The chart hilariously reveals that GPT-5 scores a whopping 74.9% accuracy on software engineering benchmarks, but the pink bars tell the real story – 52.8% of that is achieved "without thinking" while only a tiny sliver comes from actual "thinking." Meanwhile, OpenAI's o3 and GPT-4o trail behind with 69.1% and 30.8% respectively, with apparently zero thinking involved. It's basically saying these AI models are just regurgitating patterns rather than performing actual reasoning. The perfect metaphor for when your code works but you have absolutely no idea why.
SWE-Bench Verified: Thinking Optional
5 months ago
353,736 views
1 shares
ai-memes, machine-learning-memes, gpt-memes, benchmarks-memes, software-engineering-memes | ProgrammerHumor.io
More Like This
Spec Is Just Code With A Fancy Hat
1 month ago
287.0K views
2 shares
Tale Of Two Code Migrations
9 months ago
329.4K views
0 shares
When Theory Meets Production
1 month ago
210.2K views
0 shares
Getting Verified As A Human By AI
3 months ago
231.9K views
0 shares
Copilot False Hopes
5 months ago
266.5K views
0 shares
Rufus: The Shopping Assistant Who Moonlights As A React Dev
6 months ago
306.1K views
0 shares
Loading more content...
AI
AWS
Agile
Algorithms
Android
Apple
Bash
C++
Csharp