ARC-AGI-3 Benchmark Shows Top AI Models Fail at General Reasoning, Scoring Below 1%
A new benchmark is forcing a hard question: are today’s most advanced AI models actually capable of reasoning—or just very good at pattern recognition?
Google and OpenAI Models Outperform Humans at ICPC Coding Finals
OpenAI’s GPT-5 achieved a perfect score of 12/12, while Google’s Gemini 2.5 Deep Think solved 10/12, ranking second overall at the 2025 ICPC World Finals.