Common Sense Is (Still) All You Need
The Real Bottleneck in AI Isn’t Compute - it’s Understanding
Disclaimer: This is a review of an opinion paper and does not necessarily represent the reviewer’s position.
In recent years, AI has made major advances—from conversational agents and image generation to self-driving cars. Yet one basic ability still separates machines from the kind of intuitive, adaptive intelligence shown by even the simplest animals: common sense. In Common Sense Is All You Need, Hugo Latapie argues that this missing capability is not just an inconvenience but is the core obstacle keeping AI from achieving real autonomy. He makes the case that if we want AI to safely operate in dynamic, unpredictable environments, then common sense must be designed in from the ground up. Bigger models, more data, and better benchmarks won’t close the gap. Instead, we need a shift in how we build, train, and evaluate AI systems.
The paper’s core argument is that today’s AI efforts are skewed toward performance and scale at the expense of general understanding. Latapie challenges the field to move away from narrowly optimized models and benchmarks, and instead focus on what actually enables humans and animals to learn and act effectively in the real world. This means prioritizing flexible learning, adaptation, and contextual reasoning and not just better language completion or object recognition. It’s a call to reorient the foundations of AI around the very qualities we take for granted in natural intelligence.
What’s New and Important Here?
One of the paper’s most original contributions is its expansion of the concept of embodiment in AI. Typically, “embodiment” refers to robots or agents with physical bodies that learn by interacting with the environment: picking up objects, walking, or navigating spaces. Latapie broadens this idea by introducing the notion of cognitive embodiment, where AI systems engage not only with physical environments but also with structured, abstract ones. In this view, interacting with a constrained puzzle or logic problem can be as cognitively rich as exploring a room. An AI that genuinely embodies an abstract environment can learn to reason through trial and feedback, rather than just recognize patterns.
A good example of this is the Abstraction and Reasoning Corpus (ARC), a benchmark composed of visual transformation puzzles. Each task presents a small set of input-output grid pairs, and the AI must figure out the rule and apply it to a new grid. While ARC is meant to test reasoning, many AI systems today “solve” it by memorizing patterns from large sets of training examples. Latapie argues that this misses the point: to really test for reasoning, AI should approach these problems with minimal prior knowledge and learn from the structure of the task itself. That’s cognitive embodiment in action and it’s central to his proposal.
Another key idea is the importance of starting from a blank slate or as Latapie calls it, tabula rasa. Most modern AI models depend on pretraining with massive datasets, which makes them powerful in familiar situations but poor at generalizing to new ones. In contrast, humans and animals regularly encounter novel environments and must figure things out on the fly. Latapie argues that real autonomy requires this kind of flexible, low-assumption learning. AI systems should begin with very little preloaded information and build up knowledge as they interact with the world or task at hand. This would prevent overfitting to specific training sets and encourage deeper, more transferable reasoning.
Perhaps the most piercing critique in the paper is what Latapie describes as the “magic happens here” problem. In many current AI development workflows, the transition from advanced pattern recognition to general intelligence is assumed to somehow emerge from scale. If we just train bigger models, with more data and compute, autonomy will arrive… eventually. But Latapie calls this wishful thinking. Without common sense reasoning explicitly built into the architecture and training process, we’re merely pushing against the ceiling of diminishing returns. Progress appears smooth at first, but eventually hits a wall. The core issue, which is the absence of real-world understanding, still remains unsolved.
Benchmarks and What’s Wrong With Them
Latapie dedicates a substantial part of the paper to analyzing the limits of popular benchmarks and evaluation methods, using three high-profile examples. First is the ARC challenge, which was designed to test abstraction and reasoning rather than memorization. However, in practice, many AI models tackle ARC by relying on hundreds of training examples, and sometimes even the test problems leak into development workflows. This creates the illusion of general intelligence, when in fact models are just pattern-matching. Latapie proposes redesigning ARC to truly enforce minimal prior knowledge and test the system’s ability to reason in real time from a small set of examples.
Next, he looks at the case of full self-driving (FSD) systems, which are often benchmarked using the SAE autonomy levels, from Level 1 (basic driver assistance) to Level 5 (complete autonomy in all conditions). Today’s systems are largely capped at Levels 2 or 3, and Level 4 vehicles often rely on remote human assistance for unusual situations. According to Latapie, this ceiling is not just a hardware or sensor problem but ’s a reasoning problem. Without the ability to understand context, recognize unusual patterns, and make intuitive decisions, vehicles will always require fallback mechanisms. Common sense and not just better cameras or maps, is what separates current AI from the full autonomy of a human driver.
Finally, Latapie critiques the well-known Turing Test, which evaluates a machine’s ability to engage in a conversation indistinguishable from that of a human. While passing the Turing Test can be impressive, he argues that it doesn’t indicate true understanding. A chatbot might produce fluent answers using statistical associations, without any grasp of the meaning or context behind its words. Such systems may appear smart in controlled conversation but would fail catastrophically in environments that require reasoning, decision-making, or adaptability. In Latapie’s view, these benchmarks risk distracting the field from what really matters: developing grounded, reasoning-based intelligence.
Scaling Isn’t the Answer (Alone)
Latapie acknowledges that scaling has brought genuine breakthroughs, particularly in language and vision. Larger models have unlocked impressive capabilities in text generation, translation, and image analysis. But he also highlights a growing problem: scaling is starting to hit diminishing returns. Several real-world benchmarks have stalled despite increases in compute and dataset size. For instance, in object recognition (COCO), anomaly detection in video (UCF-Crime), and action localization in video (ActivityNet), performance improvements have plateaued even as models grow more complex.
This suggests that we may be nearing the practical limits of what scaling alone can achieve. The implication is that future progress will depend less on size and more on solving deeper architectural challenges, like how to build systems that can learn, adapt, and reason across unfamiliar situations. Scaling gives us sharper tools, but without common sense, those tools remain narrowly useful and brittle when faced with real-world complexity.
Theoretical Rigor
The paper also tackles several famous theoretical issues in AI, explaining how common sense could help address them. One is the No Free Lunch Theorem, which says that no learning algorithm is best for every possible problem. Latapie’s response is practical: we don’t need to solve every problem but just the ones that matter. By focusing on well-structured domains with clear patterns and rules, AI systems can learn efficiently without violating the theorem’s limits.
He also addresses the Frame Problem, which refers to the difficulty of figuring out what matters and what doesn’t in any given situation. Latapie suggests that embodied interaction, whether in the physical world or in structured tasks, helps AI systems learn relevance through experience. Similarly, the Qualification Problem, which points to the challenge of listing all the preconditions for an action, can be managed through adaptive reasoning and learning from trial and error. These responses ground abstract philosophical concerns in concrete design strategies, showing how they can be addressed through better AI architecture rather than brute-force data accumulation.
How Should We Build AI Instead?
To move forward, Latapie proposes a series of practical changes in how we build and evaluate AI. First, benchmarks need to be redesigned to test reasoning, not recognition. That means limiting the prior knowledge an AI system is allowed to use, forcing it to engage with each problem as new. Evaluation should shift away from just scoring correct answers and toward understanding how the system reached its conclusion, meaning that process matters more than output.
Architecturally, Latapie advocates for integrating symbolic logic with learning-based approaches. This hybrid strategy would combine the structure and transparency of logic with the flexibility of statistical models. He also emphasizes the need to draw from neuroscience and cognitive science, namely to build architectures that reflect how biological systems acquire and apply common sense. In short, the AI software stack needs to evolve. Adding more layers to today’s models won’t be enough. We need to rethink what the base layers should even be.
What Happens If We Don’t Change?
Latapie warns that continuing along the current path: scaling up models, polishing benchmarks, and assuming that autonomy will emerge, risks slowing progress and damaging trust. AI systems that look good in demos but fail in messy, real-world conditions can lead to disillusionment among users, developers, and investors. Worse, systems without common sense may behave in unpredictable or unsafe ways, especially as they gain more autonomy and the ability to self-modify. If these systems lack grounding in human values or context awareness, they could easily make decisions that appear intelligent but cause real harm.
The paper argues that public fears around AI, especially fears of superintelligent systems behaving recklessly, are really fears of intelligence without common sense. And those fears are not unfounded. Building AI that understands its environment, learns responsibly, and reasons about consequences is not just a technical goal but is a safety requirement.
https://arxiv.org/pdf/2501.06642