LLM evolution has clearly slowed down, but in the short term AI is becoming much more interesting from an engineering perspective than from a “purely modeling” one. It is exactly in this space – where the structural limitations of LLMs are managed through architectures, mathematics, and deep agents – that Fastal is focusing its research and development.
From the LLM Plateau to the “AI Bubble”
In recent years, the perceived leap between one generation of models and the next has drastically reduced: GPT-5 was met with a combination of initial hype and strong disappointment, with many analysts speaking of a technical plateau rather than a revolution. Even in the mainstream, the “AI bubble” narrative is gaining ground, with skeptical voices like the authoritative yet controversial Gary Marcus who have always highlighted the structural limits of simple data and parameter scaling.
At the same time, the discourse on AGI has shifted: there is growing consensus that generality will not be achieved simply by increasing the size of LLMs, but by intervening on reasoning structure, integration with external tools, and forms of neuro-symbolic hybridization. This paradigm shift opens space for a new type of innovation, less spectacular in benchmarks but much more concrete in business contexts.
Products That Work, Despite the Plateau
While public debate oscillates between enthusiasm and disillusionment, companies competing on the front lines continue to release LLM-based products with increasingly reliable performance in real workflows. Even if the “core” of models improves incrementally, enterprise solutions are becoming credibly useful for document automation, user assistance, assisted coding, and decision-making support.
The qualitative leap comes from three main levers:
- orchestration of multiple models and services (not a single “omnipotent” LLM);
- extensive use of external tools (APIs, databases, search engines, internal systems);
- increasingly rigorous control of context, memory, and output formats.
It is in this type of integration – rather than in the new “super-model” – that Fastal sees the real value for enterprises and Public Administration.
How to Work Around LLM Limitations
The known limitations of LLMs (hallucinations, lack of long-term memory, difficulty with multi-step tasks) are now addressed with specific architectural patterns. Some key elements:
External Memory and Persistent State
Deep agents separate memory from prompt context, using file systems, relational or vector databases as sources of truth accessible via tools. The model doesn’t “remember everything” but learns to read and write in a structured information space, closer to how enterprise information systems work.
Tools and Actions in the World
The agent doesn’t just produce text but can invoke atomic tools: database queries, calls to external services, code execution, file manipulation. This shifts part of the intelligence from the model to the system orchestrating tools and controls, improving reliability and auditability.
Extreme Context Engineering
The most advanced systems make sophisticated use of the system prompt: detailed protocols, stopping rules, criteria for creating sub-agents, naming standards, and response formats. The model is treated as a statistical inference engine to be channeled into highly structured procedures, reducing unwanted variability.
This engineering “around” the LLM is today’s true frontier for application reliability, and represents natural ground for a company like ours with strong expertise in architectures, databases, and systems integration.
System 2 and the Mathematics of Reasoning
In contemporary debate, “System 2” refers to the set of techniques aimed at introducing slower, deliberative, and verifiable forms of thinking above (or alongside) the fast, associative “System 1” behavior of LLMs. The idea takes the psychological distinction between intuitive and rational systems but translates it into computational mechanisms: planning, verification, step decomposition, formal control.
In this direction lies the use of more refined mathematics to control reasoning coherence:
- topological and optimization geometry techniques to analyze the structure of the solution space and make the model’s trajectory more stable during reasoning;
- explicit verification and search methods (e.g., tree-of-thoughts, guided exploration, controlled sampling) to explore multiple reasoning chains and choose those consistent with logical constraints or cost objectives;
- integration with symbolic engines or proof assistants to obtain formal guarantees on critical mathematical or logical steps.
This “mathematical prosthesis” doesn’t magically transform LLMs into fully logical systems, but drastically reduces gross errors and inconsistencies, making their use safer in regulated and mission-critical contexts. It is one of the trajectories we consider strategic for applications in regulated areas such as AML, compliance, and public administration.
Deep Agents and the Fastal Model
So-called deep agents (or Agents 2.0) represent the evolution of “single-loop” agents toward complex architectures capable of handling long, multi-step tasks distributed over time. Examples like Claude Code show agents that plan, write files, execute code, verify results, iterate, and autonomously document the process, maintaining a coherent state between iterations.
These systems present four fundamental pillars:
- Explicit planning: the model decides when to stop to plan, decompose the problem, define milestones and success criteria before acting.
- Specialized sub-agents: an orchestrator delegates tasks to specific agents (for scraping, code-editing, data analysis, user interaction) with well-defined responsibilities.
- Memory and file system: work doesn’t live only in context but in persistent files and structures that agents read and update over time.
- Rigorous human-machine collaboration protocols: standards for file formats, rules for asking user confirmation, stopping criteria, detailed logs for audit.
For us, this is the truly interesting landscape: not yet another “bigger” model, but an ecosystem of deep agents, tools, and controls that transform LLMs into components of intelligent information systems. This is where we are directing our research and development activities:
- design of vertical deep agents for regulated domains (AML, compliance, PA), where reasoning traceability is crucial;
- experimentation with System 2 architectures that combine LLMs, symbolic engines, and structured memory to reduce errors and improve verifiability;
- development of reusable frameworks and patterns to integrate these agents into enterprise legacy systems, with particular attention to security, governance, and data sovereignty.
In summary, if the “new model → amazement → bubble” cycle is slowing down, the “better architecture → more reliable process → business value” cycle has just begun. It is on this second cycle that we have chosen to invest, convinced that the real innovation of the coming years will be less flashy but much more transformative for daily work.