AI in 2025: A Combinatorial Explosion of Possibilities, but NOT AGI

Community Article Published January 4, 2025

Opinion paper by Charles Fadel

image/png

Introduction

Waymo, the robotaxi service, provides an interesting analogy: although it functions “up to Level 4” (while requiring expensive behind-the-scenes human intervention), it is now being deployed by the tens of thousands. It does not have to be perfect to be useful.

GenAI is now in the same situation: its problems are well-known (biases, lack of world model, hallucinations/confabulations, “jagged” capabilities, etc.) but it is rapidly transitioning from a science phase (and the hype of “scaling at all costs”) to an engineering phase where a significant number of development vectors are at play, and reviewed herein. The combinatorial possibilities offered by all such developments preclude any hope of forecasting what capabilities might arise within 2025 (“Combinatorial” because they will interact with each other in unforeseeable ways). This list of 23 development vectors not only underscores the combinatorial complexity at play but also pinpoints the areas with the highest transformative potential (bold in italic). Whether you’re an engineer, researcher, or enthusiast, these insights offer a roadmap to understanding GenAI's pivotal transition into its engineering phase.

1. Datasets:

  • Specialized: As early as two years ago, Google and Stanford, as well as Bloomberg and Columbia, trained specialized LLMs for Health Care, and Finance, respectively. By training an LLM on a specialty dataset, it is hoped that its responses will be more accurate, and less “hallucinatory”, than broad consumer LLMs.
  • Curated: The first task of data science is to clean its data, before computing tasks. This process minimizes biases, reduces noise, and improves LLM accuracy across various tasks..
  • Synthetic: There is a movement afoot to generate synthetic data to address issues like privacy concerns, lack of labeled data, and creating edge cases for robustness. The advantage of such an approach is to generate near-infinite amounts of data unconstrained by real data.

2. Training:

  • Post-Training + RLHF: Post-training refines a pre-trained LLM by exposing it to domain-specific data. RLHF improves performance by aligning the model's outputs with human preferences. Both aspects improve LLM performance.
  • Sparser/Distilled: Sparse training selectively activates important parameters, while distillation transfers knowledge from a larger model to a smaller one. This reduces the complexity of the underlying neural networks to improve efficiency without sacrificing performance.
  • Knowledge Graphs: KG’s are integrated into LLMs to provide structured, relational data by mapping domains as graphs. They enable LLMs to perform more factually grounded reasoning, and handle more complex queries
  • Transparent: designing and documenting the training process “transparently” allows users to understand the model's data sources and decision-making processes. This makes the model's limitations and behaviors more interpretable and auditable.
  • Physical world: the lack of real-world models is a significant drawback for LLMs. New efforts afoot such as Genesis and WorldLabs might solve some of the difficulty, by modeling physical realities, even if for the sake of robotics initially.

3. LLMs:

  • Multimodal: multimodality has been introduced in the past two years in the form of image, audio, and video generation, with steady progress.
  • SLMs: Small language models like Microsoft Phi3 have become powerful enough to be useful in Clients, and for task-specific Agents. They are optimized for tasks requiring structured representations such as tables, graphs, domain-specific schemas, etc.but also for lightweight and real-time applications.
  • Context Adherence: is an AI model's ability to maintain, interpret, and appropriately act based on the context of a given dataset. This capability ensures the AI understands the relationships within the data or conversation, for more accurate outputs.
  • Continual Learning: The ability of an LLM to remain current, fix its biases, etc. is of course important. But given the agency, and a closed feedback loop, it could theoretically self-improve to vast potency (as demonstrated with GANs to learn games, for instance). This is definitely an area to keep an eye on.

4. Inference:

  • “Reasoning”/Inference compute: the recent introduction of GPTo1-class models, rapidly followed by o3 (with unclear capabilities), represents a major departure from brute-force scaling of training, into models that “reflect” for a longer time to improve their answers.
  • “Metacognition”: refers to an ability to monitor, evaluate, and adjust its reasoning processes to ensure accurate conclusions. It involves self-awareness of cognitive steps, enabling improvements in decision-making and error correction during complex problem-solving.
  • Inference Processors: NVidia crushed its would-be competitors in the GPU race, but Inference processors represent an area where NVidia CUDA compatibility is not as critical. This allows competitors (Groq, Cerebras, Google, etc.) to offer alternative devices to vastly accelerate inference time.

5. Agentic:

  • Model Context Protocol: Anthropic has introduced this protocol to provide a universal, open standard for connecting AI systems with data sources. Unifying a multiplicity of fragmented integrations with a single protocol will help produce better, more relevant responses. It could be extremely powerful if adopted as a standard by the industry, but given its competitive nature, that is not expected.
  • Agents: are software entities that take actions autonomously to achieve specific goals. They are a major focus of the AI industry. While brute force scaling of training has plateaued, this area represents, along with “Reasoning” models, the heaviest focus.
  • Co-Pilots: Whereas Agents are autonomous, co-pilots are assistive. They are meant to assist in tasks and operations to enhance efficiency. They too are a major focus of the AI industry.
  • UX/Avatars: User interfaces, including avatars, and even as basic being given a human name, will push the already-anthropomorphizing human mind even further. Human-like attachment and advice-following has already happened and is expected to increase vastly, with uncomfortable outcomes.
  • AI Twins: They are already in use as conversational bots/avatars of the deceased (from Socrates to personally loved ones). This trend will accelerate and encompass a lot more personal data of the living as well, with all the security/privacy implications that it will entail.

6. Access:

  • Cloud to Edge (clients): During 2025, multiple vendors of laptops and smartphones will be introducing devices with AI co-processors. They will be able to run LLMs locally, not just in the cloud, opening the door to multiple apps yet to be conceived.
  • Hobby kits: NVidia has already offered a $250 hobby kit, akin to the many robotics kits for hobbyists. This will spread knowledge about GenAI, and coupled with edge devices, could enable myriads of interesting apps.
  • Distributed: SETI@Home and Folding@Home showed during the 90’s the power of distributed computing, enabling idle-time use of the CPU. This capability has been demonstrated with GPUs and will spread further with edge devices.

Conclusion:

There is a vast multiplicity of effort vectors to improve query response quality via better data, better training, and better inference, as well as expanding GenAI’s capabilities via agents and client access. All these capabilities will interact with each other during 2025, with very surprising results quite likely .

The views expressed by the author do not necessarily reflect the editorial stance of Turing Post


If you want to receive our articles straight to your inbox, please subscribe here


Community

Sign up or log in to comment