Cogito v1: The Open Source Self-Improving AI | A compact guide to large language models pdf | Best llm training dataset pdf | Hands-on large language models book pdf download | Turtles AI
Deep Cogito has announced the release of a set of open source large language models (LLMs), called Cogito v1, ranging in size from 3 billion to 70 billion parameters. These models were trained using the Iterated Distillation and Amplification (IDA) technique, which aims to overcome the limitations of human supervisors and promote iterative self-improvement towards general superintelligence. Cogito v1 models outperform other open source models of similar size, such as LLaMA, DeepSeek, and Qwen, in several standard benchmarks.
Key Points:
- Open Source Model Release: Deep Cogito has made LLM models with parameters from 3B to 70B available on platforms such as Huggingface and Ollama, and can be used via APIs on Fireworks AI and Together AI.
- IDA Training Technique: Iterated Distillation and Amplification was used for training, allowing for iterative improvement of the models’ capabilities without being limited by the intelligence of human supervisors.
- Superior Benchmark Performance: Cogito v1 models demonstrated superior performance compared to other open source models of equivalent size, excelling in several standard benchmarks.
- Future Model Plans: Deep Cogito plans to release even larger models, up to 671B of parameters, in the coming weeks and months as they continue to improve the capabilities of their LLMs.
Deep Cogito, a San Francisco-based, venture-backed AI company, has announced the release of a set of open source large language models (LLMs), called Cogito v1. These models, ranging in size from 3 billion to 70 billion parameters, were trained using Iterated Distillation and Amplification (IDA), an advanced strategy that allows for iterative self-improvement of models’ capabilities without being constrained by the limitations of human supervisors. IDA involves a continuous cycle of amplification, in which higher intelligence capabilities are created through subroutines that involve more complex calculations, followed by a distillation phase, where these capabilities are internalized into the model parameters. This approach allows models to progressively improve their performance, overcoming the limits imposed by the intelligence of human supervisors.
Cogito v1 models demonstrated superior performance compared to other open source models of equivalent size, such as LLaMA, DeepSeek, and Qwen, in several standard benchmarks. In particular, the 70 billion parameter model outperformed the recently released MoE Llama 4 109B model. These models are optimized for use cases that include coding, function invocation, and agents, and can operate in both standard and reasoning modes, allowing for greater flexibility in applying them to different tasks.
Deep Cogito has made these models available on platforms such as Huggingface and Ollama, and they are available directly via APIs on Fireworks AI and Together AI. The company plans to launch even larger models, including 109 billion, 400 billion, and 671 billion parameter versions, in the coming weeks and months, as well as improved checkpoints for each of these sizes. This commitment underscores Deep Cogito’s commitment to continue pushing the boundaries of AI toward general superintelligence, through scientific innovation and the adoption of advanced techniques such as IDA.
Deep Cogito’s strategy fits into a broader context of AI research, where techniques such as Iterated Distillation and Amplification have been proposed to address the problem of model alignment and self-improvement. Previous studies have shown that IDA can be a promising solution to build robust training signals by decomposing and recomposing complex tasks, while maintaining control over the models as their capabilities surpass those of humans.
With the release of Cogito v1, Deep Cogito not only contributes to the open source community, but also lays the foundation for future developments in AI, aiming to create increasingly advanced models capable of tackling complex tasks with a level of intelligence that surpasses that of humans.