Mistral AI

The French AI startup Mistral, recognized for its robust open source AI models, introduced two new models today: a math-focused model and a code-generating model for developers, based on the innovative Mamba architecture developed late last year by other researchers.

The Mamba architecture aims to enhance the efficiency of the transformer architecture used by most leading LLMs by streamlining its attention mechanisms. Unlike traditional transformer-based models, Mamba-based models promise faster inference times and extended context. Companies and developers, including AI21, have already released new AI models utilizing this architecture.

Leveraging this new design, Mistral’s Codestral Mamba 7B provides rapid response times even with longer input texts. Codestral Mamba is particularly effective for code productivity, especially in local coding projects.

Mistral tested the model, which will be freely available on Mistral’s la Plateforme API, with inputs of up to 256,000 tokens — double the capacity of OpenAI’s GPT-4o.

Benchmark tests conducted by Mistral demonstrated that Codestral Mamba outperformed other open source models like CodeLlama 7B, CodeGemma-1.17B, and DeepSeek in HumanEval tests.

Developers can modify and deploy Codestral Mamba from its GitHub repository and through HuggingFace, with an open source Apache 2.0 license.

Mistral claimed that the earlier version of Codestral surpassed other code generators such as CodeLlama 70B and DeepSeek Coder 33B.

AI-powered code generation and coding assistants have become prevalent applications, with platforms like GitHub’s Copilot, powered by OpenAI, Amazon’s CodeWhisperer, and Codenium gaining traction.

Mathstral: Optimized for STEM Applications

Mistral’s second model release is Mathstral 7B, an AI model specifically designed for mathematical reasoning and scientific research. Developed in collaboration with Project Numina, Mathstral boasts a 32K context window and will be released under an Apache 2.0 open source license.

Mistral stated that Mathstral outperformed all existing models in mathematical reasoning tasks, achieving significantly better results in benchmarks with more computations at inference time. Users can employ it as is or fine-tune the model.

“Mathstral exemplifies the excellent performance and speed trade-offs achievable when developing models for specific purposes – a development philosophy we actively promote on la Plateforme, especially with its new fine-tuning capabilities,” Mistral mentioned in a blog post.

Mathstral can be accessed through Mistral’s la Plateforme and HuggingFace.

Mistral, committed to offering its models on an open-source basis, has been steadily competing with other AI developers like OpenAI and Anthropic.

The company recently secured $640 million in series B funding, pushing its valuation to nearly $6 billion, with investments from tech giants such as Microsoft and IBM.

Attack AI Simulation

Leave a Reply

Your email address will not be published. Required fields are marked *

Verified by MonsterInsights