Mistral AI has introduced two major updates: the release of Pixtral Large, a new multimodal model, and significant enhancements to its AI assistant platform, Le Chat.
Pixtral Large is a 124 billion parameter model designed for both text and image processing. It builds on the capabilities of Mistral Large 2 and excels in understanding documents, charts, and natural images. The model performs strongly on benchmarks like MathVista (69.4%) and outperforms competitors such as GPT-4o and Gemini-1.5 Pro in tasks like document question answering (DocVQA) and chart analysis (ChartQA). It combines a 123B parameter multimodal decoder with a 1B parameter vision encoder, allowing it to process up to 30 high-resolution images within a 128K context window. This makes it suitable for complex visual tasks, including mathematical reasoning and document analysis.
Mistral AI also updated its Le Chat platform, which now includes:
- Web search with citations
- A new "Canvas" tool for collaborative ideation
- Advanced document and image understanding powered by Pixtral Large
Users can also generate images using Black Forest Labs' Flux Pro 1.1 model. Additionally, Le Chat introduces task automation through "agents," allowing users to automate workflows like receipt scanning or meeting summarization which can use Pixtral Large as a base.
These developments reflect Mistral's strategy to provide cutting-edge AI tools for both research and commercial use while maintaining accessibility through free tiers during beta testing.