The recent leak reveals that Mistral AI's chatbot platform, Le Chat, will soon integrate the Pixtral 12B multimodal AI model, enabling users to work with images in addition to text. Here's a breakdown of the key points:
Who: The leak comes from @testingcatalog, a source known for sharing early insights into tech developments. Mistral AI, a French AI startup, is behind the Pixtral 12B model.
What: Pixtral 12B is a multimodal AI model capable of processing both text and images. It can perform tasks such as creating captions for images, counting objects in photos, and answering questions about the content of images.
When: The integration of Pixtral 12B into Le Chat is expected to happen soon, though no specific timeline has been provided.
Where: The feature will be available on Mistral AI's chatbot platform, Le Chat, and possibly on its API platform, Le Platforme.
Why: This integration is significant because it marks a major advancement in AI capabilities, allowing for more versatile and interactive use cases. It positions Mistral AI as a competitor to other leading AI companies like OpenAI and Anthropic.
How: Users will be able to upload images via URLs or base64 encoding and interact with them within the chat. The model supports up to 4 images per chat and allows for editing, including adding and removing images.
The Pixtral 12B model itself is built on Mistral's Nemo 12B text model and includes a 400 million-parameter vision adapter. It is available for download under an Apache 2.0 license, allowing for free use and modification. This development underscores Mistral AI's strategy of releasing open models while offering managed versions for corporate customers, a move that has contributed to its rapid rise in the AI world, with a recent valuation of $6 billion.