Google AI Assistant got the ability to listen

Google updated its artificial intelligence assistant Gemini to version 1.5 Pro, which gives the model so-called ears. The model can now listen to uploaded audio files and parse information from it without having to refer to the written text.

Google announced during its Google Next event that it is making Gemini 1.5 Pro available to the public for the first time through Vertex AI, its platform for building AI applications. The Gemini 1.5 Pro was first announced in February.

This new version of the Gemini Pro, which is supposed to be the middle weight model of the Gemini family, surpasses in performance the already largest and most powerful model, the Gemini Ultra. Gemini 1.5 Pro understands complex instructions and eliminates the need to fine-tune models.

Gemini 1.5 Pro is not available to people without access to Vertex AI and AI Studio. Currently, most people are exposed to Gemini language models through the Gemini chatbot. The Gemini Ultra powers the Gemini Advanced chatbot, and while it’s powerful and can understand long commands, it’s not as fast as the Gemini 1.5 Pro.

The Gemini 1.5 Pro isn’t the only big AI model from Google getting an update. Imagen 2, the text-to-image generation model that helps power Gemini’s image creation capabilities, is getting a new feature that lets users add or remove elements from images.

Source: The Verge