Gemini 1.5 Pro Now Listens to Audio and Is Available to All
At the Google Cloud Next 2024 event in Las Vegas, Google announced that it’s going to makeGemini 1.5 Progenerally available to all users. The highly-anticipated model is finally in public preview with a 1 million context window, and youno longer have to sign upfor thewaitlist to access the Gemini 1.5 Pro model.
I tried to access the Gemini 1.5 Pro model from a new Google account and the model was readily available without any wait. And all this is available for free.
That said, it does not mean you can start using the Gemini 1.5 Pro model on the Gemini portal. You will have to head toaistudio.google.com(visit) to access the model currently. After a few months of public preview, the model will be made available on the Gemini portal. You will likely need aGemini Advanced subscriptionto use the model.
Keep in mind that the Gemini 1.5 Pro model is amid-tier modelbuilt on the MoE architecture, however, it beats the largest Gemini 1.0 Ultra model easily. And in ourcomparison with the GPT-4 model, Gemini 1.5 Pro showed remarkable capabilities in several tests. When Gemini 1.5 Pro debuts on the Gemini portal, expect it to perform better than GPT-4 and Claude 3’s Opus model.
Apart from that, Gemini 1.5 Pro can nowprocess audio files too. You can upload audio files of meetings or videos, and the model can listen to the uploaded files without the need to manually generate a transcript. It can be of immense help to people who want to find quick and structured information from audio meetings or discussions.
Gemini 1.5 Procould already process videosand images, and now audio files are supported too which makes it a powerful multimodal model with a context length of 1 million tokens. We tested the audio processing capability of the Gemini 1.5 Pro model. Here is how it went.
How to Process Audio Files on Gemini 1.5 Pro
So this is how you can upload and process audio files on Gemini 1.5 Pro. It’s really a powerful model from the Google DeepMind team and I am excited that it’s now available to the public at large without any cost. Go ahead and try it and let us know your thoughts in the comment section below.
Arjun Sha
Passionate about Windows, ChromeOS, Android, security and privacy issues. Have a penchant to solve everyday computing problems.
Add new comment
Name
Email ID
Δ
01
02
03
04
05