Google introduced its highly anticipated general purpose, multimodal, generative AI model, Gemini, which the company claims is more powerful than OpenAI’s GPT-4.
The new LLM comes in three sizes: Nano, Pro, and Ultra, catering to various needs of the users. Nano has been developed for fast on-device tasks, Pro serves as a versatile middle-tier, and Ultra is the most powerful. However, it is still undergoing safety checks and will be available next year.
Google claims Gemini has 5 times the computational power of GPT-4, leading to faster training and potentially larger model sizes. It said Gemini is the first model to outperform human experts on MMLU (Massive Multitask Language Understanding), one of the most popular methods to test the knowledge and problem solving abilities of AI models.
Google said Gemini Ultra excels at tasks involving deliberate reasoning, surpassing previous state-of-the-art models. Furthermore, it excels at image benchmarks, demonstrating native multi-modality and complex reasoning abilities.
The standard approach in creating multi-modal models involves training separate components for different modalities. However, Gemini was designed to be natively multi-modal, pre-trained on different modalities from the beginning. This design allows Gemini to understand and reason about all kinds of inputs far better than existing multi-modal models.
Gemini was trained to recognize and understand text, images, audio, and more simultaneously, which makes it proficient in explaining reasoning in complex subjects like math and physics.
Gemini’s sophisticated multi-modal reasoning capabilities can help make sense of complex written and visual information. It extracts insights from hundreds of thousands of documents, enabling breakthroughs at digital speeds in many fields from science to finance.
Gemini can understand, explain, and generate high-quality code in the world’s most popular programming languages. Its ability to reason about complex information places it among the leading foundation models for coding globally.
Google trained Gemini on its AI-optimized infrastructure using Google’s in-house designed Tensor Processing Units (TPUs), making it less subject to shortages of the GPUs that GPT-4 and other models depend on.
It designed Gemini to be its most reliable and scalable model to train, and its most efficient to serve. The company said it is adding new protections to account for Gemini’s multi-modal capabilities, considering potential risks at each stage of development.
Gemini is now rolling out across a range of products and platforms. For instance, Google’s chatbot, Bard, will use a fine-tuned version of Gemini Pro for more advanced reasoning, planning, understanding, and more.
Prepare for a cinematic spectacle like no other as Nag Ashwin's highly anticipated sci-fi extravaganza,…
Ever since its announcement, the movie "Crew," starring Kareena Kapoor, Tabu, and Kriti Sanon, has…
अपने ऐलान के बाद ही, करीना कपूर, तब्बू, और कृति सैनन की स्टारर "क्रू" ने…
iQOO, the renowned smartphone brand, has just unveiled its latest masterpiece, the iQOO Z9 5G,…
iQOO, प्रसिद्ध स्मार्टफोन ब्रांड, ने हाल ही में भारत में अपनी नई श्रेणी का नवीनतम…
View this post on Instagram A post shared by disha patani (paatni) 🦋 (@dishapatani) Disha…