At the Google I/O developer conference in May 2023, CEO Sundar Pichai announced the company’s upcoming artificial intelligence (AI) system, Gemini.
The large language model (LLM) is being developed by the Google DeepMind division (Brain Team + DeepMind). It could compete with AI systems like ChatGPT from OpenAI and possibly outperform them.
While details remain scarce, here is what we can piece together from the latest interviews and reports about Google Gemini.
Pichai stated that Gemini combines the strengths of DeepMind’s AlphaGo system, known for mastering the complex game Go, with extensive language modeling capabilities.
He said it is designed from the ground up to be multimodal, integrating text, images, and other data types. This could allow for more natural conversational abilities.
Pichai also hinted at future capabilities like memory and planning that could enable tasks requiring reasoning.
In an update to his professional bio over the summer, Google Chief Scientist Jeffrey Dean said Gemini is one of the “next-generation multimodal models” he is co-leading.
He stated it will utilize Pathways, Google’s new AI infrastructure, to enable scaling up training on diverse datasets.
This hints at Gemini potentially being the largest language model created to date, likely exceeding the size of GPT-3 with over 175 billion parameters.
Additional details came from Demis Hassabis, CEO of DeepMind.
In June, he told Wired that techniques from AlphaGo, like reinforcement learning and tree search, may give Gemini new abilities like reasoning and problem-solving.
Hassabis stated Gemini is a “series of models” that will be made available in different sizes and capabilities.
He also mentioned Gemini may utilize memory, fact-checking against sources like Google Search, and improved reinforcement learning to enhance accuracy and reduce hazardous hallucinated content.
In a September Time interview, Hassabis reiterated that Gemini aims to combine scale and innovation.
He said incorporating planning and memory is in the early exploratory stages.
Hassabis also stated Gemini may employ retrieval methods to output entire blocks of information, rather than word-by-word generation, to improve factual consistency.
He revealed that Gemini builds on DeepMind’s multimodal work like the image captioning system Flamingo.
Overall, Hassabis said Gemini is showing “very promising early results.”
In an interview with Wired, published a few days later, Pichai provided the most unambiguous indication of how Gemini fits into Google’s product roadmap.
He stated conversational AI systems like Bard are “not the end state” but waypoints leading towards more advanced chatbots.
Pichai said Gemini and future iterations will ultimately become “incredible universal personal assistants” integrated throughout people’s daily lives in areas like travel, work, and entertainment.
He reiterated that Gemini will combine strengths of text and images, stating that today’s chatbots will “look trivial” in comparison within a few years.
OpenAI CEO tweeted what appeared to be a response to a paywalled-article reporting that Google Gemini could outperform GPT-4.
Are the numbers wrong?
— Elon Musk (@elonmusk) August 30, 2023
There was no official response to the follow-up question by Elon Musk on whether the numbers provided by SemiAnalysis are correct.
This suggests Gemini may soon be ready for a beta release and integration into services like Google Cloud Vertex AI.
While the news about Gemini is promising thus far, Google isn’t the only company reportedly ready to launch a new LLM to compete with OpenAI.
Meta most recently announced the release of Llama 2, an open-source AI model, in partnership with Microsoft. The company appears dedicated to responsibly creating AI that is more accessible.
What we know so far indicates Gemini could represent a significant advancement in natural language processing.
The fusion of DeepMind’s latest AI research with Google’s vast computational resources makes the potential impact challenging to overstate.
If Gemini lives up to expectations, it could drive a change in interactive AI, aligning with Google’s ambitions to “bring AI in responsible ways to billions of people.”
The latest news from Meta and Google comes a few days after the first AI Insight Forum, where tech CEOs privately met with a portion of the United States Senate to discuss the future of AI.
Featured image: VDB Photos/Shutterstock
Last night at about 4:20 pm ET, Google began to roll out an update to the Helpful content system – the Google September 2023 Helpful Content Update. What is new is not the part about finding hidden gems (that will come later) but a new “improved classifier.” Google wrote on X, The September 2023 helpful...
Given the popularity of short-form video, it’s only logical that all businesses should be considering the entertainment value of the medium, and how it could potentially be used to deliver your brand message. The right video clip could give your brand awareness efforts a huge boost, and the DIY ethos of short-form content also makes...