Google’s Gemini is still playing catchup, but it promises a bright future

Going into this, I was going to talk about how excited and hopeful I was that Google was finally ready to take on OpenAI and, by extension, Microsoft. It was difficult not to be excited, especially after seeing all of the different promotional videos, such as Mark Rober using Bard with Gemini to find the best design of a paper airplane. We also got a “Hands-on with Gemini” video (below), which gives us a glimpse of what could be possible with future iterations.

Then the other shoe dropped, and Google got caught with its hands in the cookie jar.

While the video is comprised of different prompts showcasing what Gemini can do, it’s not as clear-cut as Google made it seem. Shortly after the original announcement went live, Bloomberg published a piece suggesting that something was awry.

As it turns out, the voice you hear responding to the questions isn’t actually Gemini. In its statement to Bloomberg, Google says the video was made “using still image frames from the footage, and prompting via text.” Additionally, the YouTube video description was updated, stating, “For the purposes of this demo, latency has been reduced, and Gemini outputs have been shortened for brevity.”

Something else that Google hasn’t clarified is whether the video is a showcase of Gemini Ultra or whether there’s something else going on. An accompanying post was shared over at the Google for Developers blog, providing more insight into “how it’s made.” But again, the only description for what version of Gemini is being used is the “multimodal model Gemini,” without explicitly stating whether it’s Ultra or something else created for demonstration.

This comes after OpenAI recently introduced the ability to have a voice conversation with ChatGPT and get vocal responses. I can’t help but feel as though this is what disappoints me the most. I do not doubt that Google is capable of offering something similar, but for whatever reason, the company just went the “easy route” to avoid what happens to other companies during tech demos.

For reference, Amazon recently introduced its next-generation version of Alexa, powered by generative AI. During the announcement, Alexa struggled with the speed of some responses, and there were a couple of times when the original question needed to be asked again.

What this did is reaffirm that the progression of generative AI is fluid and constantly evolving. But I have to give Amazon some kudos because it didn’t just shy away and show off an edited promotional video.

Google needs to prove it can deliver on Gemini

If I’m coming off as a bit repetitive, it’s for a reason. Google arguably announced something that will affect everyone (Gemini is coming to Search, eventually), and the company wasn’t transparent about everything. Instead, Google went for the shock value, which it nailed, but there’s a bigger problem.

What if Gemini Ultra is released “early next year” but can’t do what the hands-on video shows? How many times has a company showcased something, only to fall short on the claims and then “move the goalposts” to avoid too much backlash?

Or worse, what if this multimodal Gemini model never sees the light of day? Do you remember when Google wowed everyone at I/O 2017 by removing a chain-link fence that was obstructing a child playing baseball? This specific functionality has never been released, and while you can argue that this is what Magic Eraser is designed to do, I urge you to try and take a picture of something with that kind of obstruction. You’ll either end up frustrated and give up or will have a picture with a bunch of “smudges” from Magic Eraser.

Circling back to the potential impact of Gemini on our daily lives, I’m left with more questions than answers. Assistant with Bard is expected to come sometime next year, and while Bard has been infused with Gemini Pro, it’s unclear whether this is the same experience coming to our phones. The more likely scenario is that a different version of Bard will be infused with Gemini Nano, which will then be integrated into Google Assistant.

Then, there’s the question of device compatibility, as Gemini Nano is currently limited to only the Pixel 8 Pro. Despite using the same Tensor G3 chip, owners of the regular Pixel 8 have been left out. This begs the question of what hardware constraints are there with Gemini.

Using Gemini in Google Recorder on the Pixel 8 Pro

(Image credit: Google)

As far as we can tell, the only relative difference between the Pixel 8 and Pixel 8 Pro is the amount of RAM (8GB vs. 12GB). It doesn’t seem as though the Tensor G3 on the smaller phone is running at lower clock speeds, and everything else appears to be the same between them. So why couldn’t Google just bring these features to both devices?

When it comes to bringing Gemini to the best Android phones, it seems Google is largely leaving that up to app developers with the help of AICore. However, this is an Android 14-only system service, so you’ll need a phone that can be updated to Android 14 and hope that app developers jump on board and use AICore.

On the bright side, the AICore announcement post does reveal that Gemini Nano and AICore run on “NPUs in flagship Qualcomm Technologies, Samsung S.LSI, and MediaTek silicon.”

Samsung Galaxy AI blog post hero

(Image credit: Samsung)

I’m still excited about Gemini

If you put aside the edited showcase video, Google does deserve some recognition here. The company hasn’t just stood on the sidelines, letting OpenAI and Microsoft reap all of the rewards. Well, they have, but with the announcement of Gemini, I’m hoping that it will result in better experiences for everyone.

Ever since Google Bard was introduced back in February, it was always pretty obvious that this wasn’t an actual competitor to OpenAI’s ChatGPT. Never mind the regular hallucinations, but even as a general-purpose chatbot, Bard failed to hit the mark.

Since its initial launch, Bard Extensions was unveiled, aiming to let you get summaries of emails, access documents, and book a trip. However, Extensions are limited to what Google can provide and only tie into Google services, and the feature pales in comparison to ChatGPT’s platform.

But with this new large language model, Google is aiming to implement Gemini in various facets of our lives, starting with Gemini Pro with Bard, in addition to limited integration with Gemini Nano working its way to Gboard and Google’s native Recorder app, starting with the Pixel 8 Pro.

Multi-modal overview of Google Gemini Ultra, Pro, and Nano

(Image credit: Google)

The company also took the correct approach in putting (most) of its cards on the table while still owning up to the mistake it made. If anything, I’m disappointed that we have to wait even longer for Google’s GPT-4 competitor, as Gemini Ultra won’t be arriving until next year.

Now, I’m going to keep hoping for a surprise update that supercharges Assistant with Bard and Gemini. Or maybe, just maybe, bring a Gemini Copilot to ChromeOS in the same way that Microsoft has done with Windows.

Editor’s note: We reached out to Google for comment, but did not receive a response in time for publishing. We will update this piece with more information as it’s provided.