Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Google is absolutely running away with it. The greatest trick they ever pulled was letting people think they were behind.
 help



Their models might be impressive, but their products absolutely suck donkey balls. I’ve given Gemini web/cli two months and ran away back to ChatGPT. Seriously, it would just COMPLETELY forget context mid dialog. When asked about improving air quality it just gave me a list of (mediocre) air purifiers without asking for any context whatsoever, and I can list thousands of conversations like that. Shopping or comparing options is just nonexistent. It uses Russian propaganda sources for answers and switches to Chinese mid sentence (!), while explaining some generic Python functionality. It’s an embarrassment and I don’t know how they justify 20 euro price tag on it.

I agree. On top of that, in true Google style, basic things just don't work.

Any time I upload an attachment, it just fails with something vague like "couldn't process file". Whether that's a simple .MD or .txt with less than 100 lines or a PDF. I tried making a gem today. It just wouldn't let me save it, with some vague error too.

I also tried having it read and write stuff to "my stuff" and Google drive. But it would consistently write but not be able to read from it again. Or would read one file from Google drive and ignore everything else.

Their models are seriously impressive. But as usual Google sucks at making them work well in real products.


I don't find that at all. At work, we've no access to the API, so we have to force feed a dozen (or more) documents, code and instruction prompts through the web interface upload interface. The only failures I've ever had in well over 300 sessions were due to connectivity issues, not interface failures.

Context window blowouts? All the time, but never document upload failures.


I'm talking about Gemini in the app and on the web. As well as AI studio. At work we go through Copilot, but there the agentic mode with Gemini isn't the best either.

Honestly this is as Google product as you can get. Prizes for some, beatings for others.

What I love about Gemini mobile is that, if you look at the app wrong, it completely loses the response. It still generates it (and uses up your quota), but it never displays it!

This is the company that made Android, and it can't make an Android app that fetches a response from a server. Astonishing.


It's so capable at some things, and others are garbage. I uploaded a photo of some words for a spelling bee and asked it to quiz my kid on the words. The first word it asked, wasn't on the list. After multiple attempts to get it to start asking only the words in the uploaded pic, it did, and then would get the spellings wrong in the Q&A. I gave up.

I had it process a photo of my D&D character sheet and help me debug it as I'm a n00b at the game. Also did a decent, although not perfect, job of adding up a handwritten bowling score sheet.

How can the models be impressive if they switch to Chinese mid-sentence? I've observed those bizarre bugs too. Even GPT-3 didn't have those. Maybe GPT-2 did. It's actually impressive that they managed to botch it so badly.

Google is great at some things, but this isn't it.


Antigravity is an embarrassment.

The models feel terrible, somehow, like they're being fed terrible system prompts.

Plus the damn thing kept crashing and asking me to "restart it". What?!

At least Kiro does what it says on the tin.


My experience with Antigravity is the opposite. It's the first time in over 10 years that an IDE has managed to take me out a bit out of the jetbrain suite. I did not think that was something possible as I am a hardcore jetbrain user/lover.

It's literally just vscode? I tried it the other day and I couldn't tell it apart from windsurf besides the icon in my dock

Yeah same here. Even though it's vscode I'm still using it and don't plan to renew Intellij again. Gemini was crap but Opus smashes it.

It is windsurf isn't it, why would you expect it to be different?


Have you tried Cursor or VS Code with Github Copilot in agent mode (recently, not 3 or 6 months ago)?

I've recently tried a buuuuunch of stuff (including Antigravity and Kiro) and I really, really, could not stomach Antigravity.


I disagree. At least in my brief test drive, when used with Claude, the performance was on par with Cursor except that the Agent could actually interact with the terminal properly (Cursor is comically bad at this for some reason).

When the (generous!) Claude credits dry up functionality stops however. Gemini is as useless in Antigravity as everywhere else.


I've used their Pro models very successfully in demanding API workloads (classification, extraction, synthesis). On benchmarks it crushed the GPT-5 family. Gemini is my default right now for all API work.

It took me however a week to ditch Gemini 3 as a user. The hallucinations were off the charts compared to GPT-5. I've never even bothered with their CLI offering.


Sadly true.

It is also one of the worst models to have a sort of ongoing conversation with.


It’s all context/ use case; I’ve had weird things too but if you only use markdown inputs and specific prompts Gemini 3 Pro is insane, not to mention the context window

Also because of the long context window (1 mil tokens on thinking and pro! Claude and OpenAI only have 128k) deep research is the best

That being said, for coding I definitely still use Codex with GPT 5.3 XHigh lol


100x agree. It gives inconsistent edits, would regularly try to perform things I explicitly command to not.

Agreed on the product. I can't make Gemini read my emails on GMail. One day it says it doesn't have access, the other day it says Query unsuccessful. Claude Desktop has no problem reaching to GMail, on the other hand :)

I don't have any of these issues with Gemini. I use it heavily everyday. A few glitches here and there, but it's been enormously productive for me. Far more so then chatgpt, which I find mostly useless.

And it gives incorrect answers about itself and google’s services all the time. It kept pointing me to nonexistent ui elements. At least it apologizes profusely! ffs

Their models are absolutely not impressive.

Not a single person is using it for coding (outside of Google itself).

Maybe some people on a very generous free plan.

Their model is a fine mid 2025 model, backed by enormous compute resources and an army of GDM engineers to help the “researchers” keep the model on task as it traverses the “tree of thoughts”.

But that isn’t “the model” that’s an old model backed by massive money.


Uhh, just false.

It's just poop tier.

Come on.

Worthless.

Do you have any market counter points.

Market counter points that aren't really just a repackaging of:

  1. "Google has the world's best distribution" and/or  
  2. "Google has a firehose of money that allows them to sell their 'AI product' at an enormous discount?
Good luck!

These benchmarks are super impressive. That said, Gemini 3 Pro benchmarked well on coding tasks, and yet I found it abysmal. A distant third behind Codex and Claude.

Tool calling failures, hallucinations, bad code output. It felt like using a coding model from a year ago.

Even just as a general use model, somehow ChatGPT has a smoother integration with web search (than google!!), knowing when to use it, and not needing me to prompt it directly multiple times to search.

Not sure what happened there. They have all the ingredients in theory but they've really fallen behind on actual usability.

Their image models are kicking ass though.


Peacetime Google is not like wartime Google.

Peacetime Google is slow, bumbling, bureaucratic. Wartime Google gets shit done.


OpenAI is the best thing that happened to Google apparently.

Just not search. The search product has pretty much become useless over the past 3 years and the AI answers often will get just to the level of 5 years ago. This creates a sense that that things are better - but really it’s just become impossible to get reliable information from an avenue that used to work very well.

I don’t think this is intentional, but I think they stopped fighting SEO entirely to focus on AI. Recipes are the best example - completely gutted and almost all receive sites (therefore the entire search page) run by the same company. I didn’t realize how utterly consolidated huge portions of information on the internet was until every recipe site about 3 months ago simultaneously implemented the same anti-Adblock.


The search product become useless on a particular day of 2019 as discussed on HN News some time ago:

https://news.ycombinator.com/item?id=40133976


Competition always is. I think there was a real fear that their core product was going to be replaced. They're already cannibalizing it internally so it was THE wake up call.

Next they compete on ads...

Wartime Google gave us Google+. Wartime Google is still bumbling, and despite OpenAI's numerous missteps, I don't think it has to worry about Google hurting its business yet.

I do miss Google+. For my brain / use case, it was by far the best social network out there, and the Circle friends and interest management system is still unparalleled :)

Google+ was fun. Failed in the market though.

Apple made a social network called Ping. Disaster. MobileMe was silly.

Microsoft made Zune and the Kin 1 and Kin 2 devices and Windows phone and all sorts of other disasters.

These things happen.


I have a hypothesis that Google+ just wasn't addictive. Which is a good thing now, but not back then

Windows Phone was actually good. I would even say that my Lumia something was one of best experiences ever on mobile. G+ was also good. Efficient markets mean that you can "extract" rent, via selling data or attention etc. not realy what is good

But wait two hours for what OpenAI has! I love the competition and how someone just a few days ago was telling how ARC-AGI-2 was proof that LLMs can't reason. The goalposts will shift again. I feel like most of human endeavor will soon be just about trying to continuously show that AI's don't have AGI.

"AGI" doesn't mean anything concrete, so it's all a bunch of non-sequiturs. Your goalposts don't exist.

Anyone with any sense is interested in how well these tools work and how they can be harnessed, not some imaginary milestone that is not defined and cannot be measured.


I agree. I think the emergence of LLMs have shown that AGI really has no teeth. I think for decades the Turing test was viewed as the gold standard, but it's clear that there doesn't appear to be any good metric.

The turing test was passed in the 80s, somehow it has remained relevant in pop culture despite the fact that it's not a particularly difficult technical achievement

It wasn’t passed in the 80s. Not the general Turing test.

c. 2022 for me.

> I feel like most of human endeavor will soon be just about trying to continuously show that AI's don't have AGI.

I think you overestimate how much your average person-on-the-street cares about LLM benchmarks. They already treat ChatGPT or whichever as generally intelligent (including to their own detriment), are frustrated about their social media feeds filling up with slop and, maybe, if they're white-collar, worry about their jobs disappearing due to AI. Apart from a tiny minority in some specific field, people already know themselves to be less intelligent along any measurable axis than someone somewhere.


Soon they can drop the bioweapon to welcome our replacement.

Not in my experience with Gemini Pro and coding. It hallucinates APIs that aren't there. Claude does not do that.

Gemini has flashes of brilliance, but I regard it as unpolished some things work amazingly, some basics don't work.


It's very hard to tell the difference between bad models and stinginess with compute.

I subscribe to both Gemini ($20/mo) and ChatGPT Pro ($200/mo).

If I give the same question to "Gemini 3.0 Pro" and "ChatGPT 5.2 Thinking + Heavy thinking", the latter is 4x slower and it gives smarter answers.

I shouldn't have to enumerate all the different plausible explanations for this observation. Anything from Gemini deciding to nerf the reasoning effort to save compute, versus TPUs being faster, to Gemini being worse, to this being my idiosyncratic experience, all fit the same data, and are all plausible.


You nailed it. Gemini 3 Pro seems very "lazy" and seems to never reason for more than 30 seconds, which significantly impacts the quality of its outputs.

Have you used Gemini CLI, and then codex? Gemini is so trigger happy, the moment you don’t tell it „don’t make any changes“ it runs off and starts doing all kind of unrelated refactorings. This is the opposite of what I want. I want considerate, surgical implementations. I need to have a discussion of the scope, and sequence diagrams first. It should read a lot of files instead of hallucinating about my architecture.

Their chat feels similar. It just runs off like a wild dog.


Gemini's UX (and of course privacy cred as with anything Google) is the worst of all the AI apps. In the eyes of the Common Man, it's UI that will win out, and ChatGPT's is still the best.

Google privacy cred is ... excellent? The worst data breach I know of them having was a flaw that allowed access to names and emails of 500k users.

Link? Are you conflating with "500k Gmail accounts leaked [by a third party]" with Gmail having a breach?

Afaik, Google has had no breaches ever.



Google is the breach.

Their SECURITY cred is fantastic.

Privacy, not so much. How many hundreds of millions have they been fined for “incognito mode” in chrome being a blatant lie?


> Their SECURITY cred is fantastic.

In a world where Android vulnerabilities and exploits don't exist


They don't even let you have multiple chats if you disable their "App Activity" or whatever (wtf is with that ass naming? they don't even have a "Privacy" section in their settings the last time I checked)

and when I swap back into the Gemini app on my iPhone after a minute or so the chat disappears. and other weird passive-aggressive take-my-toys-away behavior if you don't bare your body and soul to Googlezebub.

ChatGPT and Grok work so much better without accounts or with high privacy settings.


Google's most profitable branch is adsense, they don't need breaches for them to have privacy issues given that elephant sized conflict of interest.

This exactly! "Oh that gang of thieves that also sells doors has never had their house broken into"

I hate how they insist on knowing everything I do all the time, but heavens forbid the minute I'm on a VPN or shared connection I have to do unpaid manual labor (100 CAPTCHAs) to train their AI


If you consider "privacy" to be 'a giant corporation tracks every bit of possible information about you and everyone else'?

OpenAI is running ads. Do you think they'll track less?

You mean AI Studio or something like that, right? Because I can't see a problem with Google's standard chat interface. All other AI offerings are confusing both regarding their intended use and their UX, though, I have to concur with that.

The lack of "projects" alone makes their chat interface really unpleasant compared to ChatGPT and Claude.

No projects, completely forgets context mid dialog, mediocre responses even on thinking, research got kneecapped somehow and is completely uses now, uses propaganda Russian videos as the search material (what’s wrong with you, Google?), janky on mobile, consumes GIGABYTES of RAM on web (seriously, what the fuck?). Left a couple of tabs over night, Mac is almost complete frozen because 10 tabs consumed 8 GBs of RAM doing nothing. It’s a complete joke.

Fair enough. I'm always astonished how different experiences are because mine is the complete opposite. I almost solely use it for help with Go and Javascript programming and found Gemini Pro to be more useful than any other model. ChatGPT was the worst offender so far, completely useless, but Claude has also been suboptimal for my use cases.

I guess it depends a lot on what you use LLMs for and how they are prompted. For example, Gemini fails the simple "count from 1 to 200 in words" test whereas Claude does it without further questions.

Another possible explanation would be that processing time is distributed unevenly across the globe and companies stay silent about this. Maybe depending on time zones?


AI Studio is also significantly improved as of yesterday.

I find Gemini's web page much snappier to use than ChatGPT - I've largely swapped to it for most things except more agentic tasks.

> Gemini's UX ... is the worst of all the AI apps

Been using Gemini + OpenCode for the past couple weeks.

Suddenly, I get a "you need a Gemini Access Code license" error but when you go to the project page there is no mention of this or how to get the license.

You really feel the "We're the phone company and we don't care. Why? Because we don't have to." [0] when you use these Google products.

PS for those that don't get the reference: US phone companies in the 1970s had a monopoly on local and long distance phone service. Similar to Google for search/ads (really a "near" monopoly but close enough).

0 - https://vimeo.com/355556831


Gemini is completely unusable in VS Code. It's rated 2/5 stars, pathetic: https://marketplace.visualstudio.com/items?itemName=Google.g...

Requests regularly time out, the whole window freezes, it gets stuck in schizophrenic loops, edits cannot be reverted and more.

It doesn't even come close to Claude or ChatGPT.


Once Google launched Antigravity, I stopped using VS Code.

Smart idea to say anything against Google here from a throwaway account, I'm sitting in negative karma for that :')

Anti Google comments do pretty well on average. It's a popular sentiment. However, low effort comments don't.

I'm leery to use a Google product in light of their history of discontinuing services. It'd have to be significantly better than a similar product from a committed competitor.

They seem to be optimizing for benchmarks instead of real world use

Yeah if only Gemini performed half as well as it does on benches, we'd actually be using it.

I'd personally bet on Google and Meta in the long run since they have access to the most interesting datasets from their other operations.

Agree. Anyone with access to large proprietary data has an edge in their space (not necessarily for foundation models): Salesforce, adobe, AutoCAD, caterpillar

Google is still behind the largest models I'd say, in real world utility. Gemini 3 Pro still has many issues.

It was obvious to me that they were top contender 2 years ago ... https://www.reddit.com/r/LocalLLaMA/comments/1c0je6h/google_...

Those black nazis in the first image model were a cause of inside trading.

What is their Claude code equivalent?


They were behind. Way behind. But they caught up.

Trick? Lol not a chance. Alphabet is a pure play tech firm that has to produce products to make the tech accessible. They really lack in the latter and this is visible when you see the interactions of their VP's. Luckily for them, if you start to create enough of a lead with the tech, you get many chances to sort out the product stuff.

You sound like Russ Hanneman from SV

It's not about how much you earn. It's about what you're worth.

Don't let the benchmarks fool you. Gemini models are completely useless not matter how smart they are. Google still hasn't figure out tool calling and making the model follow instructions. They seem to only care about benchmarking and being the most intelligent model on paper. This has been a problem of Gemini since 1.0 and they still haven't fixed it.

Also the worst model in terms of hallucinations.


Disagree.

Claude Code is great for coding, Gemini is better than everything else for everything else.


What is "everything else" in your view? Just curious -- I really only seriously use models for coding, so I am curious what I am missing.

Role-playing but Claude is as bad, same censored garbage with the CEO wanting to be your dad. Grok is best for everything else by far.

Are you using Gemini model itself or using the Gemini App? They are different.

Both

And mathematics?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: