This software has been terrible for me. Burns tokens like crazy, and fails. Most times I try to use the browser plugin, it just says it can't use the plugin. When it does work, it takes minutes to click a button. Unusable workflow.
I ask to generate a png with an alpha channel. It can't. Instead, it outputs a chroma-keyed image, then generates a python script to remove chroma key (fails), then a js script (which also fails). Then my 5h allotment is up.
It's frustrating because if it worked as they advertise, it'd be an amazing tool.
Although they can technically do it, I wouldn't be asking LLMs to generate binary files like PNG with alpha channels, no matter how simple that may seem. If it's easy enough to manually create one yourself, I would do that.
The best way for LLMs to do this is likely to write a scratch program (which is what it seems to have reached for in the second half), write code (which they are good at) and have the library create the image.
At some point it is just easier to handle such things yourself, and use them with text-based formats.
100% agree. I've spent many hours testing out local models/harnesses. So far, they're very much not worth the tradeoff. Obviously, I hope that changes.
This project led me to propose the Taft Test:
Does your page design improve when you replace every image with William Howard Taft?
If so, then, maybe all those images aren’t adding a lot to your article. At the very least, leave Taft there! You just admitted it looks better.
Two reasons: The people in the pic look more or less like our real ourselves. The synthesized photo shows the process of discussing highly conceptual approaches, which was our everyday for 10 years or so.
> Given these positive signals, we would welcome contributions to integrate a performant and memory-safe JPEG XL decoder in Chromium. In order to enable it by default in Chromium we would need a commitment to long-term maintenance. With those and our usual launch criteria met, we would ship it in Chrome.
That conversation doesn't apply to their core products: Search, Mail, Maps, Chrome, Android. Their commitment to maintaining these services over decades has been amazing. It's everything else that sucks.
I asked it to analyze my tennis serve. It was just dead wrong. For example, it said my elbow was bent. I had to show it a still image of full extension on contact, then it admitted, after reviewing again, it was wrong. Several more issues like this. It blamed it on video being difficult. Not very useful, despite the advertisements: https://x.com/sundarpichai/status/1990865172152660047
I’ve never seen such a huge delta between advertised capabilities and real world experience. I’ve had a lot of very similar experiences to yours with these models where I will literally try verbatim something shown in an ad and get absolutely garbage results. Do these execs not use their own products? I don’t understand how they are even releasing this stuff.
I ask to generate a png with an alpha channel. It can't. Instead, it outputs a chroma-keyed image, then generates a python script to remove chroma key (fails), then a js script (which also fails). Then my 5h allotment is up.
It's frustrating because if it worked as they advertise, it'd be an amazing tool.
reply