I would expect that the mode is a feature of the vs code integration and not something the model would necessarily even be aware of. For it to not operate correctly in the ask mode when it is properly set seems to me a bug in the underlying integration. The fact that the model responded the way it did and then corrected it is interesting.
Well, you didn’t share the exact text of the original prompt but you described yourself as having asked it to do something. These models are being trained for agentic behavior on data where an agent is asked _to do something_, and as their output is purely probabilistic, the rewarded response will then often include the text “I have done something” even though they have not done something. Perhaps there _is_ an issue with the integration that caused your experienced response, but purely based on my experience and the limited information you gave, my immediate guess is that the model positively associates your prompt with the generated response
Okay, let’s talk about that first command: sort all files in this directory by relevance. Relevance to what? Is this a permanent sort (what does it do to the file names?) or just a reordered ls output? What button in what OS is this superseding? Is this really an huge improvement over the smart search functionality in most OSes (besides requiring an open terminal and additional keystrokes for the abstractions)?
Some of these are slick despite not really being examples of the obsolescence of UX idioms. But that first one is a very strange example to lead with.
The video shows her driving away from the officers. They approach her window, she backs up a bit, turns her tires away from the officers and starts driving forward when the agent fires into the car
Incorrect. At least two of the 3 shots went through the driver's side window. She was driving by Jonathan Ross who shot at her head and then called her a "fcking btch"
She was doing exactly that. She was turning to leave. They escalated and then shot her. It's just blatant state murder. Governments killing people. Indefensible even by the thirstiest bootlickers
The west = the western hemisphere, the new world, the Americas, from the Northwest Territories to Patagonia
But why trouble yourself to distinguish anyway? 30% of the Irish are obese, 28% of the UK, nearly a quarter of Belgians and Germans. If over a fifth of your population is sick, does it really bolster your national ego so much that some other place is sicker?
If the sideloaded app manages to hack HSBC and steal the customers money they are going to have a demand to refund the customer a bunch of money. I can understand their position.
I understand that, but the thing I've never understood is that banking apps only care about meaningless measurements like whether a device passes Play Integrity. I have a tablet that passes Play Integrity but is also over 6 years behind on security updates. That device should not be allowed to run banking apps.
Why not refuse to run on devices that don't have current security updates? How useful is Play Integrity actually for avoiding these types of problems?
Most banks now require their app for MFA for payments, sadly. They used to offer these "calculator" devices but most banks I know of in my country now require their app. Which sucks for me because I don't want to have my authenticator on a hackable internet-connected device.
I don’t understand the appeal (is there any?) of overhead showers. Yeah sometimes I don’t want to wash my hair, but I’ve got a different primary beef. For me, it’s how do I rinse my pits or between my legs or anywhere facing down, folded or partially covered? I’m just supposed to rely on enough water running down the nearby parts of my body? Bend at difficult to balance angles on the slick shower floors? Cup my hand and splash? Garbage design through and through imo, and I regard anyone who has last used one as likely unclean in some rather key areas
What are your users journaling about? Do you think it’s possible they could be journaling about things that, in their region, could lead to prosecution? Ostracization? Take your user’s security seriously. You don’t know what they are putting in “just a journal app”
I just don’t understand this claim that it’s faster to tell an LLM to do something than to just do it. Yes for large tasks, but the author discusses “edits” and being heavily involved with “pedantic” requests. It’s not specific enough to evaluate but what I’m picturing from their description makes the argument dubious to me
reply