More

sethhochberg · 2025-12-09T03:50:21 1765252221

There are examples of the warehouse-based model working, but they clearly require both density _and_ mindshare. Its not clear Kroger had either based on the other comments in here. FreshDirect in NYC has been operating since the early 2000s with a fleet of tiny trucks with a couple of employees in them and a giant FC with essentially zero retail footprint.

(As an aside, they also have some of the best meat and produce you can get in the city without going to a farmers market. So many retail grocery stores here lack loading docks, the food handling getting from the truck to the sidewalk to the basement of the store to the shelves is really, really rough especially during the summer months. Skipping that and going warehouse-to-home has advantages)

sethhochberg · 2025-11-28T23:19:44 1764371984

Its hard to have good enough requirements gathering and documentation and product design practices to let an engineer really wrap their head around a problem well enough to come up with and then consistently follow a thoughtful, long-term-maintainable design for a system during implementation.

And its even harder to make sure everyone who reviews or tests that code has a similar level of understanding about the problem the system is trying to solve to review code or test for fitness for purpose, and challenge/validate the design choices made.

And its perhaps hardest of all to have an org-wide planning or roadmap process that can be tolerant of that well-informed peer reviewer or tester actually pushing back in a meaningful way and "delaying" work.

Thats not to say that this level of shared understanding in a team isn't possible or isn't worth pursuing: but it IS a hard thing to do and a relatively small number of engineering organizations pull it off consistently. Some view it as an unacceptable level of overhead and don't even try. But most, in my experience, hope that enough of the right things happen on enough of the right projects to keep the whole mess afloat.

mikert89 · 2025-11-28T23:51:10 1764373870

Eh, its either that, or the wrong people are getting promoted. Technical skills != business process modelling

sethhochberg · 2025-11-19T19:35:55 1763580955

We're just using Github Copilot as our primary entrypoint for all of the model families. Its the only way we can easily offer our devs some level of Claude, Gemini, and Codex all in one place.

typpilol · 2025-11-20T07:36:39 1763624199

Copilot has gotten a lot better lately at least on insiders. They are actually serving close to 200k context on insiders last I checked, which brings it more online with the first party apis

sethhochberg · 2025-11-15T20:23:04 1763238184

I always find the idea that there's something to navigate kind of curious - as you say, its lots of managed versions of open source tools and a mix of proprietary management frameworks on top. Some of what they offer are genuinely unique products for niche use cases, but if you have that niche you probably know what services can support it, like the people in the other comments here mentioning the IoT APIs.

But me (or my teams) are rarely asking the question of "how should I run my service on AWS" in general, its much more typically "I need a managed Postgres database, what AWS product offers that" or "I have an OCI image, what managed platform can I run that in" or even "I want this endpoint to be available all the time, but its usage is very unpredictable/intermittent, so I don't want to pay for idle compute". There might still be a couple of possible answers for those questions, but by the point I arrive there I'm solving for a specific problem.

Its sort of like walking into a kitchen hungry and seeing 3 knives and a stove and oven and a dozen peelers and can openers etc etc and being very overwhelmed by all of this (do I need the knife with a smooth edge or the serrated one?) until you decide you want to eat a grilled cheese, and then grabbing a skillet to put onto a burner and everything making sense once you actually start to cook a specific thing.

tyre · 2025-11-15T20:36:07 1763238967

They've gotten much better at streamlining setup and suggesting sane defaults over the years. I hear the GP that there soooo many knobs. I've found that AWS does a pretty good job, like in the postgres compatible RDS case, of suggesting defaults that make sense for most people. And when you run into issues / scaling problems, you can Claude your way to which settings to research.

The only one that still drives me insane is IAM. That product makes me feel dumb every time I use it, even for simple use cases like "I want a managed redis compatible instance that can only be accessed by these resources." The groups and users and roles and VPCs have never felt intuitive to me, despite having a clear idea of what I want the end state to be.

sethhochberg · 2025-11-12T21:13:39 1762982019

They're quite popular for distributed audio systems in general (of which sound masking is one type). "Constant voltage audio" comes in a few flavors and 70v is very common in the US, other parts of the world often use 100v. Background music systems in retail, voice paging systems, etc use constant voltage hardware because its much better technology for very long cable runs, daisy-chained speakers, and centrally located amplifiers.

The cost is fidelity. Full-range audio transformers aren't cheap, so these systems usually make some compromises because your announcements or smooth jazz over the pasta aisle don't need to be true hi-fi.

Its cool technology. Most of the speakers have variable power taps, so you can run a bunch of them in parallel on a single line and control the actual volume as-needed based on where the speaker is deployed by varying the transformer tap on each speaker.

sethhochberg · 2025-11-05T20:52:23 1762375943

I know this is written to be tounge-in-cheek, but its really almost the exact same problem playing out on both sides.

LLMs hallucinate because training on source material is a lossy process and bigger, heavier LLM-integrated systems that can research and cite primary sources are slow and expensive so few people use those techniques by default. Lowest time to a good enough response is the primary metric.

Journalists oversimplify and fail to ask followup questions because while they can research and cite primary sources, its slow and expensive in an infinitesimally short news cycle so nobody does that by default. Whoever publishes something that someone will click on first gets the ad impressions so thats the primary metric.

In either case, we've got pretty decent tools and techniques for better accuracy and education - whether via humans or LLMs and co - but most people, most of the time don't value them.

mbesto · 2025-11-06T00:50:47 1762390247

> LLMs hallucinate because training on source material is a lossy process and bigger,

LLMs hallucinate because they are probabilistic by nature not because the source material is lossy or too big. They are literally designed to create some level of "randomness" https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

ChadNauseam · 2025-11-06T02:43:06 1762396986

So if you set temperature=0 and run the LLM serially (making it deterministic) it would stop hallucinating? I don't think so. I would guess that the nondeterminism issues mentioned in the article are not at all a primary cause of hallucinations.

joquarky · 2025-11-06T02:55:28 1762397728

I thought that temperature can never actually be zero or it creates a division problem or something similar.

I'm no ML or math expert, just repeating what I've heard.

ChadNauseam · 2025-11-06T03:35:04 1762400104

That's an implementation detail I believe. But what I meant was just greedy decoding (picking the token with the highest logit in the LLM output), which can be implemented very easily

mbesto · 2025-11-06T14:22:32 1762438952

Did you read the whole article?

"In other words, the primary reason nearly all LLM inference endpoints are nondeterministic is that the load (and thus batch-size) nondeterministically varies! This nondeterminism is not unique to GPUs — LLM inference endpoints served from CPUs or TPUs will also have this source of nondeterminism."

andy99 · 2025-11-05T22:25:41 1762381541

Classical LLM hallucination happens because AI doesn’t have a world model. It can’t compare what it’s saying to anything.

You’re right that LLMs favor helpfulness so they may just make things up when they don’t know them, but this alone doesn’t capture the crux of hallucination imo, it’s deeper than just being overconfident.

OTOH, there was an interesting article recently that I’ll try to find saying humans don’t really have a world model either. While I take the point, we can have one when we want to.

Edit: see https://www.astralcodexten.com/p/in-search-of-ai-psychosis re humans not having world models

naniwaduni · 2025-11-05T23:48:46 1762386526

You're right, "journalists don't have a world model and can't compare what they're saying to anything" explains a lot.

sethhochberg · 2025-11-05T19:30:36 1762371036

I think many people would suggest that this was more of an accident due to the ubiquity of the browser, though.

The transition from "websites" to "web apps" was well underway by the time the dev tools became a built-in browser feature - Chrome was notable for being the first browser to release with the console, inspectors, etc out of the box, but that came later. The developer experience was quite a bit rougher in the early days, and then better but still not native in the days of plugins like Firebug.

The web becoming the premium app distribution platform was, firstly, because the web was the lowest-common-denominator distribution channel. Javascript was just the tool that was available where everyone wanted to run.

e12e · 2025-11-05T23:01:40 1762383700

Yes, but maintaining a webapp isn't a nightmare due to the user being able to inspect css, edit html or access the javascript console?

NetMageSCW · 2025-11-05T23:42:11 1762386131

The difference is the changes the user can make don’t flow back to the original. If the user hits refresh they get your copy of the web app, not theirs.

sethhochberg · 2025-11-01T20:54:34 1762030474

In the US, there's often a large labor/materials upcharge on anything that can be branded as "green" - you see a lot of the same thing with higher end heat pump systems and such, too. Efficiency is (for whatever reasons) frequently sold as a luxury product feature in our market and the installers take advantage.

sethhochberg · 2025-10-31T19:41:29 1761939689

The evolutionary force is really just "everyone else showed up at the party". The Internet has gone from a capital-I thing that was hard to access, to a little-i internet that was easier to access and well known but still largely distinct from the real world, to now... just the real world in virtual form. Internet morality mirrors real world morality.

For the most part, everybody is participating now, and that brings all of the challenges of any other space with everyone's competing interests colliding - but fewer established systems of governance.

sethhochberg · 2025-10-21T18:35:08 1761071708

I don't know if they're available in other markets, but in the US I've been very happy with Lutron Caseta switches for that sort of "smart enough" use case. It generally all works like normal dumb switches if the hub is offline or doesn't exist, and you only need the hub to manage configuration or enable the remote control (outside home) features. The fact that the switches look like, act like, and install like traditional dimmers and control traditional light fixtures is really what sold me: I've never liked the idea of the smart parts being in something like a light bulb thats basically a replaceable wear item.