All of these systems use massive pools of GPUs, and allocate many requests to each node. The “slow it down” knob is to steer a request to nodes with more concurrent requests; “speed it up” is to route to less-loaded nodes.
But it’s actually not so difficult is it? The simplest way to make a slow pool is by having fewer GPUs and queuing requests for the non-premium users. Dead simple engineering.
Oh, of course. That’s just conspiratorial thinking. Paying to be in a premium pool makes sense, all of this “they probably serve rotten food to make people pay for quality food” nonsense is just silly.