my take away was that the race condition was the root cause. Take away that bug, and suddenly there's no incident, regardless of any processing delays.
Right.sounds like it’s a case of “rolling your own distributed system algorithm” without the up front investment in implementing a true robust distributed system.
Often network engineers are unaware of some of the tricky problems that DS research has addressed/solved in the last 50 years because the algorithms are arcane and heuristics often work pretty well, until they don’t. But my guess is that AWS will invest in some serious redesign of the system, hopefully with some rigorous algorithms underpinning the updates.
Consider this a nudge for all you engineers that are designing fault tolerant distributed systems at scale to investigate the problem spaces and know which algorithms solve what problems.
That's true, if you use the the CAP definition for consistency. Otherwise, I'd say that the DNS design satisfies each of those terms:
- "Rapidly updatable" depends on the specific implementation, but the design allows for 2 billion changesets in flight before mirrors fall irreparably out of sync with the master database, and the DNS specs include all components necessary for rapid updates: push-based notifications and incremental transfers.
- DNS is designed to be eventually consistent, and each replica is expected to always offer internally consistent data. It's certainly possible for two mirrors to respond with different responses to the same query, but eventual consistency does not preclude that.
- Distributed: the DNS system certainly is a distributed database, if fact it was specifically designed to allow for replication across organization boundaries -- something that very few other distributed systems offer. What DNS does not offer is multi-master operation, but neither do e.g. Postgres or MSSQL.
Further, please don’t stop at RAFT. RAFT is popular because it is easy to understand, not because it is the best way to do distributed consensus. It is non-deterministic (thus requiring odd numbers of electors), requires timeouts for liveness (thus latency can kill you), and isn’t all that good for general-purpose consensus, IMHO.
0.02% almost reads like satire. If you're on to something you should consider open sourcing it or give far more details to start a real discussion.
One suggestion I have is that you should run this on many binaries instead of just a single one to get an idea of effect on various binaries. If you're already doing this, my apologies. The post made it sound like you've only tested this on a single binary.
I'm starting work and unfortunately I can't share the material yet because I'm in a mess with the files, but I'm already doing tests on different binaries and the results are getting better, I'll provide a more detailed description soon, and links to materials.
I was thinking they could use something like cloudwatch events, or something, to trigger sweeps and significantly reduce scheduled sweeps.
They could even use cost allocation tags to predict if a bucket or group of buckets should be scanned if it's growing unexpectedly. Cost isn't a perfect metric but there's definitely signal there.
This sort of work is something I wouldn't be able to do, but I can't help but point out at least one potential issue with the paper. It's a lot easier to find problems than solutions I guess.
Are the benchmarks comparing node versions valid to conclude a real world performance increase?
This isn't necesarrily unknown, but knowing that there's a cost associated with promises and async/await will get you pretty far in terms of performance. I remember the idea first clicking when I watched this :
https://www.youtube.com/watch?v=SMBvjmeOotA
It's an interesting trick with the hardcoded array! Thanks.
The video you posted is a bit chaotic, so I don't really want to watch 1.5 hours just to get something that could be summarized in one line. But anyway... I understand the downsides of promises calls, but there are also advantages, such as not blocking page rendering, which I heavily rely on.
I haven't heard that you're taxed when you lose money. My understanding was that If I invest $10,000 in a taxable account, lose $1,000, and withdraw the remaining $9,000, I won't pay taxes on the $9,0000 (ignoring transaction fees).