I've been reading about the Mill for many years now and was fascinated by many of their ideas. However, the more deeply I learn about the details of how CPUs and GPUs work, the more I'm convinced that we're not going to see the kind of paradigm shift that they're touting.
They have some cool ideas, but many are incompatible with existing software, e.g. all the memory protection stuff. Other ideas like how memory loads work could be integrated (and have been in place in GPUs for a long time, though not in CPUs) but still depend on compiler changes, which presumably they've done, but past experience with hardware projects that rely on compiler changes is that the compiler changes aren't performing well enough in practice. Or perhaps compilers improve so slowly that traditional architectures can easily absorb the ideas in time.
Another example is that Apple's M1 proves that wider frontends are possible in traditional architectures, as long as the instruction size is fixed. There's no particular reason to believe that the Mill would be inherently better at this than a traditional architecture.
If you have a lay person's interest in hardware architecture and haven't looked at their docs yet, do so. The ideas will likely tickle your brain in a good way -- they did for me. But don't expect much, if anything, from them in production.
> past experience with hardware projects that rely on compiler changes is that the compiler changes aren't performing well enough in practice.
> Another example is that Apple's M1 proves that wider frontends are possible in traditional architectures, as long as the instruction size is fixed.
Doesn't the M1 prove that you can transpile code from one architecture to another and gain performance benefits? As in, Rosetta code works at comparable or better speed at far lower power consumption?
It wouldn't have been Transmeta's patents, but older ones. IBM's DAISY and HP's Dynamo stand out, but there are probably even older examples depending on what exactly is being patented.
The original Rosetta used the same underlying tech with the exception of the memory model stuff, but Transmeta never had to deal with that since they only handled single cores. (And Sparc shipped TSO/WMO switch from a control register all the way back in the mid 90s).
FX!32 predates Rosetta. I don't think Transmeta's patents stopped any innovation here. I think it has more to do with technical difficult and social and political issues within an org. The fact that Apple has pulled off an ISA change multiple times is quit e amazing.
Programmer's got cut off from the microcode starting in the 80s, had this not been the case, I think we would have seen more activity lower in the stack.
Yeah, the AM2900 series is fascinating, the devices themselves are totally unavailable, which is fine, fpgas are cheap and ubiquitous. It would be a great project to reimplment the parts in an HDL!
I don't understand why they need the giant funding anymore for this stage in their development. We're in the middle of a very transformative place in democratization of semi conductors. A couple of grad students were able to make a competitive (for the gate count) open source OoO core with BOOM. There's an open source PDK floating around now. It's albeit for 130nm, but that's good enough for a pentium III level core and they should be very very good with that gate count without paying a cent.
It's not 2004 anymore, they should be able to make progress to making a chip at the gate counts they've been pitching with very little capital investment because of how much democratization of chip tech has happened over the past few years.
Plus I was under the impression that VCs want to at least see something running in an FPGA for a chip startup. The space is filled with software engineers that have trouble synthesizing their HDL. Cut out the fancy meta chip generator, sit down and write some HDL enough for a 'bronze' core and then I imagine the money will start flowing for the money for a leading edge node where the arch will really shine.
Somehow the pandemic shut them down; I would have expected the opposite, that it would focus them on getting something working.
It sounds to me like The Mill is an effort being driven by people who are unwilling to get their hands dirty in anything they don't personally know and enjoy doing, which is never a good sign for a startup, IMHO. OTOH if they can convince the money to spend big on an organization prior to having a product, well, good for them! On general principal I'm in favor of transferring wealth from financiers to tech people.
Yeah, that's for sure a possibility. I've got a huge soft spot for them though (and no skin in the game) and wish to see them do well.
And I've seen a few times in the industry where marketing claims are taken at face value by later engineers who then actually live up to said claims thinking "well they did it, there has to be a way". The positive outlook by being sure that there is a way rather than just having a nagging suspension that it isn't possible is huge. Even if the Mill is vaporware at the end of the day, they're probably more valuable to the industry as a pining for what could have been (taken at face value) rather than an example of what not to do like most vaporware.
* Function calls as the primary built-in control flow mechanism. This is so obviously better its insane, this should be the base unit of control flow in an ISA. No inventing incompatible "conventions" for calls, no manually saving & restoring registers (or need to eke out optimizations by avoiding using registers), no accidentally trampling on external state. Called functions receive parameters in a consistent location and returned values are output in a consistent location, other incidental state not part of the parameters is inaccessible, upon return the caller's state is automatically restored as if it had just executed a 1-cycle cpu instruction. All for an overhead of 1 cycle.
* Instructions and functions can have multiple return values. No stupid out params just to return both an error and value or a value that is bigger than one word. CPU instructions use multiple return values for things that obviously need them instead of overloading stateful flags (eww) or interrupts (heave).
* Unified address space & address translation pushed down to the memory controller. This solves so many problems it's not even funny. Cached writes to memory can unblock as soon as it hits L1, reading from uninitialized memory gives you a pre-zeroed page -- instantly -- not in 300 cycles, the previous two means that it's possible to allocate + write + read + deallocate a memory page where it never actually touches main memory and is entirely served by cpu cache hierarchy, ejects expensive TLB address translation out of the hot path of memory accesses instructions, memory protection-based access control can be done in parallel with the fetch instead of the fetch being delayed until translation is calculated. (A process should not assume its the only thing that exists and that its starting address is always the same, ASLR should be on by default. Arguing that per-program virtual memory is a security feature is like arguing that NAT is a security feature. Access and addressing can be separated securely.)
* Machine code is basically a directly encoded form of Static Single Assignment (SSA) which is typically a compiler Intermediate Representation (IR) middle step in the compilation process, but the Mill consumes SSA directly which means that a lot of data flow information is preserved during the lowering to machine code and doesn't have to be inferred or guessed dynamically by the CPU during execution.
Register windows were one of the few things that worked well in Itanium. I ran some networking benchmarks (interrupt and function call heavy code) on the first generation hardware and the combination of efficient hardware register spills combined with pretty massive memory bandwidth compared to x86 at the time lead to quite impressive results. Sadly, it took ages for the first gen hardware to ship, plus it had this huge chunk of a useless x86 core taking up valuable die space. And I stubbed my toe on that machine way too much...
I used to be quite excited for the mill. But nowadays I don't believe it will go anywhere. The mill has been in development for almost 20 years and it has literally nothing of substance to show for it. Not even something you can run on an FPGA.
For reference when the mill project was started Intel was releasing pentium 4.
I'm really interested in the Mill architecture too, but I've always wondered what it's target market would be. Does it offer a power-performance-price profile that hasn't been serviced yet? Or maybe it's faster/cheaper/less-power all-round compared to any other general purpose CPU? It's hard to know without real performance metrics.
Ah, so they jus waiting until their patents expire ? Or maybe they waiting until some needed-by-them patent expire ?
Look: MS did not manufacture mouses until mouse patend got famous for making no money. And then it was global must to have. It was impossible to make singleton in C++ in 200x but atomics was described in literature (even in SICP) and only when patent expired Intel made it everyday tool...
Patents should be automatically invalidated before half human life time is wasted.
IMO patents should have mandatory FRAND (Fair, Reasonable, And Non-Discriminatory) licensing. With higher (2x?) license fees allowed if a working prototype has been submitted to the patent office instead of just a design.
Incentivize new inventions, but keep profits fair. Incentivize working designs over on-paper ideas. Promote the progress of useful inventions over rent-seeking.
> It was impossible to make singleton in C++ in 200x but atomics was described in literature (even in SICP) and only when patent expired Intel made it everyday tool
I'm curious about what you mean here. Are you taking about an atomic compare and swap operation? I'm sure these have been implemented in processors back as far as the 90s, if not earlier.
Yep, atomics was implemented earlier but, as far as I know, they was patented and only after patent expired Intel put them in. So, how many years it take to "popularize" atomics ? Ten, 20 years ?
I suppose multi-threading wasn't a huge problem for most applications. But I'm not sure how a mutex could ever be properly implemented in a multithreaded environment without an atomic compare and swap, but they've certainly been a thing for a very long time. In terms of atomics as a assembler intrinsic, I think that only became popular when intrinsics started being supported in C compilers, but I doubt the patent was holding that back - unless you know different?
To be fair, it's a number of useful innovations that have taken decades of research, and would not have otherwise been done without property rights grants some sort of pay-out at the end. And many of the ideas would require a dramatic overhaul of how compilers generate and optimize code, and so they couldn't just be co-opted by existing processor families without breaking compatibility.
Patents do not apply to academic research, do they? Amateurs can also post some VHDL on github without fear of retribution, so still strange.
For commercial application, if the ideas are good, and the patents valid, someone could just license them. Or are they sitting on an inbox full of desperate licensing requests that they're ghosting while whining they can't get funding?
The series of videos about the Mill on youtube are fascinating and they got me interested in hardware. It is very sad that nothing seems to have come of it, especially since one of their architecture was supposed to be easier to build in hardware than existing architectures. It would be valuable as an experiment even if it never ends up being better than e.g. the M1, negative results are useful too.
Is it really so hard to them to raise money? Look at some of the stupid ideas that have raised tens to hundreds of millions of dollars over the last few years.
> Is it really so hard to them to raise money? Look at some of the stupid ideas that have raised tens to hundreds of millions of dollars over the last few years.
Yes. I think it’s hard for them because they probably aren’t pitching to VCs correctly. The Mill folks desperately need some elevator pitch material. Ivan Goddard mentioned [0] that they’re having problems raising VC money, in a world where VC money has been flowing like water.
Every single video/document from Mill Computing has been targeted towards highly technical audiences. I’ve seen zero material from them targeted to laypeople. You’re never going to secure any funding if your website only consists of a series of highly technical slides and lecture videos. This [1] is their “high level” overview page, which is already way too technical.
Anything that's a sufficient departure from the norm is going to have a very hard time raising VC money from all but a very small handful of VCs that take really big transformative swings. That's not to say that Mill Computing doesn't need to do a better job at making material for laypersons to help build a zeitgeist around their thesis and what they're doing, but it should also be considered that they're doing something that VCs don't have much interest in for reasons that have nothing to do with the existential merits of Mill Computing. Which is just the long-winded way of saying, VC money is probably the wrong money for them.
Mill really needed to be proven in academia with a physical implementation before going the startup route. If you look at RISC-V, that's been a proven path towards derisking the challenges of a clean sheet design+ISA.
Oh, I thought it was about the people claiming to be building a replica of Babbage's analytical engine, where the CPU is called "The Mill" [1] 11 years on, it's still all hype. They haven't even posted design documents, let alone built anything. Or even posted an annotated archive of the original documents.
When that started, I contacted them and asked how many part numbers the Analytical Engine needed. That's a basic number for sizing the build job. They didn't have an estimate.
(In manufacturing, a part is one object, and a "part number" is the term for a part design from which many parts are made. Engineering and tooling effort is driven by the number of unique part numbers.)
If they were serious, they could start posting drawings and getting people to model them in SolidWorks or Fusion 360. Somebody would probably CNC machine the parts and make a sample storage register.
VCs have been doling out cash by the kilo since 2020. Why haven’t they been able to secure funding, even for the modest goal of making a FPGA proof of concept?
I assume it's because there's nothing reaching the strata of VCs that tells them that Mill Computing's thesis is a slam dunk for a VC thesis. VCs invest mostly as a herd, not as pioneering eccentric outliers. Unless a company is doing a thing that is already inbounds of where the herd is at or where its vanguard is headed, then those companies will have a comparatively very hard time raising VC money.
It might look to you like VCs are just throwing gobs of money around at anything with a pulse, but they're not. They're not throwing money at 100,000 distinctly unique kinds of market making or new category creating ventures. They're throwing lots of money at slightly different flavors of parallel attempts at iterative improvements in mostly established markets.
For what Mill Computing is portending to do, I'd expect government grants and maybe strategic investment from industry CVCs to be their only real viable bet. That or finding extremely "smart" or extremely "dumb" money in the VC marketplace, because I'd be very surprised if anybody in between is going to be looking for a Mill Computing to fill out their portfolio. It's way too esoteric. That's not an indictment of Mill Computing of VCs. Neither of them seems like they're in the right place for the other.
That's only one part of the problem. I contacted them asking if they needed my help because I was interested in the project, and they wanted my help. However, they said I would have to spend at least 5 hours a week on it, every week. They demanded that.
Well, that wasn't something I could promise to something I would do on my spare time, so I had to decline.
What the post author is measuring is simply the absence of momentum and passion from whoever was curating the news/communication at the time. A company with dedicated staff for this purpose would have kept the news flowing regularly for PR reasons.
They have some cool ideas, but many are incompatible with existing software, e.g. all the memory protection stuff. Other ideas like how memory loads work could be integrated (and have been in place in GPUs for a long time, though not in CPUs) but still depend on compiler changes, which presumably they've done, but past experience with hardware projects that rely on compiler changes is that the compiler changes aren't performing well enough in practice. Or perhaps compilers improve so slowly that traditional architectures can easily absorb the ideas in time.
Another example is that Apple's M1 proves that wider frontends are possible in traditional architectures, as long as the instruction size is fixed. There's no particular reason to believe that the Mill would be inherently better at this than a traditional architecture.
If you have a lay person's interest in hardware architecture and haven't looked at their docs yet, do so. The ideas will likely tickle your brain in a good way -- they did for me. But don't expect much, if anything, from them in production.