Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tart: VMs on macOS using Apple's native Virtualization.Framework (tart.run)
261 points by PaulHoule on Jan 19, 2024 | hide | past | favorite | 135 comments


I've been experimenting with tart recently as a means of spinning up quick test environments for management workflows. I've been pleasantly surprised. There's nothing seemingly special under the hood. It's just a cli wrapper around Apple's virtualization framework, but the UI is clean and the basic workflow to get from a IPSW to running VM is very straightforward.


As a Mac Systems Administrator, I have extensive experience with Tart, having been an early adopter. My role involves rigorously testing various aspects of Mac systems, such as settings, scripts, workflows, and software deployments. Tart has been instrumental in this process, particularly due to its capability to swiftly set up virtual machines (VMs). A key feature I frequently utilize is booting a VM's macOS into Recovery OS, which allows me to disable security settings. This step is crucial for enrolling devices in DEP, despite manual registration being a different process. By disabling SIP, I can activate specific flags that enable DEP-like features.


Their “special sauce” is a well thought-out cli interface. It’s easy to understand, easy to remember, and the things one would need are implemented.

There are many other tools on GitHub which use Apple’s virtualization.framework to do this, but their UX is all horrific compared to tart.


Also want to note availability of GitLab Executor, Buildkite plugin and many more integrations for all kinds of workflows.


Orb also runs linux machines, a feature I miss for the first few weeks of using it!

https://docs.orbstack.dev/machines/


Tart runs Linux virtual machines as well.

It is an excellent tool.


I'm curious why file system performance from the VM through the hypervisor to the host is so slow on macOS. Is this some sort of fundamental limitation or is it a case of "Apple hasn't made this a priority"?

My knowledge could be out of date and maybe this is fixed, but I've tried using Docker on macOS and it's almost unusable for dev environments with lots of files because of the file system performance.


How recently have you tried it? The current best-performance option for bind mounts in Docker Desktop on macOS (VirtioFS, using Hypervisor.framework) didn't become the default until June (for new installs) or September (for existing installs), and wasn't available at all prior to December 2022.


I haven't tried it on a large project recently. How good is it?


Better, but still a pain. Nothing compared to running Docker on Linux.


But docker on Linux doesn't involve any virtualization whatsoever so filesystem performance is 100% native


Have you tried OrbStack?


Cross-OS fs performance is a problem, especially latency wise. It’s the same with parallels, virtualbox, vmware, across just any fs. Or, you can have problems with hardlinks as well.

I personally use mutagen to sync host and vm and it works great.


Did you have VirtioFS enabled when you tried it?


File performance in itself isn’t bad per se, but keeping two file trees in sync is a mess with Docker because every file operation needs to be duplicated or something. There’s some async stuff as well but ultimately most programs meant for Linux assume file access to be super fast so any sort of intermediate steps end up being slow (see also: git on Windows being dog slow)

You want super speedy docker on Mac? Run docker _inside_ a Linux vm. Even better, just use Linux!


I've been using UTM, it seems to work OK. I use arm64 images, I haven't tried x86_64 images. Works for Windows and Debian.


UTM isn’t a direct analog to tart; tart is scriptable extremely easily and there’s an accompanying Packer plugin for it, so it’s a breeze to completely automate the creation of VM images, distribute them via a container registry, as well as create and destroy VMs created from those images. It’s a single tool which does all the things an automated CI system needs to be able to do.


x86_64 runs using QEMU for emulation, and it's incredibly slow. I'd call it practically unusable.


Hard agree, it does a decent job emulating an x86 container. However Angular builds were still painfully slow.


For running an x64 container, I'd recommend running aarch64 Linux and then running the container using Rosetta 2, that yields significantly better performance than running everything through QEMU.


Wait what? I thought Rosetta 2 only worked in MacOS? Where does aarch64 Linux fit in this picture?



Coincidentally I set this up last night under UTM (https://docs.getutm.app/advanced/rosetta/) and it worked astonishingly well. Outside of Rosetta, the native virt seemed a bit slower than whatever UTM does normally, but perfectly usable on my M1 MBP.


If you're saying the rest of the aarch64 VM got slower after enabling Rosetta 2, that's unfortunately expected. Under Linux, TSO is enabled for everything, even aarch64 processes when Rosetta is active. On macOS, TSO is selectively enabled only for the x64 processes and leaves everything else alone.


MacOS 9 (PPC)runs OK. A little slow (macbook air m2)

https://mac.getutm.app/gallery/


After installing UTM, did your Mac experience a noticeable slow down, even when no VMs are running? It was frustrating enough for me to uninstall it altogether.


I've been using arm64 debian and nixos with no slow downs, 32GB memory. I don't care to try x86_64 emulation personally, but the native linux version run well in my experience. The only annoying thing I've found was that while video would run smooth there was anywhere from 100-500ms delay in the sound playing youtube in the VM.


Slightly off topic but I tried nixos on a laptop with just 4GB and it ran out of memory doing a refresh of some sort (a nix noob at the moment)

My reading suggests I am stuffed but I feel the goal is worthwhile. There does not seem to be a minimum ram requirement but if you had any pointers Inwoukd be grateful


I have ollamam jellyfin and keep a utm with kali open at all times. I don't notice any slowdowns.

Chip: Apple M1 Max Total Number of Cores: 10 (8 performance and 2 efficiency) Memory: 64 GB


When using UTM on M2 MacBook. Blazing fast Debian vm and the host Mac.

The networking on other hand is doing my head in.


Any idea how this compares with UTM in terms of features or what it's intended use-case or strengths are?


Now if only there was a simple solution for running z X86/64 vms.


Maybe not a 100% fit but since Docker 4.25 they run X86/64 binaries with Rosetta 2 which should offer near native performance https://www.docker.com/blog/docker-desktop-4-25/


The whole point of virtualization is you're running as close as possible to directly on native hardware. That's literally what makes virtualization distinct from emulation for VMs.

If you're trying to run an x86 client OS you need an emulator, there's just no way around it. If you just have some x86 binaries and don't actually need a full x86 OS, they've made rosetta available for client linux VMs.


The thing is, such an emulator exists, MacOS itself uses it to run old x86 MacOS apps transparently. Why is that as soon as that app happens to be, say, VMWare, the OS suddenly says nuh uh, not gonna do it?

The technology works, I run AAA x86 games on my Mac Studio via Crossover. Sure, the performance is a bit limited, but it's limited by the nature of an integrated, albeit fairly powerful, GPU. It works surprisingly well considering many of these games are targeted at, say, 1000-series NVidia cards.

But wanting to run a 8GB Linux instance so I can run a local dev environment is an impossible ask? (Before anyone asks, no, ARM linux isn't really a viable solution for...reasons I don't feel like going into but are mostly boring and technical-debty).


The canonical mechanism for running amd64 Linux processes appears to be to virtualise aarch64 and use binfmt-misc with Rosetta 2 to get emulation working.

It does make a certain amount of sense that Apple would have hardware virtualisation support for native VMs but not for emulated VMs. I can imagine (but I've not checked) that support for emulation of the VT extensions is lacking.

As a random person on the Internet, I'm obviously overqualified to suggest that you use native virtualisation to run aarch64 Linux, then use Rosetta within that Linux VM to run whatever amd64 software virtualisation tool you prefer. This is quite similar to what containerisation tooling does -- Docker (and similar) on aarch64 runs an ARM VM, then uses Rosetta inside that VM to run containers. You don't get a native amd64 kernel that way, but even without nested virtualisation you get a complete (namespaced) amd64 userspace.


Ah, that's not quite accurate.

Rosetta is an emulator (sorry <marketing>translator</marketing>) for userspace. Happily Apple actually provides direct support for linux in rosetta (as in it can parse linux binaries and libraries, etc can remap syscalls correctly, etc in linux VMs), and there's even a WWDC video on how to integrate it, so that you can install an arm linux kernel, and then run x86_64 linux binaries via rosetta. Windows has its own translator so doesn't need rosetta.

At a basic level the semantics of usermode x86 and usermode arm are identical, so you can basically say "I just care about translating the x86 instructions to arm equivalents", and all the rest of the "make this code run correctly" semantics of the hardware (memory protection, etc) then just work.

That basically breaks down completely for kernel mode code, for a variety of reasons, at one level a lot of the things rosetta style translators can do for speed break down because they can't just leverage memory protection any more. But at a more fundamental level your translator now needs to translate completely different mechanisms for how memory is accessed, how page tables are set up, how interrupts, are handled, etc. It's hard to say exactly which bit would be the worst part, but likely memory protection - you could very easily end up in a state where your translator has to added non-hardware supported memory protection checks on every memory access, and that is slow. Significant silicon resources are expended (the extremely high performance and physically colocated TLBs, caches, etc) to make paged memory efficient, and suddenly you have actual code running and walking what (to the cpu) is just regular ram. So now what was previously a single memory access for the CPU involves a code loop with multiple (again, for the cpu) random memory accesses.

Those problems are why CPUs added explicit support for virtualization in the early 2000s. VMWare, etc started making virtual machines an actual viable product, because they were able to set things up so that the majority of code in the client OS was running directly on the host CPU. The problem they had for performance was essentially what I described above, only they didn't have to also translate all of the instructions (I believe their implementation at the time did translate some instructions, especially kernel mode code, as part of their "tolerable" performance was being very clever, and of course the client os kernel is running in the host os's user mode so definitely has to handle client kernel use of kernel mode only instructions). A lot of virtualization support in CPUs basically boils down to things like nested page tables, which lets your VM just say "these are the page tables you should also be looking at while I'm running".

Now for cross architecture emulation that's just not an option, as the problem is your client kernel is trying to construct the data structures used by its target cpu, and those data structures don't match, and may or may not even have an equivalent, so there is no "fix" that doesn't eventually boil down to the host architecture supporting the client architecture directly, and at that point you're essentially adding multiple different implementations of the same features to the silicon which is astronomically expensive.

The far better solution is to say, make the child OS run a native to the host architecture kernel, and then have that do usermode translation like rosetta for application compatibility.


You can do it with QEMU using its emulation backend to emulate the entire x86 boot chain. It will be dozens (hundreds?) of times slower.

Hypervisors work efficiently because they allow you to "nest" certain functions inside of other userspace processes, which are typically only available to the operating system (with a higher privilege level), things like delivering interrupts efficiently, or managing page tables. The nesting means that the nested operating system is running on the same hardware in practice, and so it inherits many of the constraints of the host architecture. So, virtualization extensions can be seen as a kind of generalization of existing features, designed so that nesting them is efficient and secure.

For example, different architectures have different rules about how page tables are set up and how virtual memory page faults are handled in that environment. The entire memory mapped layout of the system is completely different. The entire memory model (TSO vs weak ordering) is different. There are strict correctness requirements. A linux kernel for x86 has specific x86 code to manage that and perform those sequences correctly. You cannot just translate x86 code to ARM code and hope the same sequence works; you have to emulate the entire processor environment, so that x86 code works as it should.

Rosetta does not emulate code; it is a binary translator. It translates x86 code to ARM code up front, then runs that. The only reason this works is because normal userspace programs have an ABI that dictates how they interoperate with each other across process and address space and function boundaries. When CrossOver runs an x86 game for example, it translates the x86 to ARM. That program then calls Metal APIs, libmetal.dylib or something, but that .dylib itself isn't x86-to-ARM translated. It is simply "pass through" shim to your system-native Metal APIs. So the graphics stack achieves native performance; the application overhead comes from the x86-to-ARM translation, which needs to preserve the semantics of the original code.

Rosetta-for-Linux works the same way, because there is a stable ABI that exists between processes and function calls, and the Linux kernel ABI is considered stable (though not between architectures in some cases). It translates the x86-to-ARM, and then that binary is run and it gets to call into the native Linux kernel, which is not translated, etc. This basically works well in practice. It is also how Windows' x86-on-ARM emulation works.

If you want to emulate an entire x86 processor, including an x86 Linux kernel, you have to do exactly that, emulate it. Which includes the entire boot process, the memory model, CPU instructions that may not have efficient 1-to-1 translations, etc.

Unfortunately, what you are asking is not actually reasonably possible, in a technical sense. Your options are either to use Rosetta-on-Linux to translate your binary, or get an actual x86 Linux machine.


> Rosetta does not emulate code; it is a binary translator. It translates x86 code to ARM code up front, then runs that. The only reason this works is because normal userspace programs have an ABI that dictates how they interoperate with each other across process and address space and function boundaries. When CrossOver runs an x86 game for example, it translates the x86 to ARM.

I don't think this is true. I think code run under WINE is always JITted because it's too much unlike a Mac binary.


Honestly I feel like ([edit: removed personal detail that in hindsight isn't really relevant in this specific discussion]) translation vs emulation is very close to a purely marketing distinction. I feel like the only thing that makes it a semantically valid distinction is if you take an interpretation of translation being explicitly the "translated" code is run as native code on the cpu, rather than as a loop written in native code that manually evaluating each original instruction. But that's still iffy, as many things that people would absolutely call emulators nowadays are generating native code as well.

JIT vs non-JIT also isn't really sufficient. Rosetta necessarily does runtime code generation as well[1], because Rosetta needs to support running code generated by x86_64 JITs at runtime (note that this is actually often slower than simply not using a JIT in the first place, to the extent that JavaScriptCore at least has a mechanism to ensure it does not use the JITs if its running under rosetta for this reason). Then as you say programs like WINE are definitionally not using platform binaries, so any code that they load and start to execute looks basically identical to "this program generated code at runtime". I do wonder if there's something that could be done either as some kind of Rosetta API to register a block of code as being logically static to allow some kind of caching, or by wine doing some fairly butchered rewriting to convert the windows objects into something that looks like "normal" machos to essentially trick rosetta into the AOT and/or caching path.

[1] Until the 2023 rosetta update the linux VM support only supported JIT translation, that update added support for a more Mac like model where there's a central daemon to handle loading binaries and libraries so they can be cached and shared


Your terminology is off and you seem to be talking about a hypervisor or something. Virtualization is just virtualizing computer hardware and emulation is one way to do it


Sorry, I'm not sure what's going on here, I replied to another comment about this and there's multiple people stating that (s)he's thinking of containers.

I think we have collective amnesia about what virtualisation actually is and why it's distinct from containerisation.

Virtualisation is absolutely about "skipping" parts of the emulation chain to do direct calls to the CPU in a way that does not need to be translated; in this way it gets much closer to the hardware.

Containerisation was considered faster still because instead of even having the 5% overhead not being able to use the same process scheduler in the kernel; you can share one.

Yes, containers are able to execute faster than VMs, but the parent is absolutely right that the entire point of virtualisation as a concept was to get closer to the CPU from inside an emulated computer.


…or they edited their post afterwards ;)

I’ve been using containers for decades on FreeBSD and Solaris, long before Linux ever caught on. And virtualisation even longer.

In fact I have fond memories of using the first version of VMWare (which was a literal emulator at that point because x86 didn’t support virtualisation back in 1999) to run Windows 2000 from Linux.

  So, like the others who responded, I definitely know the difference between virtualisation and containerisation.


You have a similar background to me; though I used Zones on Solaris before Jails on FreeBSD.

I also used VMWare when it was an emulator and hated how abysmally slow it was


Christ was is slow!!

I wasn’t using it for anything serious thankfully.


You’re thinking of containerisation. Virtualisation does abstract away direct interfaces with the hardware. And some virtual machines are literal emulators.


No, he's right.

Containerisation is distinct from virtualisation.

Virtualisation shares some areas with Emulation, but it's essentially passing CPU instructions to the CPU without translation from some alternative CPU machine language.

The difference here is the level; in descending order:

* Containerisation emulates a userland environment, shares a OS/Kernel interfaces.

* Virtualisation emulates hardware devices, but not CPU instructions; there are some "para-virt" providers (Xen) that will share even a kernel here, but this is not that.

* Emulation emulates an entire computer including its CPU


I don’t think you’ve read the comment chain correctly because you’re literally just repeating what I just said.

Though you make a distinction between virtualisation and emulation when in fact they can be the same thing (they aren’t always, but sometimes they are. It just depends on the problem you’re trying to solve).


>> The whole point of virtualization is you're running as close as possible to directly on native hardware. That's literally what makes virtualization distinct from emulation for VMs.

> You’re thinking of containerisation.

no.


They edited their post. If you look at the comments others have made, you can see their original comment was much more ambiguous


Is there a way to see original comments?


Unlikely. It’s too recent for Wayback machine to cache.

Their post was ostensibly the same but much more vaguely worded. And if you say “virtualisation is about being as close to the hardware as possible” without much more detail to someone else who talks about wanted to run a VM with a different guest CPU, then it’s understandable that people will assume the reply is mixing up virtualisation with containerisation since there’s nothing in virtualisation that says you cannot emulate hardware in the guest. Whereas containerisation is very much intended to run natively on the hardware


who's post was edited? If they're referring to my original one, then there's only one possible edit I made, because I recall making a comment today that was missing a word, but I updated that instantly so there were no comments or anything, or really likelihood anyone had read it prior to my comment. My "edit" that now has to be an additional comment is at https://news.ycombinator.com/item?id=39061507 - I'm curious if you agree with my justification on treating virtualization is meaning explicitly non-emulated these days. From your other comments it seems you do agree with me, but I'd like to know how you feel about the rationale.

In response to the editing comment made, my rule for editing or changing comments is that if it's not an instantaneous edit, then the edit should be marked. In response to this thread (the whole thing, not just the one leading down these branches) I did write a stupidly long addendum but spent so long trying to find old marketing material I then couldn't make the edit \o/

My more careful rule about editing is that if someone has replied to a comment, I won't change the text that they replied to, unless the confusion is something very simple like a missing "not" or something where the original text was clearly wrong, and I'll generally do some variation of " .... [edit: not] ...". Otherwise I try to do it as adding additional text with a note saying the new text was added.

I find silent edits incredibly annoying, as it means you can't tell if/what has changed in a comment you replied to, and it allows for people to exhibit some really screwed up behaviour. Basically along the same lines as people brought up when apple added editing/deleting to iMessage, where the original betas I think didn't show edits/deletion had occurred, nor what had been changed. I don't think it's reasonable in this day and age for sites like HN to not provide an edit history for comments.


> I'm curious if you agree with my justification on treating virtualization is meaning explicitly non-emulated these days.

I often see people try to make a distinction between hardware virtualisation and hardware emulation but the reality is they're just different sides of the same coin. It's like saying Typescript isn't a compiled language because it doesn't produce assembly like C would. Sometimes you need to emulate the entire CPU. Other times the guest and host CPUs are the same and that CPU supports hardware assisted virtualisation. But even AMD64 virtualisation solutions have to emulate some parts of the hardware stack (like the virtual network card) regardless of any paravirtalisation and other virt-extensions available by the CPU, GPU, and so on.

To compound the confusing jargon. All emulators are a virtual machine but not all virtual machines are emulators.

The distinction between containerisation and virtualisation is a lot easier to describe. Rather than placing hardware gaps (whether those hardware gaps are defined in software or hardware) like you do with virtualisation, instead you have all code running natively with only your kernel for protection.

I guess you could draw some parallels between modern hypervisors and kernels, but even hear, you wouldn't run multiple kernels on top of each other in a container. However you would run multiple kernels on top of a hypervisor. The layer of separation is a lot different

Personally I view the things as being

  1. containerisation
  2. virtualisation
    2a. hardware assisted
    2b. emulation
ie emulation is a subset of virtualisation. Distinct in some ways but not an entirely new category in itself


This dichotomy of hardware assisted vs emulation is too simplistic though.

> Sometimes you need to emulate the entire CPU. Other times the guest and host CPUs are the same and that CPU supports hardware assisted virtualisation.

You don't need dedicated hardware assistance to do some amount of virtualization if you have an MMU and memory protection. This is how early VMware on x86 worked, it was not a full blown emulator, it did not need to emulate the entire CPU. Most guest code ran unaltered without emulation, only certain ring 0 code had to be emulated.

VMware ESX and workstation ran usably well back in the x86 days, I mean it was a viable product. It was not just an "emulator", in contrast to something like BOCHS or early qemu.

https://en.wikipedia.org/wiki/X86_virtualization

https://www.vmware.com/pdf/asplos235_adams.pdf

I don't think an accurate categorization really. Emulation is a technique that may be used in parts of the implementation for virtualization, but they're different concepts, one is not a subset of the other.

> All emulators are a virtual machine

No, I can emulate a single piece of hardware like a network adapter. I can emulate just a CPU core indepedent of a particular virtual environment.


> This is how early VMware on x86 worked, it was not a full blown emulator, it did not need to emulate the entire CPU.

Of course it didn't. But it did (and parts of it still does) emulate hardware. Which is my point.

> This dichotomy of hardware assisted vs emulation is too simplistic though.

Is it though? Or is this need to define different aspects of virtualisation as entirely different fields effectively being too simplistic? I'm acknowledging that there is shared heritage and commonality between principles -- even to the extent that some directly borrow from the other.

You say "dichotomy" to refer to my comments yet you're the one trying to divide a complex field into small pigeonholes without acknowledging that they're intermingled. This is what PR people do to sell products to customers, not engineers like us.

> VMware ESX and workstation ran usably well back in the x86 days

Workstation predates ESX by a few years and the 1.x versions of Workstation really didn't run well _at all_. I used it personally, it was a bloody cool tech demo and even back then I could see the potential, but it was far too slow to use for anything serious. Particularly when UNIX and BSDs were still in vogue (albeit barely) and they had excellent support for application sandboxing via containerisation.

> I don't think an accurate categorization really. Emulation is a technique that may be used in parts of the implementation for virtualization,

So you agree with the technical description but not the categorisation?

> but they're different concepts, one is not a subset of the other.

That's like saying DocumentDB and Postgres aren't both databases because one is SQL and the other is not.

The concepts between emulation and virtualisation are similar enough that I'd argue that one is a subset of another. What you're discussing is the implementation detail of one form of virtualisation -- baring in mind that there are other ways to run a virtual machine which we also haven't covered here. You've even agreed yourself that a VM requires parts of the full machine to be emulated for it to work. So the only reason people don't (still) refer to emulation as a subset of virtualisation is marketing.

> It was not just an "emulator", in contrast to something like BOCHS or early qemu.

I really don't get what you're trying to say with "early qemu". qemu still does full emulation. It also supports virtualisation too. If anything, it's another example of the point that I'm making which is that hardware virtualisation and emulation are too frequently intermingled for it to be sensible claiming they're distinct categories of computing.

> No, I can emulate a single piece of hardware like a network adapter. I can emulate just a CPU core indepedent of a particular virtual environment.

And that's exactly why they're a subset of virtual machines. I know this isn't the common way to refer to a VM but I'd argue that an emulated hardware adapter is still a virtual machine because it takes input, returns output, and is sandboxed and self contained. In the pure mathematical sense, it is a virtual machine just like how some software runtimes are also classified as virtual machines.

I feel a lot of the issue here is down to businesses redefining common terms over the years to make their products seem extra special.


> early qemu

By early qemu, meaning qemu prior to KVM and KQEMU (ie between 2003 and 2007), which only did full CPU emulation.

> Workstation predates ESX by a few years and the 1.x versions of Workstation really didn't run well _at all_

It improved quickly. By 3.0 (early 2002) I was using an Windows 2000 desktop with Visual Studio on a Linux host as a daily driver workstation. That was still on pre VT-x x86. I then used this on a Pentium M (no VT-x) Thinkpad for years.

> I know this isn't the common way to refer to a VM but I'd argue that an emulated hardware adapter is still a virtual machine ...

The thing is overall I agree with you, I think. I agree that these terms have been used in different ways over the years to discuss overlapping concepts. Maybe the only thing is I was pushing back on a notion of "one true" ontology, but maybe that was not really your point. I also thought you were implying that system virtual machines for targets without hardware assistance (your other comment seems to suggest this) require full CPU emulation, but perhaps that was not your intent.


> I also thought you were implying that system virtual machines for targets without hardware assistance (your other comment seems to suggest this) require full CPU emulation, but perhaps that was not your intent.

In fairness, I did write my comment that way. So I can see why that's the conclusion you drew. Sorry for the confusion there.


1. containerization is often implemented these days in terms of partial virtualization, because the historical approach that were essentially a bunch of variations of chrooting were not sufficiently isolated to create a sufficient security boundary for a multiuser "cloud" hosting service.

2. virtualization, as my update/comment up the thread said, the definition of virtualization being "host os code runs directly on the cpu" has been pretty much the standard definition for a couple of decades at this point. If you say you offer virtualization, but you implement it using an academically "accurate" definition that allows emulation I would imagine that you would have difficulty finding a user that accepts that definition. Again, as I've stated elsewhere "hardware virtualization support" that CPUs have acquired since the 90s is essentially multi-level page tables and cpu mode options so that a virtual machine runtime doesn't need to rewrite kernel mode code. It has not meaningfully impacted user mode code at all.

It is not reasonable to reject the evolution of language when considering the meaning of a word, nor the context of the environment in which it is discussed. The fact that 30 years ago you could say an emulator was a virtual machine is not relevant today, where the terminology very clearly does not include ISA emulation. This is as true for modern tech terminology like "virtualization" as it is for other tech terminology. For example, no one would accept me presenting a person good at maths as a "computer" either, despite that being what it used to mean.


> containerization is often implemented these days in terms of partial virtualization, because the historical approach that were essentially a bunch of variations of chrooting were not sufficiently isolated to create a sufficient security boundary for a multiuser "cloud" hosting service.

Containerisation has zero virtualisation. There's no virtual environment at all. It's just using kernel primitives to create security boundaries around native processes and syscalls.

You're also talking very Linux specific. Linux was late to the containerisation game. Like decades late. And frankly, I think FreeBSD and Solaris still have superior implementations too. Linux is getting there though.

> 2. virtualization, as my update/comment up the thread said, the definition of virtualization being "host os code runs directly on the cpu" has been pretty much the standard definition for a couple of decades at this point.

The issue that I take with your comment here is that it implies containerisation doesn't run directly on the CPU. Or that some forms of emulation cannot run directly on the CPU either -- sometimes the instruction sets are similar enough that you can dynamically translate the differences in real time.

The actual implementation of these things is, well, complicated. Making sweeping generalisations that one cannot do the other is naturally going to be inaccurate.

> If you say you offer virtualization, but you implement it using an academically "accurate" definition that allows emulation I would imagine that you would have difficulty finding a user that accepts that definition.

What you're talking about now is entirely product marketing. And that's what I take issue with when talking about this topics. Just because something is marketed as "emulation" or "virtualisation" it doesn't mean the disciplines of one cannot be a subset of the other.

> Again, as I've stated elsewhere "hardware virtualization support" that CPUs have acquired since the 90s is essentially multi-level page tables and cpu mode options so that a virtual machine runtime doesn't need to rewrite kernel mode code.

If you're talking about x86 (which I assume is the case because that's when virtualisation became a commodity) then you're out by about a decade. It was around 2006 when Intel and AMD released their x86 virt extensions and before then, VMWare ran like shit on consumer hardware.

I was there, using VMware 1.0 in the late 90s / early 00s. I remember the excitement I had for x86 finally adding hardware assistance.

Sure, there were ways to get virtualised software to run natively on x8 before then. But it didn't work for privileged code. And as we all know swapping memory in and out of ring-0 is expensive. So having that part of the process artificially slowed down killed VMware's performance for a lot of people.

This is also why paravirtualisation was such a popular concept back then. It allowed you to bypass those constraints.

> It is not reasonable to reject the evolution of language when considering the meaning of a word, nor the context of the environment in which it is discussed.

But that's the problem here. The evolution is completely arbitrary. It's based on marketing and PR rather than technology. In technical terms, there are multiple different ways to virtualise hardware (even without discussing emulation). And in technical terms, full virtual environments are still dependant on parts of that machine being emulated. So it's not unreasonable to argue that emulation is a subset of virtualisation. It always used to be considered that way and little has changed aside how companies now market their software to management.

> For example, no one would accept me presenting a person good at maths as a "computer" either, despite that being what it used to mean.

That's probably not the best of examples because even in the 1800s a "computer" wasn't just someone who was good at maths. It was someone who computed mathematical problems. The term was used to define the machine (albeit a fleshy organic one) rather than the skill.

To the end, you do still sometimes see people refer to themselves as a computer if they're manually computing stuff. It's typically said in jest though, but it does demonstrate that the term hasn't actually drifted as far from it's original meaning as you claim.

There definitely are other examples of terms that have evolved and I'm all for languages evolving. Plenty of terms we use in tech are borrowed terms that have evolved. Like `file`, `post`, `terminal`, etc. But they still refer to the same specific properties of computing which have evolved with time.

The problem with this virtualisation vs emulation discussion is that those concepts haven't evolved in the same dramatic way. Methods of virtualisation have diversified but it still relies on elements of emulation too. And modern emulation borrows a lot from principles learned through hardware assisted virtualisation. They're fields that are still heavily entwined. And "virtualisation" itself isn't a single method of implementation (neither is emulation for that matter, but more so for virtualisation). So arguing that emulation isn't a subset of virtualisation is just bullshit marketing.

And that's why I'm unwilling to acknowledge this rebranding of the term. Once you start building virtual machines you end up with a real mix and match of paravirtualisation, hardware assisted virtualisation, emulation and so on and so forth, all components powering the same VM.


u/hnlmorg edited one of their comments in this chain I recognise right now.

They said “you are just saying what I said” but now it has been altered to a much more agreeable and less aggressive statement. Shenanigans.

nonetheless I believe it is yours that is being accused of being modified, and I do agree with you.


Note that these aren't necessarily layered: You can virtualize with emulation, but you can also emulate without virtualization, which is what e.g. Rosetta does on macOS, or QEMU's userland emulation mode.


virtualization != containerization.


This was going to be an addendum to the above comment, but I took to long trying hunt down old marketing material for Virtual PC, anyway here is the intended addendum.

[edit: Giant addendum starts here, in response to comments in replies, rather than saying variations on the same things over and over again. Nothing preceding this comment has been changed or edited from my original post. As I've said elsewhere, I really wish HN provided edit history on comments]

First off, for people saying I'm talking about containers, I was not considering them at all, I consider them as essentially tangential to the topic virtual machines and virtualization. I haven't looked into modern container infrastructure to really understand the full nuances, my exceptionally vague understanding of the current state of the art is the original full-vm-per-container model docker, et al introduced has been improved somewhat to allow better resource sharing between clients and the host hardware than a full-vm provides but is still using some degree of virtualization provide better security boundaries than just basic chroot containers could ever do. I'm curious about exactly how much of a kernel modern container VMs have based on the comments in this thread, and if I ever have time between my job and long winded HN comments I'll try to look into it - I’d love good references on exactly how kernel level operation is split and shared in modern container implementation .

Anyway, as commonly used virtualization means "code in the VM runs directly of the host CPU”, and has done for more than two decades now. An academic definition of a virtual machine may include emulation, but if you see any company or person talking about supporting virtual machines, virtual hosts, or virtualized X in any medium - press, marketing, article (tech or non-tech press) - you will know that they are not talking about emulation. The reason is simply that the technical characteristics of an emulated machine are so drastically different that any use of emulation has to be explicitly called out. Hence absent any qualifier virtualization means the client code executes directly on the the host hardware. The introduction of hypervisors in the CPU simply meant that more things could be done directly on the host hardware without requiring expensive/slow runtime support from the VM runtime, it did not change the semantics of what “virtual machine” meant vs emulation even at the time CPUs with direct support for virtualization entered the general market.

Back when VMWare first started out a big part of their marketing and performance messaging boiled down to "Virtual Machine != emulation" and that push was pretty much the death knell for a definition of “virtualization” and “virtual machine” including emulation. As that model took off, "hypervisor" was introduced to general industry as the term for the CPU mechanism to support virtualization more efficiently (I'm sure in specialized industries and academia it existed earlier) by allowing _more_ code to run directly, but for the most part there was no change to userspace code in the client machine. Most of the early “hypervisor”/virtualization extensions (I believe on ARM they’re explicitly called the “virtualization extensions”, because virtualization does not mean emulation) were just making it easier for VM runtimes to avoid having to do anything to code running in kernel mode so that that code could be left to run directly on the host CPU as well.

The closest emulation ever got to "virtualization" in non-academic terminology that I recall is arguably "Virtual PC for Mac" (for young folk virtual pc was an x86 emulator for PPC macs that was eventually bought by MS IIRC), which said “virtual pc” in the product name. It did not however use the term virtualization, and was only ever described as explicitly emulation in the tech press, I certainly have no recollection of it ever even being described as a virtual machine even back during its window of relevance. I'd love to find actual marketing material from the era because I'm genuinely curious what it actually said, but the product name seems to have been reused over time so google search results are fairly terrible and my attempts in the wayback machine are also fairly scattershot :-/

But if we look at the context, once apple moved to x86, from Day 1 the Parallels marketing that targeted the rapidly irrelevant "virtual pc for Mac" product talked about using virtualization rather than emulation to get better performance than virtual pc, but the rapid decline in the relevance of PPC meant that talking about not being emulation ceased being relevant because the meaning of a virtual machine in common language is native client running code directly on the host CPU.

So while an academic argument may have existed that virtualization included emulation in the past, the reality is that the meaning of virtualization in any non-academic context since basically the late 90s has been client code runs directly on the host CPU, not via emulation. Given that well established meaning, my statement that virtualization of a non-host-architecture OS is definitionally not possible is a reasonable statement, that is correct in the context of the modern use of the word virtualization (again we’re talking a couple of decades here, not some change in the last few months).

If you really want to argue with this, I want you to ask yourself how you would respond if you had leased a hundred virtualized x86 systems, and then found half of them were running at 10% the speed of the rest because they were actually emulated hardware, and then if you think that a lawyer for that company would be able to successfully argue that the definition of “virtualization include emulation” would pass muster when you could bring in reps from every other provider, and every commercial VM product and none of them involved emulation, and every article published for decades about how [cloud or otherwise] VMs work (none of which mention emulation). If you really think that your response would be “ah you got me”, or that that argument would work in court, then fair play to you, you’re ok with your definition and we’ll have to agree to disagree, but I think the vast majority of people in tech would disagree.


[addendum: per u/pm215 Hypervisor.Framework does still apparently exist and supported on apple silicon apparently, I assume the absence of hardware docs just makes it miserable. OTOH maybe the Asahi GPU drivers, etc can work in that model? I really haven't ever done anything substantially more than what the WWDC demos do so am not a deep fount of knowledge here :D, to avoid confusion with replies I have not edited or changed my original comment. I kind of wish that HN UX exposed edit histories for comments, or provided separate addendum/correction options]

Virtualization.Framework is how you have to do virtualization on apple silicon as it is the userspace API layer that interacts with the kernel. There is no API you can use.

Virtualization.Framework is pretty much everything you need out of the box for a generic "I have an isolated virtual machine" model, basically it's just missing a configuration UI and main()

There are a couple of WWDC sessions over the last few WWDCs on using the framework, configuring rosetta, and improvements in the 2023 OS update

https://developer.apple.com/videos/play/wwdc2022/10002 https://developer.apple.com/videos/play/wwdc2023/10007

[1] commercial VM products probably require more work to compete in the market, things like the transparent desktop, window hosting, etc


Apple also documents the Virtualization framework fairly well at https://developer.apple.com/documentation/virtualization, with links to various code samples.

For example, https://developer.apple.com/documentation/virtualization/run...:

“This sample configures a virtual machine for a Linux-based operating system. You run the sample from the command line, and you specify the locations of the Linux kernel to run and initial RAM disk to load as command-line parameters. The sample configures the boot loader that the virtual machine requires to run the guest operating system, and it configures a console device to handle standard input and output. It then starts the virtual machine and exits when the Linux kernel shuts down.”


Interesting! I really need a cheap way to spin up an Apple Silicon container to create binaries for an open source project on GitHub. I don't want to spend money on an Apple Silicon runner in GitHub and I also don't want to run the build directly on my M2 MacBook Pro along with my other development work.


You don't have to use Virtualization.Framework. Hypervisor.Framework is the lower level API -- https://developer.apple.com/documentation/hypervisor ; QEMU uses that.


I was considering similar approach were I still stuck with Apple for work: make a Firecracker OCI runtime for MacOS. Fortunately Intune for Linux came around before I had to resort to that.


Virtualization.framework does most of the things Firecracker does on Linux. It's not literally the same, of course, but it does a comparable amount of the work for you. Here's an example application which uses it:

https://github.com/apinske/virt/blob/master/virt/virt/main.s...

And yes, that's really the whole thing. Once the VM is configured (which is what most of the code is concerned with), running it is fully handled by the framework.


Firecracker is also the distro that makes assumptions (and therefore boot time wins) about being run inside the Firecracker VMM, as far as I understand it. You'd also need the OCI runtime, and a Docker-compatible socket would make tons of sense.


I do like the idea of using container registries for VM images


I set up a workflow at $DAY_JOB for building the rootfs as a container and then "promoting" it to a vmdk and creating the ovf metadata file to allow it to be imported into VMWare as its own machine.

This was ~3 years ago and at least at the time I was annoyed at how little established tooling there seemed to be for doing an appliance image build offline— everyone was just like "why? Boot some public cloud-init template and use that as the basis for your terraform/ansible/whatever. If you actually need an OVF then export it from VMWare and be done with it."

On the other hand, once I got down in the weeds with things, I did find there were some bits that were a bit hairy about the promotion process— especially with "minimal" containers that have no users or init system, not to mention of course no filesystem, kernel, or bootloader, there is a fair bit that you have to do with a typical container to ready it for becoming a real boy.


Could put everything into the image - I mean the registry for just storage/transmission rather than reusing pre existing minimal container images


Are there any similar open source tools that allow you to manage MacOS VMs? I'm aware of Lima / Colima but it seems they're for Linux only.



Yes, there is VirtualBuddy and Viable (not sure if this one is Open Source):

- https://github.com/insidegui/VirtualBuddy - https://eclecticlight.co/virtualisation-on-apple-silicon/



Interesting that the macOS images are publicly available from GHCR. I would've thought that would cause legal problems.

As for storing images in an OCI registry, I can't quite tell if Tart is layer-aware. If you pull a macOS image, modify it, and push it back to the registry, will Tart simply push a new layer with modifications? I'm guessing this isn't possible.


macOS EULA has some special wording around "Permitted Developer Services" allowing more flexibility:

> Permitted Developer Services means continuous integration services, including but not limited to software development, building software from source, automated testing during software development, and running necessary developer tools to support such activities.

As for layering, Tart doesn't support it at the moment. It seems APFS is not particularly good for layering but Tart uses sparsed files when storing VMs locally. Sparsed disk image file basically "skips" zero bytes which saves a lot.


Serious question: how far can you go with base model's 8GB RAM?

Doing VM workflows is one reason I didn't bother with recent Macbooks, as nice as they are. It is simply much cheaper to get a machine with removable RAM and then upgrade them later. Without going there, I can also build a decent ThinkPad T14 with 32GB for around $1,100 even though RAM is soldered.


If you want to do VM workflows, I definitely would recommend to upgrade the RAM (and probably the SSD) when buying the machine. Yes, it is not cheap with Apple, but still no reason not to get an Apple machine, if you are in the market for one in the first place. The big bonus you get is not only the nice hardware overall, but the ability to run Linux on a very fast ARM-Machine.


I can edit video in Final Cut Pro on my 8GB M1 Mac Mini while doing other things.


> I can edit video in Final Cut Pro on my 8GB M1 Mac Mini while doing other things.

I can't use IntelliJ or vscode with autocompletion on a 2023 MacBook Air with 8GB of RAM with a bunch of my projects.

The same projects run like a breeze on a cheap and very crappy Beelink minipc with 16GB of RAM whose total cost is lower than a RAM upgrade on a MacBook Air.


> I can't use IntelliJ or vscode with autocompletion on a 2023 MacBook Air with 8GB of RAM with a bunch of my projects.

That's surprising. (More developer anecdata: https://duncsand.medium.com/is-apples-cheapest-macbook-good-...)

Still, I'd absolutely recommend that devs and other creators spend the extra $200 for 16GB. And yes, it's outrageously priced in comparison to buying matched sticks for your PC.


> Still, I'd absolutely recommend that devs and other creators spend the extra $200 for 16GB.

Nowadays a Beelink SEi12 i7-12650H sells for around $550, and it ships with 32GB of RAM by default. Beelink is ultra crappy, but it goes to show how absurd is the $200 markup demanded by Apple to turn one of their laptops into a decent working machine.


I've used both IntelliJ and vscode on significantly sized projects on an 8GB MBA with no issues. It's not as fast as with 16GB, but it's definitely not unusable.


> It's not as fast as with 16GB, but it's definitely not unusable.

I'm sure it varies with how large your projects are. To me the impact was serious enough to force me to shift my development work to a crappy minipc from Beelink.


I'm curious if a native editor like Panic's Nova or BBEdit would work better than a Java or Electron app?


Bbedit is a a lightweight editor compared to Intellij IDE. It is hard to compare both as they are on the same foot. But yes, if you can work with BBedit on a project go for it.

On the same note, sublime will still win editor performance competition on Mac and probably all platforms.


Does anyone know if GPU passthrough is available in any kind of macOS guests? I was hoping to run fast LLM inference in containers or virt guests.


There is a paravirtualized GPU available to macOS hosts. But it’s a bit limited and might not work for LLMs but worth trying since it might work.


Hey, big shout to CirrusLabs and the CirrusCI people because it's the coolest CI.

Edit: CirrusLabs is the org behind Tart I believe.


I don't get it - why would one pay for something that comes for free on every Mac? Bootstrapping one in Swift is quite straightforward and there are a number of tools and apps (with UI) like virt (https://github.com/apinske/virt)


I guess their selling point is the container registry?


This tool used to be open (AGPLv3 for some reason; it’s not a network service) and they changed to this crap license once they realized they had something good.

The AGPL version is available in homebrew.


Realization for the license change was a bit different. From day 1 we knew Tart is good and hoped to build a healthy ecosystem around it. Unfortunately, our enthusiasm wasn't met by big companies. You can check this blog post for more details https://tart.run/blog/2023/02/11/changing-tart-license/.

The free usage is pretty generous: free on any amount of personal computers including personal workstations and up to 100 cores (12 Mac Minis) when installed on servers. If your organizations needs a bigger installation then it's probably values the product and can budget out a little bit to support it.


I adore tart. I've used it over a year for work. It works very well with Cirrus Labs' images.


Can this be used to setup a dev Linux VM ? Run Jetbrains IDEs etc on the VM ? And if I were to run PyTorch then will it have access to GPU / MPS ?


You can setup a Linux dev VM but unfortunately it won’t have access to the GPU. There is Rosetta 2 support though which works brilliantly with amd64 binaries.


I tried the ubuntu lastest image, but don't know what user and password use to log into. Any idea?


If you just need Ubuntu then you can try "Multipass" from Canonical (https://multipass.run/). Works quite well on my M2 Air. I haven't tried using Linux GUI with it though as I need only terminal based VMs.


creativity, should be user ubuntu pass ubuntu


I have tried user ubuntu with ... empty , pass, ubuntu. None of them works.


admin/admin worked for me


Thanks! working now


Has no one here heard of mutagen before? It solves our performance issue regarding syncing


What's the relationship with Veertu/Anka? The FAQ talks about Anka a lot.


Initially there were a lot of interest from Anka users to migrate to a more scalable and DX friendly tool. That’s why FAQ started with comparison. In a nutshell these tools are similar but taking different approaches for the same problems: OCI registries vs own implementation, CLI first vs REST oriented, Packer/VNC vs own scripting, etc.


Is there a modern easy GUI that allows snapshots without reaching for QEMU?


Yes, VMTek does snapshots natively and takes advantage of APFS. We have a great graphical snapshot UI as well.


Interesting, thanks. I was looking for an open source solution though, sorry for not specifying.


Does VMTek support *BSDs (i.e. FreeBSD / OpenBSD)


Can Tart be used with Vagrant?


Please check out and comment on this issue. There is some movement around but primary it’s waiting for the new promised Vagrant version in Go.

https://github.com/hashicorp/vagrant/issues/12760


APPROPRIATELY-SIZED WARNING: The website and GitHub repo do not make it immediately obvious that this project isn't open source* and that if you use it at work your company may have a bad time.

*: https://opensource.org/osd/

I wonder if we could add something like crowd-voted tags to submissions, to include e.g. the license for software that has a GitHub link.


Licenses like this inspire me to recreate it from scratch and redistribute under open source. Some would call this a “race to the bottom” but I see it as doing my part to contribute to chaos


No need; prior to this license they licensed it under AGPLv3, even though it is not a network service. You can use that version as-is or you could extend it.

Commits prior to the 1.0 release and the license change are there in GitHub, last I looked.


If you obtain and use software for commercial purposes without checking the licensing and usage terms, you need to grab a lawyer and have him give you a crash course at your leisure before some lawyers come knocking to give you a crash course at their leisure.


LICENSE file in the repo is also pretty clear in its intended limitations.

However for the license to be effective it probably needs to be rewritten to handle agents of organizations. As it reads now, arguably 30 employees could deploy the software on 500 6-core mac's without violating the license.


Seemed pretty obvious to me after looking at the "Support & Licensing" section of the website. They explain it in clear terms.


Pretty much nothing about the website and repo indicated that this is an open source project. Why would anyone just assume that it is? Hell, one of the quick links at the top of the home page is "Support & Licensing".

Open source licensing is still licensing. I feel like people increasingly treat it as "yay freebies for me" and not as actual licenses that you still have to look for, read, and comply with.


> Pretty much nothing about the website and repo indicated that this is an open source project. Why would anyone just assume that it is?

There is a prominent GitHub link to the codebase.

I think the GitHub terms of service demand that public repo owners grant public downloaders minimal rights to view and use (but not necessarily redistribute) the code.

But the LICENSE file disallows use of the code to some. If the GitHub ToS does what I recall, it seems possible that the copyright license and the repo's presence on GitHub are actually in conflict with each other. (I'm not at all expert here, though.)


if we could add something like crowd-voted tags to submissions

the repetitive license threads already add too much noise to most submissions (there are already two in this one) - anyone interested in using this can just read the license that comes with the repo. Just about every submission has more interesting things to discuss than license stuff.



heads up: it's under one of those BSL-esque weirdo licenses [1] parameterised on seats and, get this, a seat is defined as a single CPU core (if you are not an individual). so don't get any ideas about running it on more than 5 mac studios if you're a university that wants to run CI for some open-source project along with those mirrors.

[1]: https://fair.io/?a


> What counts as “using” Fair Source licensed software with respect to the Use Limitation (e.g., 25-user limit)?

> The license doesn’t define “use” exactly because the way people deploy software changes all the time. Instead, it relies on a common-sense definition. For example, executing the software, modifying the code, or accessing a running copy of the software constitutes “use.”

Appealing to common sense for a critical definition in a binding license agreement? What could go wrong!


I don't know what the intended audience is. But for managing many instances among a few mac studios, it's much better to invest an afternoon to get the right qemu command and just use that, instead of all these fancy UIs.


Just use lima. https://lima-vm.io/

You have an option to use Native or QEMU.


Or, if you're doing containers (like Tart/Virt): colima. (which uses lima, of course).

https://github.com/abiosoft/colima


From a quick look at Lima, I don't think it's exactly the same thing. Tart allows running macOS and Linux VMs, while Lima seems focused on Linux VMs only. I don't use Tart, but having infrastructure-as-code-like tooling that can be used to define macOS environments (and store the VMs in container registries) sounds useful and I'm not aware of another solution that does it.


Does Lima only support Linux guests? The page seems to suggest so. Tart supports macOS guests too.


Also check out https://www.getvmtek.com for something very polished. We just released a huge update a few days ago.


Kind of a dark pattern to hide the license price (last line in the AppStore page -> in-app purchases). Should be really prominently shown on the "buy now" page.

Also the features page is garbage. Wall of text with fairly generic stuff while it's still unclear: Can it run Windows? Can it run Linux? Arm64, x64 or both? MacOS?

Your main competition is VMWare Fusion and Parallels. See what features they advertise, make sure you are better and cheaper. Currently it looks like a university project rather than a real product.


It is a webpage that was quickly put together. We are a small company and our focus has been on the product. You can run Linux and macOS currently. That is listed on the features page as well as the app store product listing.

There is no dark pattern, it is actually a problem with the way Apple allows developers to sell software on the Mac App Store. We don't have proper control over the process and thus end up with this convoluted purchasing system that is more geared towards subscriptions - which are the real dark pattern nowadays. We sell without any subscriptions - at a very fair price that is extremely competitive with other products. The old upgrade pricing model was a lot more fair to both developers and users and we are sticking to it. This is the only way to offer a free trial on the App Store without requiring a separate installation, which would be inconvenient for everyone.

Far from a university project - but your opinion is yours to keep. It is a shame you are so negative, as it really is a labour of love and a quality Mac app. In fact we have a lot of firsts here, no one has really done a lot of these things (snapshots, suspend & resume, proper dynamic resolution with retina support) with the new Apple framework yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: