Suppose that a codebase has two different functions with two different purposes,...

pchiusano · on July 5, 2024

If I understand your question, this would work much the same way as any other language. Suppose you have:

  allocationPolicy = 23484

  -- two usages of allocationPolicy
  foo = allocationPolicy + 1
  bar = allocationPolicy + 99

You then later realize you want different allocation policies to be used in different parts of your app. You first might want to rename the existing `allocationPolicy`:

  move.term allocationPolicy defaultAllocationPolicy

At this point, all the code still references that hash, which now has the name `defaultAllocationPolicy`.

Next if (say) you wanted `foo` to reference a different definition, you'd `edit foo`, and introduce:

  fooAllocationPolicy = 283

  foo = fooAllocationPolicy + 1

Then `update` and you're done. The new version of `foo` references `fooAllocationPolicy` while `bar` continues to reference `defaultAllocationPolicy`.

A couple other notes -

* It's very rare for independent implementations to end up with the same hash. It probably only happens for some very simple defintions that exist in base. (Like the identity function, say)

* If a hash has multiple names in your project because you've used `alias.term` to do so explicitly, the pretty-printer picks one using a deterministic rule (it prefers names you've given that hash in your project, then it consults library dependencies). If you really want to give two definitions different hashes even though they are functionally the same, you can introduce a minor change, like an unused binding.

* The type of a definition is part of its hash, so sometimes you might specialize a more generic function with a more narrow type signature, and this gets its own hash.

* The OP is slightly out of date re: patches. We use something simpler now for updates and merges. When you update, we compare the new and old namespace to obtain a diff, which is applied to the ASTs in the namespace. If the result typechecks, you're done. If not, we make a minimal scratch file for you to get typechecking - it will contain the minimal transitive dependents of the change.

LegionMammal978 · on July 5, 2024

Thank you for the more detailed explanations. I was mainly wondering about the more philosophical concern of "unintended identical hashes" causing coupling between parts of the codebase with different purposes (which may want to evolve apart in the future), which you say is thankfully rare in practice.

For instance, say you have a function which generates a certain business report, and your boss wants you to fiddle with the formatting every quarter. Your colleague has a similar function to generate a report with the same data, but according to their own boss's quarterly formatting requirements. With content-based identity, it would seem like you have to be wary of your own and your colleague's reports ever aligning, lest they have the same hash and lose their distinct identities.

> If you really want to give two definitions different hashes even though they are functionally the same, you can introduce a minor change, like an unused binding.

Interesting. Is there at least any way to detect this condition before it occurs? (I.e., to know if you're trying to define a new function which happens to have the same hash as an existing one.)

zawodnaya · on July 6, 2024

Yes, unison detects this condition when it occurs and tells you about it.

carapace · on July 5, 2024

> two different functions ... have the same implementation

Then they are not different functions.

> It would seem like there is no way to disambiguate a function apart from its current implementation.

Right, that's the whole point.

In Unison the name of a function is a hash of its implementation. Change the implementation and you change the identity.

> how do you reliably separate the two in this model?

There is no way to confuse the two in this model.

DatoClement · on July 5, 2024

My understanding is that the unique hash is really the true identifier of the implementation. So when you change the implementation, the two functions will have two different hashes? I am not sure if this is exactly your problem.