Introduction

These are my notes I tok while learning about Nix and NixOS. They are endorsed by no one, not even myself.

My approach when learning something is to explain it very basically, beginning with basic principles and working up to more complex topics without leaving any gaps, trying to state all but the most obvious of assumptions. In other words, a good way to learn is to (pretend to) explain something to a child, hence the title of these notes - "ELI5" stands for "Explain like I'm 5".

I don't quite take this literally. For example, I may not define "software" the way if might if I were literally talking to a 5-year-old. Still, I like to learn by explaining these very thoroughly and methodically.

Without further ado, then - what is Nix?

What Is Nix?

This is already a slightly tricky question, because Nix can mean multiple things. On the one hand, it is a package manager. On the other hand, it is also the programming language in which the package manager is written. And finally, NixOS is the name of the operating system (a distribution or flavor of Linux) that is built with, and according to the principles of, the Nix language and package manager.

So Nix is a package manager, Nix is a programming language, and NixOS is a so-called distribution of Linux (an operating system).

A few words about NixOS at this point - it is indisputably Linux, GNU/Linux to be precise. It is quite different from most other distributions of Linux because its directory system is quite different, in ways which will soon become clear, and for reasons which will also soon become clear. One exception is Guix System, a related distribution of GNU/Linux that is built on the same general principles, but using a different language (Guile Scheme, a dialect of Lisp) and which does not allow any unfree software at all, unlike NixOS, which allows users to opt in to unfree software.

So NixOS is a distribution of Linux that is quite different from most other distributions, and it is built around the Nix package manager, which in turn is built on the Nix programming language. What is the Nix language built on? Theory and C++. Getting into the details of how Nix is implemented in C++ (or, by extension, how it might be implemented in any other Turing-complete language) is a more advanced topic; to really understand Nix, it is more important to understand the theory behind it and to see examples of its practical usage, so we'll start with that.

Having said a few words about what Nix and NixOS are, I will use a simple convention from now on: If it is ambiguous whether I mean the language or the package manager, I will specify say "Nix [lang]" or "Nix [pm]". In case where it is perfectly clear from the context or where it doesn't make a difference, of course, I will just say Nix. And typically if I can just say "Nix", whatever I say will apply to NixOS as well.

The next step in understanding Nix is to understand the background and motivation behind why we might want such a thing, why it represents such a profound paradigm shift, and why it is (in my opinion and the opinion of many other happy Nix users) such a nice thing to have.

Problems with Package Management

First, a word about terminology - a more precise term may be "software deployment". But for an entry-level explanation of Nix, the important thing is just to pick a term and be clear about what it means. So when I say "package management", I'm also referring to software deployment, software dependency management, and so on. Basically, we're talking about using software that depends on other software, whether it be in development or deployment.

So for the very big picture: Software usually depends on other software. In theory, it is possible to write everything from scratch if you are both exremely talented and extremely insane. In practice, it is much nicer to 'stand on the shoulders of giants' and not 'reinvent the wheel', using instead software that has already been written.

Good software, in some ways, is like a living organism. New versions are released from time to time, sometimes to add features, sometimes to fix bugs and to patch security vulnerabilities, and sometimes just to improve code quality. It is generally a good idea to keep up-to-date on software, but in, practice, some projects are better than others at keeping their dependencies up-to-date. If we're unlucky, this can lead to dependency conflicts, where our project has dependencies which we'll call A and B, and A depends on a different, older version of B. Under "vanilla" Linux, we can't have two versions of the package installed. We may have to use an older version of A, or maybe "vendor" one of the packages, copying it into our codebase and maintaining it as our own, which of course is extra work we'd probably prefer to avoid. Obviously, waiting for B to be updated is also far from ideal.

It's also possible for A and B to each depend on a third package, C, but require different versions of it, once again requiring some suboptimal and frustrating solution.

There is another common problem - packages with unclear dependencies. In some cases, these dependencies are unknown or unrecognized because they are usually present on a Linux system. But if they are missing, the project may fail to build or run. Anyone who has tried to install packages on a fresh install of Ubuntu or Debian has probably encountered this, for example. Usually the fix is simple - you search the error and find a Stack Overflow answer explaining which packages you need to install in order for the packages to build or run. Still, this kind of approach to software installation is mildly irritating and clearly suboptimal. Sometimes, the problem is not so simple and developers can waste hours trying to figure out exactly which dependencies are missing.

Finally, another example of something about the status quo that we might want to improve upon - getting programs configured right is a fine art, and "dotfiles" (configuration files) are very important to users who want programs to look or behave a certain way. Configurations are an important part of the software, and managing configurations on multiple devices can be tedious. This can also be problematic when developers are working on the same project. The "it works on my machine" trope has become a meme precisely because differences between local environments can be so difficult to track down and to manage.

These are just a few examples to illustrate why installing software, and developing software that depends on other software, is not always as straightforward a process as one might assume or hope. So what to do? And what do people do?

Non-Nix Solutions

Most programming languages have built-in package managers, or packages managers that are at least closely associated with and developed for the language. In the world of Linux, package managers and package repositories are among the most important differences between distributions.

Some programming languages or tools also provide "virtual environments" (as in the case of Python) or similar, to isolate dependencies in one project from the dependencies used in other projects or globally on the system.

There are also other approaches. One of the most popular is containerization, similar to virtual machines (simulating another computer via software, and working within this 'virtual' computer or virtual machine), but more lightweight. Basically, it shares CPU resources with the global system, but the filepath hierarchy is separate and access to the global system must be explicitly granted.

These approaches have their benefits, but they are still far from ideal. Containers are still fairly heavy, as they require installing nearly everything needed for a program to run inside the container. Images (the disk snapshot of a container) can be several or even tens of GB large. And this still leaves the possibility of dependency conflicts or dependencies that are not fully, specified, requiring a tedious process of trial-and-error to find and install missing dependencies.

Similarly, virtual environments require a lot of duplication of code, and external dependencies (system libraries, drivers, headers in the case of languages like C/C++, etc.) are not installed inside the virtual library. Then there is also the case of dependency resolution. Again, in the case of Python - a popular and relatively simple programming language with notoriously problematic package and environment management - there are tools like Poetry for dependency resolution and virtual environment management, which do improve the situation, but even so, dependency resolution may fail.

So what to do?

Desiderata for a Package Manager

At risk of sounding too didactic, it can be very helpful, before going on to learn about Nix, to think about how you might go about trying to solve these problems, or at least mitigate them.

Avoid Duplication

Ideally, we would like to avoid unnecessary duplication of dependencies. Even more importantly, we want to know what all dependencies of a project are, so that we can get it right the first time - no trial-and-error. It would be nice to have the different versions of the same package if we happen to need both of them, and this is true whether we need them simultaneously of simply for different projects.

Declarative over Imperative / Procedural Package Management

Another thing that would be great is declarativeness. Usually we have to install packages as we need them, one after the other, as we figure out we need them. The traditional approach to dependency management is, to some degree, imperative - we can't always just say what we want; we have to tell the package manager how to do it. This is more true of some tools than others (Poetry vs Pip, for example), but generally, we have to spell it out, step-by-step, more than we would like to. apt or pacman, for example, will often raise an error if a dependency is missing, rather than just installing it. The best package managers are good at figuring out which dependencies are required and doing it automatically, rather than requiring the user to imperatively install dependencies. Declarativeness is nice - we say what and the package manager figures out how.

To be fair, there are install scripts. A script is clearly imperative - it is, after, just a sequence of instructions - by definition. At the same time, it is a more declarative approach to imperativeness, since it records the steps all in one place, in a reproducible manner. This, I believe, is why scripts are so popular. It's also why Ansible is such a powerful and popular approach to configuration management. But that doesn't that the scripting approach can't be improved upon.

Fine-Grained Control

While we're dreaming, we might also add a few more items to the wish list. It would be nice to be able to specify different options when installing software. A package manager powerful enough to provide fine-grained control would be nice.

Pre-Built Packages

On the other hand, it would be tedious and in most cases undesirable to have to specify everything. Moreover, having to build everything from source is also rarely desirable to non-Gentoo users. So having the option of installing pre-built packages where available and building where necessary would be the best of both worlds. But we would need to be sure that pre-built binaries, compressed files, etc. actually met the requirements.

Garbage Collection

What about when packages build up and take up a lot of space on the hard drive, but we're afraid to remove them because we don't know if something still needs them? Of course, most package managers do support dependency graphs and can sometimes provide information on what is still required by what, but in practice, it is far from perfect, especially since in practice, not all packages are managed just by one package manager. On Debian and its children, for example, you might install most packages with apt, but some from source, some through various programming languages, maybe npm or Cargo. It is not uncommon to uninstall something you think is not longer needed, only to find that removing the package broke something in unexpected ways.

You might try to be at least a bit more declarative about installing packages. You might keep track of what exactly you installed and do your best to roll back the state of your environment by tracing your steps and trying to undo them. Some commands like apt autoremove are helpful for this sort of thing. But having totally reliable garbage collection would be nice. Try reading any moderately complex uninstall script, and you'll see why this might be such a nice feature to have. On 'vanilla' Linux, programs can install a lot of files and data in different places. /usr/bin, /usr/share, ~/.cache, ~/.local, ~/.config, ~/, ~/etc, and so on.

Rollbacks

What about being able to go back to a previous state, like disk backups without the hassle? How would that be possible? what if an installation fails and we are left with chaos, wishing we could go back to a previous state when we knew everything worked? Rollbacks would be another great feature to have. Most users who install Arch Linux don't install it just once per machine. Often clean re-installs become necessary.

No Side Effects

While we're at it, what about when you install / uninstall one thing and it completely borks something else? Sometimes even something seemingly unrelated. Clearly side effects are not ideal. If we could install packages in such a way that they 'left other packages alone', that would be great.

Peaceful Co-Existence of Different Versions

We already mentioned it, and it's related to the point above, but there are often cases where would be desirable to have different versions of the same package side-by side. If there is just one available place for a package of a certain name, how is this even possible? It isn't, not in the $PATH-searching approach, where the first executable named, for example, scala found in any of the directories in the $PATH environment variable is used, we just can't have another installed and available. So clearly some innovation is needed - consider this a teaser.

Conclusions

Ok, you're probably thinking this is all a bit contrived, that I'm just cherry- picking features that I already know exist in Nix. That's a fair objection, but I still believe this is a useful exercise, as it puts some of the features of Nix into perspective, especially the trickier bits. As you can see, there are a lot of perfectly reasonable requirements that traditional packagement failed to meet, and understanding why and how prior systems failed is key to understanding Nix and its design.

Rough Sketch of a Solution

How would we go about deisgning an workable implementation of all or even some of out desiderata we just listed?

TODO, WIP

Interlude 1 - Why is Nix hard?

  • unfamiliar language with mind-stretching new concepts - unless you are a Haskell (or PureScript, Idris, etc.) developer, the Nix language will take some getting used to. If you're coming from a background in C, C++, Python, Java, Javascript, C# or similar, you are used to thinking of programming in terms of sequences of instructions. Sure, objects abstract this away to some extent, but you can still find the entry point and trace what happens, step-by-step. This has the indisputable advantage of being close to the way computers actually work: in assembly language (and to an even greater extent in machine code), your program really is just a sequence of instructions being executed. Functional programming is a bit further removed from how computers actually work. To what end? The benefit of functional programming is the focus on a clean mapping from input to output without side effects - in other words, functional programming is further from the CPU and closer to mathematics. In practice, this means that Nix can be a bit frustrating to read at first, because order of execution doesn't matter like it does in, say, Python. Nix is declarative, and this is a huge advantage, but it does mean that reading Nix feels different than reading a more 'normie' language. Nis has been described as 'JSON with functions', and I find this is a helpful way of thinking about it. Of course, it also has a few quirks like 'let-in syntax', but with time and experience, the benefits of (nearly) every design choice will become clear and they will become more of a delight and less of a hindrance to understanding.
  • terribly unhelpful error messages - even the core team of Nix will acknowledge this. This is closely related to the point above; functional languages, because they are inherently less sequential, can make it harder to localize the exact cause of the error, because sometimes expressions are evaluated in a different place than where they are written in your source code, and the error then refers to that evaluation-related location, rather than the location where you would actually fix the error. The Nix core developers have this on their radar and are working on it, but it's one of the poin points that people generally learn to live with. Over time, debugging Nix is something you will get a feel for, even if the error traces are less immediately helpful than in, say, Julia or Python.
  • lacking documentation - this is getting better all the time, but it's uncontroversial to say that Nix/NixOS does not have the same caliber of documentation as, say, PHP or C#. The ecosystem can feel a bit chaotic, especially at first when you're stilling getting a feel for it.
  • slightly awkward transition phase - at the time of writing (January 2024), Nix flakes and the 'Nix command' (nix <subcommand> rather than nix-<command>) are in the awkward position of being considered best practices by the community while also being officially 'experimental'. This means that most of the official documentation focuses on the 'stable', non-experimental commands and features. There are some excellent sources that remedy this, and this is getting better with time. But the transition from 'Nix 2' to 'Nix '3.0' (not yet released, but planned to include flakes and the Nix command as first-class citizens) is not entirely painless. Still, it's probably better than Scala's 2-to-3 migration and indisputably much better than Python's 2-to-3 chaos :)

Interlude 2 - Why is Nix easy?

Having just talked about why Nix is hard, it's only fair to ask on the other hand why Nix is easy. Here are a few things that come to mind:

  • bizarrely easy rollbacks - nearly impossible to really bork your system (i.e., you'd have to try). This is because if something goes wrong, you can simply reboot and select the last (or any previous) "generation" where everything was stable and working just fine. Then, rather than losing months or even years' worth of accumulated experience and configuration, you just lose whatever changes you made since the last time you ran nixos-rebuild switch (same goes for home-manager switch).
  • wonderful community - perhaps surprising for a functional language, which sometimes have a reputation for having elitist, snobbish, unwelcoming communities compared to some wonderfully welcoming and helpful communities like (in my experience) the Julia, Raku, or Rust communities. In my experience, NixOS communities, channels, and fora - both official and unofficial - have more in common with the latter. Much like with Rust, NixOS tends to inspire a certain kind of mostly-benign fanaticism, and this has the wonderful side-effect of willingness - often even eagerness - to help new beginners.
  • availability of software - Nixpkgs is the best software repository out there. I'm asserting this without evidence, but take a look at repology.org - it's impressive. And maybe the best part is related to the next point - if something is missing, you can still use it in the same way (and with the same functionality and advantages) by writing your own expression for it, and you can add it to Nixpkgs using basic Git and Github.
  • extensibility - once you have reached a certain level of proficiency (and yes, that is easier said than done, but it is doable), the possibilities are limitless. Nix is extremely powerful and allows fine-grained control over small details. If you want to change the way something is built, you can do that in Nix. If you want to have the same configuration for all of your devices, but with certain differences, you can do that. If you want to write your own functions to control how Nix configures your favorite software, you can do that.
  • everything all in one place - coming from other Linux distros (with the exception of Guix System, which is a GNU-y and LISP-y spinoff of NixOS), it feels like a luxury to be able to declare and configure everything all in one folder that is tracked by Git. This means backups are trivially easy - built-in, in fact. And you don't need to look through a dozen locations to find where a given app is getting its configuration from. Having a clean overview of your system is priceless.