Dune Package Management RFC

Initially posted on GitHub

This document serves as a Request for Comments (RFC) on adding package management support to Dune.

At present, the project is in the prototyping stage, and we are exploring different approaches. Consequently, this document is intended to provide a guideline rather than a definitive specification for the implementation of package management in Dune. As we continue to prototype features and make progress, we will update this document accordingly.

The primary aim of this document is to stimulate community feedback and invite discussions about our plans.

We will generally stay high-level and describe the user experience rather than how we intend to implement each part. The specific implementation details can be discussed in separate issues and Pull Requests.

Outlined below are the high-level principles that guide our integration of package management into Dune:

  1. Seamless integration: Package management features should be fully integrated. Users shouldn't have to read the opam manual to learn how to use package management features in Dune.
  2. Compatibility with existing features: The package management features should work seamlessly with existing Dune features, such as watch mode. These features should work in an obvious way with package management.
  3. Windows support: The first stable version of the package management system must include Windows support; it will not be considered stable until Windows is supported.
  4. Cross-compilation: Support for cross-compilation should be a first-class citizen.
  5. Build directory isolation: No state should exist outside the _build directory.
  6. Explicit state modification: Commands that modify the state implicitly, such as pinning, are forbidden.
  7. Performance: Features should be as fast as they can be.

TL;DR: Here are some elements from the RFC we'd like to highlight:

Overview

Package management in Dune aims to improve the user experience on the OCaml Platform by addressing the friction caused by using separate opam and dune CLI tools.

From the user's perspective, downloading and building dependencies become an integral part of building a project. Dune becomes the primary tool that users install on their systems to work in OCaml. The workflow to go from a clean environment to a compiled project looks like:

$ apt install dune # or winget install dune
$ git clone git@github.com/xyz/project-a.git
$ cd project-a
$ dune build

Executing dune build will download and build the project's dependencies before proceeding with the project build.

Note: extending the dune build command to install dependencies is still under discussion from the Dune team and we might introduce a separate command to install dependencies.

To achieve this, several steps are taken under the hood:

  1. Dune retrieves project dependencies from the dune-project file, where they are defined in the existing package stanza.
  2. Dune reads the dune-workspace file to gather additional workspace configurations, such as the opam repositories to use and the compiler version. If no dune-workspace file is present, Dune will use sensible defaults like the ocaml/opam-repository and the latest version of the compiler.
  3. Using the dependencies and configuration, Dune generates a lock file containing specific versions of each dependency, their source, and build commands.
  4. Relying on the generated lockfile, Dune fetches the dependencies, stores them in the _build directory, and builds them.
  5. Once the dependencies are built, the project build proceeds as usual by compiling the project's source tree.

An important aspect of our current plan involves leveraging opam as a library, often referred to as opam 3. This approach allows us to re-use existing opam features that we would otherwise need to re-implement in Dune.

In this scenario, the opam client will continue to exist as a separate tool that utilizes the opam library. This separation ensures that the opam client remains available for non-dune users, and enforces a compatibility between the Dune and opam client package management functionnalities.

It is important to note that following Principle 1. the use of opam as a library is considered an implementation detail and will not have any impact on the user.

Fetching sources

PR: Source fetching

The source fetching in Dune will support every source specified in the opam url field, ensuring compatibility with existing packages on the opam-repository. Additionally, Dune will also support local sources, offering a workflow to work on changes spanning multiple projects.

When fetching sources, Dune will create a dedicated directory within the _build directory. This directory is intended for Dune's private use and will be hidden from the user.

Caching

RFC: Shared Build Sandboxes

Dune's package management will use a global cache for all dependencies. The caching mechanism will also work with the compiler, eliminating the need to recompile it for each project.

The shared build sandboxes will be set up in a way that they produce the same paths regardless of the workspace location. This is achieved by maintaining all sandboxes in a single directory and using a hash based on the action and its dependencies. A file lock system will be used to manage concurrent access to the sandbox during the build process.

Sandboxing

All components and state associated with a project remain confined within the project itself, with the exception of the cache. The best comparison with opam is that we're using local opam switch for every project.

We plan to support all the sandboxing functionality offered by opam through bubble wrap. Moreover, we'll forbid package builds from accessing packages that aren't listed as dependencies.

All state information is contained within the _build directory, which is internal and managed exclusively by Dune.

We plan to implement caching for the compiler to prevent the need for recompiling the compiler for every project.

By default, the opam-system compiler will not be used.

It's worth noting that opam switches won't be supported by Dune, the sandboxing mechanism is specific to Dune.

State and Configuration

All components and settings are maintained within the project, and that nothing resides outside the workspace.

Configuration information is stored in the dune-workspace file, which includes details about the opam repository, compiler, and sources.

Dune will support multiple opam repositories within the dune-workspace.

We will also keep the notion of context in the dune-workspace, where users can define multiple contexts with different configurations. This will make testing with different versions of the compiler, or different opam repository extremely easy.

Building opam Packages

PR: Build opam Packages

As a requirement to build every packages from the opam repository, we will implement support for building arbitrary opam packages, by reading build instructions from opam files, executing them and producing build artifacts in the _build directory.

Support for building opam packages will extend to vendored packages as well. Dune will be capable of vendoring non-dune opam packages.

In the initial version, for opam packages that already use Dune, the build process will remain the same as for other opam packages. Dune will be installed as a dependency and the installed Dune will be used to build the dependencies.

To keep things simple for the initial version, this workflow will be used universally, with the possibility of optimizing and using fully composed rules in future versions.

Cross-compilation

Dune will support cross-compilation for all opam packages, eliminating the need to port packages to Dune to leverage cross-compilation.

Cross-compilation support is possible under the condition that code generators are located in a separate package from their users. Specifically, a package cannot build a code generator and use it simultaneously.

There are still unanswered questions, like how to support packages that already support cross-compilation without Dune, such as the low-level Mirage packages (mirage-*-solo5, mirage-*-xen, etc.), or packages that install cross-compilers such as ocaml-freestanding. We aim to maintain compatibility and work with these existing cross-compilation configurations and we'll explore solutions for these as we prototype building opam packages.

Windows support

Following Principle 3., we will not accept features or changes that would cause issues on Windows. Support for Windows will be first-class as soon as the first stable version. We will be uncompromising on this.

Lockfiles

Dune will produce a lockfile when building projects. The lockfile will contain a stanza listing all dependencies, their sources, and the required build actions. When a lockfile is present, dune build will use it by default.

A important design for the lockfile is that it contains both sources and build commands, which means that reading from the opam-repository is not required when a lockfile is available. As a result, when a lockfile is available, we're removing two steps that are currently needed when installing dependencies with opam: (1) running the solver and (2) reading from the opam-repository to get the build instructions and source. This will make compilation of projects significantly faster, essentially consisting of downloading the sources of the dependencies and building them.

The generation of the lockfile isn't part of the Build, Dune will provide a separate command (e.g. dune lock) to generate the lockfile.

Watch Mode

Following Principle 2., package management will integrate well with watch mode and will work in the way that users would expect (i.e. in an obvious way).

Dune will monitor the dune-project and dune-workspace files as well as the lock file and will automatically update the dependencies on changes. From a user point of view, this means that you won't need to leave the editor when running Dune in watch mode. Every change on the workspace can be performed by updating the source files.

System Dependencies

Dune will not automatically install missing system dependencies (aka depext). Instead, it will provide a user interface to inform users about missing dependencies and offer hints on how to install them. Building on top of opam's depext system, Dune can include build rules that check for the availability of required system packages and provide custom error messages when these rules do not succeed.

To facilitate the installation of system dependencies, we will explore the possibility of introducing an explicit command that handles the installation of system dependencies. This command would leverage the functionality of opam-depext to test the presence or absence of such dependencies and install them if needed.

Vendoring and Pinning

In version 1, there will be no specific vendoring workflow beyond the existing vendoring features already supported by Dune.

However, an alternative workflow will be provided by supporting the local filesystem as a source for dependencies. This can be configured in the dune-workspace. With this approach, Dune will be able to monitor changes in the dependencies sources and rebuild the project as needed. This effectively supports use cases to work on multiple projects at the same time.

This also offers an alternative workflow to opam pins. It's worth noting that opam pins are not supported by the opam repository, therefore this configuration is specific to the workspace (hence why it belongs to dune-workspace and not dune-project), and it will not be used when releasing packages.

This workflow enables substituting custom sources for any package listed in the lock file, not only the leaves. Sources for packages are typically defined by actions that fetch URLs; however, with this approach, the source of a package can be populated from a directory in the workspace. To implement this feature, the corresponding directory should be excluded from the normal build process by using data_only_dirs. This approach addresses the pain point of having to vendor all the transitive reverse dependencies of any deep dependency that needs modification, allowing for more flexibility in vendoring specific parts of the package graph.

Solver

In the initial version of the integrated package management, Dune will use 0install as its solver.

0install will receive its input from the dune-project package field and the dune-workspace configuration.

Users will have the option to point to alternative opam-repository and even use multiple repositories in their dune-workspace configuration.

From experience with opam-monorepo, which uses 0install, we expect that one challenge will be the clarity of error messages. While we recognise that having user friendly error messages from solvers is generally a hard challenge, we will be working upstream when appropriate to improve the quality of error messages.