Toward a New CL Project Index

Written on 2021-09-05 22:00:00

Quicklisp has had a profound impact on the CL community. It's transformed the way CL devs share libraries, made it easier and encouraged devs to re-use existing code instead of implementing everything in house, and is widely used. While Quicklisp took the CL community a huge step forward, I nevertheless think we can and should do better.

To that end, I've been working on two interlinked projects, CLPM and the Common Lisp Project Index (CLPI). I've posted about CLPM in various places before and awareness of it is already growing in the CL community. Therefore, this post will focus on CLPI and why I think it is important. My ultimate goal is to find like-minded people to collaborate with on bringing CLPI (or something similar) to reality.

I've been meaning to make a post like this for a while, but life kept putting it on the back burner. However, I've recently found more CLPM users in the wild, which always gets my energy levels up for this type of work. Plus, discussions I've seen in various Lisp forums (including this tweet that was brought to my attention) have made me think that the time may finally be ripe to start discussing this topic more broadly.

Before continuing, I want to make clear that I have the utmost respect for Xach, Quicklisp, and the services he provides to the community. This post does critique what is probably Xach's most well known work, but it is by no means an attack against either QL or him and I will not tolerate any comments or discussion that cross that line.

What is a Project Index?

First, let's clarify at a high level what I mean by a project index. Basically, a package index is a listing of projects and ASDF systems. For every project, it contains information on what releases are available (how to actually get the code), along with what systems are included in each release and what the dependencies of those systems are.

A project index lets you quickly answer questions like "what is the latest version of cffi?", "what are the dependencies of the latest version of cffi?", or "where can I download the latest version of ironclad?" without needing to load any code.

Quicklisp Issues

Now let's look at what I consider to be flaws of the Quicklisp project index model.

Conflation of project manager and project index. When I mention Quicklisp, what do you think of first? Perhaps the quicklisp-client that gets loaded into your image and provides ql:quickload? Or is it the distinfo.txt, systems.txt, and releases.txt files that contain all the projects known to Quicklisp?
The problem is that it's both! I think that there needs to be a clear separation between the project index (distinfo.txt and friends) and the consumers of the project index (quicklisp-client). Such a separation both makes it clearer to what is being referred in casual conversation, and makes it easier to build competing consumers or servers of the project index.
The project index format is not documented. I believe this is a consequence of the previous issue. To the best of my knowledge, the only documentation of the QL project index format is the quicklisp client code that interacts with it. This makes it harder to implement both competing clients (I had to do quite a bit of code diving to get CLPM to understand the QL project index) and competing servers (there exist several forks of the quickdist project, yet none of them seem to create the dist versions file that CLPM needs).
Not a lot of data is provided. The QL project index does not contain critical information, such as license, ASDF system version number, location of the canonical VCS repo, or author names.
Not easily extensible. The only way to include more information in a QL project index is to add more files. Information cannot be added to releases.txt nor systems.txt without breaking the QL client. Additionally, if the current aesthetics are to be maintained, each line in a file must represent one entry that consists of tokens to be interpreted as strings, integers, or a list of strings (but only one list per entry).
Enforces a particular style of development. A QL project index is rolling: it always grabs the latest version of projects. This forces projects to always use the latest and "greatest" (scare quotes intended) versions of their dependencies or risk being cut from the index. Additionally, it makes it difficult for developers to continue supporting old versions of their code that they would like to maintain; if version 1.0.0 of system A is released, then version 2.0.0, followed by 1.1.0, version 1.1.0 will never show up in a QL project index.
Takes control of releases away from developers. Not only does the QL project index preclude releasing bug fixes to older, stable code, it also takes away the choice of when to perform a release. A developer cannot say "oh crap, I just realized 1.0.0 had a huge bug, I need to get 1.0.1 out today!", instantly publish 1.0.1, and then have others immediately use it. Instead, they have to wait until the next time the QL project index maintainer decides to poll them and see if a new version is available. For the primary QL index, this process can take about a month.
The index is not space efficient. There is a lot of duplicated information in a QL project index. If a project had new releases in QL version M and N, then the information for the release in version M is replicated identically for releases M through N-1. This is an issue if you want to make a consumer that can show when things changed, can install any version of a project, or just wants to efficiently sync index data over the internet.

Ultralisp

A side note on Ultralisp. Ultralisp largely seeks to address issue 6. However, as far as I can tell, it still polls, so developers cannot push new versions to it on demand (please correct me if I'm wrong here!). However, even if it does allow pushes, it still falls victim to the all the other issues except 1. Additionally, Ultralisp is very affected by issue 7 given its update frequency.

CLPI

To address these concerns, I've been slowly developing the Common Lisp Project Index (CLPI) specification. Additionally, I currently have two instances of the index running. One mirrors the data available in the primary QL index, the other is for internal use with my coworkers. Last, CLPM can efficiently consume an index that follows the CLPI spec.

I'm not claiming that CLPI is perfect, but I think it is a significant step forward from QL project indices. Plus, I have some experience running it so I also know that it works (albeit for relatively small audiences). The QL mirror is located at https://quicklisp.common-lisp-project-index.org/.

Now, let's take a brief dive into each of the issues I raised with the QL project index and see how CLPI addresses them.

Conflation of project manager and project index

There is no project manager named CLPI. I do not ever plan on creating one. In any case, Common Lisp Project Index would be a weird name for a project manager.

The project index format is not documented

The current specification of the format of CLPI indices is located at https://gitlab.common-lisp.net/clpm/clpi/-/blob/master/specs/clpi-0.4.org.

The current object model used by CLPI is located at https://gitlab.common-lisp.net/clpm/clpi/-/blob/master/specs/clpi-object-model.org.

Not a lot of data is provided

CLPI allows a project's canonical VCS to be provided. Each system can have the author, license, description, and version specified. System dependencies can include ASDF's (:version ...) and (:feature ...) specifiers.

Not easily extensible

Every file must contain one or more forms that are suitable for READing. Additionally, all the non trivial files consist of plists. This makes it trivial to both write a parser for each file and to extend files with extra information without breaking consumers (so long as the extra information does not change the semantics on which older versions are relying).

Enforces a particular style of development

Every release of every project is made available. Additionally, with the preservation of (:version ...) specifiers from ASDF's dependency lists, developers can easily provide version constraints and project managers can also take those constraints into account.

Takes control of releases away from developers

The proof of concept CLPI server I have developed for my internal use allows a developer to push releases on demand. I am using this in conjunction with Gitlab CI to push releases when tags are created on our git repos.

The index is not space efficient

CLPI borrows a lot of ideas from Rubygems' compact_index. While it is not required as part of the spec, CLPI instances can signal that they intend to only append to the files served to consumers. This lets consumers easily use HTTP headers to download only the new parts of each file that they have yet to see.

Additionally, instead of a monolithic file like releases.txt that contains release information, CLPI splits this info into project specific files. For example, you can get all the known releases for fiveam by downloading https://quicklisp.common-lisp-project-index.org/clpi/v0.4/projects/fiveam/releases-0. To do the same thing with a QL index, you'd have to download releases.txt for every dist version (currently 117 in the primary QL index). For comparison, the CLPI file is currently 2183 bytes, while a single releases.txt file (from the 2021-08-07 dist) is 506134 bytes, or over 200 times bigger. Additionally, the CLPI version also tells you the dependencies. To get that from QL, you also need to download systems.txt (the 2021-08-07 version is 374391 bytes).

Next Steps

Does CLPI excite you? Do you want it to become reality? Awesome, I'd love to collaborate with you to bring CLPI or something similar to the CL community at large! Please reach out to me here, on #commonlisp or #clpm on Libera.chat (I'm etimmons), on Matrix at #clpm:matrix.org (I'm @eric:davidson-timmons.com), or via email at clpm-devel@common-lisp.net.

There's a lot to do and I really want to make this a community effort.

I'd love it people could provide feedback on both the object model and the index format!
I'd love to work with people excited to help take my proof of concept CLPI server and make it production ready (or just make a new one from scratch)! This would include implementing a database backend, support for multiple users, and a permissions system.
I'd love to work with others interested in standing up a CLPI server for the whole community to craft a set of community guidelines and policies that address concerns such as when and how projects can be pulled (remember NPM's leftpad incident??), project ownership, etc!
I'd love to have feedback from all the people out there that are unsatisfied by both the QL client and CLPM! If you're making your own project manager, is there anything we can do with the CLPI spec to make your life easier? Do you have something like CLPI that we can learn from/build off?
But perhaps most of all, I'd love to hear if developers would be interested in publishing their code to a community CLPI server! This is one place where QL's model shines. Xach does all the work, so it is nearly effortless for individual developers to get their latest releases into the QL index. Under the CLPI model, someone (ideally the developer, but potentially a proxy maintainer) would have to perform an action on every release to get it into the CLPI instance.

It's likely that I'll continue putzing along with CLPI, even if I don't get any help. But it'll likely never get to the point of being usable by the community at large without input from others. And even if I somehow managed to get a CLPI server that is usable by the whole community, I wouldn't host it without a team willing to help maintain it, both policy- and tech-wise. I run enough projects with a bus factor of one as it is.