Development, Machine Learning, Tech Policy

No, Llama 2 is not actually open source

I’m not surprised that Meta is completely abusing the term “open source” to position themselves as the champions of the distributed FOSS community in the AI space, but it’s time to be explicit: their so-called “open source” Llama 2 model is not actually open source.

Open source is a term that is defined by a non-profit called the Open Source Initiative. For starters, the OSI has explicitly noted that they don’t have a definition for open source AI, but they are the arbiters of the term, and are in the process of putting together an updated definition of open source to take into consideration the development and use of machine learning models. But even if you use the original definition of open source, which would arguably be what applies here until a new definition is codified, Llama 2 doesn’t apply.

The OSI explicitly calls out that it is not sufficient for code to be open for something to be called open source. The actual definition of open source includes provisions that must be true for the licensing of the software.

Of most interest here are articles 3 and 6. Article three states that open source software must not put restrictions on the derivative applications, and article six states that the software cannot discriminate against fields of endeavor [1] and how the software is used.

Llama 2’s license does both of these things. It specifically says that you can’t use the model to train competitive models. It also has an acceptable use policy that has a long list of things that you’re not allowed to use the model to do, but it’s less clear to me if that specifically applies to the license itself in terms of violating the OSI’s definition.

So, while Llama 2 is certainly interesting, and more openly licensed than some other AI language models, it’s definitely not open source. Articles covering the software would be wise to note that the more appropriate term is “permissively licensed”, and direct all other feedback to the OSI’s open calls for defining open source AI.


[1] As an aside, I’m not saying I agree with the fields of endeavor requirement on defining open source software. There’s a lot to argue for the idea that accountability in software development has been played fast and loose in terms of the damage it causes. However, more aggressive licensing terms about how software can be used is probably not the answer, and it’s certainly not up to Meta to redefine a several-decades-old-term that has a specific meaning and definition in order to further their abusive software practices. I don’t trust anything that Meta does given Le Cunn’s influence, and this is a further example of how they’re poisoning the industry – which they’ve, frankly, done enough of already.