Tech giants fall short of new open-source AI standard

Purportedly open-source generative AI (genAI) tools from Meta, X, and Microsoft have failed to reach the bar set by the newly released Open Source AI Definition (OSAID), which aims to ensure that new models can be scrutinised for security, trustworthiness, and repeatability.

Developed over the past year and released in final form by the Open Source Initiative’s (OSI’s) more than 20 sponsoring organisations at the end of October, OSAID v1.0 outlines ‘four freedoms’ that members believe every genAI system must embrace.

These include using the system for any purpose without having to ask for permission; being able to study how the system works and inspect its components; modifying the system for any purpose, including changing its output; and sharing the system with others.

Many genAI firms already claim to support open source AI, but each is working to its own definition of the term – for example, declining to provide commercially sensitive source code and preventing users from seeing the data the genAI systems were trained with.

This means the systems can’t be properly scrutinised to understand why they generate the outputs they produce – vexing companies that want to implement their own genAI systems but worry that they can’t actually explain what is going on inside ‘black box’ solutions.

Without understanding what data was used to train a particular genAI model – and clearance from its owners for others to reuse it – companies, auditors and governments can’t verify a genAI system’s accuracy or bias, creating risk and governance issues.

With genAI systems increasingly and controversially hoovering up confidential, copyright news content, books, and proprietary data sets, much of that data simply isn’t legally or practically available to those that need to understand how the systems work.

Retracing genAI’s steps

OSAID’s 17-element checklist includes requirements that genAI models “provide enough information about their training data so that a skilled person can recreate a substantially equivalent system using the same or similar data,” Mozilla AI strategy head Ayah Bdeir said.

“This is the starting point to addressing the complexities of how AI training data should be treated, acknowledging the challenges of sharing full datasets while working to make open datasets a more commonplace part of the AI ecosystem.”

This requirement limits the open source AI designation to genAI systems that can supply the open source data they were trained on –and the lack of such proof is part of the reason that Meta’s Llama2, X’s Grok, Microsoft’s Phi-2 and Mistral’s Mixtral AIs failed to pass OSI testing.

That’s a blow for Meta, whose founder and CEO Mark Zuckerberg launched the latest version of Llama in July with the proclamation that a deep commitment to open source “is necessary for a positive AI future.”

^{According to Meta CEO Mark Zuckerberg, a deep commitment to open source “is necessary for a positive AI future". Photo: Shutterstock AI}

“Llama needs to develop into a full ecosystem of tools,” he added. “If we were the only company using Llama, this ecosystem wouldn’t develop and we’d fare no better than the closed variants of Unix.”

OSAID’s criteria were shaped by open-source definitions used by the Bloom, OpenCV, Llama 2, and Pythia genAI engines – with the five genAI models that ultimately passed including Eleuther AI’s Pythia, OLMo’s AI2, LLM360’s Amber and CrystalCoder, and Google’s T5.

BigScience’s BLOOM, BigCode’s Starcoder2, and TII Falcon “would probably pass if they changed their licenses/legal terms,” OSI noted.

From little things, open things grow

Just as the success of open-source Linux helped it beat closed Unix rivals, open source can be a powerful catalyst for innovation that drives productive competition amongst like-minded rivals – but it’s never easy to get fierce competitors to put aside their differences.

Reaching enough industry consensus to release OSAID 1.0 “was a difficult journey,” OSI executive director Stefano Maffulli said, admitting that the “delicate process [was] filled with differing opinions and uncharted technical frontiers – and the occasional heated exchange.”

Given genAI’s high stakes – a recent funding round valued OpenAI at $240 billion ($US157 billion) and Elon Musk is said to be lining up funding that would value his xAI at $60 billion ($US40 billion) – can OSAID drive similar discipline in a market promising untold profits?

“It’s a good start,” Julien Sobrier, senior product manager Endor Labs – a specialist firm that advises companies how to integrate open-source systems into their software development lifecycle, and ranks genAI systems’ security and quality – told Information Age.

Although the 1.0 version of OSAID “is definitely a very broad definition and almost more like a vision,” Sobrier explained, “it tells you in very broad strokes ‘what does it mean to be open?’ – but it doesn’t really tell you how to structure information to make it easy to use.”

As a growing number of genAI industry players close ranks around OSAID – founding supporters include open-source advocates like SUSE, Mozilla, Nextcloud and Eclipse Foundation – Sobrier “expects the same trajectory that we’ve seen” in other efforts.

Effective security needs widespread scrutiny

Broader adoption of OSAID’s principles will be critical not only for helping companies adopting genAI know what they’re getting themselves into, but for helping armies of security specialists pore over new open-source AI projects to find and fix potential problems early.

Ongoing code review will be critical as cyber criminals stress-test genAI architectures: Protect AI members uncovered 34 genAI exploits last month alone, while a newly discovered exploit uses hexadecimal encoding to trick ChatGPT-4o and others into executing malware.

With genAI proving so open to manipulation, the ability to better see what’s going on under the hood will be “a sobering moment for organisations,” David Brauchler, head of AI and ML security with consultancy NCC Group, told Information Age.

“Right now, a lot of them think that just by adding guard rails to try to influence the behaviour of these large language models, that they can have certain security assurances that they really can’t.”

“As we move towards an agentic context where models are able to execute useful functions on behalf of users, that just gets more and more dangerous from a security perspective.”

“The goal is to make it so that no matter what the model itself does, your systems are still secure.”