Open source software has a number of benefits over commercial products, not least the fact that it can be downloaded for free. This means anyone can analyse the code and, assuming they have the right hardware and software environment configured, they can start using the open source code immediately.
With artificial intelligence (AI), there are two parts to being open. The source code for the AI engine itself can be downloaded from a repository, inspected and run on suitable hardware just like other open source code. But open also applies to the data model, which means it is entirely feasible for someone to run a local AI model that has already been trained.
In other words, with the right hardware, a developer is free to download an AI model, disconnect the target hardware from the internet and run it locally without the risk of query data being leaked to a cloud-based AI service.
And since it is open source, the AI model can be installed locally so it does not incur the costs associated with cloud-hosted AI models, which are generally charged based on the volume of queries measured in tokens submitted to the AI engine.
How does an open model differ from commercial AI?
All software needs to be licenced. Commercial products are increasingly changed on a subscription basis and, in the case of large language models (LLMs), the cost correlates to the amount of usage, based on the volume of tokens submitted to the LLM and the hardware consumed in terms of hours of graphics processing unit (GPU) time used by the model when it is queried.
Like all open source software, an LLM that is open source is subject to the terms and conditions of the licensing scheme used. Some of these licences put restrictions on how the software is used but, generally, there are no licence fees associated with running an open model locally.
However, there is a charge if the open model is run on public cloud infrastructure or accessed as a cloud service, which is usually calculated based on the volume of tokens submitted to the LLM programmatically using application programming interfaces (APIs).
What are the benefits of open source AI models
Beyond the fact that they can be downloaded and deployed on-premise without additional cost, their openness helps to progress the development of the model in a similar way to how the open source community is able to improve projects.
Just like other open source projects, an AI model that is open source can be checked by anyone. This should help to improve its quality and remove bugs and go some way to tackling bias, when the source data on which a model is trained is not diverse enough. The following podcast explores AI models further.
How to get started with open models
Most AI models offer free or low-cost access via the web to enable people to work directly with the AI system. Programmatic access via APIs is often charged based on the volume of tokens submitted to the model as input data, such as the number of words in a natural language query. There can also be a charge for output tokens, which is a measure of the data produced by the model when it responds to a query.
Since it is open source, an open model can be downloaded from its open source repository (“repo”) on GitHub. The repository generally contains different builds for target systems – such as distributions of Linux, Windows and MacOS.
However, while this approach is how developers tend to use open source code, it can be a very involved process and a data scientist may just want to “try” the latest, greatest model, without having to get into the somewhat arduous process of getting the model up and running.
Step in Hugging Face, an AI platform where people who want to experiment with AI models can research what is available and test them on datasets all from one place. There is a free version, but Hugging Face also provides an enterprise subscription and various pricing for AI model developers for hosting and running their models.
Another option is Ollama, an open source, command-line tool that provides a relatively easy way to download and run LLMs. For a full graphical user interface to interact with LLMs, it is necessary to run an AI platform such as Open WebUI, an open source project available on GitHub.
How open source AI models support corporate IT security
Cyber security leaders have raised concerns over the ease with which employees can access popular LLMs, which presents a data leakage risk. Among the widely reported leaks is Samsung Electronics’ use of ChatGPT to help developers debug code. The code – in effect, Samsung Electronics intellectual property – was uploaded into the ChatGPT public LLM and effectively became subsumed into the model.
The tech giant quickly took steps to ban the use of ChatGPT, but the growth in so-called copilots and the rise of agentic AI have the potential to leak data. Software providers deploying agentic technology will often claim they keep a customer’s private data entirely separate, which means such data is not used to train the AI model. But unless it is indeed trained with the latest thinking, shortcuts, best practices and mistakes, the model will quickly become stale and out of date.
An AI model that is open can be run in a secure sandbox, either on-premise or hosted in a secure public cloud. But this model represents a snapshot of the AI model the developer released, and similar to AI in enterprise software, it will quickly go out of date and become irrelevant.
However, whatever information is fed into it remains within the confines of the model, which allows organisations willing to invest the resources needed to retrain the model using this information. In effect, new enterprise content and structured data can be used to teach the AI model the specifics of how the business operates.
What hardware do you need
There are YouTube videos demonstrating that an LLM such as the Chinese DeepSeek-R1 model can run on an Nvidia Jetson Nano embedded edge device or even a Raspberry Pi, using a suitable adapter and a relatively modern GPU card. Assuming the GPU is supported, it also needs plenty of video memory (VRAM). This is because for best performance, the LLM needs to run in memory on the GPU.
Inference requires less memory and less GPU cores, but the more processing power and VRAM available, the faster the model is able to respond, as a measure of tokens it can process per second. For training LLMs, the number of GPU cores and VRAM requirements go up significantly, which equates to extremely costly on-premise AI servers. Even if the GPUs are run in the public cloud with metered usage, there is no getting away from the high costs needed to run inference workloads continuously.
Nevertheless, the sheer capacity of compute power available from the hyperscalers means that it may be cost effective to upload training data to an open LLM model hosted in a public cloud.
How to make open source AI models more affordable to run
As its name suggests, a large language model is large. LLMs require huge datasets for training and immense farms of powerful servers for training. Even if an AI model is open source, the sheer cost of the hardware means that only those organisations that are prepared to make upfront investments in hardware or reserve GPU capacity in the public cloud have the means to operationalise LLMs fully.
But not everyone needs an LLM and that is why there is so much interest in models that can run on much cheaper hardware. These so-called small language models (SLM) are less compute intensive, and some will even run on edge devices, smartphones and personal computers (see box).