Run LLMs Locally using Ollama

Step-by-step process to running large language models locally on your laptop.

Marc
3 min readMar 10, 2024
Ollama Logo (source: Ollama)

Introduction

Since the release of ChatGPT, there has been a drastic rise in the popularity of large language models (LLMs). The majority of people interact with LLMs via APIs that are hosted externally, Ollama allows you to host LLMs locally on your own laptop.

Ollama provides the ability to interact with open-source and customisable LLMs via a command line interface (CLI), REST API, or Jupyter Notebook. It is extremely simple to install and will have you interacting with local LLMs in a matter of minutes.

Installing Ollama

Ollama can be downloaded on MacOS, Windows, and Linux (by using the following command):

curl -fsSL https://ollama.com/install.sh | sh

Once installed, run the following command in your CLI:

ollama run <MODEL_NAME>

This will download your LLM of choice and initiate a conversation. The easiest approach to interacting with LLMs using Ollama is via the CLI.

What models are available?

Ollama supports all SOTA open-source LLMs. As of March 2024, the list of LLMs available using Ollama is displayed below:

Ollama Model List (Source: GitHub)

Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

Model Customisation

All the models listed above can be customised by composing your own system prompt. For example, to customise the llama2 model, first run the following command:

ollama pull llama2

Once you have pulled the model, create a Modelfile consisting of your system prompt and other parameters:


FROM llama2

PARAMETER temperature 1

SYSTEM """
You are a Python programmer specialising in machine learning.
"""

The contents of the Modelfile above will create a customised llama2 model that is acting as a machine learning specialist, the temperature parameter allows this model to be more creative in its outputs.

To create and run your new custom LLM, run the following command:

ollama create ml_spec -f ./Modelfile

ollama run ml_spec
>>> hello, what is your profession?

Hello! I am an ML specialist working with the programming language Python.

Running LLMs Outside the CLI

Now we have our custom LLM, we do not want to only interact with it via the CLI. Due to LangChain, Ollama LLMs can be run via Jupyter Notebooks, Ollama also has its own REST API.

REST API

Ollama’s REST API allows you to run and manage your local LLMs. To generate a response from your model, run the following command:

curl http://localhost:11434/api/generate -d '{
"model": "llama2",
"prompt":"Why is the sky blue?"
}'

If you want to interact with the LLM in a conversational style, run the following command:

curl http://localhost:11434/api/chat -d '{
"model": "mistral",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'

For more information, see the API documentation here.

Jupyter Notebook

One of the most common approaches when experimenting with Python is via Jupyter Notebooks.

To run your models using Jupyter you’ll need to leverage LangChain:

from langchain.llms import Ollama

By using LangChain, it is now really easy to call your local LLM:

ollama = Ollama(base_url="http://localhost:11434", model="llama2")

TEXT_PROMPT = "Why is the sky blue?"

print(ollama(TEXT_PROMPT))
>> The sky appears blue due to a phenomenon called Rayleigh Scattering.

Conclusion

Ollama is a great approach to working with LLMs locally. Not only is it extremely simple to set up, but the combination of Ollama and LangChain allows users to implement their custom LLMs directly in Jupyter Notebooks.

As new open-source LLMs are released, Ollama will make these available with only a single command. As long as you have enough compute on your laptop, you will encounter no issues when working with new SOTA open-source LLMs.

For more information on Ollama, visit their GitHub page here.

If you enjoyed reading this article, please follow me on Medium, Twitter, and GitHub for similar content relating to Data Science, Artificial Intelligence, and Engineering.

Happy learning! 🚀

--

--

Marc

Lead Data Scientist • Writing about Machine Learning, Artificial Intelligence and Engineering