Member-only story

Run LLMs Locally using Ollama

Step-by-step process to running large language models locally on your laptop.

Marc Matterson

Published in

Data Science Collective

3 min readMar 10, 2024

Introduction

Since the release of ChatGPT, there has been a drastic rise in the popularity of large language models (LLMs). The majority of people interact with LLMs via APIs that are hosted externally, Ollama allows you to host LLMs locally on your own laptop.

Ollama provides the ability to interact with open-source and customisable LLMs via a command line interface (CLI), REST API, or Jupyter Notebook. It is extremely simple to install and will have you interacting with local LLMs in a matter of minutes.

Installing Ollama

Ollama can be downloaded on MacOS, Windows, and Linux (by using the following command):

curl -fsSL https://ollama.com/install.sh | sh

Once installed, run the following command in your CLI:

ollama run <MODEL_NAME>

This will download your LLM of choice and initiate a conversation. The easiest approach to interacting with LLMs using Ollama is via the CLI.

What models are available?

Ollama supports all SOTA open-source LLMs. As of March 2024, the list of LLMs available using Ollama is displayed below:

Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

Model Customisation

All the models listed above can be customised by composing your own system prompt. For example, to customise the llama2 model, first run the following command:

ollama pull llama2

Once you have pulled the model, create a Modelfile consisting of your system prompt and other parameters:


FROM llama2

PARAMETER temperature 1

SYSTEM """
You are a Python programmer specialising in machine learning.
"""