Believemy logo purple

Create a customized LLM with Ollama in 2 minutes

Discover Ollama, run LLMs language models locally in total confidentiality and customize LLama easily on your device!
Updated on December 9, 2024
Believemy logo

Generative AI: A Strategic Concern

Generative artificial intelligence is a real concern and a strategic issue, both for businesses and governmental entities alike.

With the arrival of ChatGPT, numerous solutions have emerged: Gemini or even Mistral, for example.

The problem lies in the fact that:

  • Accessing these services can incur very high costs in case of intensive use;
  • We do not know what these services are doing with our personal data;
  • Our data necessarily passes through an internet connection;
  • We are forced to endure the outages that occur on these services.

Fortunately, there’s an alternative to address these problems: Ollama.

Homepage of Ollama
Ollama’s Homepage

 

What is Ollama?

Ollama is a relatively recent project born out of the desire to simplify the use of language models (often called LLMs for Large Language Models) by allowing them to run directly on local machines rather than hosting them on cloud servers.

The idea behind Ollama is to make language models more accessible to developers while respecting growing concerns about data privacy.

If you know Docker, consider Ollama as the Docker of LLMs. If that doesn’t ring a bell, think of it as a quarantined space on your computer: nothing can get in or out.

With Ollama, there are commands reminiscent of Docker’s commands: pull and run, for example.

With Ollama, all your data is stored locally on your computer. You can also easily fine-tune language models to your needs.

Goodbye monthly subscriptions to online services! 😁

 

Installing Ollama

To install Ollama, start by visiting the official website.

Just click on the big "Download" button. You shouldn’t have trouble finding it. 😉

Select your operating system and off you go!

Click on your OS
Click on your operating system

 

If you are on Mac or Windows

A file should now download to your computer: open it and follow the installation prompts.

If all goes well, you should see something like this:

Screenshot of Ollama’s welcome message on Mac
Screenshot of Ollama’s welcome message on Mac

A message will ask you to install a command ollama. Just click Install.

Congratulations! You have installed Ollama!

 

If you are on Linux

Open your terminal and type:

CONSOLE
curl -fsSL https://ollama.com/install.sh | sh

You should now be able to use the Ollama command!

 

Launching an Ollama Instance

Now that Ollama is installed, let’s start a first instance!

Before going any further, verify that the following command returns the version number of Ollama.

CONSOLE
ollama --version

To start our first instance, we must understand what we are going to do.

We will begin by using Meta’s language model: Llama in its version 3.1 (the very latest).

If this doesn’t ring a bell, this language model is simply the best of the best, even ranked above OpenAI’s ChatGPT in several benchmarks, depending on the configuration. Moreover, it’s open-source.

Performances of Llama3.1 (405B params) vs GPT4
Performances between Llama 3.1 (405 billion parameters) and GPT4 (source)

Now we have your attention. 🙃

Launching the Llama 3.1 model

To launch a language model, we will use the following command:

CONSOLE
ollama run <model name>

In <model name>, you can choose from this massive list of supported language models by Ollama.

We’ll work with llama3.1. Here’s what we’ll type in the terminal:

CONSOLE
ollama run llama3.1

There are multiple versions of Llama3.1:

  • llama3.1 - corresponds to the basic version of the Llama model (8B parameters)
  • llama3.1:70b - corresponds to the 70B parameters version
  • llama3.1:405b - corresponds to the 405B parameters version 🤪

Do not download the 405B parameter version unless you have a very powerful machine with 231GB of RAM and about 4 x 4090 GPUs.

Downloading the Llama3.1 model should now be done after a few minutes if you have fiber optics. The model is about 4.7GB.

Our model ready
Our model is now ready to respond

 

Using the Llama 3.1 model

Now that the model is ready, let’s ask a question!

CONSOLE
>>> What is the answer to the mystery of life?

(The model responds with a philosophical explanation instead of just “42”.)

To stop the model, you have options:

  • type /bye
  • press Ctrl + d

 

Using Llama 3.1 via a REST API

You can also use Llama 3.1 with an API. To do this, we first need to run our local server, which will also allow you to set up your model directly on a production server.

Starting the Ollama server

To start our server, run:

CONSOLE
ollama serve

Your server should now be running with various messages showing info you can safely ignore.

If you see this error:

CONSOLE
Error: listen tcp 127.0.0.1:11434: bind: address already in use

It means your server is already running. If you installed Ollama on Mac or Windows, this is normal: its installation automatically launches Ollama! Just go to the taskbar, find the Ollama icon, and click "Stop ollama".

If you’re on Linux, you can stop Ollama via terminal:

CONSOLE
systemctl stop ollama

 

Using the REST API for Ollama

Now open a new terminal and use the tool curl to send a request to our new Ollama server:

CONSOLE
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama3.1",
  "prompt": "Salut comment que ça va ?",
  "stream": false
}'

We send the request to an address generated by our Ollama server with three specific parameters:

  • model - The model to use
  • prompt - Our query
  • stream - When set to false, we receive an object with our response along with additional info. If set to true, we would receive a streamed response.

You should get a JSON object you can use in your projects.

Example of a response with Ollama server
Example of a response from our brand new Ollama server

 

Customizing Llama 3.1 to Your Needs

Did you think that was all? Not with Believemy. 😗

We will now learn to customize the Llama model to our needs. If you’re familiar with Docker, this will feel familiar.

Creating a Modelfile

To customize your language model, create a Modelfile (similar to a Dockerfile). Open your favorite code editor or if you don’t know what that is, use Notepad (I never thought I’d say that on Believemy 👀).

Create a Modelfile without extension:

CONSOLE
FROM llama3

PARAMETER temperature 3

SYSTEM """
You respond in french. Behave like if you were Steve Jobs and always say at the end of yours messages "Steve Jobs."
"""

Explanations:

  • FROM - Our base model
  • PARAMETER temperature - The higher the value, the more coherent the model. The lower the value (like 1), the more creative and unpredictable it is.
  • SYSTEM - Here we define instructions that the language model should always follow. It’s like giving it an initial personality or rules.

For more details on customizing a language model with Ollama, check out this documentation.

 

Creating a new Ollama model

We can now add our own model (Llama custom version):

CONSOLE
ollama create steve -f ./Modelfile

Here’s what this does:

  • create - Creates a new model
  • steve - Our chosen name for the model
  • -f - Indicates we’ll specify a path to a file
  • ./Modelfile - The file path

Use your chosen name and path. The command might take a while because we’re downloading and customizing the language model again.

Our new Ollama model ready
Our new Ollama model is ready

 

Using a customized Ollama model

Now we can finally use our new model! Remember the command? 😉

CONSOLE
ollama run steve

Here my model is named steve. For your model, use your chosen name. Don’t confuse it with "llama".

Example response with custom model
Example response with our new Ollama model based on Llama3.1

Isn’t that incredible? 🥲

You can also use the REST API:

CONSOLE
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "steve",
  "prompt": "Salut comment que ça va ?",
  "stream": false
}'

 

Conclusion

The possibilities are huge! Have fun with Ollama! This mini-tutorial was a pleasure to create. If you enjoyed it, feel free to share it on your networks and tag Believemy or me, so we can comment on your share!

In the meantime, if you want to discuss LLM, come visit our private Discord channel.

Peace! 😉

Steve Jobs. (🙃)

Believemy logo
Comments (0)

You need to be logged in to comment on this article: log in or sign up.

Try for Free

Whether you're scaling your startup, building your first website, or transitioning to a developer role, Believemy is your new home. Join us, grow, and let’s build your project together.

Believemy is with anyone