Skip to main content

Command Palette

Search for a command to run...

๐Ÿง  Offline LLMs in Linux | Ollama on Linux | Easy Setup Guide

Updated
โ€ข3 min read
๐Ÿง  Offline LLMs in Linux | Ollama on Linux | Easy Setup Guide
C

Siddhant Bali, an aspiring tech entrepreneur, is an Undergraduate Research Scholar at IIIT Delhi, currently pursuing a B.Tech in Computer Science Engineering with a focus on design (CSD). Excelling in college activities and event management, Siddhant's entrepreneurial spirit propels him into innovative ventures. Connect on LinkedIn or reach out at siddhant22496@iiitd.ac.in for more info.

๐Ÿ“Œ What is Ollama?

Ollama is a tool to run LLMs (Large Language Models) locally on your computer with just one command. Itโ€™s beginner-friendly, supports offline usage, and works on most modern Linux systems.


โœ… Features

  • Supports models like LLaMA 3, Mistral, Phi-3, Code LLaMA, and more

  • CLI-based: clean and fast

  • Works on CPU or GPU

  • Easy install and model usage

  • Free and open-source


๐Ÿ–ฅ๏ธ System Requirements

  • OS: Linux (Ubuntu, Fedora, Arch, etc.)

  • Memory: 8 GB+ RAM recommended

  • CPU: Any modern x86_64 processor

  • (Optional) GPU: For faster performance (NVIDIA preferred)


๐Ÿ› ๏ธ 1. Install Ollama

๐Ÿ”น Run this command:

curl -fsSL https://ollama.com/install.sh | sh

This:

  • Installs the Ollama CLI

  • Sets up the system service

  • Adds the ollama user and group


๐Ÿš€ 2. Run a Model

๐Ÿ”น Example (LLaMA 3):

ollama run llama3

It will:

  • Pull the model automatically (first time only)

  • Start an interactive chat in your terminal


๐Ÿง  3. Try Other Models

Model NameSize (Params)TypeStrengthsUse Case ExamplesCommand
LLaMA 38B / 70BGeneral-purposeBalanced reasoning, long contextChatbots, coding, general Q&Aollama run llama3
Mistral7BGeneral-purposeFast, good qualityLightweight assistant, dev toolsollama run mistral
Phi-33.8BLightweightExtremely small, fastMobile devices, embedded use, casual chatollama run phi3
Code LLaMA7B / 13BCode-focusedBest for programming tasksCode generation, debuggingollama run codellama
LLaMA 27B / 13BGeneral-purposeEarlier version, still powerfulChat, essays, summarizationollama run llama2
Gemma2B / 7BGoogle modelFast & alignedChat, education, summarizationollama run gemma
Neural Chat7BChat optimizedTuned for conversational flowPersonal assistant, Q&Aollama run neural-chat
ollama run mistral
ollama run phi3
ollama run codellama
ollama run llama2

๐Ÿ”Ž List All Installed Models:

ollama list

โŒ Remove a Model:

ollama remove mistral

๐Ÿ”ง 4. Use as a Local API (Optional)

Start the Ollama server:

ollama serve

Use HTTP API to query models:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "What is the capital of India?"
}'

๐Ÿ“ Model Location

Models are stored at:

~/.ollama/models

๐Ÿ›‘ Uninstall Ollama

sudo systemctl stop ollama
sudo rm -rf /usr/local/bin/ollama /usr/local/lib/ollama ~/.ollama
sudo userdel ollama
sudo groupdel ollama

๐Ÿ’ก Tips

  • Use Q4_K_M or Q8_0 models for better performance

  • You can run Ollama completely offline after the model is downloaded

  • Combine it with tools like LM Studio, KoboldCpp, or Streamlit apps