Raspberry Pi AI Server Running DeepSeek R1
Share
This guide provides a comprehensive overview of setting up and running the DeepSeek R1 large language model (LLM) on the Raspberry Pi 5B. With improvements in LLM efficiency, running them on small, low-power devices like the Raspberry Pi has become feasible. Below, we break down the steps and key considerations in detail:
1. Core Features of the Project
Offline Large Model Inference
Supports running lightweight large models like DeepSeek R1 1.5B/7B, enabling text generation, code writing, and Q&A interactions.
Fully local execution with no internet requirement, ensuring privacy protection.
Portable Hardware Design
Raspberry Pi 5B + PiSugar 3 Plus battery provides 2-3 hours of runtime with a lightweight design.
Custom 3D-printed case integrates cooling and power management.
Multi-Modal Interaction Interface
Open WebUI: Provides a ChatGPT-like web-based interface.
Command Line Interface: Allows developers to directly invoke APIs.
Local Network Sharing: Enables multi-user access via a web browser.
2. Project Architecture
[PiSugar 3 Plus Battery]
↓ Power Supply
[Raspberry Pi 5B]
├─ Ollama Service (DeepSeek R1 Model)
├─ Open WebUI (Frontend Interaction)
├─ Power Management Module (Battery Monitoring / Power Saving Strategy)
└─ Cooling Control (Temperature-Controlled Fan + Metal Case)
↓ Output
[User Device] → Browser/SSH/API Call
3. Hardware and Software Configuration
Raspberry Pi 5B Performance
8GB RAM: Sufficient for running models with 1.5B/7B parameters (7B models require about 4-6GB of RAM).
CPU: Cortex-A76 quad-core 2.4GHz, significantly improved over Raspberry Pi 3B+/4B, enabling faster inference speeds.
32GB TF Card: Ensure proper storage space allocation.
PiSugar 3 Plus Battery
5000mAh Capacity: Powers Raspberry Pi 5B for 2-3 hours under high-load LLM inference and 6-8 hours in low-power mode.
Portability: Combined with a custom case, it creates a true "pocket-sized AI server."
System Image
Raspberry Pi OS (64-bit)
4. Implementation Steps
Step 1: Flash the System Image
Preparation
Required materials: Card reader, TF card.
Download the flashing tool: Visit the Raspberry Pi Imager official site at https://www.raspberrypi.com/software/ to download the software.
Flashing Process
Click "CHOOSE DEVICE" and select Raspberry Pi 5.
Click "CHOOSE OS" and select Raspberry Pi OS (64-bit).
Insert the microSD card into the card reader, connect it to your computer, and select "CHOOSE STORAGE".

Customize OS settings:
Username & Password: Set administrator credentials for SSH login.
WiFi Credentials: Ensure the Raspberry Pi connects to the correct network.
Device Hostname: Define the Pi's network broadcast name.
Region Settings: Configure time zone and keyboard layout.

Enable Remote Access: Under "Services", enable SSH and select "Password Authentication".

Click "SAVE", then confirm OS customization settings.
Click "YES" to write the image to the storage device.
Once flashing is complete, eject the SD card and insert it into the Raspberry Pi to boot.
Step 2: Connect Raspberry Pi via SSH
After assembling the Raspberry Pi 5, PiSugar 3 Plus, cooling fan, and TF card, power on the device.
On your host machine, press Win + R
, type cmd
, and press Enter to open the command prompt. Then, run:
ssh pi@raspberrypi.local
# Or specify the IP: ssh pi@192.168.x.x
# "pi" is the username, "raspberrypi" is the hostname

Step 3: Install Ollama and Download the Model
Ollama is a tool designed for running and customizing large language models in local environments. It provides a simple, efficient interface for managing models, making AI deployment easier for developers and end users alike.
Install Ollama
sudo apt install curl -y
# One-click installation (automatically configures service)
curl -fsSL https://ollama.com/install.sh | sh
# Verify installation status
sudo systemctl status ollama
# Should display "Active: active (running)"
Step 4: Launch the DeepSeek R1 Model
Run the following command to start the AI chatbot on your Raspberry Pi:
ollama run deepseek-r1:1.5b

Step 5: Deploy Open WebUI
For a ChatGPT-like experience, install Open WebUI to turn your Raspberry Pi into a small AI chat server.
Set Up Python Virtual Environment
cd
mkdir webui
python3 -m venv ~/webui
source ~/webui/bin/activate
# Install Open WebUI
pip install open-webui
Verify Service Status
sudo systemctl status open-webui
# Should display "Active: active (running)"
Start the Service
open-webui serve

Open a browser and navigate to http://<raspberrypi_ip>:8080
to access the web interface. If access is denied, try refreshing or switching networks.
Q&A
1. Unable to SSH using Hostname
Ensure the Raspberry Pi and host machine are on the same network.
Make sure SSH is enabled during system flashing.
Use PiSugar’s WiFi Configuration Tool to manage connections:
curl https://cdn.pisugar.com/PiSugar-wificonfig/script/install.sh | sudo bash
2. Speeding Up Ollama Model Downloads via Proxy
Find your host machine's IP (cmd → ipconfig → IPv4 Address
).
Enable LAN proxy in Clash (default port: 7890
).
Configure environment variables:
export https_proxy=http://127.0.0.1:7890
export http_proxy=http://127.0.0.1:7890
export all_proxy=socks5://127.0.0.1:7890
3. Resolving Open WebUI Installation Timeout
If the access timeout occurs when pip installs Open WebUI in your area, you can use the following command to change the pip source
# Modify /etc/pip.conf
sudo nano /etc/pip.conf
[global]
index-url=https://pypi.tuna.tsinghua.edu.cn/simple
After configuration, Open WebUI should install smoothly. The process may take some time, so please be patient.
With these steps, you can turn a Raspberry Pi 5B into a portable AI server capable of running advanced language models entirely offline!