Skip to main content

How to Fix Ollama Model Pull Stalling and Progress Reverting

 Nothing disrupts a development workflow quite like a failing local environment setup. If you are attempting to pull a large language model and find your Ollama pull stuck, you are not alone.

The symptoms are highly specific: the download speed abruptly drops to 0 B/s, the terminal hangs, and you watch the Ollama download progress reverting backward (e.g., dropping from 65% back to 40%). If you inspect the background service logs, you will likely see an recurring Ollama part attempt failed error.

This guide breaks down the underlying network mechanics causing this failure and provides concrete, production-tested solutions to resolve it.

Root Cause Analysis: Why Progress Reverts

To fix the problem, you must understand how Ollama handles model distribution.

Ollama stores and distributes models similarly to Docker images. A model is not a single file; it is a manifest composed of multiple layer blobs (hashed via SHA256). To optimize speed, Ollama utilizes multipart downloading, opening multiple concurrent connections to fetch chunks of these blobs simultaneously.

The reverting progress behavior occurs due to the following sequence:

  1. Chunk Discarding: When a network drop occurs mid-download, the partial chunk currently held in memory is corrupted. Ollama discards these incomplete bytes.
  2. Progress Recalculation: The terminal progress bar represents the total successfully verified bytes written to disk. When the corrupted chunk is discarded, the total valid byte count decreases, causing the percentage to visually jump backward.
  3. Connection Drops: These drops happen because the Ollama registry max streams threshold is hit, triggering rate limits from the underlying Content Delivery Network (CDN). Alternatively, your local ISP or router's NAT table becomes exhausted by the aggressive multiplexing, silently dropping the packets.

When the connection is forcibly closed by the CDN or router, Ollama logs the part attempt failed exception and initiates an endless retry loop that continually stalls.

Step 1: Clear Corrupted Partial Blobs

The first step to fix Ollama network error loops is to purge the corrupted .partial files. Ollama attempts to resume downloads by reading these files, but a corrupted tail byte will cause the retry to fail instantly.

Stop the Ollama service before deleting these files to prevent file-lock conflicts.

Linux / macOS

# 1. Stop the Ollama service
sudo systemctl stop ollama    # Linux
# Or if running via Homebrew services on macOS:
# brew services stop ollama

# 2. Navigate to the blobs directory
cd ~/.ollama/models/blobs

# 3. Remove all partial downloads
find . -type f -name "*.partial" -delete

# 4. Restart the service
sudo systemctl start ollama

Windows (PowerShell)

# 1. Stop the Ollama application from the system tray or task manager
Stop-Process -Name "ollama" -Force

# 2. Remove partial files in the default Windows directory
$blobPath = "$env:USERPROFILE\.ollama\models\blobs"
Get-ChildItem -Path $blobPath -Filter "*.partial" -Recurse | Remove-Item -Force

# 3. Restart Ollama
Start-Process "ollama"

After clearing the partial files, re-run your ollama pull <model> command. In many cases, starting with a clean slate bypasses the previous chunking error.

Step 2: Mitigate IPv6 and DNS Routing Failures

If the pull continues to stall, the issue lies in your network routing to Ollama's CDN (often routed through Cloudflare). IPv6 fragmentation and ISP-provided DNS servers frequently cause silent packet drops during sustained, multi-connection downloads.

Force IPv4 Binding

Depending on your network hardware, IPv6 can handle large packets poorly, leading to MTU (Maximum Transmission Unit) black holes. You can force Ollama to bind to IPv4.

If you are running Ollama via systemd on Linux, modify the service file:

sudo systemctl edit ollama.service

Add the following environment variable definition to restrict the host binding:

[Service]
Environment="OLLAMA_HOST=0.0.0.0"

Save, exit, and restart the daemon:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Switch to a Reliable DNS

Change your system's DNS provider to Cloudflare (1.1.1.1) or Google (8.8.8.8). Because Ollama's CDN uses geographical load balancing, an updated DNS prevents your system from connecting to a degraded edge server that is aggressively dropping streams.

Step 3: The Bulletproof Workaround (Manual GGUF Import)

If your network strictly prohibits the concurrent stream behavior of the Ollama registry, you can bypass ollama pull entirely.

Ollama models are primarily packaged GGUF files. You can download the model directly via a browser or a resilient download manager (like wget or aria2c), and then build the Ollama model locally.

1. Download the GGUF File

Navigate to HuggingFace and locate the GGUF version of your desired model (e.g., Meta-Llama-3-8B-Instruct.Q4_K_M.gguf). Download it using a tool that supports robust resuming:

# Using wget with infinite retries and continue support
wget -c --retry-connrefused --tries=0 --timeout=15 "https://huggingface.co/path/to/model.gguf" -O model.gguf

2. Create a Modelfile

In the same directory as your downloaded model.gguf, create a file literally named Modelfile. This file dictates how Ollama should parse and configure the weights.

# Modelfile
FROM ./model.gguf

# Optional: Define system parameters or templates
PARAMETER temperature 0.7
PARAMETER num_ctx 4096

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

"""

Note: Ensure you use the correct prompt template for your specific model architecture.

3. Build the Model Locally

Use the ollama create command to ingest the local GGUF file into the Ollama blob registry. This operation requires zero network usage.

ollama create my-local-model -f Modelfile

Once the terminal outputs success, you can run the model normally:

ollama run my-local-model

Common Pitfalls and Edge Cases

Docker Container Environments

If you are running Ollama inside a Docker container (ollama/ollama), the partial files exist inside the container's volume. To clear them, you must execute the shell command from within the running container:

docker exec -it ollama-container-name find /root/.ollama/models/blobs -name "*.partial" -delete

VPN and MTU Size

Running ollama pull over a corporate VPN often triggers the reverting progress bug. VPNs typically lower your network's MTU size to accommodate encryption overhead. Ollama's chunked packets may exceed this MTU, causing fragmentation that the registry rejects. Temporarily disabling the VPN or lowering your network adapter's MTU to 1350 can stabilize the pull.

Conclusion

The "Ollama pull stuck" and progress reverting issues are fundamentally network concurrency problems. By clearing corrupted .partial blobs, you prevent the daemon from trapping itself in a local retry loop. By optimizing your network routing (or bypassing the multi-stream registry entirely via manual GGUF imports), you can guarantee successful local model deployments regardless of CDN rate limits.