Ollama 加载gguf模型的两种方式

Administrator

2025-02-08

Ollama

17 8~11 min

GGUF（GPT-Generated Unified Format）是一种专为大规模机器学习模型设计的二进制文件格式。它通过将原始的大模型预训练结果进行优化后转换而成，具有加载速度快、资源消耗低等优势。GGUF格式支持内存映射技术，使得模型数据可以直接映射到内存中，从而提高了数据处理的效率。此外，GGUF还支持跨硬件平台优化，能够在CPU和GPU上高效运行。

ollama支持gguf模型的在线或者本地加载

1.在线加载，直接输入模型网站的具体模型

网站有：

modelscope.cn
huggingface.co或hf.co
hf-mirror.com

具体格式为：

ollama run modelscope.cn/{username}/{model}:{version}

例如：

ollama run modelscope.cn/unsloth/DeepSeek-R1-Distill-Qwen-7B-GGUF:DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
或者
ollama run hf-mirror.co/unsloth/DeepSeek-R1-Distill-Qwen-7B-GGUF:DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf

2.本地离线加载，稍微麻烦一点点

需要从网站下载好模型，例如：

https://modelscope.cn/models/unsloth/DeepSeek-R1-Distill-Qwen-7B-GGUF/file/view/master?fileName=DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf&status=2

下载到本地后，新建一个.txt文件，里面写入FROM gguf文件路径，保存，例如：

打开命令行输入

ollama create 自定义名字 -f 自定义名字.txt

例如：

看到success就可以运行模型了

以上就是ollama加载gguf的两种方式。