AMD Instinct MI50 通过llama.cpp 在 ROCm7.0.2上运行-尧图网站建设

📅 发布时间：2026/6/21 21:06:16

关于网上传言MI50 ROCm7.0.2的性能提升了，这边做了下测试。

ROCm7.0.2安装方法:

ROCm 7.0 Install for Mi50 32GB | Ubuntu 24.04 LTS : r/LocalLLaMA

这边系统使用的ubuntu22.04

ROCm安装

wget https://repo.radeon.com/amdgpu-install/7.0.2/ubuntu/jammy/amdgpu-install_7.0.2.70002-1_all.deb
sudo apt install ./amdgpu-install_7.0.2.70002-1_all.deb
sudo apt update
sudo apt install python3-setuptools python3-wheel
sudo usermod -a -G render,video $LOGNAME # Add the current user to the render and video groups
sudo apt install rocm

Drivers安装

wget https://repo.radeon.com/amdgpu-install/7.0.2/ubuntu/jammy/amdgpu-install_7.0.2.70002-1_all.deb
sudo apt install ./amdgpu-install_7.0.2.70002-1_all.deb
sudo apt update
sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
sudo apt install amdgpu-dkms

Guide:
1. Run the commands from the ROCm quick install: https://rocm.docs.amd.com/projects/install...
2. Before rebooting to complete the install, download the 6.4 rocblas from the AUR: https://archlinux.org/packages/extra/x86_6...
3. Extract it
4. Copy all files that contain the filename "gfx906" in rocblas-6.4.3-3-x86_64.pkg/opt/rocm/lib/rocblas/library to /opt/rocm/lib/rocblas/library
5. Reboot, enrolling MOK if needed
6. Check by running sudo update-alternatives --display rocm

Now you can build llama.cpp with ROCm + flash attention (adjust j value according to number of threads):

HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx906 -DGGML_HIP_ROCWMMA_FATTN=ON -DCMAKE_BUILD_TYPE=Release \
&& cmake --build build --config Release -- -j 16

Note: Vulkan also works, but in my findings prompt processing seems to be better on ROCm.

测试使用qwen3 vl 32b：

./llama-server -m ~/.lmstudio/models/huihui-ai/Huihui-Qwen3-VL-32B-Thinking-abliterated/ggml-model-Q4_K_M.gguf --port 8080

运行后进入浏览器测试

测试速度相对LM中Vulkan而言，感觉提升也不大，可能是对部分模型优化会更好，后面试试其他模型。