How to get ROCM running in Docker to do machine learning on your AMD GPU

2 min readDec 6, 2022

Photo by <a href=”https://unsplash.com/@lnz_uk?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Vagelis Lnz</a> on <a href=”https://unsplash.com/s/photos/amd-gpu?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>

Have you ever wanted to get up and running with machine learning on your desktop only to find that pytorch and tensorflow don’t want to run on your AMD GPU and only want to work with Nvidia GPUs? If so, this article should be able to help you get started doing machine learning with ROCM versions of Pytorch, or Tensorflow in a docker container. I have tested this on both Ubuntu 20.04 and Ubuntu 22.04 and it seems to be working well.

I followed the instructions here: https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.4/page/How_to_Install_ROCm.html#d23e2098 to get rocm installed in my machine. Note that they will probably release newer releases with different instructions so always good to check there but here are the step by step instructions for Ubuntu 22.04. I’m assuming you have Docker installed on your machine to get this to work.

sudo apt-get update
wget https://repo.radeon.com/amdgpu-install/5.4/ubuntu/jammy/amdgpu-install_5.4.50400-1_all.deb
sudo apt-get install ./amdgpu-install_5.4.50400-1_all.deb

sudo amdgpu-install --usecase=rocm

this is all I had to do to get rocm installed on my host machine. Now let’s get a docker container working to do some work here are the ways get Pytorch, Tensorflow, and MAGMA installed in docker containers: https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.4-/page/Frameworks_Installation.html#d1667e103. I’ll show how I got Pytorch running and how I ran it to allow jupyter notebooks access in my host machine.

docker pull rocm/pytorch:latest

docker run --net=host -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest

the -- net=host argument makes it so if you run juypter notebook inside your container you can access that on the same port of your host machine.

once in the docker container you can run

python
import torch
torch.cuda.is_available()

the torch.cuda.is_available() should return True if it worked. then we can run some tests.

cd apex/tests/L0
./run_rocm.sh

this will test a bunch of stuff. you can monitor your gpu utilization in another terminal with this command

watch -n .1 rocm-smi

that’ll show you how much of you gpu is being utilized.

I used my AMD RX 6900xt to get this working. Haven’t tested other AMD GPUs but this was fun to actually get working.

How to get ROCM running in Docker to do machine learning on your AMD GPU

Written by Drew Gillies