=hipblas-dev_2.0.0-00c0676_amd64.deb=============== Debian GNU/Linux ================ The tinyrocs operating system (OS) is based on Debian GNU/Linux.

Debian

Debian stable (bookworm/12) is the base installation.

Installation

Do a minimal Debian server installation, no GUI, just ssh.

Dependencies

Install dependencies from Debian repositories.

# Debian Repositories /etc/apt/sources.list
deb http://deb.debian.org/debian/ bookworm main
deb http://security.debian.org/debian-security bookworm-security main
deb http://deb.debian.org/debian/ bookworm-updates main
deb http://deb.debian.org/debian/ bookworm-backports main

# Use IPv4 for apt
echo 'Acquire::ForceIPv4 "true";' | tee /etc/apt/apt.conf.d/99force-ipv4

# Install dependencies
apt update
apt install bc bison build-essential ccache cmake-curses-gui colordiff \
  cpufrequtils devscripts dpkg-dev equivs flex gfortran git haveged host \
  libbz2-dev libdrm-dev libedit-dev libegl1-mesa-dev libelf-dev libffi-dev \
  libglut-dev libhdf5-openmpi-dev liblzma-dev libncurses-dev libnuma-dev \
  libopenmpi-dev libpomp2-dev libsqlite3-dev libssl-dev libsystemd-dev \
  libudev-dev libxml2-dev libxml2-utils libz3-dev libzstd-dev lshw \
  lzma-dev mesa-common-dev net-tools ninja-build nlohmann-json3-dev \
  ntpsec-ntpdate nvme-cli ocl-icd-opencl-dev openmpi-bin pahole pkg-config \
  portaudio19-dev python3-argcomplete python3-pip python3-pygments \
  python3-venv python3-virtualenv python3-yaml quilt rsync rsyslog sshfs \
  sudo swig traceroute vim xxd python3-sphinx git-lfs hwdata \
  lua5.3 liblua5.3-dev libmpfr-dev libmsgpack-dev libfmt-dev \
  environment-modules python3-numpy pybind11-dev libopengl-dev zip zsh \
  hpcc gawk googletest libdw-dev libgtest-dev libsigsegv2 \
  libbabeltrace-dev libbabeltrace1 libbison-dev libncurses5-dev \
  libtext-unidecode-perl tex-common texinfo ucx-utils libucx-dev \
  librdmacm-dev libpci-dev python-is-python3

apt install -t bookworm-backports linux-cpupower

# Packages like this aren't used from Debian's repository.
# Make sure they are gone.
apt purge --autoremove firmware-amd-graphics libhsa-runtime64-1 \
  opencl-c-headers rocminfo

apt clean

OS Configuration

Operating system configuration.

# Lazy sudo
sed -i -e 's/%sudo\tALL=(ALL:ALL) ALL/%sudo ALL=(ALL) NOPASSWD: ALL/g' /etc/sudoers

# Add user to some groups (user name debian here):
adduser debian sudo
adduser debian render

# Disable various startup packages
systemctl disable XXX

User Configuration

Set up the user account. Configure to use various caching services already available in the cluster.

ccache

There is a redis ccache server on the tinyrocs network. Edit ~/.config/ccache/ccache.conf thusly:

remote_storage = redis://192.168.1.2
remote_only = true
reshare = true

PATH

Add the ROCm binary path and ccache (XXX) to ~/.bashrc:

PATH=/usr/lib/ccache:/opt/rocm/bin:$PATH

Python pip cache

Set up to use LAN pip cache pydev if available, by editing ~/.config/pip/pip.conf, such as:

[global]
trusted-host = 192.168.1.3
index-url = http://192.168.1.3:4040/root/pypi/+simple/

[search]
index = http://192.168.1.3:4040/root/pypi/

Monitoring and Control

Applications to monitor and control the hardware.

Most require dependenices built first.

nvtop

nvtop is nice to quickly visualize the GPUs in a text console.

Something like:

git clone https://github.com/syllo/nvtop
cd nvtop
mkdir build
cd build
ccmake ..
# Disable other GPUs, enable AMD
make

./src/nvtop

btop

git clone https://github.com/aristocratos/btop
cd btop/
rm -rf build
cmake -B build -G Ninja \
  -DCMAKE_BUILD_TYPE=Release \
  -DBTOP_RSMI_STATIC=ON \
  -DBTOP_GPU=ON \
  -DBTOP_RSMI_STATIC=ON

ninja -C build