Installing NVIDIA Drivers on Ubuntu: Don’t Let Your High-End GPU Perform Like a Snail

Ubuntu tutorial - IT technology blog
Ubuntu tutorial - IT technology blog

Why is Ubuntu’s default driver not enough?

Right after installing Ubuntu, the system will use the Nouveau open-source driver. This driver is fine for web browsing or word processing. However, if you plan to train PyTorch models or climb the ranks in Steam, Nouveau is a nightmare. It doesn’t support hardware acceleration, causing FPS to drop drastically and the system to lag frequently, hindering any efforts in optimizing Ubuntu desktop performance.

After 6 months of managing an Ubuntu 22.04 workstation cluster for an AI team, I’ve learned a painful lesson. Never install drivers using the .run file from NVIDIA’s website unless you’re an expert. This method easily causes conflicts every time you run apt upgrade, leading to the legendary black screen error. Using PPA for managing packages with apt on Ubuntu is the most sustainable solution for a real-world production environment.

Step 1: Check hardware and clean up the system

First, identify your graphics card model to avoid installing an unsupported version. Type the following command into the terminal:

lspci | grep -i nvidia

If the system is experiencing old driver errors, you need to clean them up to avoid “ghost” conflicts later, following the best practices for things to do after installing Ubuntu Server 22.04:

sudo apt-get purge nvidia* -y
sudo apt-get autoremove -y

Step 2: Install Driver via PPA (Recommended)

Ubuntu’s default repositories often update drivers very slowly. To get the best performance for RTX 3090 or 4000 series cards, I always prioritize using the graphics-drivers team’s PPA. This repository usually updates with new patches just 24-48 hours after NVIDIA releases them.

Add the repository and update the package list:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update

Next, let the ubuntu-drivers tool automatically recommend the optimal version:

ubuntu-drivers devices

The system will return a result similar to the following:

vendor   : NVIDIA Corporation
model    : GA102 [GeForce RTX 3090]
driver   : nvidia-driver-535 - third-party free recommended

To install the most stable version (labeled recommended), simply run:

sudo ubuntu-drivers autoinstall

If you need a specific version (such as version 535 to match CUDA 12.x), specify the package name. Don’t forget to restart your computer so the Kernel can load the new driver:

sudo reboot

Step 3: Configure NVIDIA Prime for Laptops

Specifically for those using laptops with discrete graphics cards: power management is vital. Many complain that their machines get hot and battery life drops like a stone after installing drivers. This is often because the NVIDIA card is always running in the background even during light tasks.

You can actively switch modes to save battery:

  • Performance Mode: Always use NVIDIA for all tasks.
  • Intel Mode: Completely disable the discrete card for maximum battery savings.
  • On-demand Mode: Only activate NVIDIA when running heavy applications.

The switching command is very simple: sudo prime-select on-demand. Then, log out and log back in for the changes to take effect.

Step 4: Testing and Performance Monitoring

After the computer restarts, check if the driver is “alive” using the command:

nvidia-smi

If the table displays the full Driver version and VRAM capacity, you’ve succeeded. However, nvidia-smi is just a static snapshot. To monitor GPU load in real-time while training AI, I recommend using nvtop.

sudo apt install nvtop

With nvtop, you will see intuitive clock speed and temperature charts. It helps you immediately detect if any process is abnormally “hogging” all the VRAM.

Troubleshooting: Common “Tricky” Issues

1. The Secure Boot Barrier

This is the most common reason why drivers don’t work after installation. If BIOS has Secure Boot enabled, Ubuntu will ask you to create a MOK (Machine Owner Key) password. Upon rebooting, select Enroll MOK and enter that password. If you accidentally skip this, the driver will be completely blocked because it hasn’t been authenticated.

2. Rescuing Your System from a Black Screen

Don’t panic if your machine gets stuck on a black screen. Press Ctrl + Alt + F3 to enter the raw terminal. Log in and purge the recently installed drivers using the command sudo apt-get purge nvidia*. After rebooting, the machine will revert to the default driver so you can find another cause.

Real-world Experience from Production

In system operations, stability is always more important than having the latest version. For AI servers, especially when setting up a professional Python development environment, I usually stick with LTS (Long Term Support) driver versions like 525 or 535. Chasing beta versions only makes the system more prone to crashing.

One final note: If you use Docker or follow our guide to installing and managing LXD containers, install the NVIDIA Container Toolkit. Without this tool, Docker won’t be able to connect to the GPU. In that case, no matter how powerful the host machine’s driver is, your container will still be forced to run on the slow CPU.

Share: