How to Create a GPU-Optimized Machine Image with HashiCorp Packer on GCP

by SkillAiNest

Whenever you spin up a GPU infrastructure, you do the same thing: install CUDA drivers, DCGM, apply OS-level GPU tuning, and fight dependency issues. Same old ritual every time, wasting expensive cloud credits and getting frustrated before the real work starts.

In this article, you’ll create a reusable GPU-optimized machine image using Packer, which comes preloaded with persistence mode such as NVIDIA drivers, CUDA Toolkit, NVIDIA Container Toolkit, DCGM, and system-level GPU tuning.

Table of Contents

Conditions

  • Hashi carp figure >= 1.9

  • Google Compute Packer Plugin (by install packer init)

  • optionally, AWS Packer Plugin can be used for EC2 builds by adding a amazon-ebs Source of node.pkr.hcl

  • GCP project enabled with Compute Engine API (or AWS account with EC2 access)

  • GCP Validation (gcloud auth application-default login) or AWS credentials

  • Access to NVIDIA GPU instance type (eg, A100, H100, L4 on GCP; p4d, p5, G6 on AWS)

Project setup

Step 1: Install the package.

To get started, if you’re on macOS you’ll install the packer with the steps below (or you can follow the official documentation for Linux and Windows installation Leader).

First, you will install the official packer formula from the terminal.

Install HashiCorp Faucet, the repository of all Hashicorp packages.

$ brew tap hashicorp/tap

Now, install with package. hashicorp/tap/packer.

$ brew install hashicorp/tap/packer

Step 2: Set up the project directory.

With the package installed, you will create your project directory. For clean code and separation of concerns, your project directory should look like below. Go ahead and make these files your own packer_demo folder using the command below:

mkdir -p packer_demo/script && touch packer_demo/{build.pkr.hcl,source.pkr.hcl,variable.pkr.hcl,local.pkr.hcl,plugins.pkr.hcl,values.pkrvars.hcl} packer_demo/script/base.sh

Your file directory should look like this:

packer_demo
├── build.pkr.hcl                 # Build pipeline — provisioner ordering
├── source.pkr.hcl                # GCP source definition (googlecompute)
├── variable.pkr.hcl              # Variable definitions with defaults
├── local.pkr.hcl                 # Local values
├── plugins.pkr.hcl                # Packer plugin requirements
├── values.pkrvars.hcl             # variable values (copy and customize)
├── script/
│   ├── base.sh                  # requirement script 

Step 3: Install packer plugins.

in you plugins.pkr.hcl file,Define your plugin. packer block. gave packer {} The block contains configuration settings, including specifying the desired plugin version. You will get it required_plugins A block in the figure block, which defines all the plugins the template needs to create your image. If you’re on Azure or AWS, you can check for the latest plugins. Here.

packer {
  required_plugins {
    googlecompute = {
      source  = "github.com/hashicorp/googlecompute"
      version = "~> 1"
    }
  }
}

Then, start your packer plugin with the command below:

packer init .

Step 4: Define your source.

With your plugin up and running, you can now define your source block. A source block configures a specific builder plugin, which is then invoked by a build block. The source blocks contain you project IDThe zone where your machine will be built, source_image_family (Think of it as your base image, like Debian, Ubuntu, and so on), and your source_image_project_id.

In GCP, each image has a project ID, such as “ubuntu-os-cloud” for Ubuntu. You will set. machine type In the GPU machine type Because you are building your base image for a GPU machine, the machine it will be built on must be able to run your commands.

source "googlecompute" "gpu-node" {
  project_id              = var.project_id
  zone                    = var.zone
  source_image_family     = var.image_family
  source_image_project_id = var.image_project_id
  ssh_username            = var.ssh_username
  machine_type            = var.machine_type



  image_name        = var.image_name
  image_description = var.image_description

  disk_size           = var.disk_size
  on_host_maintenance = "TERMINATE"

  tags = ("gpu-node")

}

The arrangement on_host_maintenance = "TERMINATE" On Google Cloud Compute Engine ensures that the VM instance stops during infrastructure maintenance instead of live migration. This is important when using GPUs or specialized hardware that cannot be moved, preventing data corruption.

You will define all your variables. variable.pkr.hcl In the file, and set the values values.pkrvars.hcl. Always remember to add your own values.pkrvars.hcl File to Gitignore.

variable "image_name" {
  type        = string
  description = "The name of the resulting image"
}

variable "image_description" {
  type        = string
  description = "Description of the image"
}

variable "project_id" {
  type        = string
  description = "The GCP project ID where the image will be created"
}

variable "image_family" {
  type        = string
  description = "The image family to which the resulting image belongs"
}

variable "image_project_id" {
  type        = list(string)
  description = "The project ID(s) to search for the source image"
}

variable "zone" {
  type        = string
  description = "The GCP zone where the build instance will be created"
}

variable "ssh_username" {
  type        = string
  description = "The SSH username to use for connecting to the instance"
}
variable "machine_type" {
  type        = string
  description = "The machine type to use for the build instance"
}

variable "cuda_version" {
  type        = string
  description = "CUDA toolkit version"
  default     = "13.1"
}

variable "driver_version" {
  type        = string
  description = "NVIDIA driver version"
  default     = "590.48.01"
}

variable "disk_size" {
  type        = number
  description = "Boot disk size in GB"
  default     = 50
}

values.pkrvars.hcl

image_name        = "base-gpu-image-{{timestamp}}"
image_description = "Ubuntu 24.04 LTS with gpu drivers and health checks"
project_id        = "your gcp project id"
image_family      = "ubuntu-2404-lts-amd64"
image_project_id  = ("ubuntu-os-cloud")
zone              = "us-central1-a"
ssh_username      = "packer"
machine_type      = "g2-standard-4"
disk_size        = 50
driver_version   = "590.48.01"
cuda_version      = "13.1" 

Step 5: Writing the Build Template

make build.pkr.hcl. gave build The block creates a temporary instance, runs provisioners, and builds an image.

The providers in this template are arranged as follows:

  • The first sustenance giver Runs system updates and upgrades.

  • Another provider restarts the instance (expect_disconnect = true).

  • The third sustenance provider waiting for instance to return (pause_before), then runs. script/base.sh. This sets the provider. max_retries To handle temporary SSH timeouts and pass environment variables DRIVER_VERSION And CUDA_VERSION.

Finally, you have a post processor that will tell you the image ID and completion status:

build {
  sources = ("source.googlecompute.gpu-node")

  provisioner "shell" {
    inline = (
      "set -e",
      "sudo apt update",
      "sudo apt -y dist-upgrade"
    )
  }

  provisioner "shell" {
    expect_disconnect = true
    inline            = ("sudo reboot")
  }

  # Base: NVIDIA drivers, CUDA, DCGM
  provisioner "shell" {
    pause_before = "60s"
    script       = "script/base.sh"
    max_retries  = 2
    environment_vars = (
      "DRIVER_VERSION=${var.driver_version}",
      "CUDA_VERSION=${var.cuda_version}"
    )
  }

  post-processor "shell-local" {
    inline = (
      "echo '=== Image Build Complete ==='",
      "echo 'Image ID: ${build.ID}'",
      "date"
    )
  }
}

Step 6: Writing the GPU Provisioning Script

Now we’ll go through the base script, and break down some of its parts.

Before installing NVIDIA drivers, the system requires kernel headers and build tools. The NVIDIA driver compiles a kernel module during installation via DKMS, so if the headers for your running kernel are not present, the build will fail silently, and the driver will not load at boot.

log "Installing kernel headers and build tools..."
sudo apt-get install -qq -y \
  "linux-headers-$(uname -r)" \
  build-essential \
  dkms \
  curl \
  wget

Section 2: Installing NVIDIA’s Apt Repository

This snippet downloads and installs NVIDIA’s official keyring package based on your OS Linux distribution, which adds the trusted signing keys required by the system to authenticate CUDA packages.

log "Adding NVIDIA CUDA apt repository (${DISTRO})..."
wget -q " \
  -O /tmp/cuda-keyring.deb
sudo dpkg -i /tmp/cuda-keyring.deb
rm /tmp/cuda-keyring.deb
sudo apt-get update -qq

Section 3: Pinning the NVIDIA Driver Version

Pinning an NVIDIA driver to a specific version ensures that the system always installs and uses the exact same driver version, even when new drivers appear in the repository.

NVIDIA drivers are tightly coupled with CUDA Toolkit versions, kernel versions, and container runtimes such as Docker or NVIDIA Container Toolkit.

A mismatch, such as auto-upgrading the system to a new driver, can cause CUDA to stop working, break GPU acceleration, or make the machine image inconsistent across deployments.

log "Pinning driver to version ${DRIVER_VERSION}..."
sudo apt-get install -qq -y "nvidia-driver-pinning-${DRIVER_VERSION}"

Section 4: Installing the Driver

gave libnvidia-compute Installs only compute-relevant userspace libraries (CUDA driver components), whereas nvidia-dkms-open; Installs Open source NVIDIA kernel modulebuilt locally by DKMS.

Both of these packages provide you with a fully functional CUDA driver environment without any GUI or graphics dependencies.

Here, we are using NVIDIA’s compute-only driver stack using open source kernel modulesbecause it deliberately avoids installing any display-related components, which you don’t need.

This approach provides an installation module based on DKMS that integrates better with Linux distros, as it is lightweight and compute-focused.

log "Installing NVIDIA compute-only driver (open kernel modules)..."
sudo apt-get -V install -y \
  libnvidia-compute \
  nvidia-dkms-open

Section 5: Installing the CUDA Toolkit

This part of the script installs CUDA Toolkit for a specific version and then ensures that CUDA’s executables and libraries are available system-wide for each user and each shell session.

This adds the CUDA binaries to the PATH, thus the commands nvcc, cuda-gdband cuda-memcheck Work without specifying full paths. It also adds CUDA libraries to LD_LIBRARY_PATH, so that applications can find CUDA shared libraries at runtime.

log "Installing CUDA Toolkit ${CUDA_VERSION}..."
sudo apt-get install -qq -y "cuda-toolkit-${CUDA_VERSION}"

# Persist CUDA paths for all users and sessions
cat <<'EOF' | sudo tee /etc/profile.d/cuda.sh
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH:-}
EOF
echo "/usr/local/cuda/lib64" | sudo tee /etc/ld.so.conf.d/cuda.conf
sudo ldconfig

Section 6: NVIDIA Container Toolkit

This block installs and configures the NVIDIA Container Toolkit so that containers (Docker or Container) can safely and correctly access the GPU. This is an important step for Kubernetes GPU nodes, Docker GPU workloads, and any system that needs GPU acceleration inside containers.

log "Installing NVIDIA Container Toolkit..."
curl -fsSL  \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL  \
  | sed 's#deb  (signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg)  \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update -qq
sudo apt-get install -qq -y nvidia-container-toolkit

# Configure for containerd (primary Kubernetes runtime)
sudo nvidia-ctk runtime configure --runtime=containerd

# Configure for Docker if present on this image
if systemctl list-unit-files | grep -q "^docker.service"; then
  sudo nvidia-ctk runtime configure --runtime=docker
fi

Section 7: Installing DCGM (Data Center GPU Manager)

This section covers installation and verification of NVIDIA DCGM (Data Center GPU Manager), NVIDIA’s official management and telemetry framework for data center GPUs.

It offers health monitoring and diagnostics, telemetry (including temperature, clocks, power, and usage), error reporting, and integration with Kubernetes, Prometheus, and monitoring agents. Your GPU monitoring stack depends on it.

The script extracts the installed version and checks that it matches. Minimum required version NVIDIA Driver for 590+. Then it enforces the version requirement. This prevents a mismatch between the GPU driver and DCGM, which would break monitoring and health checks. It also enables Fabric Manager for NVLink/NVswitches, if you are on a multi-GPU topology like A100/H100 DGX or multi-GPU servers.

log "Installing DCGM..."
sudo apt-get install -qq -y datacenter-gpu-manager

DCGM_VER=\((dpkg -s datacenter-gpu-manager 2>/dev/null | awk '/^Version:/{print \)2}' | sed 's/^(0-9)*://')
DCGM_MAJOR=\((echo "\){DCGM_VER}" | cut -d. -f1)
DCGM_MINOR=\((echo "\){DCGM_VER}" | cut -d. -f2)
if (( "\({DCGM_MAJOR}" -lt 4 )) || { (( "\){DCGM_MAJOR}" -eq 4 )) && (( "${DCGM_MINOR}" -lt 3 )); }; then
  error "DCGM ${DCGM_VER} is below the 4.3 minimum required for driver 590+. Check your CUDA repo."
fi
log "DCGM installed: ${DCGM_VER}"

sudo systemctl enable nvidia-dcgm
sudo systemctl start  nvidia-dcgm

# Fabric Manager — only needed for NVLink/NVSwitch GPUs (A100/H100 multi-GPU nodes)
if systemctl list-unit-files | grep -q "^nvidia-fabricmanager.service"; then
  log "Enabling nvidia-fabricmanager for NVLink GPUs..."
  sudo systemctl enable nvidia-fabricmanager
  sudo systemctl start  nvidia-fabricmanager
fi

Section 8: Enabling Persistence Mode

The NVIDIA driver usually unloads itself when the GPU is idle. When a new workload starts, the driver must be reloaded, the GPU restarted, and the memory mapping set. This increases latency from a few hundred milliseconds to several seconds depending on the GPU and system.

Enabling NVIDIA-Presistenced keeps the NVIDIA driver loaded in memory even when no GPU workloads are running.

log "Enabling nvidia-persistenced..."
sudo systemctl enable nvidia-persistenced
sudo systemctl start  nvidia-persistenced

Section 9: System Tuning for GPU Compute Workloads

This applies a set of blocks. System level performance and stability tuning which are standard for high-performance GPU servers, Kubernetes GPU nodes, and ML/AI workloads.

Each line targets a specific bottleneck or instability pattern that appears in a real GPU production environment.

  • Swap and Memory Behavior: Disabling Swap and Sort vm.swappiness=0 Prevents the kernel from pushing GPU-bound processes to swap. GPU workloads are highly sensitive to latency, and switching can cause CUDA context resets and GPU driver timeouts.

  • Large pages for large memory allocation: configuration vm.nr_hugepages=2048 Allocates a pool of hedge pages, which reduces TLB pressure for large contiguous memory allocations.

    CUDA, NCCL, and deep learning frameworks often allocate large buffers, and hedge pages reduce page table overhead, improve memory bandwidth, and reduce latency for large tensor operations. This is especially useful on multi-GPU servers.

  • CPU Frequency Governor: Installing cpupower and forcing the CPU governor performance Ensures CPU stays at maximum frequency instead of scaling.

    GPU workloads often become CPU bound during data preprocessing, kernel launches, and NCCL communication. Keeping CPUs at full speed reduces latency and improves throughput.

  • NUMA and Topology Tools: Installing numactl, libnuma-devand hwloc NUMA provides tools for pinning processes to nodes, understanding CPU–GPU affinity, and optimizing multi-GPU placements.

  • Disabling Imbalances: Prevention and Deactivation irqbalance This allows the NVIDIA driver to manage interrupt dependencies. For GPU servers, irqbalance can incorrectly route GPU interrupts to suboptimal CPUs, causing high latency and low throughput.

log "Applying system tuning..."

# Disable swap (critical for Kubernetes scheduler and ML stability)
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab
echo "vm.swappiness=0"     | sudo tee /etc/sysctl.d/99-gpu-swappiness.conf

# Hugepages — reduces TLB pressure for large memory allocations
echo "vm.nr_hugepages=2048" | sudo tee /etc/sysctl.d/99-gpu-hugepages.conf

# CPU performance governor
sudo apt-get install -qq -y linux-tools-common "linux-tools-$(uname -r)" || true
sudo cpupower frequency-set -g performance || true

# NUMA and topology tools for GPU affinity tuning
sudo apt-get install -qq -y numactl libnuma-dev hwloc

# Disable irqbalance — let NVIDIA driver manage interrupt affinity
sudo systemctl disable irqbalance || true
sudo systemctl stop    irqbalance || true

# Apply all sysctl settings now
sudo sysctl --system

Full base.sh script here:

#!/bin/bash
set -euo pipefail

log()   { echo "(BASE) $1"; }
error() { echo "(BASE)(ERROR) $1" >&2; exit 1; }

###############################################################
###############################################################
(( -z "${DRIVER_VERSION:-}" )) && error "DRIVER_VERSION is not set."
(( -z "${CUDA_VERSION:-}"   )) && error "CUDA_VERSION is not set."

log "DRIVER_VERSION : ${DRIVER_VERSION}"
log "CUDA_VERSION   : ${CUDA_VERSION}"

DISTRO=\((. /etc/os-release && echo "\){ID}${VERSION_ID}" | tr -d '.')
ARCH="x86_64"

export DEBIAN_FRONTEND=noninteractive

###############################################################
# 1. System update
###############################################################
log "Updating system packages..."
sudo apt-get update -qq
sudo apt-get upgrade -qq -y

###############################################################
# 2. Pre-installation — kernel headers
#    Source: 
###############################################################
log "Installing kernel headers and build tools..."
sudo apt-get install -qq -y \
  "linux-headers-$(uname -r)" \
  build-essential \
  dkms \
  curl \
  wget

###############################################################
# 3. NVIDIA CUDA Network Repository
###############################################################
log "Adding NVIDIA CUDA apt repository (${DISTRO})..."
wget -q " \
  -O /tmp/cuda-keyring.deb
sudo dpkg -i /tmp/cuda-keyring.deb
rm /tmp/cuda-keyring.deb
sudo apt-get update -qq

###############################################################
# 4. Pin driver version BEFORE installation (590+ requirement)
###############################################################
log "Pinning driver to version ${DRIVER_VERSION}..."
sudo apt-get install -qq -y "nvidia-driver-pinning-${DRIVER_VERSION}"

###############################################################
# 5. Compute-only (headless) driver — Open Kernel Modules
#    Source: NVIDIA Driver Installation Guide — Compute-only System (Open Kernel Modules)
#
#    libnvidia-compute  = compute libraries only (no GL/Vulkan/display)
#    nvidia-dkms-open   = open-source kernel module built via DKMS
#
#    Open kernel modules are the NVIDIA-recommended choice for
#    Ampere, Hopper, and Blackwell data centre GPUs (A100, H100, etc.)
###############################################################
log "Installing NVIDIA compute-only driver (open kernel modules)..."
sudo apt-get -V install -y \
  libnvidia-compute \
  nvidia-dkms-open

###############################################################
# 6. CUDA Toolkit
###############################################################
log "Installing CUDA Toolkit ${CUDA_VERSION}..."
sudo apt-get install -qq -y "cuda-toolkit-${CUDA_VERSION}"

# Persist CUDA paths for all users and sessions
cat <<'EOF' | sudo tee /etc/profile.d/cuda.sh
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH:-}
EOF
echo "/usr/local/cuda/lib64" | sudo tee /etc/ld.so.conf.d/cuda.conf
sudo ldconfig

###############################################################
# 7. NVIDIA Container Toolkit
#    Required for GPU workloads in Docker / containerd / Kubernetes
###############################################################
log "Installing NVIDIA Container Toolkit..."
curl -fsSL  \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL  \
  | sed 's#deb  (signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg)  \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update -qq
sudo apt-get install -qq -y nvidia-container-toolkit

# Configure for containerd (primary Kubernetes runtime)
sudo nvidia-ctk runtime configure --runtime=containerd

# Configure for Docker if present on this image
if systemctl list-unit-files | grep -q "^docker.service"; then
  sudo nvidia-ctk runtime configure --runtime=docker
fi

###############################################################
# 8. DCGM — DataCenter GPU Manager
###############################################################
log "Installing DCGM..."
sudo apt-get install -qq -y datacenter-gpu-manager
 
DCGM_VER=\((dpkg -s datacenter-gpu-manager 2>/dev/null | awk '/^Version:/{print \)2}' | sed 's/^(0-9)*://')
DCGM_MAJOR=\((echo "\){DCGM_VER}" | cut -d. -f1)
DCGM_MINOR=\((echo "\){DCGM_VER}" | cut -d. -f2)
if (( "\({DCGM_MAJOR}" -lt 4 )) || { (( "\){DCGM_MAJOR}" -eq 4 )) && (( "${DCGM_MINOR}" -lt 3 )); }; then
  error "DCGM ${DCGM_VER} is below the 4.3 minimum required for driver 590+. Check your CUDA repo."
fi
log "DCGM installed: ${DCGM_VER}"

sudo systemctl enable nvidia-dcgm
sudo systemctl start  nvidia-dcgm

# Fabric Manager — only needed for NVLink/NVSwitch GPUs (A100/H100 multi-GPU nodes)
if systemctl list-unit-files | grep -q "^nvidia-fabricmanager.service"; then
  log "Enabling nvidia-fabricmanager for NVLink GPUs..."
  sudo systemctl enable nvidia-fabricmanager
  sudo systemctl start  nvidia-fabricmanager
fi

###############################################################
# 9. NVIDIA Persistence Daemon
#    Keeps the driver loaded between jobs — reduces cold-start
#    latency on the first CUDA call in each new workload
###############################################################
log "Enabling nvidia-persistenced..."
sudo systemctl enable nvidia-persistenced
sudo systemctl start  nvidia-persistenced

###############################################################
# 10. System tuning for GPU compute workloads
###############################################################
log "Applying system tuning..."

# Disable swap (critical for Kubernetes scheduler and ML stability)
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab
echo "vm.swappiness=0"     | sudo tee /etc/sysctl.d/99-gpu-swappiness.conf

# Hugepages — reduces TLB pressure for large memory allocations
echo "vm.nr_hugepages=2048" | sudo tee /etc/sysctl.d/99-gpu-hugepages.conf

# CPU performance governor
sudo apt-get install -qq -y linux-tools-common "linux-tools-$(uname -r)" || true
sudo cpupower frequency-set -g performance || true

# NUMA and topology tools for GPU affinity tuning
sudo apt-get install -qq -y numactl libnuma-dev hwloc

# Disable irqbalance — let NVIDIA driver manage interrupt affinity
sudo systemctl disable irqbalance || true
sudo systemctl stop    irqbalance || true

# Apply all sysctl settings now
sudo sysctl --system

###############################################################
# Done
###############################################################
log "============================================"
log "Base layer provisioning complete."
log "  OS      : ${DISTRO}"
log "  Driver  : ${DRIVER_VERSION} (open kernel modules, compute-only)"
log "  CUDA    : cuda-toolkit-${CUDA_VERSION}"
log "  DCGM    : ${DCGM_VER}"
log "============================================"

Step 7: Assembling and operating the building

Validate the template first, then run the build. Validation catches syntax or variable errors early, so the build doesn’t start on a broken configuration.

packer validate -var-file=values.pkrvars.hcl .

If the verification is successful, you will see a short confirmation like this The configuration is valid.. After that, start construction. You should expect this process to create a temporary VM, run your provisioners and create an image:

packer build -var-file=values.pkrvars.hcl .

Construction usually takes 15-20 minutes, Depends on network speed and package installation. See the figure log for the three main checkpoints:

  • Example creation – Confirms that the temporary VM was provisioned.

  • Provisioner output – Shows each step of the script (updates, reboot, script/base.sh) and any errors.

  • Image creation – indicates the completion of construction and an image sample was written.

If the build fails, copy the failing provider’s log lines and run the build again after fixing the script or variables. For quick troubleshooting, rerun the failed provisioner on a locally matched test VM to quickly replicate.

googlecompute.gpu-node: output will be in this color.

==> googlecompute.gpu-node: Checking image does not exist...
==> googlecompute.gpu-node: Creating temporary RSA SSH key for instance...
==> googlecompute.gpu-node: no persistent disk to create
==> googlecompute.gpu-node: Using image: ubuntu-2404-noble-amd64-v20260225
==> googlecompute.gpu-node: Creating instance...
==> googlecompute.gpu-node: Loading zone: us-central1-a
==> googlecompute.gpu-node: Loading machine type: g2-standard-4
==> googlecompute.gpu-node: Requesting instance creation...
==> googlecompute.gpu-node: Waiting for creation operation to complete...
==> googlecompute.gpu-node: Instance has been created!
==> googlecompute.gpu-node: Waiting for the instance to become running...
==> googlecompute.gpu-node: IP: 34.58.58.214
==> googlecompute.gpu-node: Using SSH communicator to connect: 34.58.58.214
==> googlecompute.gpu-node: Waiting for SSH to become available...
systemd-logind.service
==> googlecompute.gpu-node:  systemctl restart unattended-upgrades.service
==> googlecompute.gpu-node:
==> googlecompute.gpu-node: No containers need to be restarted.
==> googlecompute.gpu-node:
==> googlecompute.gpu-node: User sessions running outdated binaries:
==> googlecompute.gpu-node:  packer @ session #1: sshd(1535)
==> googlecompute.gpu-node:  packer @ user manager service: systemd(1540)
==> googlecompute.gpu-node: Pausing 1m0s before the next provisioner...
==> googlecompute.gpu-node: Provisioning with shell script: script/base.sh
==> googlecompute.gpu-node: (BASE) DRIVER_VERSION : 590.48.01
==> googlecompute.gpu-node: (BASE) CUDA_VERSION   : 13.1
==> googlecompute.gpu-node: (BASE) Updating system packages...
==> googlecompute.gpu-node: (BASE) Installing kernel headers and build tools...
==> googlecompute.gpu-node: (BASE) Installing CUDA Toolkit 13.1...
==> googlecompute.gpu-node: (BASE) Installing DCGM...
==> googlecompute.gpu-node: (BASE) Enabling nvidia-persistenced...
==> googlecompute.gpu-node: (BASE) Applying system tuning...
==> googlecompute.gpu-node: vm.swappiness=0
==> googlecompute.gpu-node: vm.nr_hugepages=2048
==> googlecompute.gpu-node: Setting cpu: 0
==> googlecompute.gpu-node: Error setting new values. Common errors:
==> googlecompute.gpu-node: (BASE) ============================================
==> googlecompute.gpu-node: (BASE) Base layer provisioning complete.
==> googlecompute.gpu-node: (BASE)   OS      : ubuntu2404
==> googlecompute.gpu-node: (BASE)   Driver  : 590.48.01 (open kernel modules, compute-only)
==> googlecompute.gpu-node: (BASE)   CUDA    : cuda-toolkit-13.1
==> googlecompute.gpu-node: (BASE)   DCGM    : 1:3.3.9
==> googlecompute.gpu-node: (BASE) ============================================
==> googlecompute.gpu-node: Deleting instance...
==> googlecompute.gpu-node: Instance has been deleted!
==> googlecompute.gpu-node: Creating image...
==> googlecompute.gpu-node: Deleting disk...
==> googlecompute.gpu-node: Disk has been deleted!
==> googlecompute.gpu-node: Running post-processor:  (type shell-local)
==> googlecompute.gpu-node (shell-local): Running local shell script: 
==> googlecompute.gpu-node (shell-local): === Image Build Complete ===
==> googlecompute.gpu-node (shell-local): Image ID: packer-69b6c2ee-883a-3602-7bb5-059f1ba27c8b
==> googlecompute.gpu-node (shell-local): Sun Mar 15 15:50:09 WAT 2026
Build 'googlecompute.gpu-node' finished after 17 minutes 55 seconds.

==> Wait completed after 17 minutes 55 seconds

==> Builds finished. The artifacts of successful builds are:
--> googlecompute.gpu-node: A disk image was created in the 'my_project-00000' project: base-gpu-image-1773585134

Step 8: Test the image and verify the GPU stack.

Verify that the image exists in the GCP console: Compute → Storage → Images And locate your newly created OS image.

Your image information on GCP

Create a test VM from the image:

gcloud compute instances create my-gpu-vm \
  --machine-type=g2-standard-4 \
  --accelerator=count=1,type=nvidia-l4 \
  --image=base-gpu-image-1772718104 \
  --image-project=YOUR_PROJECT_ID \
  --boot-disk-size=50GB \
  --maintenance-policy=TERMINATE \
  --restart-on-failure \
  --zone=us-central1-a

Created (
NAME       ZONE           MACHINE_TYPE   PREEMPTIBLE  INTERNAL_IP    EXTERNAL_IP      STATUS
my-gpu-vm  us-central1-a  g2-standard-4               10.128.15.227  104.154.184.217  RUNNING

A bar is an example. RUNNINGverify that the NVIDIA driver and GPU are visible:

Output from Nvidia-SMI command showing driver and CUDA version.

Image confirming persistence mode is enabled.

gave nvidia-smi The output confirms:

This is exactly what a healthy waistline should look like. Notes Disp.A: Off? This confirms that our compute-only driver selection is working—no display adapters are enabled.

Verify by running the installed CUDA toolkit. nvcc --version. You can see that version 13.1 was installed as specified.

Output from the NVCC -Version command

Let’s verify the DCGM installation by running dcgmi discovery -l. A successful output indicates that the DCGM is running and communicating with the driver.

The output from the DCGMI dicovery -l command is showing device information.

The result

You now have a production-grade, GPU-optimized base image that includes the NVIDIA compute-only driver built with open kernel modules, DCGM for monitoring, and the CUDA toolkit. You also implement OS-level tuning according to GPU compute workloads, providing a consistent, reproducible environment without manual setup.

From here, you can extend the build by adding application layer scripts to install frameworks like PyTorch, TensorFlow, or vLLM, or create an instance template that uses this image to scale your GPU infrastructure.

The complete figure project includes additional scripts for training and inference workloads that you can use to enhance your figure.

References

You may also like

Leave a Comment

At Skillainest, we believe the future belongs to those who embrace AI, upgrade their skills, and stay ahead of the curve.

Get latest news

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

@2025 Skillainest.Designed and Developed by Pro