
When Your Security Scanner Gets Compromised
A cold CI run that takes 12 minutes drops to 3 when caches are warm. That's not a hypothetical. It's what happens when you stop reinstalling 800 MB of node_modules from scratch on every push. Caching is the single highest-leverage optimization available in GitHub Actions, yet most teams either skip it entirely or misconfigure their cache keys and wonder why they never hit.
This guide covers every caching layer you can use in GitHub Actions: the built-in cache action for npm/pnpm/yarn, Docker layer caching via BuildKit, and Turborepo's remote cache for monorepos. You'll get copy-pasteable workflow snippets and the reasoning behind each cache key pattern.
The actions/cache@v4 action stores and retrieves files from GitHub's cache service using a key-based lookup. You specify a key, a path, and optionally a list of restore-keys. The lookup order is straightforward:
An exact match is a cache hit. Anything else is a miss. On a miss, the action creates a new cache entry at the end of the job using the primary key. You can't update an existing entry; you can only create new ones.
A well-structured cache key has three parts: a static prefix that identifies what's being cached, a context segment that scopes it (OS, Node version), and a hash that changes when the underlying content changes.
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}Here runner.os prevents cross-OS cache pollution. The hashFiles function produces a SHA-256 of your lockfile. Any dependency change produces a new hash, which triggers a fresh cache. Keys have a maximum length of 512 characters.
restore-keys give you a safety net. If the exact lockfile hash doesn't match, you can still restore a partially stale cache that's close enough. The trick is ordering them from most specific to least specific:
restore-keys: |
${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
${{ runner.os }}-node-
${{ runner.os }}-The most recent cache whose key starts with the restore-key prefix wins. So if you added one package, you'll still restore 99% of your previous cache and only download the delta with npm install.
Caches are scoped to branches. A workflow on branch feature-b can read caches from its own branch and from the default branch (main), but not from sibling branches. PR workflows can also access caches from their base branch. This isolation prevents stale data from unrelated branches from polluting your builds, but it means your first CI run on a new feature branch will typically fall back to the main branch cache.
You have two approaches here: use the built-in caching in actions/setup-node, or configure actions/cache manually. The first is simpler. The second gives you more control.
actions/setup-node handles cache keys, paths, and restore-keys automatically when you pass the cache input. It caches the package manager's global cache directory, not node_modules.
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ciSwap 'npm' for 'pnpm' or 'yarn' for those package managers. The action detects the appropriate lockfile automatically.
When you need granular control, use the cache action directly. The key decision is what to cache: the global package cache (~/.npm) or node_modules directly.
Caching the global cache directory (~/.npm for npm, ~/.pnpm-store for pnpm) is safer. You still run npm ci after restoring the cache, but the install pulls packages from the local cache instead of the registry. It's slower than caching node_modules but avoids native module ABI mismatches and postinstall script issues.
Caching node_modules directly is faster on hit because you skip the install step entirely. But if your lockfile hash is the same while the runner OS or Node version changed, you can restore incompatible native binaries. If you go this route, include the Node version in your cache key.
- name: Cache node_modules
uses: actions/cache@v4
id: cache-deps
with:
path: node_modules
key: ${{ runner.os }}-node-${{ matrix.node-version }}-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-${{ matrix.node-version }}-
- name: Install dependencies
if: steps.cache-deps.outputs.cache-hit != 'true'
run: npm ciNotice the conditional: npm ci only runs on a cache miss. On a hit, node_modules is already in place.
pnpm uses a content-addressable store, so caching the store directory is the idiomatic approach. You need to resolve the store path first:
- name: Get pnpm store directory
shell: bash
run: echo "STORE_PATH=$(pnpm store path --silent)" >> $GITHUB_ENV
- name: Cache pnpm store
uses: actions/cache@v4
with:
path: ${{ env.STORE_PATH }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
- run: pnpm install --frozen-lockfilepnpm still runs the install after cache restore because it needs to link packages from the store into node_modules. The linking is fast; the expensive part (downloading tarballs) is avoided by the cached store.
Docker image builds in CI are a different beast. Each GitHub Actions runner starts with a clean Docker daemon, so every layer rebuilds from scratch unless you explicitly cache them. BuildKit, the modern Docker build engine, supports several cache backends that work with GitHub Actions.
The simplest option. BuildKit talks directly to GitHub's cache service using the same infrastructure that actions/cache uses. It shares the same 10 GB default storage limit per repository.
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4
- name: Build and push
uses: docker/build-push-action@v7
with:
push: true
tags: ghcr.io/myorg/myapp:latest
cache-from: type=gha
cache-to: type=gha,mode=maxThe mode=max flag tells BuildKit to cache all layers, including intermediate ones from multi-stage builds. Without it, only the final stage's layers are cached, which means cache misses when you change early build stages.
One gotcha: the GHA cache backend requires Docker Buildx >= v0.21.0 and BuildKit >= v0.20.0 since GitHub migrated to Cache service API v2 in April 2025. If you're on self-hosted runners with older tooling, use docker/setup-buildx-action@v4 to install a compatible version.
If you're hitting the 10 GB cap or want to share Docker layer caches across repositories, push cache data to a container registry instead:
- name: Build and push
uses: docker/build-push-action@v7
with:
push: true
tags: ghcr.io/myorg/myapp:latest
cache-from: type=registry,ref=ghcr.io/myorg/myapp:buildcache
cache-to: type=registry,ref=ghcr.io/myorg/myapp:buildcache,mode=maxThe cache lives as a tagged image in your registry (the :buildcache tag). This approach doesn't count against GitHub's cache quota, but it does count against your registry storage. It's the right call when your Docker images are large or when you need cache sharing across repos.
If you're running a monorepo with Turborepo, the cache action and Docker layer caching only cover part of the picture. Turborepo has its own caching layer that works at the task level: it hashes your source files, config, dependencies, and environment variables into a fingerprint. If the fingerprint matches, Turbo skips the entire task and restores the output from cache.
Locally, Turborepo stores this cache in .turbo/cache. But in CI, every runner starts clean, so you need Remote Caching to share results across runs and across developers.
Vercel provides a managed Remote Cache that works out of the box if you're on their platform. For CI, you just need two environment variables:
env:
TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
TURBO_TEAM: ${{ vars.TURBO_TEAM }}
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'pnpm'
- run: pnpm install --frozen-lockfile
- run: pnpm turbo run build lint testTURBO_TOKEN is the bearer token for authentication. TURBO_TEAM is your Vercel team slug. With those set, every turbo run command automatically reads from and writes to the remote cache.
If you don't want to depend on Vercel, Turborepo's Remote Cache API is documented and open. Several open-source servers implement it, including ducktors/turborepo-remote-cache (backed by S3, GCS, or local storage). You point Turbo at your server by setting TURBO_API in addition to the token:
env:
TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
TURBO_TEAM: my-org
TURBO_API: https://cache.internal.myorg.comSelf-hosting gives you full control over retention, storage costs, and geographic placement. The tradeoff is you're running and maintaining another service.
Understanding the hash inputs helps you debug cache misses. Turborepo's fingerprint includes:
inputs key)env and globalEnvglobalDependenciesIf you're getting unexpected misses, use turbo run build --dry to inspect the task inputs without running anything, or turbo run build --summarize to generate a full run summary you can diff between two runs.
Caching is only useful if you can trust the cache. Here's when it goes wrong, and what to do about it.
If you cache node_modules and your lockfile doesn't change but the runner image updates its system libraries or Node.js version, native modules like sharp or bcrypt can break. The fix: include the Node version in your cache key, or cache the global package cache instead of node_modules.
Sometimes you just need to nuke the cache. You can't delete specific cache entries from a workflow, but you can add a version segment to your key:
key: ${{ runner.os }}-node-v2-${{ hashFiles('**/package-lock.json') }}Bump v2 to v3 when you need a fresh start. You can also use the GitHub UI or CLI to delete caches: gh cache delete --all or selectively by key.
If your build output depends on an environment variable (like API_URL for different environments) but you don't list it in env or globalEnv in turbo.json, Turbo won't include it in the hash. You'll get a cache hit that deploys a staging build to production. Always declare environment variables that affect your output.
Caching isn't set-and-forget. You need to track whether it's actually working.
The actions/cache action outputs cache-hit as a boolean. You can log it in a subsequent step, or send it to your observability stack. Over time, you want to see high hit rates on your default branch (where the lockfile is stable) and lower but non-zero rates on feature branches.
GitHub's default cache limit is 10 GB per repository. Enterprise and organization admins can increase this, but anything beyond the default incurs additional costs. Entries not accessed within 7 days are evicted automatically. When you exceed the limit, GitHub evicts the oldest entries first.
This creates a real problem for repos with many branches. If each branch creates its own cache entries, you can blow through 10 GB quickly. A monorepo with 5 packages, 3 active branches, and Docker builds can easily consume the full quota. Watch for cache thrashing: if your hit rate drops suddenly, check your total cache usage via the GitHub UI or the REST API.
Rate limits also apply: 200 uploads and 1,500 downloads per minute per repository. In practice, only very large matrix builds hit these.
The most convincing metric is actual time saved. Compare your average CI duration for cache hits vs. cache misses. For npm installs, the difference is typically 1-4 minutes depending on dependency count. For Docker builds with many layers, it can be 5-10 minutes. For Turborepo, the "FULL TURBO" case (all tasks cached) can turn a 15-minute pipeline into under a minute.
Here's a complete workflow for a Turborepo monorepo that caches npm dependencies, uses Turborepo remote cache, and builds a Docker image with layer caching:
name: CI
on:
push:
branches: [main]
pull_request:
jobs:
build-and-test:
runs-on: ubuntu-latest
env:
TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
TURBO_TEAM: ${{ vars.TURBO_TEAM }}
steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ci
- run: npx turbo run build lint test
docker:
runs-on: ubuntu-latest
needs: build-and-test
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v5
- uses: docker/setup-buildx-action@v4
- uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/build-push-action@v7
with:
push: true
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=maxThree caching layers in one workflow: setup-node handles npm, Turborepo's remote cache handles task-level caching across packages, and BuildKit's GHA backend handles Docker layers. Each layer addresses a different stage of the pipeline, and they don't interfere with each other.
Start with the simplest caching setup that covers your dependencies and build artifacts. Measure your hit rates and CI times for a week before adding complexity. Most teams get 60-80% of the possible speedup from just caching npm correctly, and adding Docker and Turborepo layers closes the remaining gap.
Tags
Recommended for you
What's next in your stack.
GET TENKI