??????? Software Supply Chain Security — Handbook
V1
S²CS
SUPPLY CHAIN SECURITY
Software Security Handbook
SBOMs Sigstore Dependencies AI-Era Risks
Software Supply Chain Security

Every dependency
is a trust decision.

From SBOMs to Sigstore, from typosquatting to AI-hallucinated packages — a complete practitioner's guide to securing the modern software supply chain.

SBOM · SPDX · CycloneDX Sigstore · Cosign · Rekor SLSA Provenance Dependency Confusion Typosquatting · Protestware AI Package Hallucination VEX · OSV · CSAF
🗺

Threat Landscape

// Why supply chain security is the defining challenge of this decade

Software supply chain attacks target the trusted dependencies, tools, and build processes used to create software — not the software itself. Attackers compromise one package or build system to reach thousands of downstream consumers. The 2020 SolarWinds breach, the 2021 Log4Shell vulnerability, and the 2022 node-ipc protestware incident all share the same root: implicit trust in code we didn't write.

🚨
Scale of impact: The average enterprise application depends on hundreds of direct packages and thousands of transitive dependencies. A single compromised package at the top of the npm or PyPI popularity charts can reach millions of applications within hours. In the AI era, large language models are actively inventing package names that attackers then publish.

The Three Layers of Supply Chain Risk

Source Code

Malicious contributions to open-source repos, compromised maintainer accounts, insider threats, and typosquatted package names that closely mimic trusted libraries.

Build & Packaging

Tampered build pipelines, unsigned release artifacts, compromised registries (npm, PyPI, Maven Central), and CI/CD system compromises that inject malicious code at packaging time.

Distribution & Runtime

Mirror poisoning, CDN tampering, dependency confusion attacks exploiting internal vs. public registry resolution, and protestware that activates based on runtime conditions.

Notable Incidents

IncidentYearVectorScope
SolarWinds Orion 2020 Compromised build server; backdoor injected into signed release 18,000+ organizations
Log4Shell (Log4j) 2021 RCE via JNDI injection in ubiquitous logging library Billions of JVM apps
ua-parser-js hijack 2021 npm account takeover; crypto miner + credential stealer injected 8M weekly downloads
node-ipc protestware 2022 Maintainer intentionally added file-wiping payload for RU/BY IPs 1M+ weekly downloads
PyTorch compromise 2022 Dependency confusion — torchtriton published to PyPI before official Nightly build users
XZ Utils backdoor 2024 Long-term social engineering of maintainer; backdoor in liblzma Linux distros globally
AI package hallucination 2023– LLMs invent nonexistent package names; attackers publish them All AI-assisted devs
⚔️

Attack Vectors

// The techniques attackers use to compromise your dependencies
Critical
Dependency Confusion
Attacker publishes a public package with the same name as your private internal package. Package managers resolve public registries first, installing the malicious version.
Victim uses @mycompany/auth internally. Attacker publishes mycompany-auth to npm with higher version.
Critical
Typosquatting
Publishing packages with names nearly identical to popular libraries — one character off, swapped letters, or plausible misspellings that developers mistype.
colourama (vs colorama), lodash_ (vs lodash), requets (vs requests)
High
Account Takeover
Compromising a maintainer's registry credentials via phishing, credential stuffing, or leaked tokens — then publishing a malicious version of a trusted package.
ua-parser-js, event-stream, coa, rc npm packages were all hijacked this way.
High
Protestware / Sabotage
A legitimate maintainer intentionally inserts malicious code — often politically motivated, triggered conditionally by locale, IP range, or other runtime factors.
node-ipc (2022): file wiper activated for Russian IP addresses.
High
Build Pipeline Compromise
Injecting malicious code into the CI/CD pipeline after source code review — in build scripts, post-install hooks, or through compromised build tooling.
SolarWinds: malicious code injected during the build process, not present in source repo.
Medium
Malicious Install Scripts
Packages using postinstall, preinstall hooks to execute arbitrary shell commands the moment the package is installed — before any code review.
npm postinstall scripts execute automatically unless --ignore-scripts is used.
⚠️
The long-game attack: The XZ Utils backdoor (2024) involved a fake persona spending two years building trust with the maintainer, then slowly introducing the backdoor across multiple commits. Automated scanning cannot reliably catch this. Provenance verification, reproducible builds, and SLSA frameworks reduce (but cannot eliminate) this risk.
📋

What Is an SBOM?

// Software Bill of Materials — the ingredient list for your software

A Software Bill of Materials (SBOM) is a formal, machine-readable inventory of every component in a software artifact — direct dependencies, transitive dependencies, their versions, licenses, known vulnerabilities, and their relationships. Think of it as the nutrition label for software.

What an SBOM Contains
  • Component identity — name, version, package URL (PURL), CPES
  • Supplier info — author, maintainer, originating organization
  • Relationships — what depends on what (dependency graph)
  • Licenses — SPDX license expressions per component
  • Checksums — SHA-256 / SHA-512 hashes of each component
  • VEX data — Vulnerability Exploitability eXchange status
  • SBOM metadata — timestamp, creator, spec version, document namespace
Why SBOMs Matter
  • Vulnerability response — instantly identify if Log4j-equivalent is in any product
  • License compliance — catch GPL in a proprietary product before shipping
  • Procurement security — require SBOMs from vendors before purchase
  • Incident triage — know in minutes, not days, what's affected
  • Regulatory compliance — US EO 14028, EU Cyber Resilience Act mandate SBOMs
  • Software composition visibility — end "I didn't know we used that"
ℹ️
Executive Order 14028 (US, 2021) mandated that any software sold to the US federal government must be accompanied by an SBOM. The EU Cyber Resilience Act (CRA, 2024) extends similar requirements to products sold in the EU. SBOMs are rapidly becoming a contractual and legal requirement, not just a best practice.
📄

SPDX vs CycloneDX

// The two dominant SBOM formats — choose based on use case
SPDX ISO/IEC 5962:2021

The Linux Foundation standard, ISO-ratified. Originally designed for license compliance, expanded to full SBOM. Widely accepted for regulatory compliance.

Formats: SPDX-TV (tag-value), JSON, RDF, YAML, XML

Strengths: License expression accuracy, ISO standard status, mature tooling, NTIA minimum element coverage, legal defensibility.

Best for: Legal/compliance teams, government procurement, license auditing, cross-industry interoperability.

Version: SPDX 2.3 / 3.0 · spdx.dev

CycloneDX OWASP Standard

The OWASP standard, security-first design. Purpose-built for vulnerability management, VEX, and DevSecOps integration. Rich security metadata.

Formats: JSON (preferred), XML, Protobuf

Strengths: Security metadata depth, VEX support, services/hardware/ML model BOMs, faster tooling adoption, CSAF integration.

Best for: Security teams, DevSecOps pipelines, vulnerability management, software-defined systems, AI/ML artifact tracking.

Version: CycloneDX 1.5 / 1.6 · cyclonedx.org

NTIA Minimum SBOM Elements

The US National Telecommunications and Information Administration defines seven minimum elements any SBOM must contain regardless of format:

ElementDescriptionExample
Supplier NameEntity that creates, defines, identifies the componentThe Apache Software Foundation
Component Name Designation assigned to the unit of software log4j-core
Component VersionIdentifier used by supplier to specify change2.17.1
Other Unique IDs Other identifiers — PURL, CPE, SWID pkg:maven/org.apache.logging.log4j/log4j-core@2.17.1
Dependency RelationshipsCharacterizing the relationship between componentsapp → log4j-core@2.17.1
Author of SBOM Data Entity creating the SBOM data for this component syft@1.0.0
Timestamp Date/time SBOM data was assembled 2024-03-15T14:22:00Z
⚙️

Generating SBOMs

// Syft, cdxgen, and language-native tools

Syft — Universal SBOM Generator

bashAnchore Syft
# Install Syft curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin # Generate SBOM from a Docker image (CycloneDX JSON) syft nginx:1.25 -o cyclonedx-json=sbom.cdx.json # Generate SBOM from a local directory / repo syft dir:./my-app -o spdx-json=sbom.spdx.json # Generate SBOM from a built OCI image and attest it with cosign syft packages my-registry/my-image:sha256-abc123 \ -o cyclonedx-json \ | cosign attest --predicate - --type cyclonedx \ my-registry/my-image:sha256-abc123 # Output both formats simultaneously syft my-image:latest \ -o spdx-json=sbom.spdx.json \ -o cyclonedx-json=sbom.cdx.json \ -o table # human-readable summary to stdout

cdxgen — CycloneDX-Native Generator

bashcdxgen (OWASP)
# Install cdxgen globally npm install -g @cyclonedx/cdxgen # Auto-detect language and generate SBOM cdxgen -o bom.json -t auto # Language-specific generation cdxgen -o bom.json -t python # pip/poetry/conda cdxgen -o bom.json -t java # Maven/Gradle cdxgen -o bom.json -t nodejs # npm/yarn/pnpm cdxgen -o bom.json -t dotnet # NuGet cdxgen -o bom.json -t golang # go modules cdxgen -o bom.json -t rust # cargo # Deep analysis including transitive deps cdxgen -o bom.json --deep # Generate for a container image cdxgen -o bom.json -t docker nginx:1.25

Language-Native Tools

bashEcosystem-native SBOM generation
# Python — pip-sbom or cyclonedx-py pip install cyclonedx-bom cyclonedx-py poetry -o sbom.json # from poetry.lock cyclonedx-py requirements -i requirements.txt -o sbom.json # Node.js — @cyclonedx/cyclonedx-npm npx @cyclonedx/cyclonedx-npm --output-format json --output-file sbom.json # Java (Maven plugin) mvn org.cyclonedx:cyclonedx-maven-plugin:makeAggregateBom # Java (Gradle plugin) — add to build.gradle: # id("org.cyclonedx.bom") version "1.8.2" gradle cyclonedxBom # Go go install github.com/google/osv-scanner/cmd/osv-scanner@latest osv-scanner --sbom sbom.json ./... # .NET — Microsoft.Sbom.Tool dotnet tool install Microsoft.Sbom.Tool -g sbom-tool generate -b . -bc . -pn MyApp -pv 1.0 -nsb https://mycompany.com # Rust (cargo-sbom) cargo install cargo-sbom cargo sbom --output-format cyclone_dx_json_1_4 > sbom.json
SBOM generation in CI: Generate SBOMs at build time — not retroactively — so they accurately reflect the exact artifact produced. Attach the SBOM to the release artifact and sign it. Syft's OCI attestation support makes this straightforward for containerized workloads: the SBOM is stored alongside the image digest in the registry, not as a separate file to manage.
🔍

Consuming & Auditing SBOMs

// Grype, OSV-Scanner, and VEX-driven vulnerability triage

Vulnerability Scanning Against SBOMs

bashGrype — Anchore vulnerability scanner
# Install Grype curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin # Scan a container image directly grype nginx:1.25 # Scan an existing SBOM grype sbom:./sbom.spdx.json grype sbom:./sbom.cdx.json # Fail CI on Critical/High vulnerabilities only grype sbom:./sbom.cdx.json --fail-on high # Output structured JSON for pipeline consumption grype sbom:./sbom.cdx.json -o json > vuln-report.json # Use a VEX document to suppress known non-exploitable CVEs grype sbom:./sbom.cdx.json --vex ./vex.json # OSV-Scanner (Google) — cross-ecosystem, fast osv-scanner --sbom=sbom.cdx.json osv-scanner --lockfile=package-lock.json # also works on lockfiles osv-scanner -r ./my-project # recursive directory scan

VEX — Vulnerability Exploitability eXchange

VEX documents let producers communicate whether a known CVE is actually exploitable in their product. This eliminates alert fatigue from CVEs that exist in a dependency but aren't reachable in your specific build or configuration.

jsonCycloneDX VEX document example
{ "bomFormat": "CycloneDX", "specVersion": "1.6", "version": 1, "metadata": { "timestamp": "2024-09-15T10:00:00Z", "component": { "name": "my-api", "version": "3.2.1" } }, "vulnerabilities": [ { "id": "CVE-2021-44228", // Log4Shell "ratings": [{ "severity": "critical" }], "affects": [{ "ref": "pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1", "versions": [{ "range": "vers:maven/>=2.0-beta9|<=2.14.1" }] }], "analysis": { "state": "not_affected", // ← VEX status "justification": "protected_by_mitigating_control", "detail": "JNDI lookup disabled via LOG4J_FORMAT_MSG_NO_LOOKUPS=true; log4j-core not reachable from user-controlled input.", "responses": ["workaround_available"] } } ] }

VEX Status Values

StatusMeaningAction for Consumer
not_affectedThe vulnerability does not affect this productSuppress from alerts; log justification
affectedThe product is affected and action is neededTreat as active vulnerability; patch or mitigate
fixedA fix has been applied in this versionVerify fix version; update if needed
under_investigationStatus is being analyzedWatch for status update; don't suppress
📜

SBOM Policy & Mandates

// Regulatory requirements and vendor expectations
MUST — US Federal
US Executive Order 14028
Any software sold to US federal agencies must include an SBOM. CISA has published minimum element guidance aligned to NTIA standards. Required for FedRAMP authorization.
MUST — EU Market
EU Cyber Resilience Act (CRA)
Products sold in the EU with digital elements must maintain SBOMs, report actively exploited vulnerabilities, and support secure updates. Enforcement begins 2026–2027.
SHOULD — Healthcare
FDA Medical Device SBOMs
US FDA requires SBOMs for premarket submissions of medical devices with software. Must include all commercial, open-source, and off-the-shelf software components.
SHOULD — Enterprise Procurement
Vendor SBOM Requirements
Require SBOMs from all software vendors as a procurement condition. Specify CycloneDX 1.5+ or SPDX 2.3+ in contracts. Automate ingestion into your VDR (Vulnerability Disclosure Report) workflow.
⚠️
SBOM freshness: An SBOM is only useful if it reflects the current state of the artifact. Implement automated SBOM generation on every build, version it alongside release artifacts, and republish updated SBOMs when dependency updates occur — even if the application code hasn't changed.
🔑

Sigstore

// Keyless signing for the open-source era

Sigstore is an open standard and toolchain for signing, verifying, and protecting software artifacts without requiring teams to manage private key infrastructure. It uses ephemeral OIDC-based signing certificates and a transparency log (Rekor) to create a tamper-evident, publicly auditable record of all signing events.

Sigstore Components

Cosign

Signs and verifies container images, blobs, and OCI artifacts. Stores signatures in OCI registries alongside the artifact. The primary user-facing CLI tool.

Rekor

The immutable, append-only transparency log. Every signature event is recorded. Anyone can query Rekor to verify when and by whom an artifact was signed.

Fulcio

The OIDC-based certificate authority. Issues short-lived signing certificates (10-minute TTL) bound to an identity (GitHub Actions workflow, Google account, etc.) — no long-lived private keys.

Cosign — Signing & Verification

bashCosign — keyless signing via OIDC
# Install cosign brew install cosign # or: go install github.com/sigstore/cosign/v2/cmd/cosign@latest ─── SIGNING ─────────────────────────────────────────────────────────────── # Keyless sign a container image (OIDC identity — GitHub Actions, GCP, etc.) cosign sign \ --yes \ ghcr.io/myorg/my-image@sha256:abc123... # Sign with a key file (for air-gapped / non-OIDC environments) cosign generate-key-pair # produces cosign.key + cosign.pub cosign sign --key cosign.key ghcr.io/myorg/my-image@sha256:abc123 # Sign and attach an SBOM as an attestation cosign attest \ --predicate sbom.cdx.json \ --type cyclonedx \ --yes \ ghcr.io/myorg/my-image@sha256:abc123 # Sign a blob / binary (e.g. a release tarball) cosign sign-blob --yes ./my-app-linux-amd64.tar.gz \ --bundle bundle.json # saves sig + cert + rekor entry ─── VERIFICATION ────────────────────────────────────────────────────────── # Verify keyless — enforce identity must be a GitHub Actions workflow cosign verify \ --certificate-identity-regexp=https://github.com/myorg/my-repo/.* \ --certificate-oidc-issuer=https://token.actions.githubusercontent.com \ ghcr.io/myorg/my-image@sha256:abc123 # Verify with a public key cosign verify --key cosign.pub ghcr.io/myorg/my-image@sha256:abc123 # Verify an SBOM attestation cosign verify-attestation \ --type cyclonedx \ --certificate-identity-regexp=https://github.com/myorg/.* \ --certificate-oidc-issuer=https://token.actions.githubusercontent.com \ ghcr.io/myorg/my-image@sha256:abc123 \ | jq '.payload | @base64d | fromjson' # Verify a blob cosign verify-blob ./my-app-linux-amd64.tar.gz --bundle bundle.json \ --certificate-identity-regexp=https://github.com/myorg/.* \ --certificate-oidc-issuer=https://token.actions.githubusercontent.com

GitHub Actions — Keyless Signing in CI

yaml.github/workflows/release.yml
name: Build, SBOM, Sign on: { push: { tags: ['v*'] } } permissions: id-token: write # required for keyless OIDC signing contents: read packages: write # push to ghcr.io attestations: write # GitHub artifact attestations jobs: release: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Build image run: | docker build -t ghcr.io/${{ github.repository }}:${{ github.sha }} . docker push ghcr.io/${{ github.repository }}:${{ github.sha }} # Capture the digest — always sign the digest, never just the tag DIGEST=$(docker inspect --format='{{index .RepoDigests 0}}' \ ghcr.io/${{ github.repository }}:${{ github.sha }}) echo "IMAGE_DIGEST=$DIGEST" >> $GITHUB_ENV - name: Generate SBOM (Syft) uses: anchore/sbom-action@v0 with: image: ${{ env.IMAGE_DIGEST }} format: cyclonedx-json output-file: sbom.cdx.json - name: Install Cosign uses: sigstore/cosign-installer@v3 - name: Sign image run: | cosign sign --yes ${{ env.IMAGE_DIGEST }} - name: Attest SBOM run: | cosign attest \ --predicate sbom.cdx.json \ --type cyclonedx \ --yes \ ${{ env.IMAGE_DIGEST }} - name: Scan for vulnerabilities uses: anchore/scan-action@v3 with: sbom: sbom.cdx.json fail-build: "true" severity-cutoff: high
🔑
Always sign the digest, never the tag. Tags are mutable — latest can point to a different image tomorrow. Signatures attach to the immutable SHA-256 digest. If you sign myimage:latest, cosign resolves the digest at signing time, but consumers must verify against the digest to get the guarantee.
🏛

SLSA Framework

// Supply chain Levels for Software Artifacts — tamper-evident provenance

SLSA (pronounced "salsa") is a security framework and checklist of standards to prevent tampering, improve integrity, and secure packages and infrastructure in your projects. It defines four levels of build security assurance.

LevelRequirementsGuaranteeTypical Environment
SLSA 1 Build process is scripted/automated; provenance generated (may be unverified) Basic documentation; not tamper-resistant CI with any provenance generation
SLSA 2 Hosted build service; provenance is authenticated and signed Builds from a specific hosted service; prevents some tampering GitHub Actions, GitLab CI with provenance action
SLSA 3 Hardened build platform; builds from specific source repo; provenance non-forgeable by build service Build source auditable; cannot be tampered by the build service itself Hermetically sealed CI, provenance from trusted builder
SLSA 4 (proposed) Two-party review of all source changes; hermetic, reproducible builds Full auditability; reproducible builds verifiable by anyone High-assurance projects (OS distros, critical infra)

Generating SLSA Provenance in GitHub Actions

yamlSLSA Level 3 provenance for a Go binary
name: Release with SLSA Provenance on: { release: { types: [created] } } permissions: id-token: write contents: write actions: read # required by SLSA generator jobs: build: runs-on: ubuntu-latest outputs: hash: ${{ steps.hash.outputs.hash }} steps: - uses: actions/checkout@v4 - name: Build binary run: go build -o my-app ./cmd/... - name: Compute hash id: hash run: | sha256sum my-app > hash.txt echo "hash=$(base64 -w0 hash.txt)" >> $GITHUB_OUTPUT - uses: actions/upload-artifact@v4 with: { name: binary, path: my-app } # SLSA generator produces provenance signed by GitHub's OIDC provenance: needs: [build] uses: slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@v2.0.0 with: base64-subjects: ${{ needs.build.outputs.hash }} upload-assets: true # attaches .intoto.jsonl to GitHub Release

In-Toto Attestations

// Structured claims about software supply chain steps

In-Toto is the attestation framework underlying SLSA provenance, SBOM attestations, and other supply chain claims. An attestation is a cryptographically signed statement: "this software was built from this source, using this tool, at this time, producing this output."

jsonIn-Toto SLSA Provenance attestation structure
{ "_type": "https://in-toto.io/Statement/v0.1", "predicateType": "https://slsa.dev/provenance/v0.2", "subject": [{ "name": "my-app", "digest": { "sha256": "4d82d6854a8..." } }], "predicate": { "builder": { "id": "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@v2.0.0" }, "buildType": "https://slsa.dev/provenance/v0.2", "invocation": { "configSource": { "uri": "git+https://github.com/myorg/my-repo@refs/heads/main", "digest": { "sha1": "abc123def456" }, "entryPoint": ".github/workflows/release.yml" } }, "materials": [{ "uri": "git+https://github.com/myorg/my-repo", "digest": { "sha1": "abc123def456" } }] } }
🕸

Dependency Risks

// Understanding what you're actually importing
🔴
The real scale: An average Node.js application installs hundreds of packages including transitive deps. A typical node_modules folder contains code from thousands of unique contributors worldwide. You haven't reviewed 99%+ of it. The question isn't whether you trust it — it's whether you've made an informed decision.

Risk Dimensions for Every Dependency

DimensionLow RiskHigh Risk
Maintainer count Multiple active maintainers, org-owned Single maintainer, inactive for years
Dependency depth Direct dep with few transitive deps Deep transitive dep, many sub-dependencies
Install scripts No postinstall/preinstall hooks Executes shell commands on install
Network access in tests All tests run offline Tests or build scripts reach external URLs
Published vs. source diff Reproducible build; source matches artifact Can't reproduce the published artifact from source
License MIT, Apache-2, BSD — permissive GPL/AGPL — copyleft; Unknown — high risk
🛡

Hardening Dependencies

// Locking, pinning, private proxies, and allowlisting

Pin Exact Versions — Use Lockfiles

✓ Pinned — reproducible
# requirements.txt — exact version requests==2.31.0 numpy==1.26.4 # package.json — no range prefixes { "dependencies": { "express": "4.18.2" ← exact } } # Always commit lockfiles package-lock.json ← commit this poetry.lock ← commit this Cargo.lock ← commit this
✕ Unpinned — supply chain risk
# requests>=2.0 — any future version requests>=2.0.0 # ^ and ~ allow updates { "dependencies": { "express": "^4.0.0" ← allows 4.x } } # .gitignore-ing lockfiles # means every install could differ *.lock ← do NOT ignore

Private Package Proxies & Allowlisting

bashPrivate proxy configuration
# npm — route all installs through Artifactory / Nexus / Verdaccio npm config set registry https://artifactory.mycompany.com/artifactory/api/npm/npm/ # pip — private PyPI proxy pip install --index-url https://nexus.mycompany.com/repository/pypi/simple/ requests # pip.conf (system-wide) [global] index-url = https://nexus.mycompany.com/repository/pypi-proxy/simple/ trusted-host = nexus.mycompany.com # Maven — settings.xml with mirror <mirror> <id>central-proxy</id> <mirrorOf>*</mirrorOf> <url>https://nexus.mycompany.com/repository/maven-public/</url> </mirror> # Docker — always pull from internal mirror, never Docker Hub directly DOCKER_MIRROR=registry.mycompany.com/dockerhub-proxy # Prevent dependency confusion: namespace all internal packages # npm: use scoped packages @myorg/package-name # pip: register internal package names on PyPI as dummies # Or: configure Artifactory to only resolve @myorg scope from internal

Dependency Confusion — Specific Mitigations

MUST — npm
Use scoped packages + overrides
All internal packages must use a scoped name like @myorg/package. Configure npm to resolve @myorg/* exclusively from your private registry, never from the public npm registry.
MUST — pip
Claim internal names on PyPI
Publish placeholder/dummy packages on PyPI for every internal package name you use. An attacker can't publish a malicious version of a name that's already claimed. Use --no-index in air-gapped builds.
MUST — All
Block public registry in prod build
Production builds should only resolve from your internal proxy. Network-level egress rules or firewall policies should prevent build agents from reaching public registries directly.

Npm Audit & Lockfile Integrity

bashIntegrity verification
# npm — install with integrity check, no allow-downgrade npm ci # respects lockfile exactly; fails if lockfile inconsistent npm audit --audit-level=high # fail on high+ severity npm audit fix # auto-remediate # pip — verify checksums pip install --require-hashes -r requirements.txt # requirements.txt with hashes: # requests==2.31.0 \ # --hash=sha256:58cd2187423839... \ # --hash=sha256:942c5a758f98d7... # pip-audit — dedicated audit tool pip install pip-audit pip-audit --requirement requirements.txt pip-audit --output-format cyclonedx-json > vuln.cdx.json # cargo — audit for Rust cargo install cargo-audit cargo audit # Go — vulnerability check go install golang.org/x/vuln/cmd/govulncheck@latest govulncheck ./...
🤖

AI-Era Threats

// Package hallucination, AI-assisted attacks, and LLM supply chain risks
🚨
AI package hallucination is a live, weaponized threat. Large language models — including coding assistants — frequently invent plausible-sounding but nonexistent package names. Attackers monitor for these hallucinated names and publish malicious packages under those names. Any developer who follows an LLM's pip install or npm install suggestion without verification may install attacker-controlled code.

How the Attack Works

textAI hallucination → supply chain attack flow
Developer asks LLM: "How do I parse YAML in Python?" LLM responds: "You can use the `yaml-python-advanced` library: pip install yaml-python-advanced import yaml_python_advanced as ypa data = ypa.parse(my_string)" ❌ PROBLEM: `yaml-python-advanced` doesn't exist. The correct package is `pyyaml` (pip install pyyaml). Attacker flow: 1. Collect hallucinated package names from LLM outputs (research papers show 100s of unique invented names per session) 2. Register the name on PyPI / npm before any legitimate package exists 3. Include credential harvester, crypto miner, or backdoor in the package 4. Wait for developers to blindly run the install command an LLM suggested 5. Achieve code execution on developer machines and CI/CD pipelines

Verified Hallucination Patterns (Research Findings)

PatternExample HallucinationActual Package
Adding "advanced" or "pro"requests-advancedrequests
Adding language prefixpython-http-client-v2httpx
Combining real package namesnumpy-pandas-utilsnumpy, pandas
Plausible-sounding utilitiescsv-tools-enhancedcsv (stdlib)
Version-suffixed packagesflask-api-v3flask

Mitigations for AI-Assisted Development

MUST
Verify before installing
Always look up every package an LLM suggests on the official registry (pypi.org, npmjs.com) before running install. Verify: does this package exist? Is it what the LLM described? Check download counts, age, and maintainer history.
MUST
Use a package allowlist
Maintain an approved package list. Any package not on the list requires a security review and explicit approval before adding to the project — regardless of source (LLM, StackOverflow, colleague recommendation).
MUST
Block unknown packages in CI
CI pipelines should only install packages that are in the lockfile and verified against your internal proxy. Any new package must go through a review gate — not added directly in a PR.
SHOULD
Monitor LLM suggestions in PRs
Review PRs for AI-suggested package additions with the same scrutiny as code changes. Treat any requirements.txt or package.json diff as a security-relevant change requiring explicit review.

AI Coding Assistant Risks Beyond Package Hallucination

Outdated Dependency Advice

LLMs have training cutoffs and may recommend specific versions that are outdated or have known CVEs. The model cannot know about vulnerabilities disclosed after its training data cutoff.

Mitigation: Always run npm audit, pip-audit, or Grype after adding any LLM-suggested dependency.

Poisoned Training Data

Research demonstrates that code completion models can be trained or fine-tuned to suggest insecure coding patterns — SQL injection, hardcoded credentials, unsafe deserialization — with high confidence.

Mitigation: Treat all AI-generated code as untrusted user input. SAST scan everything. Code review is not optional.

🔭

Scanning & Detection

// OSV, Grype, Trivy, Dependabot, and behavioral detection

Vulnerability Databases

DatabaseCoverageFormatURL
OSVAll ecosystems (PyPI, npm, Go, Maven, crates.io, GitHub)JSON / APIosv.dev
NVDCVEs across all software; authoritative CVSS scoresJSON / OVALnvd.nist.gov
GitHub Advisorynpm, PyPI, Maven, RubyGems, Go, NuGet, RustGraphQL / APIgithub.com/advisories
Exploit-DBProof-of-concept exploits for CVEsWeb / CSVexploit-db.com

Detecting Malicious Packages Behaviorally

bashPackage Behavior Analysis
# Socket.dev — detect malicious npm packages before install npm install -g @socketsecurity/cli socket npm install lodash # audits before installing socket report --repo myorg/my-repo # full repo audit # guarddog — Google's malicious package detector pip install guarddog guarddog pypi scan requests # scan specific package guarddog pypi verify requirements.txt # verify lockfile # What guarddog checks for: # - Typosquatting analysis # - Install scripts presence # - Outbound network connections in package code # - File system access outside the package # - Code obfuscation / eval() of dynamic strings # - Credential environment variable access # Trivy — comprehensive scanner (images, repos, SBOMs) trivy image nginx:1.25 trivy repo https://github.com/myorg/my-repo trivy sbom sbom.cdx.json # Dependabot (GitHub) — automated PR-based updates # .github/dependabot.yml: version: 2 updates: - package-ecosystem: "npm" directory: "/" schedule: { interval: "weekly" } open-pull-requests-limit: 10 - package-ecosystem: "pip" directory: "/" schedule: { interval: "daily" }
🚀

CI/CD Pipeline Security Gates

// The supply chain security checks every pipeline should enforce

Every stage of the CI/CD pipeline is an opportunity to detect and block supply chain compromise. Security gates should fail the build — not just warn — on violations.

📥
Source
Checkout + Verify
🔍
Scan
Dep Audit
🏗
Build
Hermetic Build
📋
SBOM
Generate + Sign
🔭
Scan
Vuln + License
🔑
Sign
Cosign + SLSA
🚢
Deploy
Verify + Admit
yamlComplete secure pipeline template (.github/workflows/secure-build.yml)
name: Secure Build Pipeline on: [push, pull_request] permissions: contents: read # minimal permissions by default security-events: write # for SARIF upload jobs: security-checks: runs-on: ubuntu-latest steps: # ── 1. Pin action versions by SHA (not tag!) ── - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4 SHA with: persist-credentials: false # don't expose GitHub token to steps # ── 2. Dependency audit (fail on high+) ── - name: Audit dependencies (npm) run: npm audit --audit-level=high - name: Install with frozen lockfile run: npm ci # ci = frozen lockfile, fails if inconsistent # ── 3. SAST scan ── - uses: github/codeql-action/analyze@v3 with: { languages: javascript } # ── 4. Build ── - name: Build run: npm run build env: NODE_ENV: production NO_INTERNET: "1" # signal: build should be hermetic # ── 5. Container build + SBOM ── - uses: docker/build-push-action@v5 id: build with: push: true tags: ghcr.io/${{ github.repository }}:${{ github.sha }} - uses: anchore/sbom-action@v0 with: image: ghcr.io/${{ github.repository }}@${{ steps.build.outputs.digest }} format: cyclonedx-json # ── 6. Vulnerability scan the SBOM ── - uses: anchore/scan-action@v3 with: image: ghcr.io/${{ github.repository }}@${{ steps.build.outputs.digest }} fail-build: "true" severity-cutoff: high # ── 7. Sign image + SBOM ── - uses: sigstore/cosign-installer@v3 - name: Sign run: cosign sign --yes ghcr.io/${{ github.repository }}@${{ steps.build.outputs.digest }} permissions: id-token: write # OIDC signing requires elevated permission
⚠️
Pin GitHub Actions by SHA, not tag. uses: actions/checkout@v4 is a mutable tag — it could point to different code tomorrow if the action repo is compromised. Use the full SHA: uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683. Tools like Renovate and Dependabot can keep SHA pins up to date automatically.
🚨

Incident Response

// When a dependency is compromised — the first 60 minutes
textSupply chain incident runbook
T+0 min — DETECTION □ Source of intelligence: CVE feed / security advisory / internal scan □ Identify the affected package: name, version range, registry □ Query your SBOM inventory: which products and versions use this package? → grype sbom:*.cdx.json | grep package-name → osv-scanner --sbom=*.json □ Open incident bridge; notify security team lead T+5 min — SCOPING □ Are affected versions deployed in production? Which environments? □ Is the vulnerability exploitable in your usage context? (VEX analysis) □ What is the CVSS score? Is there a public PoC exploit? □ Are customers / external services exposed? T+15 min — CONTAINMENT (if actively exploitable) □ Isolate affected services from external network if RCE/critical □ Block malicious outbound connections if package exfiltrates data □ Rotate any credentials that may have been exposed □ Enable enhanced logging on affected systems T+20 min — REMEDIATION □ Pin to patched version in all affected repos □ Open emergency PRs; fast-track review + merge □ Trigger emergency CI/CD pipeline run □ Deploy patched artifacts to all affected environments □ Rebuild and re-sign container images; push new SBOMs T+45 min — VERIFICATION □ Re-run vulnerability scan against patched SBOM — confirm CVE resolved □ Verify patched images are deployed across all environments □ Confirm no indicators of compromise in logs (data exfil, reverse shell, etc.) T+60 min — COMMUNICATION □ Customer notification (if applicable, per SLA/regulatory requirements) □ Internal post-incident summary to stakeholders □ Update VEX documents for any CVEs deemed not_affected in other products POST-INCIDENT □ Blameless post-mortem within 5 business days □ Update SBOM freshness SLA if gap discovered □ Add new detection rule for class of vulnerability □ Review dependency update frequency policy
⚖️

Compliance & Policy

// License compliance, allowlists, and policy-as-code

License Compliance

License TypeExamplesCommercial UseCopyleft Risk
Permissive MIT, Apache-2.0, BSD-2-Clause, ISC ✓ Free to use None
Weak Copyleft LGPL-2.1, MPL-2.0, EUPL Conditional — if modified, share changes Library modifications must be open
Strong Copyleft GPL-2.0, GPL-3.0 ⚠ Derivative works must be GPL May require open-sourcing your product
Network Copyleft AGPL-3.0 ⚠ Even SaaS use triggers copyleft Most restrictive for SaaS businesses
Commercial Restriction BUSL-1.1, SSPL, CC-NC ✗ Commercial use prohibited or restricted Legal review required

Policy-as-Code with OPA and Syft

bashLicense and vulnerability policy enforcement
# license_finder — Ruby gem, works across ecosystems gem install license_finder license_finder # scan current project license_finder approve --name lodash # approve specific package license_finder permitted_licenses add MIT Apache-2.0 BSD-2-Clause ISC license_finder restricted_licenses add AGPL-3.0 GPL-3.0 # In CI — fail if any restricted license detected license_finder --decisions-file=approved_licenses.yml || exit 1 # Trivy — license scanning built-in trivy image --scanners license nginx:1.25 trivy repo --scanners license --ignored-licenses MIT,Apache-2.0 . # sbom-scorecard — SBOM quality + policy scoring go install github.com/eBay/sbom-scorecard@latest sbom-scorecard score sbom.cdx.json # Reports: NTIA compliance, component completeness, has hashes, has licenses, etc. # Conftest (OPA) — SBOM policy-as-code # policy/sbom.rego: package sbom deny[msg] { component := input.components[_] component.licenses[_].expression == "GPL-3.0-only" msg := sprintf("GPL-3.0 detected in component: %v", [component.name]) } conftest test sbom.cdx.json --policy policy/
🔧

Tooling Reference

// The supply chain security toolchain

SBOM Generation

🔬
Syft
anchore/syft · github.com/anchore/syft
Universal SBOM generator for container images, filesystems, and source repos. Outputs SPDX, CycloneDX, and custom formats. Best-in-class OCI attestation support. Pairs with Grype for vulnerability scanning.
cdxgen
@cyclonedx/cdxgen · npmjs.com/package/@cyclonedx/cdxgen
CycloneDX-native SBOM generator with deep multi-language support. Handles 20+ build systems. Particularly strong for Java/Gradle and complex monorepos. Generates ML model BOMs (mlbom) for AI/ML projects.

Vulnerability Scanning

🦷
Grype
anchore/grype · github.com/anchore/grype
Fast, offline-capable vulnerability scanner. Accepts container images, SBOMs (SPDX/CycloneDX), directories, and more. Integrates with VEX for suppression. Same ecosystem as Syft.
🔭
Trivy
aquasecurity/trivy · trivy.dev
All-in-one scanner: container images, filesystems, Git repos, Kubernetes configs, Terraform, SBOMs. License scanning built-in. Excellent Kubernetes/cloud-native integration. Aqua Security sponsored.
🌐
OSV-Scanner
google/osv-scanner · google.github.io/osv-scanner
Google's open-source scanner backed by OSV database. Scans lockfiles across 15+ ecosystems and SBOMs. Excellent false-positive rate. Can output CycloneDX for pipeline consumption.

Signing & Provenance

🔑
Cosign
sigstore/cosign · docs.sigstore.dev
The primary Sigstore CLI. Signs and verifies OCI images, blobs, and SBOM attestations. Keyless OIDC signing for CI/CD environments. Signatures stored in OCI registries alongside artifacts.
🏛
slsa-github-generator
slsa-framework/slsa-github-generator
Reusable GitHub Actions workflows for generating SLSA Level 3 provenance for Go binaries, generic artifacts, and container images. Provenance signed by GitHub's OIDC, stored in Rekor.

Policy & Detection

🧲
Socket
socket.dev · @socketsecurity/cli
Real-time malicious package detection for npm and PyPI. Intercepts install commands, analyzes package behavior (network calls, file access, obfuscation) before installation. GitHub app for PR-level scanning.
🐕
guarddog
DataDog/guarddog · github.com/DataDog/guarddog
Datadog's malicious package detector. 20+ heuristics covering typosquatting, install scripts, exfiltration patterns, obfuscation. Works on PyPI and npm. Integrates cleanly into CI pipelines.

Reference Links