Create MRVA Analysis

In this workflow, the main mechanism for creating a MRVA analysis is via the Create a CodeQL variant analysis API. This will produce the raw data.

MRVA queries run against CodeQL databases stored on GitHub. A repository cannot participate in variant analysis until a valid CodeQL database exists for the target language. Each code scanning workflow run generates a fresh database snapshot per analyzed language, with the most recent version persisted automatically. Large-scale MRVA is only feasible after CodeQL has been fully rolled out across the target repositories for the language under analysis.

Controller Repository

MRVA uses GitHub Actions to execute queries. No dedicated workflow file is required, but a controller repository must exist to orchestrate the analysis. The controller repository must contain at least one commit.

Runner Configuration

Self-hosted runners are supported via the MRVA_RUNNER_OS action variable (not a secret) on the controller repository. The value is a JSON array mapping to the runs-on field:

["self-hosted", "some-label", "another-label"]

Query Packs

A CodeQL query pack is a distributable unit that bundles queries, query suites, optional libraries, and a manifest (codeql-pack.yml). Packs are versioned, portable, and define dependencies on other packs. The query_pack field in the API request specifies which queries MRVA executes. You can read more about creating CodeQL packs here.

API

Request Body

Field	Type	Required	Description
`language`	`string`	Yes	Target language (`javascript`, `typescript`, `python`, `java`, `cpp`, `csharp`, `go`, `ruby`). Must match the CodeQL database language. One database exists per language per repository.
`query_pack`	`string` (Base64)	Yes	Base64-encoded CodeQL query pack tarball (`.tgz`).
`repositories`	`string[]`	One of	Explicit repository list in `owner/name` format. Mutually exclusive with `repository_owners`.
`repository_owners`	`string[]`	One of	Organization handles. Targets all eligible repositories under each owner. Mutually exclusive with `repositories`.

Bundle, Encode, and Submit

The following example script bundles a CodeQL query pack, Base64-encodes it, constructs the JSON payload, and submits the variant analysis request. The payload is written to a temporary file because the encoded query pack exceeds command-line argument limits. repos.json contains a JSON array of "owner/name" strings. GITHUB_TOKEN must be set as an environment variable. There are multiple ways to achieve this, the purpose of this example is to demonstrate some of the considerations that should be made when calling the create endpoint, particularly when exceeding length limits.

#!/usr/bin/env bash
set -euo pipefail

usage() {
  echo "Usage: $0 -c <controller_repo> -l <language> -p <pack_dir> -r <repos_file>" >&2
  echo "  -c  Controller repository (owner/name)" >&2
  echo "  -l  CodeQL language" >&2
  echo "  -p  Query pack source directory" >&2
  echo "  -r  Path to repos.json" >&2
  exit 1
}

while getopts "c:l:p:r:" opt; do
  case "$opt" in
    c) CONTROLLER_REPO="$OPTARG" ;;
    l) LANGUAGE="$OPTARG" ;;
    p) PACK_DIR="$OPTARG" ;;
    r) REPOS_FILE="$OPTARG" ;;
    *) usage ;;
  esac
done

: "${CONTROLLER_REPO:?Missing -c <controller_repo>}"
: "${LANGUAGE:?Missing -l <language>}"
: "${PACK_DIR:?Missing -p <pack_dir>}"
: "${REPOS_FILE:?Missing -r <repos_file>}"

BUNDLE_OUTPUT="$(basename "$PACK_DIR")-bundle.tar.gz"
ENCODED_OUTPUT="$(basename "$PACK_DIR")-base64.txt"

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"

if ! command -v codeql &>/dev/null; then
  echo "Error: 'codeql' CLI not found in PATH." >&2
  exit 1
fi

# Bundle query pack
echo "==> Installing pack dependencies..."
codeql pack install "$PACK_DIR"

echo "==> Bundling CodeQL pack..."
codeql pack bundle \
  --output="$BUNDLE_OUTPUT" \
  "$PACK_DIR"

echo "==> Base64-encoding bundle..."
base64 -w 0 "$BUNDLE_OUTPUT" > "$ENCODED_OUTPUT"

BYTE_SIZE=$(wc -c < "$BUNDLE_OUTPUT")
ENCODED_SIZE=$(wc -c < "$ENCODED_OUTPUT")
echo "  Bundle:  $BUNDLE_OUTPUT ($BYTE_SIZE bytes)"
echo "  Base64:  $ENCODED_OUTPUT ($ENCODED_SIZE bytes)"

# Submit variant analysis
PAYLOAD=$(mktemp)
trap 'rm -f "$PAYLOAD"' EXIT

jq -n \
  --arg language "$LANGUAGE" \
  --rawfile query_pack "$SCRIPT_DIR/$ENCODED_OUTPUT" \
  --slurpfile repositories "$SCRIPT_DIR/$REPOS_FILE" \
  '{language: $language, query_pack: $query_pack, repositories: $repositories[0]}' \
  > "$PAYLOAD"

echo "==> Submitting variant analysis..."
curl -L \
  -X POST \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer ${GITHUB_TOKEN}" \
  -H "X-GitHub-Api-Version: 2026-03-10" \
  -H "Content-Type: application/json" \
  -d @"$PAYLOAD" \
  "https://api.github.com/repos/${CONTROLLER_REPO}/code-scanning/codeql/variant-analyses"

The following demonstrates the full workflow: executing the scripts above with the required flags, bundling and encoding the query pack, and submitting the variant analysis request to the GitHub API. Note the id field in the API response as it is required in subsequent steps of this workflow. See the official documentation for full response details.

MRVA Report Documentation