Skip to content

Fallback benchmark comparison to dry-run when secrets are unavailable#18

Merged
sharpninja merged 3 commits intomainfrom
copilot/fix-benchmark-pipeline-failure
Mar 21, 2026
Merged

Fallback benchmark comparison to dry-run when secrets are unavailable#18
sharpninja merged 3 commits intomainfrom
copilot/fix-benchmark-pipeline-failure

Conversation

Copy link

Copilot AI commented Mar 21, 2026

Description

The benchmark comparison workflow was failing before it could produce artifacts when required benchmark secrets were not configured. This change makes the pipeline degrade to a dry-run path in that case, while preserving full benchmark execution when the environment is fully configured.

Related Issues

Addresses the benchmark pipeline build failure.

Proposed Changes

  • Workflow mode selection

    • Add a Select benchmark mode step to .github/workflows/benchmark-comparison.yml.
    • Detect whether the benchmark-only secrets are available and set a single workflow argument for downstream steps.
  • Graceful fallback

    • Run scripts/benchmark_smoke.py with --dry-run when benchmark secrets are missing.
    • Keep the existing full benchmark behavior unchanged when all required secrets are present.
  • Artifact/report continuity

    • Ensure the workflow still generates benchmark JSON outputs and the comparison report even in repositories/environments that do not have benchmark credentials configured.
    • Add a short step summary message indicating whether the job ran in full or dry-run mode.
- name: Select benchmark mode
  shell: bash
  run: |
    if [[ -n "${GRAPHRAG_API_KEY:-}" && -n "${GRAPHRAG_API_BASE:-}" && -n "${AZURE_AI_SEARCH_URL_ENDPOINT:-}" && -n "${AZURE_AI_SEARCH_API_KEY:-}" ]]; then
      echo "BENCHMARK_EXTRA_ARG=" >> "$GITHUB_ENV"
    else
      echo "BENCHMARK_EXTRA_ARG=--dry-run" >> "$GITHUB_ENV"
    fi

Checklist

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have updated the documentation (if necessary).
  • I have added appropriate unit tests (if applicable).

Additional Notes

The failure mode was not in report generation; the Python benchmark step was exiting during fixture execution because secret-backed model/search configuration was unresolved. This change keeps the workflow useful in forks and CI environments where those secrets are intentionally absent.


📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

Copilot AI self-assigned this Mar 21, 2026
Copilot AI and others added 2 commits March 21, 2026 19:18
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com>
Agent-Logs-Url: https://github.com/sharpninja/graphrag/sessions/cd6db3fa-53c3-4678-a5ee-0bc3ee2bdb6f
Copilot AI changed the title [WIP] Fix build failure on benchmark pipeline Fallback benchmark comparison to dry-run when secrets are unavailable Mar 21, 2026
Copilot AI requested a review from sharpninja March 21, 2026 19:22
@sharpninja sharpninja marked this pull request as ready for review March 21, 2026 19:22
Copilot AI review requested due to automatic review settings March 21, 2026 19:22
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@sharpninja sharpninja merged commit 7338750 into main Mar 21, 2026
21 of 27 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Benchmark Comparison GitHub Actions workflow to automatically fall back to a --dry-run benchmark execution path when required benchmark secrets are not available, ensuring JSON outputs and the comparison report are still produced.

Changes:

  • Adds a “Select benchmark mode” step that detects whether required secret-backed environment variables are present and sets BENCHMARK_EXTRA_ARG accordingly.
  • Passes ${BENCHMARK_EXTRA_ARG} into both Python and .NET benchmark invocations so they can run in full mode or --dry-run.
  • Writes a clear mode indicator into the job step summary (full vs dry-run).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants