Enable manual benchmark runs and fix benchmark CI follow-ups#17
Enable manual benchmark runs and fix benchmark CI follow-ups#17sharpninja merged 3 commits intomainfrom
Conversation
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com> Agent-Logs-Url: https://github.com/sharpninja/graphrag/sessions/4253a90a-67dd-4ef4-8111-25ea94ea578c
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Enables ad hoc benchmark runs by making the existing Benchmark Comparison GitHub Actions workflow manually triggerable, while preserving the behavior that skips draft pull request runs.
Changes:
- Added
workflow_dispatchtrigger to allow manual workflow execution from the Actions UI. - Updated the benchmark job
if:condition to skip only draft PRs, while allowing push and manual (workflow_dispatch) events to run.
|
@copilot discover the causes of the pipeline failures, but do not change any code. |
Investigated on 69ce79c with no code changes. The failures are from multiple existing issues:
None of the failures point to the |
|
@copilot fix these: Benchmark Comparison: scripts/benchmark_smoke.py:97-100 assumes every workflow_config value is a mapping and calls .get(...), but tests/fixtures/azure/config.json includes "skip_assert": true, so the run crashes with AttributeError: 'bool' object has no attribute 'get' before benchmarking starts. |
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com> Agent-Logs-Url: https://github.com/sharpninja/graphrag/sessions/ed8e64ed-c4d1-4644-8660-94891a6c9f68
Addressed in b278d2f.
Validated with |
Description
This pull request makes the benchmark comparison workflow manually triggerable from GitHub Actions and addresses follow-up CI failures uncovered on the PR. In addition to preserving the existing draft-PR skip behavior, it fixes the benchmark smoke fixture loading logic and resolves the repository spellcheck failures.
Related Issues
Addresses the benchmark pipeline triggerability issue and the follow-up benchmark/spellcheck CI failures found during review.
Proposed Changes
Workflow trigger
workflow_dispatchto.github/workflows/benchmark-comparison.ymlso the benchmark comparison workflow can be launched on demand.Job gating
Benchmark smoke fix
scripts/benchmark_smoke.pysoworkflow_configentries that are not mappings, such asskip_assert: true, are ignored when collectingexpected_artifacts.Test coverage
tests/unit/test_benchmark_smoke.pyfor the Azure fixture case with non-mappingworkflow_configentries.Spellcheck
dictionary.txtwith the repository-specific terms that were causing the existing cspell workflow to fail.Checklist
Additional Notes
Validated locally with:
python3 -m pytest tests/unit/test_benchmark_smoke.py./scripts/spellcheck.shThis PR still includes the original workflow-only change, but now also contains the minimal follow-up fixes required to get the related benchmark and spellcheck CI paths passing.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.