feat(ai): LLM cost tracking and AI span inspector#3213
feat(ai): LLM cost tracking and AI span inspector#3213
Conversation
🦋 Changeset detectedLatest commit: 463f972 The changes in this PR will be included in the next version bump. This PR includes changesets to release 29 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds end-to-end LLM cost tracking and UI. Introduces an internal llm-pricing package (types, registry, default prices, seeding, and tests), Prisma schema and migration for LLM pricing tables, and a ClickHouse llm_usage_v1 table plus insert helpers. Adds a llmPricingRegistry singleton, pricing enrichment that writes trigger.llm.* attributes and a side-channel _llmUsage, OTLP exporter changes (array handling and runTags), dual-write to ClickHouse, admin APIs/UIs for model management, and multiple React components/utilities to parse and display AI span data. Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
e99ab5b to
4c0b521
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/webapp/test/otlpExporter.test.ts (1)
436-438: Avoidundefined as anyfor registry cleanup.The type assertion bypasses type safety. Consider exposing a dedicated reset/unload function from the module (e.g.,
resetLlmPricingRegistry()) or acceptingundefinedin the function signature if it's a valid state.♻️ Suggested approach
Option 1 - Accept undefined in the function signature:
// In enrichCreatableEvents.server.ts export function setLlmPricingRegistry(registry: LlmPricingRegistry | undefined): voidOption 2 - Add a dedicated reset function:
// In enrichCreatableEvents.server.ts export function resetLlmPricingRegistry(): void // In test afterEach(() => { resetLlmPricingRegistry(); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/test/otlpExporter.test.ts` around lines 436 - 438, The test currently calls setLlmPricingRegistry(undefined as any) which bypasses type safety; update the module (enrichCreatableEvents.server.ts) to either allow undefined in the setter signature (export function setLlmPricingRegistry(registry: LlmPricingRegistry | undefined): void) or add a dedicated reset function (export function resetLlmPricingRegistry(): void), then change the test afterEach to call the new resetLlmPricingRegistry() or call setLlmPricingRegistry(undefined) with the adjusted type so the cleanup is type-safe and no longer uses undefined as any.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@apps/webapp/test/otlpExporter.test.ts`:
- Around line 436-438: The test currently calls setLlmPricingRegistry(undefined
as any) which bypasses type safety; update the module
(enrichCreatableEvents.server.ts) to either allow undefined in the setter
signature (export function setLlmPricingRegistry(registry: LlmPricingRegistry |
undefined): void) or add a dedicated reset function (export function
resetLlmPricingRegistry(): void), then change the test afterEach to call the new
resetLlmPricingRegistry() or call setLlmPricingRegistry(undefined) with the
adjusted type so the cleanup is type-safe and no longer uses undefined as any.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d7a96f6e-7769-4037-b7bf-7ff0bcdfef7c
📒 Files selected for processing (2)
apps/webapp/test/otlpExporter.test.tsinternal-packages/llm-pricing/package.json
🚧 Files skipped from review as they are similar to previous changes (1)
- internal-packages/llm-pricing/package.json
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (14)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
**/*.{ts,tsx}: Use task export syntax: export const myTask = task({ id: 'my-task', run: async (payload) => { ... } })
Use Run Engine 2.0 (@internal/run-engine) and redis-worker for all new work - avoid DEPRECATED zodworker (Graphile-worker wrapper)
Prisma 6.14.0 client and schema use PostgreSQL in internal-packages/database - import only from Prisma client
Files:
apps/webapp/test/otlpExporter.test.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.{test,spec}.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use vitest for all tests in the Trigger.dev repository
Files:
apps/webapp/test/otlpExporter.test.ts
apps/webapp/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Test files should only import classes and functions from
app/**/*.tsfiles and should not importenv.server.tsdirectly or indirectly; pass configuration through options insteadIn test files, do not import
env.server.tsdirectly; pass configuration as constructor arguments or options instead for testable code
Files:
apps/webapp/test/otlpExporter.test.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: When importing from@trigger.dev/corein the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.test.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.test.{ts,tsx,js,jsx}: Test files should live beside the files under test and use descriptivedescribeanditblocks
Tests should avoid mocks or stubs and use the helpers from@internal/testcontainerswhen Redis or Postgres are needed
Use vitest for running unit tests
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.test.{ts,tsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.test.{ts,tsx,js}: Use vitest exclusively for testing - never mock anything, use testcontainers instead
Place test files next to source files with naming convention: SourceFile.ts -> SourceFile.test.ts
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use testcontainers for Redis/PostgreSQL testing instead of mocks with redisTest, postgresTest, or containerTest helpers from
@internal/testcontainers
Files:
apps/webapp/test/otlpExporter.test.ts
apps/{webapp,supervisor}/**/*
📄 CodeRabbit inference engine (CLAUDE.md)
When modifying only server components (apps/webapp/, apps/supervisor/) with no package changes, add a .server-changes/ file instead of a changeset
Files:
apps/webapp/test/otlpExporter.test.ts
**/*.{ts,tsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js}: Always import from@trigger.dev/sdkfor Trigger.dev tasks - never use@trigger.dev/sdk/v3or deprecated client.defineJob
Import subpaths only from@trigger.dev/core, never import from root
Add crumbs as you write code using //@crumbscomments or //#region@crumbsblocks for agentcrumbs debug tracing
Files:
apps/webapp/test/otlpExporter.test.ts
apps/webapp/**/*.{ts,tsx,jsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
Remix 2.1.0 is used in apps/webapp for the main API, dashboard, and orchestration with Express server
Files:
apps/webapp/test/otlpExporter.test.ts
🧠 Learnings (11)
📓 Common learnings
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3213
File: apps/webapp/app/components/runs/v3/ai/extractAISpanData.ts:52-52
Timestamp: 2026-03-13T13:42:56.298Z
Learning: In the trigger.dev codebase (PR `#3213`), `extractAISpanData.ts` (`apps/webapp/app/components/runs/v3/ai/extractAISpanData.ts`) is a read-side UI helper that reads already-enriched `trigger.llm.*` span attributes for display. The actual LLM cost computation and gateway/OpenRouter cost fallback logic lives in `enrichCreatableEvents.server.ts` (`apps/webapp/app/v3/utils/enrichCreatableEvents.server.ts`) via `extractProviderCost()`. The `gatewayCost` parsed in `extractAISpanData` is for UI display purposes only, not for cost calculation.
📚 Learning: 2026-03-02T12:42:56.114Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:56.114Z
Learning: Applies to apps/webapp/**/*.test.{ts,tsx} : In test files, do not import `env.server.ts` directly; pass configuration as constructor arguments or options instead for testable code
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2026-03-03T13:07:33.177Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3166
File: internal-packages/run-engine/src/batch-queue/tests/index.test.ts:711-713
Timestamp: 2026-03-03T13:07:33.177Z
Learning: In `internal-packages/run-engine/src/batch-queue/tests/index.test.ts`, test assertions for rate limiter stubs can use `toBeGreaterThanOrEqual` rather than exact equality (`toBe`) because the consumer loop may call the rate limiter during empty pops in addition to actual item processing, and this over-calling is acceptable in integration tests.
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/**/*.test.{ts,tsx} : Test files should only import classes and functions from `app/**/*.ts` files and should not import `env.server.ts` directly or indirectly; pass configuration through options instead
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2026-01-15T10:48:02.687Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-15T10:48:02.687Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Use vitest for running unit tests
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to **/*.{test,spec}.{ts,tsx} : Use vitest for all tests in the Trigger.dev repository
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2026-03-13T13:37:49.544Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T13:37:49.544Z
Learning: Applies to **/*.test.{ts,tsx,js} : Use vitest exclusively for testing - never mock anything, use testcontainers instead
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2026-01-15T10:48:02.687Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-15T10:48:02.687Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Test files should live beside the files under test and use descriptive `describe` and `it` blocks
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2026-01-15T10:48:02.687Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-15T10:48:02.687Z
Learning: Applies to **/*.test.{ts,tsx,js,jsx} : Tests should avoid mocks or stubs and use the helpers from `internal/testcontainers` when Redis or Postgres are needed
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2026-03-02T12:43:25.254Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/run-engine/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:25.254Z
Learning: Applies to internal-packages/run-engine/src/engine/tests/**/*.test.ts : Implement tests for RunEngine in `src/engine/tests/` using testcontainers for Redis and PostgreSQL containerization
Applied to files:
apps/webapp/test/otlpExporter.test.ts
📚 Learning: 2026-03-13T13:42:56.298Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3213
File: apps/webapp/app/components/runs/v3/ai/extractAISpanData.ts:52-52
Timestamp: 2026-03-13T13:42:56.298Z
Learning: In the trigger.dev codebase (PR `#3213`), `extractAISpanData.ts` (`apps/webapp/app/components/runs/v3/ai/extractAISpanData.ts`) is a read-side UI helper that reads already-enriched `trigger.llm.*` span attributes for display. The actual LLM cost computation and gateway/OpenRouter cost fallback logic lives in `enrichCreatableEvents.server.ts` (`apps/webapp/app/v3/utils/enrichCreatableEvents.server.ts`) via `extractProviderCost()`. The `gatewayCost` parsed in `extractAISpanData` is for UI display purposes only, not for cost calculation.
Applied to files:
apps/webapp/test/otlpExporter.test.ts
🔇 Additional comments (3)
apps/webapp/test/otlpExporter.test.ts (3)
1-12: LGTM!Imports are appropriate: vitest lifecycle hooks for test isolation, server module for the function under test, and proper subpath import from
@trigger.dev/core/v3.
440-474: LGTM!The
makeGenAiEventhelper is well-structured with sensible defaults and a clean override pattern for test flexibility.
476-641: Excellent test coverage for LLM cost enrichment.The test suite comprehensively covers:
- Cost calculation and attribute enrichment
- Accessory pills rendering
- Side-channel
_llmUsagefor dual-write- Edge cases (partial spans, missing models, zero tokens, unknown models)
- Token attribute normalization (prompt_tokens → input_tokens alias)
- Registry state handling
The use of
toBeCloseTofor floating-point assertions is appropriate.
Calculates costs from gen_ai.* span attributes using an in-memory pricing registry backed by Postgres, with model prices synced from Langfuse (145 models). Costs are dual-written to span attributes (trigger.llm.*) and a new llm_usage_v1 ClickHouse table for efficient aggregation. - New @internal/llm-pricing package with ModelPricingRegistry - Prisma schema for llm_models, llm_pricing_tiers, llm_prices - ClickHouse llm_usage_v1 table with DynamicFlushScheduler batching - Cost enrichment in enrichCreatableEvents() with gen_ai.usage.* extraction - TRQL llm_usage table schema for querying - Admin API endpoints for model CRUD, seed, and registry reload - Pill-style accessories on spans showing model, tokens, and cost - Anthropic logo icon for RunIcon - Style merge fix for partial/completed span deduplication - Env vars: LLM_COST_TRACKING_ENABLED, LLM_PRICING_RELOAD_INTERVAL_MS refs TRI-7773
…d-on-startup - Add friendly_id column to llm_models (llm_model_xxx format) - Use friendlyId as matchedModelId in all external surfaces - Add durationNs render type to TSQLResultsTable and QueryResultsChart - Add 4 example queries for llm_usage in query editor - Add LLM_PRICING_SEED_ON_STARTUP env var for local bootstrapping - Update admin API and seed to generate friendlyId refs TRI-7773
New model admin dashboard, test model strings, add and edit models, view missing models and easily add them. Also extract cost data from ai gateway provider response metadata, better enrichment.
5cf618a to
622bee1
Compare
llm_usage_v1table for analyticsproviderMetadatawhen registry pricing is unavailablemistral/mistral-large-3matchesmistral-large-3pricing)pnpm run db:seed:ai-spans) with 51 spans across 12 provider systems for local dev testingcompletionTokens/promptTokensaliases,ai.response.objectdisplay for generateObject, cache read/write token breakdown