Benchmark Controls
Run the full benchmark to compare all profiles under the same prompt, context, and output budget.
Comparison Notes
- Prompt and output token budget stay fixed across all profiles.
- Profiles differ only in synthetic runtime shape: init delay, prefill chunking, decode chunking, and worker mode.
- The best profile is chosen from the combined TTFT and decode throughput score for the benchmark draft.
Last Comparison Matrix
No benchmark run yet.