Local LLM benchmarking
Benchmark local models with repeatable prompt runs
Select models from the default local catalog, define prompts with generation settings, run sequential inference benchmarks, then compare averages and per-run details in interactive charts.
4. Results
Aggregated averages per model/prompt. Hover bars to inspect every run and failed-run count.
end-to-end duration per run
No benchmark results yet. Configure models and prompts, then run a benchmark.
Execution log
No events yet.