kirha logo

Benchmark

How does querying premium data providers compare to searching the web? To find out, we tested 100 domain-specific queries across three verticals (Company Data, Insurance, Crypto) and measured both quality and token consumption. Responses were summarized using Gemini 2.5 Flash and evaluated by an LLM-as-Judge with extended thinking enabled.

The full results, methodology, and raw data are available at benchmark.kirha.com. The benchmark is fully open source so you can reproduce it or run your own tests.

GitHubkirha-ai/benchmark

3

Results

KirhaWeb Search
Overall score87 / 10061 / 100
Total tokens consumed across all tests233,9204,604,853

Kirha uses 95% fewer tokens while scoring 42% higher overall.

Score breakdown

MetricKirhaWeb Search
Relevance8970
Accuracy8755
Completeness8163
Freshness9464
Actionability8652

Why the difference

Web search returns raw HTML pages that the agent has to parse, filter, and often re-query to find the right data. This consumes tokens at every step: fetching pages, extracting content, discarding noise, and sometimes retrying with different queries.

Kirha queries premium data providers directly and returns structured, domain-specific data. The agent gets exactly what it needs in a single call, with no parsing and no noise. Fewer tokens in, better data out.

Run your own comparison

Visit benchmark.kirha.com to explore individual test results, see the prompts used, and read the full methodology.

On this page