Kirha Docs - Benchmark

How does querying premium data providers compare to searching the web? To find out, we tested 100 domain-specific queries across three verticals (Company Data, Insurance, Crypto) and measured both quality and token consumption. Responses were summarized using Gemini 2.5 Flash and evaluated by an LLM-as-Judge with extended thinking enabled.

The full results, methodology, and raw data are available at benchmark.kirha.com. The benchmark is fully open source so you can reproduce it or run your own tests.

kirha-ai/benchmark

Results

	Kirha	Web Search
Overall score	87 / 100	61 / 100
Total tokens consumed across all tests	233,920	4,604,853

Kirha uses 95% fewer tokens while scoring 42% higher overall.

Score breakdown

Metric	Kirha	Web Search
Relevance	89	70
Accuracy	87	55
Completeness	81	63
Freshness	94	64
Actionability	86	52

Why the difference

Web search returns raw HTML pages that the agent has to parse, filter, and often re-query to find the right data. This consumes tokens at every step: fetching pages, extracting content, discarding noise, and sometimes retrying with different queries.

Kirha queries premium data providers directly and returns structured, domain-specific data. The agent gets exactly what it needs in a single call, with no parsing and no noise. Fewer tokens in, better data out.

Run your own comparison

Visit benchmark.kirha.com to explore individual test results, see the prompts used, and read the full methodology.

Benchmark

Results

Score breakdown

Why the difference

On this page