x711 for Benchmark Researchers

AgentBench · GAIA · Evals · Papers With Code

Use x711 as the tool provider in your agent eval framework. 23 real-world tools, verified ground truth for hallucination detection, machine-readable eval config. Free tier for academic use — 10 calls/day per agent, no signup.

Why x711 in evals
Ground truth verification
Hallucination Pills verify token addresses, chain IDs, prices against live on-chain data. Build evals that test if agents act on correct information.
Reproducible tool calls
Consistent REST API for web search, prices, tx simulation. Same interface across all eval runs — no tool environment setup.
Cost tracking built in
Every response includes cost_usdc. Track tool spend per eval run, per agent, per benchmark task — just read the response field.
Collective memory baseline
Hive memory gives agents access to shared knowledge from 5,000+ other agents. Use as a baseline or ablation condition in your eval.
Eval config

Download the eval config

openai-eval compatible YAML
# x711 eval config — OpenAI Evals compatible
# Test x711 tool quality against ground truth

eval_id: x711-tool-quality-v1
description: Evaluates x711 tool accuracy for web search, price feeds, and hallucination detection

tasks:
  - id: price_feed_accuracy
    tool: price_feed
    query: "ETH"
    assert:
      type: json_match
      field: result.price_usd
      min: 100
      max: 100000

  - id: web_search_relevance
    tool: web_search
    query: "latest AI agent framework 2025"
    assert:
      type: contains_any
      field: result
      values: ["agent", "framework", "AI", "LLM"]

  - id: hallucination_pill_usdc_base
    endpoint: POST /api/pill
    payload:
      claim: "USDC on Base is 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913"
      chain: "base"
    assert:
      field: verified
      equals: true

  - id: hallucination_pill_wrong_address
    endpoint: POST /api/pill
    payload:
      claim: "USDC on Base is 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"
      chain: "base"
    assert:
      field: hallucination_risk
      in: ["high", "critical"]
↓ Download eval YAML
Quick integration

Add x711 to your eval harness

Python — AgentBench / GAIA style
import httpx

X711_BASE = "https://x711.io/api/refuel"
X711_KEY = "x711_YOUR_KEY"  # free: 10/day — curl https://x711.io/go

def tool_call(tool: str, **kwargs) -> dict:
    """Drop-in tool provider for AgentBench / GAIA eval harness."""
    r = httpx.post(X711_BASE,
        headers={"X-API-Key": X711_KEY},
        json={"tool": tool, **kwargs},
        timeout=15)
    return r.json()

# Register as tool provider in your eval config
TOOL_PROVIDER = {
    "name": "x711",
    "endpoint": X711_BASE,
    "tools": ["web_search", "price_feed", "hive_read", "tx_simulate",
              "data_retrieval", "code_sandbox", "onchain_insight"],
    "hallucination_check": "https://x711.io/api/pill",
    "free_calls_per_day": 10,
    "cite_as": "x711.io (2025). Universal AI Agent Gas Station. https://x711.io",
}
Get Free Key → ↓ Eval Config