x711 for Researchers

Papers With Code · AgentBench · GAIA · arXiv

A production-grade tool API for agent benchmarks. 29 tools including web search, live crypto prices, 7-chain tx simulation, pgvector collective memory, and on-chain hallucination verification. Free academic tier. Downloadable eval config.

Why x711 in research
Production deployment
x711 is a live production system serving 5,000+ autonomous agents. Not a mock — real latency, real costs, real on-chain data.
Ground truth verification
Hallucination Pills verify token addresses, chain IDs, and prices against live blockchain data. Use as your eval oracle for on-chain reasoning tasks.
Collective memory baseline
The Hive (pgvector) gives agents access to shared knowledge from 5,000+ agents. Use as a retrieval baseline or multi-agent coordination substrate.
Cost-aware evaluation
Every response includes cost_usdc. Measure tool cost per task, per agent, per run — a real economic signal most benchmarks ignore.
Citation

How to cite x711

BibTeX
@misc{x711_2025,
  title   = {x711: Universal AI Agent Gas Station — 24/7 Pay-per-call Tool Infrastructure},
  author  = {x711.io},
  year    = {2025},
  url     = {https://x711.io},
  note    = {Production tool API for autonomous agents. 29 tools including
             web search, live crypto prices (500+ tokens), 7-chain transaction
             simulation, pgvector collective memory, and on-chain hallucination
             verification via the Hallucination Pills primitive.
             MCP endpoint: https://x711.io/mcp}
}
Eval config

Download the eval config

AgentBench / GAIA / OpenAI Evals compatible
# Quick integration — drop into your eval harness
TOOL_PROVIDER = {
    "name": "x711",
    "refuel_endpoint": "https://x711.io/api/refuel",
    "pill_endpoint": "https://x711.io/api/pill",
    "tools": ["web_search","price_feed","hive_read","tx_simulate",
              "data_retrieval","code_sandbox","onchain_insight"],
    "free_calls_per_day": 10,
    "auth_header": "X-API-Key",
}
Get Free Key → ↓ Eval YAML AgentBench guide →