# BrowseAI Dev — Full Documentation > Research infrastructure for AI agents. Real-time web search with evidence-backed citations and confidence scores. ## What Is BrowseAI Dev? BrowseAI Dev is open-source research infrastructure that gives AI agents real-time web search with evidence-backed citations. It returns structured JSON (claims, sources, confidence, contradictions) that agents can programmatically evaluate — not a chat response. Available as MCP server (npm: browseai-dev, renamed from browse-ai — old name still works), REST API, and Python SDK (PyPI: browseaidev, renamed from browseai — old name still works). MIT licensed. ## Why Use BrowseAI Dev? 1. **Structured output**: Every response includes extracted claims with source citations, verification scores, consensus levels, and contradiction flags 2. **Evidence-based confidence**: 7-factor algorithm computed from real signals, not LLM self-assessment 3. **Self-improving**: Domain authority scores improve with usage via Bayesian cold-start smoothing 4. **Multi-surface**: Same capabilities across MCP, REST API, and Python SDK 5. **Research sessions**: Persistent memory across multiple queries for deep research ## Installation ### MCP Server (for Claude, Cursor, Windsurf, etc.) ```json { "mcpServers": { "browseai-dev": { "command": "npx", "args": ["-y", "browseai-dev"] } } } ``` ### Python SDK ```bash pip install browseaidev ``` ### Framework Integrations ```bash pip install langchain-browseaidev # LangChain tools pip install crewai-browseaidev # CrewAI tools pip install llamaindex-browseaidev # LlamaIndex tools ``` ### REST API ```bash curl -X POST https://browseai.dev/api/browse/answer \ -H "Content-Type: application/json" \ -d '{"query": "How do mRNA vaccines work?"}' ``` ## API Endpoints ### POST /browse/search Search the web and return ranked results. ```json {"query": "quantum computing breakthroughs 2024", "limit": 5} ``` ### POST /browse/answer Full research pipeline: search → fetch → extract → verify → cite → score. ```json {"query": "How does CRISPR gene editing work?", "depth": "fast"} ``` Set `depth: "thorough"` for auto-retry with rephrased query when confidence < 60%. ### POST /browse/extract Extract structured claims from a specific URL. ```json {"url": "https://example.com/article", "query": "pricing details"} ``` ### POST /browse/open Fetch and parse a web page into clean text. ```json {"url": "https://example.com/article"} ``` ### POST /browse/compare Compare raw LLM answer vs evidence-backed answer side-by-side. ```json {"query": "Is nuclear energy safe?"} ``` ### POST /browse/feedback Submit feedback on a result to improve future accuracy. ```json {"resultId": "abc123", "rating": "good"} ``` Ratings: "good", "bad", "wrong". Optional: `claimIndex` to flag a specific wrong claim. ### Research Sessions #### POST /session/create Create a persistent research session. ```json {"topic": "AI safety research"} ``` #### POST /session/:id/ask Research within a session. Recalls prior findings before searching. ```json {"query": "What are the main approaches to AI alignment?"} ``` #### POST /session/:id/recall Query session knowledge without new web search. ```json {"query": "What did we learn about RLHF?"} ``` #### POST /session/:id/share Share a session publicly for other agents to fork. #### GET /session/:id/knowledge Export all accumulated claims from a session. #### POST /session/fork/:shareId Fork a shared session to continue the research. ## Response Format ### Answer Response ```json { "answer": "mRNA vaccines work by...", "claims": [ { "claim": "mRNA vaccines use lipid nanoparticles for delivery", "sources": ["https://nature.com/...", "https://pubmed.ncbi.nlm.nih.gov/..."], "verified": true, "verificationScore": 0.82, "consensusCount": 3, "consensusLevel": "strong" } ], "sources": [ { "url": "https://nature.com/...", "title": "mRNA Vaccine Technology", "domain": "nature.com", "quote": "The lipid nanoparticle encapsulates...", "verified": true, "authority": 0.95 } ], "confidence": 0.78, "contradictions": [], "trace": [ {"step": "search", "duration_ms": 450}, {"step": "fetch", "duration_ms": 1200}, {"step": "extract", "duration_ms": 800}, {"step": "verify", "duration_ms": 50}, {"step": "answer", "duration_ms": 600} ] } ``` ## Verification Pipeline 1. **Web Search** — Tavily API searches for relevant pages 2. **Page Fetch** — Downloads and parses pages into clean text 3. **Claim Extraction** — Gemini 2.5 Flash extracts structured claims with source attribution 4. **BM25 Verification** — Sentence-level matching verifies each claim against source text 5. **Cross-Source Consensus** — Claims found in multiple sources get higher consensus scores 6. **Contradiction Detection** — Identifies conflicting claims across sources 7. **Domain Authority** — 10,000+ domains scored across 5 tiers with Bayesian dynamic blending 8. **Confidence Score** — 7-factor evidence-based score (not LLM self-assessed) ### Confidence Score Factors - Source count (15%) - Domain diversity (10%) - Claim grounding ratio (10%) - Citation depth (5%) - Verification rate (25%) - Domain authority average (20%) - Consensus score (15%) - Contradiction penalty applied when conflicts detected ### Domain Authority Tiers - Tier 1 (0.95): Government, academic institutions (gov, edu, who.int, nature.com) - Tier 2 (0.85): Major news, established reference (reuters.com, wikipedia.org, bbc.com) - Tier 3 (0.70): Quality tech/science publications (arxiv.org, techcrunch.com) - Tier 4 (0.50): General web, blogs, forums - Tier 5 (0.30): Content farms, low-quality aggregators Dynamic scores improve over time using Bayesian cold-start smoothing from real verification data. ## MCP Tools (12 total) | Tool | Description | |------|-------------| | browse_search | Search the web for information | | browse_open | Fetch and parse a web page | | browse_extract | Extract structured claims from a URL | | browse_answer | Full pipeline: search + extract + cite | | browse_compare | Compare raw LLM vs evidence-backed | | browse_session_create | Create a research session | | browse_session_ask | Research within a session | | browse_session_recall | Query session knowledge | | browse_session_share | Share a session publicly | | browse_session_knowledge | Export session claims | | browse_session_fork | Fork a shared session | | browse_feedback | Submit result feedback | ## Python SDK ```python from browseaidev import BrowseAIDev client = BrowseAIDev() # Simple answer result = client.answer("How does CRISPR work?") print(f"Confidence: {result.confidence}") for claim in result.claims: print(f" [{claim.consensus_level}] {claim.claim}") # Thorough mode result = client.answer("Latest quantum computing breakthroughs", depth="thorough") # Research session session = client.create_session(topic="AI Safety") r1 = session.ask("What is RLHF?") r2 = session.ask("How does constitutional AI differ?") knowledge = session.knowledge() # Feedback client.feedback(result_id="abc123", rating="good") # Async from browseaidev import AsyncBrowseAIDev async_client = AsyncBrowseAIDev() result = await async_client.answer("query") ``` ## Authentication Three options: 1. **BYOK (Bring Your Own Keys)**: Pass `X-Tavily-Key` and `X-OpenRouter-Key` headers — unlimited usage 2. **BrowseAI Dev API Key**: Get a `bai_xxx` key from the dashboard — usage tracked per key 3. **Demo mode**: No auth needed — 5 queries/hour per IP ## Self-Hosting MIT licensed. Clone the repo and deploy: ```bash git clone https://github.com/BrowseAI-HQ/BrowseAI-Dev.git cd BrowseAI-Dev pnpm install pnpm dev ``` Required env vars: `SERP_API_KEY` (Tavily), `OPENROUTER_API_KEY` (LLM). Optional: `SUPABASE_URL`, `SUPABASE_SERVICE_ROLE_KEY` (persistence). ## Links - Website: https://browseai.dev - Documentation: https://browseai.dev/docs - Playground: https://browseai.dev/playground - GitHub: https://github.com/BrowseAI-HQ/BrowseAI-Dev - npm: https://www.npmjs.com/package/browseai-dev (renamed from browse-ai, old name still works) - PyPI: https://pypi.org/project/browseaidev/ (renamed from browseai, old name still works) - Agent Skills: https://github.com/BrowseAI-HQ/browseAIDev_Skills - Discord: https://discord.gg/ubAuT4YQsT - License: MIT