>cogtrix v0.3.0

Web & Search

Cogtrix exposes a single canonical research tool: web_search. It runs a multi-provider fan-out (DuckDuckGo always; Tavily / Exa / Brave / Google / SerpAPI / SearXNG when their API keys are configured), fetches top results, extracts page content with trafilatura, and returns a structured Markdown picture — sources with ①②③… citation indices, optional synthesis section, disagreements between sources called out explicitly, and a coverage block reporting per-provider + per-fetch outcomes.

Architecture and design rationale: ADR-0056 (held in the private documentation submodule).

Universal web research tool — multi-provider fan-out, fetch, extract, format.

Parameters:

ParameterTypeRequiredDefaultDescription
querystringYesThe research query
depthintNo3Top-K sources to fetch + extract (1–10). Higher = more breadth, longer wall time. The historical lxml GIL bottleneck that motivated lowering the default from 6 was removed in PR #1716 — extraction now runs in a ProcessPoolExecutor so pages are parsed in true parallel; the in-process _LXML_LOCK is retained as an unused export for back-compat with callers that still imported it. Default of 3 is now a latency choice, not a serialisation workaround. Set depth explicitly (5–10) for deep research.
regionstringNo"wt-wt"Region hint for providers that accept one (e.g. DDG).
compactboolNofalseWhen true, drop per-source extracts and the Additional Sources tail (~5 KB vs ~18 KB output).

Returns: Markdown blob with sections (in order):

  • # Research: <query> header.
  • ## Key findings — synthesised cross-source facts with [①②③…] citations. Stage 5 synthesis runs in-tool by default; the section is omitted only when synthesis is explicitly disabled or its deadline (10 s) expires.
  • ## Disagreements — emitted when sources state directly contradictory facts.
  • ## Gaps — aspects of the query the search couldn’t answer.
  • ## Sources — flat index of cited URLs with domain class + recency tag.
  • Per-source extract bodies (non-compact mode).
  • ## Additional sources — snippet-only tail of URLs that survived dedup but didn’t make top-K (non-compact mode).
  • ## Coverage — operator-facing summary: providers responded, raw vs distinct count, fetch outcomes, synthesis model + elapsed, total wall time.

Failure modes: The full reliability table is in ADR-0056. Key categories: validation-failed, blocked-robots, cross-domain-redirect, ssl-error, rate-limited, http-status, timeout. Every failure produces partial-but-useful output; the hard outer deadline is 25 s (raised from 15 s in PR #1687).

SSRF safety: Every fetch (including the robots.txt probe and every redirect hop) is DNS-pinned to the IP that _validate_url resolved up front — the connect target cannot diverge from the validated address. See src/tools/_http_fetch.py for the mechanism.

Retired legacy tools

The following tools were removed from the agent catalogue when web_search shipped. The underlying functions remain importable from their respective modules for power users and internal use; the agent simply no longer sees them as discoverable tools:

  • search_web (DuckDuckGo, see src/tools/web_search.py; search_news is also importable but is not part of the agent catalogue)
  • tavily_search (src/tools/tavily_search.py)
  • brave_search (src/tools/brave_search.py)
  • google_search (src/tools/google_search.py)
  • exa_search (src/tools/exa_search.py)
  • serpapi_search (src/tools/serpapi_search.py)
  • searxng_search (src/tools/searxng_search.py)

tavily_extract, exa_find_similar, and exa_get_contents remain in the catalogue — they cover use cases (URL-targeted extraction, semantic similarity) that web_search does not subsume.


Web & HTTP

http_get

Make HTTP GET requests.

Parameters:

ParameterTypeRequiredDescription
urlstringYesURL to request
headersstringNoRequest headers as JSON string
timeoutintNoTimeout in seconds (default: 30)

Returns: Response body and status code


http_post

Make HTTP POST requests with JSON data.

Requires Confirmation: Yes

Parameters:

ParameterTypeRequiredDescription
urlstringYesURL to request
datastringYesRequest body as JSON string
headersstringNoRequest headers as JSON string
timeoutintNoTimeout in seconds (default: 30)

Weather

get_weather

Get current weather for any location.

Requires: OpenWeather API key (set in config or OPENWEATHER_API_KEY)

Parameters:

ParameterTypeRequiredDescription
locationstringYesCity name or coordinates
unitsstringNoUnits: metric, imperial (default: metric)

Returns:

{
  "temperature": 22,
  "feels_like": 24,
  "humidity": 65,
  "description": "partly cloudy",
  "wind_speed": 12
}