Search
Cogtrix exposes a single canonical research tool: web_search. It runs a multi-provider
fan-out (DuckDuckGo always; Tavily / Exa / Brave / Google / SerpAPI / SearXNG when their API
keys are configured), fetches top results, extracts page content with trafilatura, and
returns a structured Markdown picture — sources with ①②③… citation indices, optional
synthesis section, disagreements between sources called out explicitly, and a coverage
block reporting per-provider + per-fetch outcomes.
Architecture and design rationale: ADR-0056 (held in the private documentation submodule).
web_search
Universal web research tool — multi-provider fan-out, fetch, extract, format.
Parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | string | Yes | — | The research query |
depth | int | No | 3 | Top-K sources to fetch + extract (1–10). Higher = more breadth, longer wall time. The historical lxml GIL bottleneck that motivated lowering the default from 6 was removed in PR #1716 — extraction now runs in a ProcessPoolExecutor so pages are parsed in true parallel; the in-process _LXML_LOCK is retained as an unused export for back-compat with callers that still imported it. Default of 3 is now a latency choice, not a serialisation workaround. Set depth explicitly (5–10) for deep research. |
region | string | No | "wt-wt" | Region hint for providers that accept one (e.g. DDG). |
compact | bool | No | false | When true, drop per-source extracts and the Additional Sources tail (~5 KB vs ~18 KB output). |
Returns: Markdown blob with sections (in order):
# Research: <query>header.## Key findings— synthesised cross-source facts with[①②③…]citations. Stage 5 synthesis runs in-tool by default; the section is omitted only when synthesis is explicitly disabled or its deadline (10 s) expires.## Disagreements— emitted when sources state directly contradictory facts.## Gaps— aspects of the query the search couldn’t answer.## Sources— flat index of cited URLs with domain class + recency tag.- Per-source extract bodies (non-compact mode).
## Additional sources— snippet-only tail of URLs that survived dedup but didn’t make top-K (non-compact mode).## Coverage— operator-facing summary: providers responded, raw vs distinct count, fetch outcomes, synthesis model + elapsed, total wall time.
Failure modes: The full reliability table is in ADR-0056. Key categories: validation-failed,
blocked-robots, cross-domain-redirect, ssl-error, rate-limited, http-status,
timeout. Every failure produces partial-but-useful output; the hard outer deadline is
25 s (raised from 15 s in PR #1687).
SSRF safety: Every fetch (including the robots.txt probe and every redirect hop) is
DNS-pinned to the IP that _validate_url resolved up front — the connect target cannot
diverge from the validated address. See src/tools/_http_fetch.py for the mechanism.
Retired legacy tools
The following tools were removed from the agent catalogue when web_search shipped. The
underlying functions remain importable from their respective modules for power users and
internal use; the agent simply no longer sees them as discoverable tools:
search_web(DuckDuckGo, seesrc/tools/web_search.py;search_newsis also importable but is not part of the agent catalogue)tavily_search(src/tools/tavily_search.py)brave_search(src/tools/brave_search.py)google_search(src/tools/google_search.py)exa_search(src/tools/exa_search.py)serpapi_search(src/tools/serpapi_search.py)searxng_search(src/tools/searxng_search.py)
tavily_extract, exa_find_similar, and exa_get_contents remain in the catalogue —
they cover use cases (URL-targeted extraction, semantic similarity) that web_search
does not subsume.
Web & HTTP
http_get
Make HTTP GET requests.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | URL to request |
headers | string | No | Request headers as JSON string |
timeout | int | No | Timeout in seconds (default: 30) |
Returns: Response body and status code
http_post
Make HTTP POST requests with JSON data.
Requires Confirmation: Yes
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | URL to request |
data | string | Yes | Request body as JSON string |
headers | string | No | Request headers as JSON string |
timeout | int | No | Timeout in seconds (default: 30) |
Weather
get_weather
Get current weather for any location.
Requires: OpenWeather API key (set in config or OPENWEATHER_API_KEY)
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
location | string | Yes | City name or coordinates |
units | string | No | Units: metric, imperial (default: metric) |
Returns:
{
"temperature": 22,
"feels_like": 24,
"humidity": 65,
"description": "partly cloudy",
"wind_speed": 12
}