Skip to content

Releases: ihor-sokoliuk/mcp-searxng

v1.10.1

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 04 Jul 18:45

Fixed

  • USER_AGENT now applied to the /config and suggestions requests: The configured USER_AGENT header is now sent on the SearXNG /config instance-info fetch and on search-suggestion fetches. These two paths previously always used the default agent while the main search and web_url_read paths already honored USER_AGENT, so instances that filter or rate-limit by User-Agent behaved inconsistently. The header is now merged in one shared request-config helper covering every outbound instance request. (BUG-009, #145)

Security

  • SSRF guard now blocks CGNAT and the remaining IANA special-purpose IPv4 ranges: The private-address guard that protects web_url_read — and the DNS-rebinding lookup hook that re-validates every resolved answer — previously only rejected RFC1918, loopback, link-local, and 0.0.0.0/8. It now also blocks CGNAT (100.64.0.0/10, Tailscale's default range plus container overlays and ISP CGNAT), the TEST-NET ranges, benchmarking (198.18.0.0/15), IETF protocol assignments (192.0.0.0/24), 6to4 relay anycast, multicast (224.0.0.0/4), and reserved/broadcast (240.0.0.0/4). All blocked ranges are consolidated into a single auditable CIDR table (RFC 6890) enforced at both the literal-hostname and DNS-resolved paths; IPv4-mapped IPv6 delegates here and is covered too. (SEC-024, #147)

Full Changelog: v1.10.0...v1.10.1

v1.10.0

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 03 Jul 22:58

Added

  • Content-type-aware web_url_read: The URL reader now inspects the response Content-Type before converting. HTML is converted to markdown as before; JSON (application/json and *+json) is pretty-printed in a fenced block; and plain text, YAML, TOML, and XML are returned as readable fenced text. Binary, media, archive, and PDF responses are now rejected with a short hint instead of being decoded into unreadable bytes — fixing the case where fetching a PDF URL fed garbage to the model. Responses whose declared type is missing or generic are sniffed for a NUL byte in the first kilobyte and rejected if they look binary, which also catches binaries mislabeled as text/plain; anything textual continues through the existing HTML pipeline unchanged. (FEAT-045, #142, resolves #133)

  • Actionable errors when a SearXNG instance returns non-JSON: When a search gets a 200 response whose body is not JSON — an HTML results page because the instance never enabled format: json, or a Cloudflare/WAF interstitial — the error now names both fixes (enable - json under search.formats in the instance's settings.yml, or set SEARXNG_HTML_FALLBACK=true) while still including the response preview, instead of failing with an opaque "Invalid JSON format". (FEAT-053, #141, resolves #137)

  • Documented NODE_EXTRA_CA_CERTS for Windows and corporate-proxy TLS: A new "TLS / Corporate CA" section in CONFIGURATION.md explains that Linux and macOS auto-detect the system CA bundle, while Windows users behind a TLS-inspecting corporate proxy (Zscaler, Netskope, Palo Alto, Blue Coat) must export the proxy's root CA to PEM and point the standard Node.js NODE_EXTRA_CA_CERTS variable at it — with the PowerShell export steps and an explicit warning never to use the insecure NODE_TLS_REJECT_UNAUTHORIZED=0. No code change; the variable was already honored by Node/undici. (FEAT-054, #143, resolves #138)

Full Changelog: v1.9.0...v1.10.0

v1.9.0

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 02 Jul 21:41

Added

  • Configurable Express trust proxy for HTTP mode (MCP_HTTP_TRUST_PROXY): When the Streamable HTTP transport runs behind a trusted reverse proxy, set MCP_HTTP_TRUST_PROXY so Express resolves the real client IP from X-Forwarded-For before computing rate-limit keys and request logs. Accepts true, a trusted hop count such as 1, or a subnet/preset such as loopback or 10.0.0.0/8; unset, false, or 0 disables it, which stays the secure default (enabling it without a real proxy in front lets clients spoof X-Forwarded-For). This is distinct from the outbound HTTP_PROXY / HTTPS_PROXY settings that govern this server's own requests. (FEAT-051, #140)

Fixed

  • HTTP session recovered after a server restart: The Streamable HTTP sessions map is in-memory, so a client that reused its mcp-session-id across a server restart got wedged — a fresh initialize still carried the stale header and fell through to 400 / -32000. initialize is now accepted regardless of any stale session header, and unknown session IDs on non-initialize POSTs return 404 / -32001 "Session not found" (matching the MCP SDK's own shape) so clients can detect a dead session and re-initialize. (BUG-010, #139)

  • Search JSON-parse errors keep the real response preview: A fetch response body is single-use, and the old path called response.text() in the catch after response.json() had already consumed it, so a JSON-parse failure always degraded to [Could not read response text]. The body is now read as text first and then parsed, so the error carries the actual response preview — making misconfigured or HTML-returning instances far easier to diagnose. (BUG-008, #131)

Security

  • SEARXNG_URL credentials redacted in errors, logs, and provenance: Embedded userinfo (user:pass@host) in SEARXNG_URL no longer leaks into model-visible error messages, client logs, or servedBy provenance. A shared redaction helper is now applied at every instance-URL emission point — the aggregate failover error, the ECONNREFUSED nested message, request/fallback logs, error context, and servedBy. (BUG-007, #136)

Full Changelog: v1.8.0...v1.9.0

v1.8.0

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 23 Jun 19:45

Added

  • Multi-instance failover and optional parallel fanout for SEARXNG_URL: SEARXNG_URL now accepts several semicolon-separated SearXNG replica URLs that are treated as interchangeable. In the default failover mode a search tries each instance in order until one returns results; an instance with 3 consecutive hard failures is skipped for 60 seconds, while a 200 OK with an empty result set is treated as healthy and does not trigger cooldown. Set the new SEARXNG_FANOUT=true to instead query all healthy instances in parallel and merge results — deduplicated by canonical URL, keeping the highest-scoring copy and ordered by descending score. A single-URL SEARXNG_URL behaves exactly as before, so no configuration change is required. (FEAT-047, #128)

  • Capability discovery aggregated across all instances for filter guidance: searxng_instance_info and the categories/engines search parameters now aggregate live /config capabilities from every reachable configured instance instead of a single one. The tool reports common categories and engines (supported on every reachable instance, so safe for consistent multi-instance results) alongside best-effort available values, keeping filter guidance accurate when replicas differ in their enabled engines. A /config endpoint that fails is skipped for about 60 seconds, or retried immediately when searxng_instance_info is called with refresh=true. (FEAT-048, #130)

Fixed

  • safesearch accepted as a string enum and honoring the instance default when omitted: safesearch is now declared as a string enum ("0", "1", "2") so MCP clients that send every tool argument as a string — notably Gemini and Antigravity — no longer fail schema validation. The schema default was also dropped, so omitting safesearch now falls back to each instance's server-side default instead of forcing a value. (BUG-006, #127)

  • Docker Compose HTTP transport reachable from the host: The HTTP transport in the provided docker-compose setup now binds to 0.0.0.0 instead of a loopback address, so the mapped port is reachable from the host rather than only from inside the container.

Full Changelog: v1.7.2...v1.8.0

v1.7.2

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 21 Jun 00:51

Security

  • Container image now runs as a non-root user (UID 1000): The published Docker image previously ran as root, so Kubernetes deployments using the runAsNonRoot: true pod security context were rejected at admission. The image now sets a numeric USER 1000 (the node account already present in the node:lts-alpine base), which satisfies runAsNonRoot without an additional runAsUser override and reduces the container's blast radius. No configuration change is required. (Reported by @nogweii, #122)

Full Changelog: v1.7.1...v1.7.2

v1.7.1

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 18 Jun 21:40

Security

  • DNS-resolved private-address SSRF in web_url_read blocked (GHSA-mrvx-jmjw-vggc): The URL reader previously validated only the literal hostname string, so a public-looking hostname that DNS-resolves to a private, loopback, or link-local address (for example a domain pointing at 127.0.0.1/10.0.0.0/8 or a cloud metadata endpoint like 169.254.169.254) bypassed the SSRF guard. Direct (no-proxy) reads now validate every resolved DNS answer before connecting and pin the connection to the validated address, closing the DNS-rebinding window. The MCP_HTTP_ALLOW_PRIVATE_URLS=true opt-out still applies. When a URL-reader proxy is configured the proxy performs DNS resolution, so those deployments must rely on egress/firewall controls (documented in SECURITY.md).
  • Unbounded response-body read in web_url_read capped (GHSA-xcqx-9jf5-w339): The page-size limit was advisory only — a server using chunked transfer encoding, a failing/absent HEAD response, or a body larger than its reported Content-Length could force the entire response into memory (denial of service). The body is now read through a bounded stream that enforces URL_READ_MAX_CONTENT_LENGTH_BYTES (default 5 MB) against the decompressed size and stops once the cap is exceeded, before any conversion or caching.

Full Changelog: v1.7.0...v1.7.1

v1.7.0

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 18 Jun 17:31

✨ Added

  • HTML-search fallback (SEARXNG_HTML_FALLBACK=true) — opt-in compatibility mode for SearXNG instances that disable JSON output. When a search hits a 403/404 or a non-JSON response, it is automatically retried without format=json and results (title, URL, snippet) are parsed from the regular HTML results page and marked sourceFormat: "html". Triggers strictly on format rejections — never on 401, 5xx, network, or timeout errors. Enabling JSON on a SearXNG instance you control remains the recommended setup (see the README troubleshooting section).

🔒 Security

  • undici → 7.28.0 — resolves two HIGH advisories affecting 7.0.0–7.27.2: GHSA-vmh5-mc38-953g (TLS certificate validation bypass in the SOCKS5 ProxyAgent) and GHSA-pr7r-676h-xcf6 (cross-user information disclosure via shared-cache whitespace bypass).
  • form-data → 4.0.6 — clears a CRLF-injection advisory (GHSA-hmw2-7cc7-3qxx) in the test toolchain.

Full Changelog: v1.6.0...v1.7.0

v1.6.0

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 16 Jun 21:00

This release rolls up everything since v1.4.0. Note: 1.5.0 was published to npm and Docker Hub on 2026-06-12 but never received a GitHub release — those changes are included below alongside the new 1.6.0 work.

✨ Added

  • engines parameter on searxng_web_search — a comma-separated list (e.g. google,bing,duckduckgo) routes a search to specific SearXNG engines instead of the category defaults.
  • Validated & normalized categories / engines — values are trimmed and matched case-insensitively against the connected instance's live /config, and canonical names are sent to SearXNG. Unknown values are rejected up front with the available options listed, fixing silent search degradation from miscased names.
  • Configurable URL cache controlsCACHE_TTL_MS (default 24 h) and CACHE_MAX_ENTRIES (default 500).
  • Bounded URL cache eviction — entries track hit counts and use LFU eviction with oldest-entry tie-breaking.
  • searxng_suggestions tool — returns search autocomplete suggestions from the instance.
  • searxng_instance_info tool — discovers instance capabilities (engines, categories, languages, safe-search).
  • JSON response formatsearxng_web_search accepts response_format ("text" | "json") for programmatic result processing.
  • Search metadata in text output — answers, spelling corrections, infoboxes, and suggestions surface alongside ranked results.

🔧 Changed

  • URL cache TTL default raised from 60 s to 24 h within a running server (entries still expire/evict).

🐛 Fixed

  • Metadata (answers, corrections, infoboxes) is preserved in text output even when min_score filters out all web results.
  • Unresponsive engines are no longer listed in text output.
  • searxng_suggestions and searxng_instance_info now route through the configured search proxy and default TLS dispatcher.

🔒 Security

  • Least-privilege Docker workflow permissionssecurity-events: write is isolated to a dedicated image-scan job in both the publish and rebuild workflows, with id-token: write confined to the publish/sign job and workflow-level permissions kept read-only.
  • Patched bundled hono — pinned the transitive hono dependency to ≥ 4.12.25 (npm overrides) to resolve CVE-2026-54290 (CORS middleware origin reflection) in the published Docker image.

🏗️ Build / CI

  • Added a CI workflow running lint plus unit and integration tests on every pull request and push to main.

Full Changelog: v1.4.0...v1.6.0

v1.5.0

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 16 Jun 21:00

Backfilled release — 1.5.0 was published to npm and Docker Hub on 2026-06-12 but the GitHub release was missed at the time.

✨ Added

  • searxng_suggestions tool — returns search autocomplete suggestions from the SearXNG instance.
  • searxng_instance_info tool — discovers the connected instance's capabilities (enabled engines, supported categories, available languages, safe-search settings).
  • JSON response formatsearxng_web_search accepts a response_format parameter ("text" | "json"); "json" returns raw structured data for programmatic processing.
  • Search metadata in text outputsearxng_web_search text responses now include answers, spelling corrections, infoboxes, and autocomplete suggestions when the instance returns them.

🐛 Fixed

  • Metadata (answers, corrections, infoboxes) is preserved in text output even when min_score filters out all web results.
  • Unresponsive engines are no longer listed in text output.
  • searxng_suggestions and searxng_instance_info requests route through the configured search proxy and default TLS dispatcher.

Full Changelog: v1.4.0...v1.5.0

v1.4.0

Choose a tag to compare

@ihor-sokoliuk ihor-sokoliuk released this 12 Jun 03:16

Added

  • Result count control: num_results parameter on searxng_web_search (1–20) lets callers request only as many results as they need. SEARXNG_MAX_RESULTS env var sets an operator-level hard cap that applies even when num_results is omitted — useful for reducing token spend across all callers.

  • Token budget limits: SEARXNG_MAX_RESULT_CHARS env var truncates each search result snippet to a character limit (appending ) before returning. URL_READ_MAX_CHARS env var sets a default maxLength for URL reads when the caller omits it — both controls are recommended for local models with small context windows.

  • HEAD preflight for URL reader: A fast HEAD request is made before every URL fetch to check Content-Length. If the server reports a size above URL_READ_MAX_CONTENT_LENGTH_BYTES (default 5 MB), the download is blocked and a descriptive message with readHeadings/section pagination hints is returned instead of downloading an unbounded body.

  • categories parameter on searxng_web_search: Routes searches to specific SearXNG categories — general, news, images, videos, it, science, files, social media. Omitting the parameter uses the SearXNG instance default (general).

  • Configurable search defaults: SEARXNG_DEFAULT_LANGUAGE and SEARXNG_DEFAULT_SAFESEARCH env vars set operator-level defaults for language and safe-search level. Per-call parameters still take precedence. Invalid SEARXNG_DEFAULT_SAFESEARCH values (not 0, 1, or 2) are logged and ignored.

  • Configurable timeouts: SEARXNG_TIMEOUT_MS controls the search request timeout and FETCH_TIMEOUT_MS controls the URL reader fetch timeout (both default to 10000 ms).

  • Lite tool schemas (SEARXNG_LITE_TOOLS=true): When set, registers minimal query-only and url-only tool schemas instead of the full parameter list. Reduces context overhead for local models with small context windows while still forwarding any extra arguments the caller provides.

Security

  • Pinned the npm trusted publishing installer step in the publish workflow to a full commit SHA to guard against tag-swap supply-chain attacks.

Full Changelog: v1.3.4...v1.4.0