Docs/Reader_Mode

Reader Mode Legacy

High-speed Markdown extraction for RAG pipelines. Converts noisy HTML into normalized, LLM-ready text.

Cost Efficiency
1 Credit
per request
Latency
<400ms
global average
Output
Markdown
normalized text

Core Capabilities

Reader Mode bypasses the heavy rendering pipeline (Chrome/Puppeteer) and uses a specialized Rust-based parser to strip navigation, ads, footers, and tracking scripts. It returns only the semantic content.

Noise Removal

Automatically filters sidebars, popups, and non-content DOM nodes.

Token Optimization

Reduces whitespace and formatting overhead by ~30% vs raw HTML.

Implementation

cURL Request

curl -X POST https://api.sentienceapi.com/v1/observe \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://news.ycombinator.com",
    "mode": "read",
    "format": "markdown",
    "options": {
      "contentLimit": 50000
    }
  }'
urlRequired

Target HTTP/HTTPS URL.

modeRequired

Must be set to "read".

formatOptional

Output format: "markdown" (default) or "text".

options.contentLimitOptional

Maximum characters in content (default: 50000).

Output Schema

JSON Response

{
  "status": "success",
  "url": "https://news.ycombinator.com",
  "title": "Hacker News",
  "content": "# Hacker News\n\nNew | Past | Comments | Ask | Show...\n\n## Top Stories\n\n### Show HN: I built a perception layer for AI agents\n\nWe're excited to share SentienceAPI...",
  "format": "markdown",
  "author": null,
  "published_date": null,
  "word_count": 1247,
  "reading_time_minutes": 6,
  "timestamp": "2025-12-12T10:30:00.123Z"
}

Normalization Engine

Whitespace Collapse

Multi-line breaks and tabs are compressed to single markdown breaks.

Before: "Title\n\n\n\n\nSubtitle"After:  "Title\n\nSubtitle"

Link Consolidation

Decorative or empty links are stripped; semantic links are preserved.

Before: "Click > > [here](url) < <"After:  "[Click here](url)"