Docs/SDK/Content Reading

Content Reading API

Extract page content as text, markdown, or raw HTML using the read() function.

Basic Usage

from sentience import read

# Get markdown content
result = read(browser, format="markdown")
print(result["content"])

# Get plain text
result = read(browser, format="text")
print(result["content"])

# Get raw HTML (for external processing)
result = read(browser, format="raw")
html = result["content"]

Parameters

Python:

TypeScript:

Returns

Dict/object with:

Format Options

"raw" (default):

"text":

"markdown":

Example Use Cases

Extract article content:

browser.page.goto("https://example.com/article")
result = read(browser, format="markdown")
article_content = result["content"]
print(f"Article length: {result['length']} characters")

Extract text for analysis:

result = read(browser, format="text")
text_content = result["content"]
# Use with NLP libraries or text analysis tools

Save content to file:

result = read(browser, format="markdown")
with open("page_content.md", "w", encoding="utf-8") as f:
    f.write(result["content"])