Docs/SDK/Screenshot API

Screenshot API

Capture screenshots of the current page and search text to find its pixel coordinate.

screenshot() - Capture Screenshot

Captures a screenshot of the current page.

from sentience import screenshot
import base64

# Capture PNG screenshot
data_url = screenshot(browser, format="png")

# Save to file
image_data = base64.b64decode(data_url.split(",")[1])
with open("screenshot.png", "wb") as f:
    f.write(image_data)

# JPEG with quality control
data_url = screenshot(browser, format="jpeg", quality=85)

Parameters:

browser (SentienceBrowser): Browser instance
format (str/string): Image format - "png" or "jpeg" (Python default: "png", TypeScript: options object)
quality (int/number): JPEG quality (1-100, default: 80). Ignored for PNG.

Returns: Base64-encoded data URL string

find_text_rect() / findTextRect() - Search text to get its pixel coordinates

Finds all occurrences of text on the page and returns their exact pixel coordinates. This is useful for locating UI elements by their visible text content without needing element IDs or selectors.

from sentience import find_text_rect, click_rect

with SentienceBrowser() as browser:
    browser.page.goto("https://example.com")

    # Find "Sign In" button
    result = find_text_rect(browser, "Sign In")
    if result.status == "success" and result.results:
        first_match = result.results[0]
        print(f"Found at: ({first_match.rect.x}, {first_match.rect.y})")
        print(f"Size: {first_match.rect.width}x{first_match.rect.height}")
        print(f"In viewport: {first_match.in_viewport}")

    # Case-sensitive search
    result = find_text_rect(browser, "LOGIN", case_sensitive=True)

    # Whole word only (won't match "login" as part of "loginButton")
    result = find_text_rect(browser, "log", whole_word=True)

    # Find and click the first visible match
    result = find_text_rect(browser, "Buy Now", max_results=5)
    if result.status == "success" and result.results:
        for match in result.results:
            if match.in_viewport:
                # Click in the center of the text
                click_rect(browser, {
                    "x": match.rect.x,
                    "y": match.rect.y,
                    "w": match.rect.width,
                    "h": match.rect.height
                })
                break

Parameters:

browser / page (SentienceBrowser/Page): Browser instance or Playwright Page
text (str/string): Text to search for (required)
case_sensitive / caseSensitive (bool, optional): If True, search is case-sensitive (default: False)
whole_word / wholeWord (bool, optional): If True, only match whole words surrounded by whitespace (default: False)
max_results / maxResults (int/number, optional): Maximum number of matches to return (default: 10, max: 100)

Returns: TextRectSearchResult with:

status (str/string): "success" or "error"
query (str/string): The search text that was queried
case_sensitive / caseSensitive (bool): Whether search was case-sensitive
whole_word / wholeWord (bool): Whether whole-word matching was used
matches (int/number): Number of matches found
results (list/array): List of text matches, each containing:
- text (str/string): The matched text
- rect (TextRect): Absolute rectangle coordinates (with scroll offset)
- viewport_rect / viewportRect (ViewportRect): Viewport-relative coordinates
- context (TextContext): Surrounding text (before/after)
- in_viewport / inViewport (bool): Whether visible in current viewport
viewport (Viewport): Current viewport dimensions and scroll position
error (str/string): Error message if status is "error"

Use Cases:

Find buttons, links, or labels by their visible text
Locate UI elements without CSS selectors or element IDs
Get exact pixel coordinates for click automation
Search for specific text in dynamic content
Verify text visibility and position

Notes:

Returns both absolute coordinates (including scroll offset) and viewport-relative coordinates
Includes context text (20 chars before/after) to help identify matches
Respects page scroll position - use in_viewport to filter visible matches
Maximum 100 results to prevent performance issues
Does not consume API credits (runs locally in browser)

Content Reading

Async API