POST /v1/snapshot

Refine coarse UI geometry into a ranked, compact element list for agent decision making.

Overview

This is the API used by the SDK. The local extension / Playwright session collects raw_elements, then the server produces elements with ranking + visual cues.

Prefer the SDK: If you're integrating an agent, use the SDK Quick Start to get action execution + consistent snapshots automatically.

Request Format

curl -X POST https://api.sentienceapi.com/v1/snapshot \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "viewport": { "width": 1920, "height": 1080 },
    "raw_elements": [
      {
        "id": 0,
        "tag": "button",
        "rect": { "x": 100, "y": 200, "width": 150, "height": 40 },
        "styles": { "z_index": "10", "cursor": "pointer" },
        "attributes": { "role": "button", "aria_label": "Sign in" },
        "text": "Sign in",
        "in_viewport": true,
        "is_occluded": false
      }
    ],
    "options": {
      "limit": 50,
      "filter": { "min_area": 100, "allowed_roles": ["button", "textbox"] }
    }
  }'

Request Parameters

Parameter	Type	Required	Description
`url`	`string`	Yes	The URL of the webpage being analyzed
`viewport`	`object`	Yes	Viewport dimensions with width and height
`raw_elements`	`array`	Yes	Array of raw element data from browser
`options`	`object`	No	Filtering and limiting options (limit, filter)

Response Format

{
  "status": "success",
  "url": "https://example.com",
  "viewport": { "width": 1920, "height": 1080 },
  "elements": [
    {
      "id": 0,
      "role": "button",
      "text": "Sign in",
      "importance": 85,
      "visual_cues": { "is_primary": true, "is_clickable": true },
      "bbox": { "x": 100, "y": 200, "width": 150, "height": 40 },
      "in_viewport": true,
      "is_occluded": false,
      "z_index": 10
    }
  ]
}

Response Fields

Field	Type	Description
`status`	`string`	Request status: "success" or "error"
`url`	`string`	The URL that was analyzed
`viewport`	`object`	Viewport dimensions used
`elements`	`array`	Ranked array of refined elements
`id`	`number`	Element identifier matching input
`role`	`string`	Semantic role (button, textbox, link, etc.)
`text`	`string`	Element text content
`importance`	`number`	Importance score from 0-100 (higher = more important)
`visual_cues`	`object`	Visual hints for agent decision-making
`bbox`	`object`	Bounding box with x, y, width, height
`in_viewport`	`boolean`	Whether element is visible in current viewport
`is_occluded`	`boolean`	Whether element is hidden by other elements
`z_index`	`number`	CSS z-index value

Use Cases

When to Use This API

Use /v1/snapshot when you need fine-grained control:

Custom automation frameworks - Integrate with non-Playwright browsers
Research and experimentation - Test ranking algorithms
Specialized pipelines - Build custom visual AI workflows
Headless environments - Where SDK dependencies aren't available

When to Use the SDK

The SDK is recommended for most use cases because it:

✅ Handles snapshot collection automatically
✅ Provides action execution APIs (click, type, wait)
✅ Includes error handling and retries
✅ Maintains session consistency across actions

Get started with the SDK →

Best Practices

1. Always include viewport dimensions

Accurate coordinates depend on consistent viewport size. Use the same dimensions for snapshots and actions.

2. Filter early to reduce costs

Use options.filter to reduce token costs and improve response times:

{
  "options": {
    "limit": 20,
    "filter": {
      "min_area": 100,
      "allowed_roles": ["button", "textbox", "link"]
    }
  }
}

3. Check occlusion before interacting

Always verify is_occluded: false before attempting to click or type:

if element.is_occluded:
    print(f"Element {element.id} is hidden by another element")
else:
    # Safe to interact
    click(browser, element.id)

4. Sort by importance score

Elements are pre-sorted by importance, but you can re-filter:

# Get only high-importance elements
important_elements = [e for e in elements if e.importance > 70]

Integration Example

Here's how the SDK uses this endpoint internally:

from sentience import SentienceBrowser, snapshot

with SentienceBrowser(api_key="sk_...") as browser:
    browser.page.goto("https://example.com")

    # This calls /v1/snapshot behind the scenes
    snap = snapshot(browser)

    # snap.elements contains the ranked results
    for element in snap.elements[:5]:  # Top 5 most important
        print(f"{element.role}: {element.text} (importance: {element.importance})")

Error Handling

If the request fails, you'll receive an error response:

{
  "status": "error",
  "error": "Invalid viewport dimensions",
  "message": "Viewport width and height must be positive integers"
}

Common errors:

Invalid viewport dimensions - Check width/height are positive numbers
Missing required field - Ensure url, viewport, and raw_elements are provided
Invalid element format - Verify raw_elements array structure
Rate limit exceeded - Reduce request frequency or upgrade plan

Next Steps

SDK Quick Start - Get started with the full SDK
Semantic Queries - Master element finding strategies
Action Execution - Learn how to interact with elements

API Reference

POST /v1/observe