How SentienceAPI Works

From raw webpages to deterministic agent actions — with full observability.

Most tools help agents load pages or read content.
SentienceAPI helps agents act — reliably.

The Problem: In practice, agents fail far more often at execution than at reasoning

Modern LLMs are good at deciding what they want to do.
They are unreliable at deciding where to do it on a real webpage.

✗

Clicking invisible or occluded elements

✗

Guessing between similar buttons

✗

Retrying with different prompts

✗

Burning tokens on vision loops

✗

Non-reproducible behavior across runs

Sentience fixes this by removing guesswork from action selection.

Where SentienceAPI Fits

SentienceAPI sits between the browser and your LLM.
It converts live webpages into action-ready targets that agents can select deterministically — without brittle selectors or probabilistic vision.

Your Agent

(LLM + Logic)

Defines the goal

Natural language intent

Sentience SDK

(Python / TS)

•Launches browser

•Triggers snapshot()

•Executes actions

Browser +

WASM Ext

•Captures layout

•Visibility & occlusion

•Stable coordinates

Sentience

Gateway (Cloud)

•Semantic grounding

•Heuristics + ML

•Action-ready targets

Deterministic

Actions

•Click / Type / Press

•Verifiable results

•Replayable steps

Your Agent

(LLM + Logic)

Defines the goal

Natural language intent

natural language goal

Sentience SDK

(Python / TS)

•Launches browser

•Triggers snapshot()

•Executes actions

raw geometry + DOM

Browser + WASM Ext

•Captures layout

•Visibility & occlusion

•Stable coordinates

raw geometry data

Sentience Gateway

(Cloud)

•Semantic grounding

•Heuristics + ML rerank

•Action-ready elements

ranked, stable targets

Deterministic Actions

•Click / Type / Scroll

•Verifiable results

•Replayable steps

Execution Traces & Studio

•Every snapshot and action is recorded

•Replay, diff, and debug any agent run

•Power CI-style validation

How It Works (One Pass)

Your Agent Defines the Goal

Your agent (LLM + logic) decides what it wants to do:

•"Click the search input"
•"Add the item to cart"
•"Submit the form"

Sentience does not replace planning or reasoning.

The Sentience SDK Controls the Browser

Using the Sentience SDK (Python or TypeScript), your agent:

•launches a real browser
•navigates pages
•requests a snapshot()

This snapshot is not HTML or pixels — it's structured geometry with semantic rankings.

Browser + WASM Capture Raw Geometry

A lightweight browser extension captures:

•element bounding boxes (x, y, w, h)
•visibility and occlusion
•layout structure
•stable coordinates

No inference. No guessing. Just ground truth.

The Sentience Gateway Grounds Actions

Raw geometry is sent to the Sentience Gateway, which:

•assigns semantic roles (button, input, link, etc.)
•computes visibility & occlusion explicitly
•applies heuristics and ML reranking
•produces a ranked, execution-ready action space

This is the intelligence layer.

Deterministic Actions Are Executed

Your agent selects a target from the grounded action space:

•click
•type
•scroll
•wait

The SDK executes the action exactly where intended, with no retries in most cases.

Want to see this in action?

Run a live example using the Sentience SDK — no setup required.

👉Try it live

What Makes This Different

Sentience vs Browser Infrastructure

Browser infrastructure gives you a place to run code.

Sentience gives your agent certainty about where to act.

Without grounded action selection, agents still guess.

Sentience vs Scrapers / Read APIs

Scrapers are excellent for:

•reading
•summarizing
•RAG

They do not tell an agent:

• what is clickable
• what is visible
• where it is on screen

Reading ≠ acting.

Sentience is built for agents that must interact.

What the Agent Actually Receives

Instead of pixels or raw DOM, your agent gets:

Ranked list of actionable elements

Ordered by relevance and visibility

Stable geometry

Consistent coordinates and bounding boxes

Explicit signals

Visibility, occlusion, primary action cues

Deterministic ordering

Same input produces same output every time

This drastically reduces:

↓Token usage

↓Retries

↓Hallucinated actions

Built-In Observability (Sentience Studio & Traces)

Every step is recorded automatically:

•snapshots

•ranked targets

•chosen action

•execution result

These traces power:

→step-by-step replay

→visual debugging

→determinism diffing

→CI-style validation

Nothing is hidden. Nothing is guessed.

When You Should Use Sentience

Sentience is designed for:

Agents that must act, not just read
Production workflows where retries are expensive
Systems that need auditability and replay
Teams debugging real-world agent failures

If your agent only reads text, Sentience is unnecessary.

If your agent must click, type, scroll, or submit — Sentience is the missing layer.

Try It Live

If you're building agents that must act, SentienceAPI is the missing layer.

Explore interactive SDK examples or test the API directly with real automation scenarios

Select Example:

Navigate to a login page, find email/password fields semantically, and submit the form.

1# No selectors. No vision. Stable semantic targets.
2from sentience import SentienceBrowser, snapshot, find, click, type_text, wait_for
3
4# Initialize browser with API key
5browser = SentienceBrowser(api_key="sk_live_...")
6browser.start()
7
8# Navigate to login page
9browser.page.goto("https://example.com/login")
10
11# PERCEPTION: Find elements semantically
12snap = snapshot(browser)
13email_field = find(snap, "role=textbox text~'email'")
14password_field = find(snap, "role=textbox text~'password'")
15submit_btn = find(snap, "role=button text~'sign in'")
16
17# ACTION: Interact with the page
18type_text(browser, email_field.id, "user@example.com")
19type_text(browser, password_field.id, "secure_password")
20click(browser, submit_btn.id)
21
22# VERIFICATION: Wait for navigation
23wait_for(browser, "role=heading text~'Dashboard'", timeout=5.0)
24
25print("✅ Login successful!")
26browser.close()

Execution Output

🎯 Semantic Discovery

Find elements by role, text, and visual cues - not fragile CSS selectors

⚡ Token Optimization

Intelligent filtering reduces token usage by up to 73% vs vision models

🔒 Deterministic

Same input produces same output every time - no random failures

SentienceAPI focuses on execution intelligence. Browser runtimes and navigation engines are intentionally decoupled.