NEW in v0.3.0+ - The Agent Abstraction Layer provides natural language automation with 4 levels of control, from simple commands to full conversational AI.
The Sentience SDK offers multiple levels of abstraction for browser automation:
| Level | Use Case | Code Reduction | Requirements |
|---|---|---|---|
| Level 1: Raw Playwright | Maximum control, edge cases | 0% | LLM API key |
| Level 2: Direct SDK | Precise control, debugging | 80% | Sentience API key |
| Level 3: SentienceAgent | Quick automation, step-by-step | 95% | LLM API key |
| Level 4: ConversationalAgent | Complex tasks, chatbots | 99% | LLM API key |
Quick Tip: Start with Level 3 (SentienceAgent) for most automation tasks. Upgrade to Level 4 (ConversationalAgent) when you need multi-step planning or conversational interfaces.
Use Playwright directly with semantic element finding - no LLM required:
from playwright.sync_api import sync_playwright
# Pure Playwright - no Sentience SDK
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
# Navigate
page.goto("https://amazon.com")
# Find elements with CSS selectors
page.locator('input[id="twotabsearchtextbox"]').fill("wireless mouse")
page.locator('input[id="nav-search-submit-button"]').click()
page.wait_for_selector('.s-result-item')
# Click first result
page.locator('.s-result-item').first.click()
browser.close()When to use Level 1:
Limitations:
Use Sentience SDK for semantic element finding without LLMs:
from sentience import SentienceBrowser, snapshot, find, click, type_text, press
# Sentience SDK - semantic queries, no LLM
with SentienceBrowser(api_key="your_key") as browser:
browser.page.goto("https://amazon.com")
# Semantic element finding (no CSS selectors!)
snap = snapshot(browser)
search_box = find(snap, "role=textbox text~'search'")
type_text(browser, search_box.id, "wireless mouse")
press(browser, "Enter")
# Wait for results
snap = snapshot(browser)
first_result = find(snap, "role=link importance>500")
click(browser, first_result.id)Benefits over Level 1:
When to use Level 2:
Use single natural language commands - the agent handles the rest:
from sentience import SentienceBrowser, SentienceAgent
from sentience.llm import OpenAIProvider
# 1. Create browser and LLM provider
browser = SentienceBrowser(api_key="your_sentience_key")
llm = OpenAIProvider(api_key="your_openai_key", model="gpt-4o")
# 2. Create agent
agent = SentienceAgent(browser, llm)
# 3. Navigate and use natural language commands
browser.page.goto("https://amazon.com")
agent.act("Click the search box")
agent.act("Type 'wireless mouse' into the search field")
agent.act("Press Enter key")
agent.act("Click the first product result")
# Check token usage
print(f"Tokens used: {agent.get_token_stats()['total_tokens']}")ONE command does everything - automatic planning and execution:
from sentience import SentienceBrowser, ConversationalAgent
from sentience.llm import OpenAIProvider
# 1. Setup
browser = SentienceBrowser(api_key="your_sentience_key")
llm = OpenAIProvider(api_key="your_openai_key", model="gpt-4o")
agent = ConversationalAgent(browser, llm)
# 2. Natural language - agent plans and executes automatically
browser.page.goto("https://amazon.com")
response = agent.execute(
"Search for wireless mouse and tell me the price of the top result"
)
print(response) # "I found the top result for wireless mouse. It's priced at $24.99..."
# 3. Follow-up questions maintain context
follow_up = agent.chat("Add it to cart")
print(follow_up)
# 4. Get conversation summary
summary = agent.get_summary()
print(summary)# OpenAI (GPT-4, GPT-4o, etc.)
from sentience.llm import OpenAIProvider
llm = OpenAIProvider(api_key="sk_...", model="gpt-4o")
# Anthropic (Claude)
from sentience.llm import AnthropicProvider
llm = AnthropicProvider(api_key="sk_...", model="claude-3-5-sonnet-20241022")
# Google Gemini
from sentience.llm import GeminiProvider
llm = GeminiProvider(api_key="your_gemini_key", model="gemini-pro")
# GLM (ChatGLM, GLM-4, etc.)
from sentience.llm import GLMProvider
llm = GLMProvider(api_key="your_glm_key", model="glm-4")
# Local LLM (e.g., Qwen, Llama, etc.)
from sentience.llm import LocalLLMProvider
llm = LocalLLMProvider(base_url="http://localhost:8000/v1", model="Qwen/Qwen2.5-3B-Instruct")Understanding the cost and complexity tradeoffs between levels:
Same task: "Search Amazon for wireless mouse and click first result"
| Level | Lines of Code | Complexity | Credits Used | LLM Tokens |
|---|---|---|---|---|
| Level 1 | ~15 lines | High (CSS selectors) | 0 | 0 |
| Level 2 | ~10 lines | Medium (semantic queries) | ~2-4 | 0 |
| Level 3 | ~5 lines | Low (natural language) | ~2-4 | ~1,500 |
| Level 4 | ~3 lines | Very Low (one command) | ~2-4 | ~2,500 |
Level 3: SentienceAgent - Manual step-by-step commands
Level 4: ConversationalAgent - Automatic planning
Sentience API Credits (same across all SDK levels):
LLM Costs (Level 3 & 4 only):
| Level | Sentience Credits | LLM Cost | Total Cost |
|---|---|---|---|
| Level 1 | $0 | $0 | $0 |
| Level 2 | $0.004 | $0 | $0.004 |
| Level 3 | $0.004 | $0.006 | $0.010 |
| Level 4 | $0.004 | $0.010 | $0.014 |
| Level 3 (Local LLM) | $0.004 | $0 | $0.004 |
Cost Optimization Tips:
use_api=False in snapshots to avoid credit usage (free tier)