San Francisco, CA
- On-site (5 days/week)
- Full-time Compensation: $180,000–$250,000 + competitive equity
About The Company
An early-stage, venture-backed AI startup building systems that operate computers the way humans do — navigating browsers, processing documents, and working through legacy systems — to automate the messiest enterprise finance operations. The company is going after the $300B+ BPO industry that software historically couldn't touch, and is already live with enterprise customers ranging from $500M to $5B in revenue.
Founded 2025
- :6 people
- Industry: Applied AI / enterprise automation
The Role
Own the intelligence that powers the automation. You'll turn research into production across browser agent reliability, document understanding, and inference optimization — making the system more accurate and faster every week.
What You'll Be Doing
- Push core automation capabilities to state-of-the-art: UI interaction, unstructured-data parsing, and tool use.
- Build adaptive systems that self-heal when environments change.
- Design fine-tuning pipelines that learn from customer-specific workflows.
- Optimize latency across the stack via model selection, quantization, caching, and routing strategies.
- Improve browser agent reliability and document-understanding accuracy on real enterprise data.
Tech stack: Python, PyTorch, and modern ML frameworks; LLMs, agents, RAG, and fine-tuning; inference optimization (quantization, caching, routing).
Requirements
- Strong Python and ML frameworks, particularly PyTorch.
- Applied ML/AI engineering experience at a strong company.
- Eval-and-metric mindset — thinks in terms of metrics that matter in production, not just benchmarks.
- Comfort with messy data and figuring out how to make it useful.
- Track record of shipping — can describe specific systems built end-to-end, not just research.
- Crisp communication about own work — can describe what they built in a few clear sentences without buzzwords.
- Based in San Francisco or willing to relocate; in-person 5 days a week.
Green Flags
- Real applied ML or AI engineering work at a respected Series A–D startup or selective technical org (calibration anchors: Ramp, Databricks, Scale, Stripe).
- Lab or research exposure (SAIL, BAIR, MIT CSAIL, or similar) paired with evidence of shipping, not just publishing — the combination is the highest-signal background.
- Recent momentum toward LLMs, agents, RAG, fine-tuning, or production ML systems; direct adjacency to the roadmap (browser agent reliability, document understanding, inference optimization).
- Experience with RL, retrieval systems, or agent-based systems.
- Cross-stack range: inference optimization, data pipelines, fine-tuning, and model monitoring.
- Published ML papers or significant OSS contributions.
Red Flags
- Resumes or LinkedIn profiles stuffed with 300–400 word descriptions full of buzzwords and keywords.
- Inability to clearly articulate what they actually built and how they thought through problems.
- Communication style that sounds like reading off a script or cue card.
Why Join
- Category-defining problem: AI that actually operates software end-to-end against a $300B+ market.
- Frontier research-to-production work on browser agents, document understanding, and inference optimization.
- Ground-floor ownership on a small SF team, owning the intelligence layer of the product.
- Live enterprise customers and strong early traction.
Details
- Location: San Francisco, CA
- Work policy: In-person, 5 days a week (relocation supported)
- Compensation: $180,000–$250,000 + equity
- Visa sponsorship: H-1B, O-1
- Employment type: Full-time