Senior RL (Reinforcement Learning) Engineer

Senior RL (Reinforcement Learning) Engineer - San Francisco, CA

Staffworx Limited

United States

Information Technology

H-1B

OPT

All other/unspecified

Salary

Up to £500,000 per annum $400,000 pa + $500,000 pa + equity

Job posted on May 30, 2026

APPLY NOW

Job Description:

Senior Reinforcement Learning EngineerSan Francisco, CA (FiDi) | On-site 5 days | Full-time$300,000 -- $500,000 + Equity A small, elite AI research team working on reinforcement learning in open-ended settings. The team includes researchers from leading PhD programmes and tier-one AI organisations. Early-stage, well-resourced, and moving quickly -- this is a genuine ground-floor opportunity with significant scope and impact.The RoleWe are hiring for a Senior RL Engineer to sit at the intersection of research and production engineering. You will own the translation of RL research ideas into reliable, measurable training systems and drive technically complex projects end to end.Key Responsibilities

Build and improve RL training pipelines for language model-based agents
Implement reward functions, verifiers, environment interfaces, rollout pipelines, and evaluation harnesses
Design experiments to test whether RL methods are improving model behaviour, sample efficiency, robustness, or generalisation
Build monitoring tooling: regression tests, eval suites, and reward-hacking checks
Debug unstable training runs and diagnose learning dynamics failures across algorithms, rewards, data, infrastructure, and evals
Manage GPU clusters, distributed training, and compute efficiency
Build 0-to-1 systems for new RL workflows and harden them into reusable infrastructure
Own ambiguous technical problems from problem framing through to delivery

Requirements

Strong applied ML engineering background: shipped systems, open-source work, competitions, or early-stage startup experience
Hands-on experience scaling RL pipelines and debugging training issues
Familiarity with RL environments and large language models; diffusion model experience a plus
Python proficiency and strong working knowledge of PyTorch or JAX
Solid grounding in RL, supervised learning, optimisation, and modern deep learning
Independent, intellectually curious, and able to drive ambiguous problems to working solutions
Comfortable collaborating with researchers while holding high engineering standards
PhD not required -- strong applied experience equally valued

Package

Visa sponsorship available (H1B transfer, TN, OPT, O-1); existing US work authorisation preferred

APPLY NOW