Software Engineer, Model Inference, Mid-Level

Jobright.ai

San francisco, California, United States

Information Technology

H-1B

Job posted on August 10, 2025

APPLY NOW

Job Description:

Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust.
Job Summary: OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. They are seeking a Software Engineer, Model Inference to optimize AI models for high-volume, low-latency production environments, collaborating with researchers and engineers to enhance performance and efficiency.
Responsibilities: • Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production. • Work alongside researchers to enable advanced research through awesome engineering. • Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack. • Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues. • Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.
Qualifications:
Required: • Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference. • Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done. • Have at least 5 years of professional software engineering experience. • Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc. • Have experience architecting, building, observing, and debugging production distributed systems. • Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale. • Are self-directed and enjoy figuring out the most important problem to work on. • Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.
Preferred: • Bonus point if worked on performance-critical distributed systems.
Company: OpenAI creates artificial intelligence technologies to assist with tasks and provide support for human activities. Founded in 2015, the company is headquartered in San Francisco, California, USA, with a team of 201-500 employees. The company is currently Growth Stage. OpenAI has a track record of offering H1B sponsorships.

APPLY NOW