Verified Job On Employer Career Site Job Summary: ScaleAI is transforming how organizations build and deploy AI, focusing on advancing generative AI initiatives. As a Data Infrastructure Engineer on the AI Infrastructure team, you will design and maintain scalable data platforms to support R&D and applied ML workloads, collaborating closely with various teams to enhance data quality and optimize infrastructure costs. Responsibilities: • Design, implement, and maintain scalable data platforms to support diverse R&D and applied ML workloads. • Partner with ML researchers, product engineers, and operations teams to align data infrastructure with organizational goals. • Collaborate with ML researchers to build data access tools that help advance the state of frontier post-training research. • Participate in our team’s on call process to ensure the availability of our services. • Own projects end-to-end, from requirements, scoping, design, to implementation, in a highly collaborative and cross-functional environment. Qualifications: Required: • 2+ years of experience in building and operating large-scale distributed data systems that support ML workloads. • Expertise in modern data platform technologies. • Experience working with standard containerization & deployment technologies like Kubernetes, Helm, Terraform, Docker, etc. • Strong problem solving skills and the ability to work effectively in a fast paced, dynamic environment. Preferred: • Familiarity with ML development tools such as PyTorch, HuggingFace, or Weights & Biases. • Experience with a variety of storage systems: object (S3), document (MongoDB), relational (Postgres), and distributed (Redis, Elasticsearch). • Exposure to orchestration platforms like Temporal, Airflow, or AWS Step Functions. • Experience supporting post-training workflows such as evaluation, fine-tuning, and RLHF in LLM systems. • Experience working in a fast-moving startup or high-scale ML infra environment. Company: Scale AI provides a data-oriented platform that assists in the development of AI applications. Founded in 2016, the company is headquartered in San Francisco, California, USA, with a team of 501-1000 employees. The company is currently Late Stage. Scale AI has a track record of offering H1B sponsorships.