Software Engineer, Data Infrastructure – Research

OpenAI

full-time

Posted on: 9/22/2025

Origin: • 🇺🇸 United States • California

✨ AI Apply

💰 $250,000 - $380,000 per year

Mid-LevelSenior

Distributed Systems

About the role

Design and maintain standardized dataset APIs, including for multimodal data that cannot fit in memory
Build proactive testing and scale validation pipelines for dataset loading at GPU scale
Integrate datasets into training and inference pipelines, collaborating with multimodal researchers and infra teams
Document and maintain dataset interfaces for discoverability and consistent adoption
Establish safeguards and validation systems to ensure reproducibility of standardized datasets
Debug and resolve performance bottlenecks in distributed dataset loading (e.g., stragglers)
Provide visualization and inspection tools to surface errors, bugs, or bottlenecks
Work on LLM training and inference infrastructure to support massive-scale GPU/accelerator fleets

Strong engineering fundamentals with experience in distributed systems, data pipelines, or infrastructure
Experience building APIs, modular code, and scalable abstractions with attention to UX
Comfortable debugging bottlenecks across large fleets of machines
Collaborative, humble, and able to own foundational ML infrastructure
Bonus: background in data math, probability, or distributed data theory
Bonus: experience with GPU-scale distributed systems or dataset scaling for real-time data