Lead Software Engineer, AI Data Systems

Upwork

full-time

Posted on: 9/11/2025

Origin: • 🇺🇸 United States

✨ AI Apply

💰 $151,500 - $215,500 per year

Senior

ApacheAWSAzureCloudGoogle Cloud PlatformPandasPythonSpark

About the role

Collect high-quality training data and curate datasets for supervised, unsupervised, and reinforcement learning use cases
Build scalable featurization and preprocessing pipelines to transform raw data into structured inputs for AI/ML model development
Partner with ML engineers and researchers to define data requirements and production workflows for LLM-based agents and autonomous AI systems
Lead development of infrastructure enabling experimentation, evaluation, and deployment of machine learning models in production
Support orchestration and real-time inference pipelines using Python and cloud-native tools, ensuring low-latency and high availability
Mentor engineers and foster a high-performance, collaborative engineering culture
Drive cross-functional alignment with product, infrastructure, and research stakeholders on progress, goals, and architecture

Strong software engineering background with deep experience in building data collection, transformation, and featurization pipelines at scale
Proficiency in Python, including async programming and concurrency tools
Experience with data-centric frameworks such as Pandas, Spark, or Apache Beam
Familiarity with ML model development workflows and infrastructure, including dataset versioning, experiment tracking, and model evaluation
Experience deploying and scaling AI systems in cloud environments such as AWS, GCP, or Azure
Proven success operating in highly ambiguous environments such as research labs, startups, or fast-paced product teams
Experience in startups, AI/ML research environments, or similarly dynamic settings is essential
Track record of working with or alongside high-caliber peers in top engineering teams, research groups, or startup ecosystems
Growth mindset, strong communication skills, and a commitment to inclusive collaboration and continuous learning