Design and optimize systems for large-scale signal generation, indexing, serving, and applications
Build and maintain content feature lifecycle management, including generation, storage, sourcing, monitoring, and deprecation of unused features
Simplify the content feature development process by collaborating with ML data platform teams and improving tooling for generation, storage, and sourcing
Optimize and monitor signal pipelines for reliability, latency, and scalability
Develop infrastructure for training data pipelines, including logjoin optimization, streaming logjoin, data sampling, data shuffling, and window tuning
Build and maintain training data for new applications and ranking models, including experiments on long-term objectives such as user retention and creator affinity
Collaborate with ML engineers to improve training workflows (feature engineering, preprocessing, model iterations, evaluation, and inference)
Build training data monitoring and analysis tools with Bento and data infra teams, including SQL-based analysis, feature importance, discrepancy detection, and anomaly detection
Requirements
Strong programming skills in Python, Java, Scala, or C++
Strong problem-solving skills with a focus on system performance, data quality, and scalability
Deep understanding of distributed systems, data pipelines, and ML infrastructure
Experience with big data processing frameworks such as Spark, Flink, Dataflow, or Ray
Familiarity with feature engineering, signal pipelines, and model training workflows
Proven track record of operating highly available and reliable infrastructure at scale
Ability to proactively learn new concepts and apply them in a fast-paced environment
Strong collaboration skills with ML engineers, data scientists, and infra teams
Bachelor’s degree in a technical field such as computer science or equivalent experience
6+ years of post-Bachelor’s software development experience; or Master’s degree + 5+ years; or PhD + 2+ years
Experience building large-scale data or ML production systems, distributed systems, or big data processing systems
Preferred: Masters/PhD, experience with feature platforms, logjoin optimization, training data systems, TensorFlow, PyTorch, Spark ML, signal pipelines, feature registries, retrieval systems, data quality monitoring
Preferred: hands-on experience with Snap internal tech stacks such as Robusta, Hashi, Dataflow, Feature Registry, Mixer, Retrieval Service, logjoin, and dcoll
Benefits
paid parental leave
comprehensive medical coverage
emotional and mental health support programs
compensation packages that let you share in Snap’s long-term success
this position is eligible for equity in the form of RSUs
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonJavaScalaC++big data processingfeature engineeringsignal pipelinesmodel training workflowslogjoin optimizationdata quality