Design and implement scalable, high-performance data processing pipelines capable of handling petabyte-scale telemetry data (logs, metrics, traces)
Build and optimize ML-driven data routing, filtering, and transformation engines to reduce customer data volumes by 80%+ while preserving critical insights
Develop real-time analytics and anomaly detection systems using advanced machine learning techniques and large language models
Architect cloud-native microservices and APIs that integrate seamlessly with major observability platforms (Splunk, Elastic, Datadog, New Relic)
Implement robust monitoring, alerting, and observability solutions for distributed systems operating at enterprise scale
Collaborate with Product and DevOps teams to translate customer requirements into technical solutions
Optimize system performance, cost efficiency, and reliability through continuous profiling, testing, and infrastructure improvements
Mentor junior engineers and contribute to engineering culture through code reviews, technical design discussions, and knowledge sharing
Stay current with emerging technologies in AI/ML, data engineering, and observability to drive innovation and competitive advantage
Requirements
5+ years of software engineering experience with focus on distributed systems, data engineering, or ML infrastructure in high-growth SaaS environments
Expert-level proficiency in Go, Rust or Java with strong understanding of system design patterns and software architecture principles
Deep experience with cloud platforms (AWS, GCP, Azure) and container orchestration technologies (Kubernetes, Docker)
Proven track record in building and scaling data pipelines using technologies like Apache Kafka, Apache Spark, Apache Flink, or similar streaming frameworks
Strong background in database technologies, including both SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Cassandra, Redis) systems
Hands-on experience with machine learning frameworks (TensorFlow, PyTorch, scikit-learn) and MLOps practices for production ML systems
Expertise in observability and monitoring tools and practices, with experience integrating with platforms like Prometheus, Grafana, or ELK stack
Solid understanding of data formats, protocols, and standards used in enterprise observability (OpenTelemetry, StatsD, syslog, JSON, Parquet)
Experience with Infrastructure as Code tools (Terraform, CloudFormation) and CI/CD pipelines for automated deployment and testing
Strong analytical and problem-solving skills with the ability to optimize complex systems for performance, cost, and reliability
Excellent communication skills with experience collaborating across engineering, product, and customer-facing teams
Bachelor's degree in Computer Science, Engineering, or related field; advanced degree preferred
Benefits
Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA
Unlimited PTO
Industry-leading gender-neutral parental leave
Paid company holidays
Paid sick time
Employee stock purchase program
Disability and life insurance
Employee assistance program
Gym membership reimbursement
Cell phone reimbursement
Numerous company-sponsored events including regular happy hours and team-building events
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.