DevOps Engineer

• Design, implement, and maintain highly available, scalable, and secure cloud infrastructure for the Sweep Data platform and AI workloads using Infrastructure as Code practices
• Improve and expand observability strategy with Datadog for Rails application and AI workloads
• Develop scalable infrastructure to support machine learning model training, deployment, and monitoring
• Participate in incident response and post-mortem reviews
• Support critical infrastructure scaling projects and high-traffic systems design
• Establish team processes including runbooks, workflows, and documentation
• Collaborate within SRE guild and across engineering teams and AI/ML teams
• Manage day-to-day operations including on-call duties, capacity planning, and proactive system health monitoring
• Implement security measures and support enterprise customer security requirements including BYOK and data sovereignty compliance
• Contribute to maintaining SOC 2 Type 2, ISO 27001 compliance
• Proactively improve systems and stay up-to-date with industry trends

Site Reliability Engineer

Associate Site Reliability Engineer

Senior Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer II – Network Operations

CI/CD Engineer