Comet

Senior DevOps Engineer

Comet

full-time

Posted on:

Origin:  • 🇺🇸 United States

Visit company website
AI Apply
Manual Apply

Job Level

Senior

Tech Stack

AWSCloudDockerGoGoogle Cloud PlatformGrafanaJavaKubernetesLinuxOpen SourcePrometheusPythonTerraformUnix

About the role

  • Comet builds a development platform and experiment management for ML teams used by Netflix, Uber, and others.
  • Design, implement, and manage scalable, secure, and reliable cloud-based infrastructure
  • Build and maintain CI/CD pipelines for efficient and consistent application delivery
  • Implement and manage Infrastructure as Code (IaC) to ensure consistency across environments
  • Drive adoption of best practices in automation, observability, and system reliability
  • Ensure security and compliance across infrastructure and deployments
  • Optimize cost management of cloud infrastructure
  • Collaborate with teams to improve processes and ensure operability
  • Troubleshoot, investigate, and resolve production issues affecting customers

Requirements

  • 5+ years of experience in a DevOps, SRE, or related role, including significant production experience
  • Proven remote work experience and strong collaboration skills in distributed teams
  • Deep understanding of DevOps practices, automation, CI/CD, and infrastructure-as-code
  • Passion for troubleshooting and root cause analysis
  • Strong experience with cloud platforms (AWS preferred, GCP a plus) and managing infrastructure with Terraform
  • Solid understanding of networking, security, and infrastructure best practices
  • Significant hands-on experience with containerization and orchestration (Docker, Kubernetes, Helm)
  • Experience with observability tools such as Prometheus, Grafana, or NewRelic
  • Strong background in Linux/Unix system administration
  • Proficiency in scripting (Bash, Python)
  • Experience in software development (Java, Python, Go) - a plus
  • Knowledge of database management and performance optimization - a plus