CloudZero

Senior CloudOps Engineer

CloudZero

full-time

Posted on:

Location Type: Hybrid

Location: BostonMassachusettsUnited States

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Design and maintain Pulumi modules that provision reliable, cost-efficient cloud resources
  • Own infrastructure end to end with no clicking through consoles
  • Instrument systems so that failures surface quickly and debugging happens with data, not guesswork
  • Build observability into everything so you know about problems before customers do
  • Automate deployments, scaling, backups, and limit changes; if humans are doing it repeatedly, build a system to do it instead
  • Balance automation intelligently, building solutions to real problems rather than automating for its own sake
  • Help teams design resilient services, review architectures for operational complexity, and build deployment pipelines that enable safe and fast shipping
  • Optimize for cost and performance; CloudZero's business is helping others optimize cloud costs, and we should be exemplars of efficient cloud usage ourselves

Requirements

  • 3 to 5+ years of experience building and operating distributed systems in AWS
  • Strong skills in Python and Infrastructure as Code using Pulumi or Terraform
  • Experience with frontier AI models such as Claude, Codex, or Gemini
  • Hands-on experience with monitoring tools such as Prometheus or Datadog
  • Proven ability to debug production issues under pressure
  • Values thoughtful, reliable system design over reactive hero efforts
  • Strong documentation habits to support long-term team clarity and system stability
  • Ability to clearly explain complex technical issues to non-technical stakeholders
  • Excited to take ownership of infrastructure and solve operational challenges at scale
Benefits
  • Health insurance
  • 401(k)
  • Paid time off
  • Flexible work arrangements
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonInfrastructure as CodePulumiTerraformdistributed systemsmonitoring toolsdebuggingobservabilityautomationcost optimization
Soft Skills
system designdocumentationcommunicationownershipproblem-solvingresilienceteam collaborationclaritypressure managementthoughtfulness