Own the entire Laboratory Operations Software release process execution, ensuring smooth and timely software releases with minimal downtime.
Continuously monitor the effectiveness of the release process and implement improvements to increase efficiency, reduce errors, and enhance overall quality.
Act as an internal consultant and subject matter expert, coaching individual product teams on best-in-class DevOps practices, including infrastructure-as-code (IaC), monitoring, logging, and security integration.
Embed with development teams to assess and improve DevOps maturity, delivery practices, and operational readiness.
Design and implement a variety of projects to support extreme growth of complexity of applications as well as to enable innovation.
Provide hands-on guidance in CI/CD, cloud infrastructure usage, Kubernetes operations, and observability.
Help teams adopt existing infrastructure, platforms, and tooling provided by central Cloud / Platform teams.
Promote and reinforce technical standards, guardrails, and best practices that allow teams to operate autonomously while remaining compliant and secure.
Guide teams in applying organizational expectations around reliability, security, and cost management through automation rather than manual controls.
Serve as a feedback channel to central platform and cloud teams, sharing adoption challenges and improvement opportunities.
Continuously improve and automate infrastructure provisioning, configuration management, application deployment, and testing using tools like Terraform, Kubernetes and CI/CD.
Advocate for automation-first approaches to reduce operational toil and risk.
Partner with teams to define and implement Service Level Indicators (SLIs), Service Level Objectives (SLOs), and operational dashboards for their services.
Guide teams through incident response, post-incident reviews, and reliability improvements.
Identify systemic reliability issues and escalate platform-level concerns to the appropriate owning teams.
Drive capacity planning and performance tuning activities to ensure scalability and efficiency.
Provide expert-level support for complex infrastructure and deployment issues escalated by the product teams.
Assist teams in root cause analysis and long-term remediation.
Create and maintain clear documentation, runbooks, release process, CI/CD pipelines, and regression testing procedures.
Maintain comprehensive documentation of the release process, CI/CD pipelines, and regression testing procedures.
Share best practices and lessons learned across teams to raise overall DevOps maturity.

Requirements

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
7+ years of professional software engineering experience building production-grade systems with emphasis on automation, integrations and infrastructure tooling.
Excellent problem-solving skills with the ability to troubleshoot complex issues in a fast-paced environment.
Excellent communication, coaching, and collaboration skills, with the ability to work effectively across teams and convey technical concepts to non-technical stakeholders.
Deep understanding of Site Reliability Engineering (SRE) principles, including SLIs, SLOs, error budgets, and toil reduction.
Expertise in setting up and managing comprehensive monitoring, logging, and alerting systems.
Proven experience with incident response and leading post-incident review (post-mortem) processes.
Experience with capacity planning, performance analysis, and optimization of distributed systems.
Strong expertise in CI/CD tools (e.g., Jenkins, GitLab CI).
Practical experience building complex CI/CD pipelines.
Proficiency in at least one programming language (e.g., Java, Python).
Strong command of AWS stack.
Proficiency in Docker, Kubernetes and Helm.
Experience working with databases (SQL, MySQL, PostgreSQL).
Version control systems (e.g., Git).
Experience working with Terraform.

Benefits

Comprehensive medical, dental, vision, life and disability plans for eligible employees and their dependents.
Free testing for employees and their immediate families.
Fertility care benefits.
Pregnancy and baby bonding leave.
401k benefits.
Commuter benefits.
Generous employee referral program.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

CI/CDinfrastructure-as-codemonitoringloggingsecurity integrationcapacity planningperformance tuningautomationprogramming languagedatabase

Soft Skills

problem-solvingcommunicationcoachingcollaboration