FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Production Engineer
Veeam Software. Own complex and escalated production issues from support, and drive long-term fixes in collaboration with engineering, including code, configuration, and architecture changes.
Tech Stack
Tools & technologiesAzureCloudDistributed SystemsGoGrafanaJavaJavaScriptPrometheusTypeScript
About the role
Key responsibilities & impact- Own complex and escalated production issues from support, and drive long-term fixes in collaboration with engineering, including code, configuration, and architecture changes.
- Proactively identify and address risks that are identified during the problem solving process
- Lead production efficiency initiatives, develop and maintain processes, run-books and knowledge base integrity
- Define, build and maintain production monitoring systems
- Continuously improve alerting to minimize noise and ensure actionable, well-documented runbooks.
- Define and maintain SLIs/SLOs for key services, and use error budgets to guide operational and product decisions.
- Turn manual processes into automation
- Own and drive post-mortem review process and actions arising from incident analysis.
- Collaborate with support organization as an escalation point and feed back knowledge & improvement recommendations.
- Collaborate with developers throughout the lifecycle of changes, from design through rollout and patch delivery, ensuring safe deployments and efficient incident mitigation.
- Participate in design reviews to ensure services are operable with minimal manual intervention in production (automation, safe deployments, clear runbooks), and share learnings through documentation and feedback.
Requirements
What you’ll need- 3–5 years of experience in software engineering, site reliability, production engineering, or senior technical support roles operating distributed systems.
- Experience with log analysis and advanced troubleshooting
- Basic programming experience (e.g., JS, Go, Typescript, Java, or C#).
- Experience deploying and troubleshooting systems on a public cloud platforms (Azure preferred).
- Familiarity with observability tooling (e.g., Elastic, Prometheus, Grafana, Open Telemetry).
- Understanding of distributed systems, networking, automation and CI/CD.
Benefits
Comp & perks- Unlimited paid time off, 12 paid holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
- Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents
- Medical, dental, and vision coverage starting on your first day
- Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program
- 401(k) retirement plan with company matching contributions
- Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time
- AirVet: 24/7 virtual veterinary care at no cost
- Legal services, identity protection, and supplemental health insurance options
- Tax-advantaged spending accounts for healthcare, dependent care, and commuting
- Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
log analysisadvanced troubleshootingprogrammingJavaScriptGoTypescriptJavaC#automationCI/CD
Soft Skills
problem solvingcollaborationleadershipprocess developmentdocumentationincident analysisrisk managementcommunicationefficiency improvementfeedback