
Manager, Site Reliability Engineering
Dimensional Fund Advisors
full-time
Posted on:
Location Type: Hybrid
Location: Austin • Texas • United States
Visit company websiteExplore more
Tech Stack
About the role
- Manage a global team of SREs, driving professional growth and operational excellence through coaching and mentorship.
- Own our monitoring strategy and keep the team apprised of our service’s health and performance indicators through dashboarding and alerting.
- Lead infrastructure capacity planning and headroom management for the team and its infrastructure to ensure we scale effectively.
- Collaborate with product and engineering teams to negotiate and manage error budgets, SLOs and SLIs.
- Drive the standardization of approaches, logging practices, and observability across the organization.
- Develop a strategy for intuitively navigable documentation and oversee its implementation.
- Act as the primary liaison between SRE, TPMs, DevOps, development teams and business stakeholders.
- Relentlessly pursue opportunities to eradicate toil through automation.
- Lead the debugging, troubleshooting, diagnosing, and resolving incidents, ensuring rapid response and effective post-mortems.
Requirements
- Deep expertise in ELK, Prometheus, and Grafana.
- Proficiency in Python-based service development, Linux administration, and CI/CD.
- Experience with data flows using Airflow, dbt and Snowflake.
- Capability to write and run automated tests.
- Experience running software projects from ideation through design, implementation, deployment and operations.
- Demonstrated ability to be self-organized and self-driven with strong communication skills to influence cross-functional partners at all levels.
Benefits
- Dimensional offers a variety of programs to help take care of you, your family, and your career, including comprehensive benefits, educational initiatives, and special celebrations of our history, culture, and growth.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonLinux administrationCI/CDELKPrometheusGrafanaAirflowdbtSnowflakeautomated testing
Soft Skills
coachingmentorshipcommunicationself-organizationself-driveninfluencing