FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Data Center Reliability Engineer
PhaidraData Center Reliability Engineer utilizing AI-powered tools for industrial automation and improving data center operations while collaborating with a remote team.
Posted 6/2/2026full-timeRemote • Washington • 🇺🇸 United StatesJuniorMid-Level💰 $101,320 - $163,900 per yearWebsite
Tech Stack
Tools & technologiesNumpyPandasPython
About the role
Key responsibilities & impact- Utilize existing data ingestion and delivery platforms to "teach" models to understand the physical world, filling a critical expertise gap in the data center industry.
- Use telemetry tools to analyze sensor data across mechanical (chillers, pumps) and electrical (UPS, switchgear, power feeds) systems to identify "failure signatures" for LLM-driven monitoring tool.
- Act as a primary user of platforms, identifying gaps in current mechanisms and collaborating with Engineering to influence future features and data quality.
- Translate raw telemetry into "SME-level" logic and directions used by the LLM tool to guide data center operators in real-time.
- Cultivate deep domain expertise in all facets of data center infrastructure.
- Move from shadowing peers to directly supporting customers, using the platform to provide clear, data-backed direction on complex problems.
- Oversee pilot projects to test how AI-driven SME tool interprets real-world stressors, ensuring the output is operationally realistic, accurate, and actionable.
- Remain agile and proactive in a fast-moving team environment.
Requirements
What you’ll need- 2–3 years of professional relevant experience
- Bachelor’s degree in Mechanical Engineering, Electrical Engineering, Control Theory, or a related field that provides a foundation in physical systems and thermodynamics.
- A deep, innate interest in using data to diagnose how and why systems fail. You are a "tinkerer" who prefers solving real-world problems over theoretical research.
- Strong Python skills and experience with data manipulation libraries (Pandas/NumPy) to perform custom analysis outside of standard tooling.
- Ability to explain complex diagnostic findings clearly and persuasively to both technical peers and non-domain stakeholders.
- A proven ability to look at a problem without preconceived notions and figure out solutions either independently or via team collaboration.
- Demonstrated commitment to Transparency, Collaboration, and Ownership—especially in environments where reliability and learning from failure are paramount.
Benefits
Comp & perks- Fast-paced, team-oriented environment where your work directly shapes the company’s direction.
- We are a 100% remote company.
- Competitive compensation & meaningful equity.
- Outsized responsibilities & professional development.
- Training is foundational; functional, customer immersion, and development training.
- Medical, dental, and vision insurance (exact benefits vary by region).
- Unlimited paid time off, with a required minimum of 20 days per year.
- Paid parental leave (exact benefits vary by region).
- Flexible stipends to support your workspace, well-being, and continued professional development.
- Company MacBook.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Pythondata manipulationPandasNumPydata analysistelemetry analysismechanical systemselectrical systemsAI-driven toolsdata ingestion
Soft Skills
problem-solvingcommunicationcollaborationtransparencyownershipadaptabilitycritical thinkingpersuasivenessindependenceteamwork