FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Data Center Facility Telemetry & Controls Engineer
LambdaData Center Engineer responsible for telemetry and controls for AI infrastructure. Designing, deploying, and managing BMS and DCIM platforms across data centers.
Posted 6/23/2026full-timeRemote • California • 🇺🇸 United StatesSeniorLead💰 $185,000 - $290,000 per yearWebsite
Tech Stack
Tools & technologiesGrafanaPrometheus
About the role
Key responsibilities & impact- Architect and manage BMS integration across colocation and Lambda-owned facilities, covering chillers, CRAHs, CDUs (Coolant Distribution Units), cooling towers, UPS systems, PDUs, and automatic transfer switches.
- Define standards for BMS point lists, naming conventions, control sequences, and integration protocols (BACnet, Modbus, SNMP, OPC-UA, RESTful APIs).
- Oversee commissioning and acceptance testing of new BMS deployments and CDU/TCS loop integrations for next-generation liquid-cooled GPU rack systems.
- Collaborate with colocation partners (Equinix, Digital Realty, and others) to ensure telemetry data flows from provider BMS/EPMS into Lambda's monitoring stack.
- Own the DCIM platform strategy and roadmap — evaluating, selecting, and implementing tooling for asset management, capacity planning, environmental monitoring, and power chain visibility.
- Develop and maintain real-time dashboards for PUE, thermal performance, stranded capacity, and cooling system efficiency across all Lambda sites.
- Build and maintain telemetry pipelines ingesting data from BMS, PDUs, in-rack sensors, CDUs, and network devices into centralized monitoring and alerting platforms (e.g., Prometheus, Grafana, InfluxDB, or equivalent).
- Define alarm thresholds and escalation workflows for critical facility events including high coolant temperatures, CDU inlet/outlet anomalies, leak detection, and power exceedances.
- Develop control strategies and setpoint frameworks for TCS (Thermal Control System) loops supporting direct liquid cooling at densities of 220–380 kW per rack.
- Evaluate and qualify CDU vendors on controls integration capabilities, telemetry exposure, and remote management interfaces.
- Define and enforce operational procedures for CDU commissioning, setpoint changes, loop pressure management, and fluid quality monitoring.
- Support design and construction coordination for liquid cooling infrastructure in new data center buildouts, ensuring BMS and controls readiness at Day 1.
- Establish and maintain facility event management processes, including on-call response protocols for facility telemetry anomalies.
- Lead root cause analysis for facility system failures and implement corrective actions to prevent recurrence.
- Partner with the data center operations team to maintain and refine emergency response runbooks tied to BMS alerts and automated controls.
- Drive continuous improvement in MTTR for facility-related events through better telemetry coverage and automated remediation.
- Manage BMS integrators, DCIM vendors, and control subcontractors - from RFP through design, installation, commissioning, and ongoing support.
- Serve as the primary technical interface with colocation providers on all BMS/EPMS integration topics.
- Collaborate with Lambda's infrastructure engineering, construction, and procurement teams to align controls requirements with facility buildout timelines.
- Support due diligence and technical evaluation for new colocation sites and modular data center deployments from a telemetry and controls readiness perspective.
Requirements
What you’ll need- 7+ years of experience in data center infrastructure engineering, with at least 4 years focused on BMS, DCIM, or controls systems in a hyperscale, colocation, or AI/HPC environment.
- Hands-on experience designing and integrating BMS for mission-critical facilities including UPS, PDU, CRAH/CRAC, chiller plant, cooling tower, and liquid cooling (CDU/in-row) systems.
- Strong working knowledge of industrial control protocols: BACnet IP/MS-TP, Modbus TCP/RTU, SNMP, DNP3, and modern API-based integrations.
- Demonstrated experience with DCIM platforms (Nlyte, Sunbird, Vertiv TRELLIS, or equivalent) including deployment, configuration, and ongoing administration.
- Experience with real-time telemetry stacks (Prometheus, InfluxDB, Grafana, or similar) applied to infrastructure monitoring use cases.
- Strong understanding of data center power and cooling systems, including PUE optimization, thermal management, and redundancy architectures (2N, N+1).
Benefits
Comp & perks- Opportunity to shape the telemetry and controls architecture for one of the fastest-growing AI infrastructure platforms in the industry.
- Work with cutting-edge GPU infrastructure at rack densities at the frontier of what the industry has deployed.
- Collaborative environment with experienced infrastructure, construction, and vendor teams across a rapidly scaling global portfolio.
- Competitive compensation including salary, equity, and comprehensive benefits.
- Flexibility in work location with hybrid/remote options depending on facility portfolio needs.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
BMS integrationDCIMcontrols systemstelemetry datacooling systemsreal-time dashboardscontrol strategiesroot cause analysisautomated remediationcapacity planning
Soft Skills
collaborationleadershipcommunicationproblem-solvingcontinuous improvementproject managementtechnical interfaceoperational proceduresevent managementvendor management