Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Lambda

Data Center Facility Telemetry & Controls Engineer

Lambda

Data Center Engineer responsible for telemetry and controls for AI infrastructure. Designing, deploying, and managing BMS and DCIM platforms across data centers.

Posted 6/23/2026full-timeRemote • California • 🇺🇸 United StatesSeniorLead💰 $185,000 - $290,000 per yearWebsite

Tech Stack

Tools & technologies
GrafanaPrometheus

About the role

Key responsibilities & impact
  • Architect and manage BMS integration across colocation and Lambda-owned facilities, covering chillers, CRAHs, CDUs (Coolant Distribution Units), cooling towers, UPS systems, PDUs, and automatic transfer switches.
  • Define standards for BMS point lists, naming conventions, control sequences, and integration protocols (BACnet, Modbus, SNMP, OPC-UA, RESTful APIs).
  • Oversee commissioning and acceptance testing of new BMS deployments and CDU/TCS loop integrations for next-generation liquid-cooled GPU rack systems.
  • Collaborate with colocation partners (Equinix, Digital Realty, and others) to ensure telemetry data flows from provider BMS/EPMS into Lambda's monitoring stack.
  • Own the DCIM platform strategy and roadmap — evaluating, selecting, and implementing tooling for asset management, capacity planning, environmental monitoring, and power chain visibility.
  • Develop and maintain real-time dashboards for PUE, thermal performance, stranded capacity, and cooling system efficiency across all Lambda sites.
  • Build and maintain telemetry pipelines ingesting data from BMS, PDUs, in-rack sensors, CDUs, and network devices into centralized monitoring and alerting platforms (e.g., Prometheus, Grafana, InfluxDB, or equivalent).
  • Define alarm thresholds and escalation workflows for critical facility events including high coolant temperatures, CDU inlet/outlet anomalies, leak detection, and power exceedances.
  • Develop control strategies and setpoint frameworks for TCS (Thermal Control System) loops supporting direct liquid cooling at densities of 220–380 kW per rack.
  • Evaluate and qualify CDU vendors on controls integration capabilities, telemetry exposure, and remote management interfaces.
  • Define and enforce operational procedures for CDU commissioning, setpoint changes, loop pressure management, and fluid quality monitoring.
  • Support design and construction coordination for liquid cooling infrastructure in new data center buildouts, ensuring BMS and controls readiness at Day 1.
  • Establish and maintain facility event management processes, including on-call response protocols for facility telemetry anomalies.
  • Lead root cause analysis for facility system failures and implement corrective actions to prevent recurrence.
  • Partner with the data center operations team to maintain and refine emergency response runbooks tied to BMS alerts and automated controls.
  • Drive continuous improvement in MTTR for facility-related events through better telemetry coverage and automated remediation.
  • Manage BMS integrators, DCIM vendors, and control subcontractors - from RFP through design, installation, commissioning, and ongoing support.
  • Serve as the primary technical interface with colocation providers on all BMS/EPMS integration topics.
  • Collaborate with Lambda's infrastructure engineering, construction, and procurement teams to align controls requirements with facility buildout timelines.
  • Support due diligence and technical evaluation for new colocation sites and modular data center deployments from a telemetry and controls readiness perspective.

Requirements

What you’ll need
  • 7+ years of experience in data center infrastructure engineering, with at least 4 years focused on BMS, DCIM, or controls systems in a hyperscale, colocation, or AI/HPC environment.
  • Hands-on experience designing and integrating BMS for mission-critical facilities including UPS, PDU, CRAH/CRAC, chiller plant, cooling tower, and liquid cooling (CDU/in-row) systems.
  • Strong working knowledge of industrial control protocols: BACnet IP/MS-TP, Modbus TCP/RTU, SNMP, DNP3, and modern API-based integrations.
  • Demonstrated experience with DCIM platforms (Nlyte, Sunbird, Vertiv TRELLIS, or equivalent) including deployment, configuration, and ongoing administration.
  • Experience with real-time telemetry stacks (Prometheus, InfluxDB, Grafana, or similar) applied to infrastructure monitoring use cases.
  • Strong understanding of data center power and cooling systems, including PUE optimization, thermal management, and redundancy architectures (2N, N+1).

Benefits

Comp & perks
  • Opportunity to shape the telemetry and controls architecture for one of the fastest-growing AI infrastructure platforms in the industry.
  • Work with cutting-edge GPU infrastructure at rack densities at the frontier of what the industry has deployed.
  • Collaborative environment with experienced infrastructure, construction, and vendor teams across a rapidly scaling global portfolio.
  • Competitive compensation including salary, equity, and comprehensive benefits.
  • Flexibility in work location with hybrid/remote options depending on facility portfolio needs.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
BMS integrationDCIMcontrols systemstelemetry datacooling systemsreal-time dashboardscontrol strategiesroot cause analysisautomated remediationcapacity planning
Soft Skills
collaborationleadershipcommunicationproblem-solvingcontinuous improvementproject managementtechnical interfaceoperational proceduresevent managementvendor management