Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Flexential

Senior Platform Engineer

Flexential

Senior Platform Engineer developing and managing critical IT platforms at Flexential. Focused on automation, observability, and high availability using advanced technologies.

Posted 6/13/2026full-timeRemote • Colorado • 🇺🇸 United StatesSenior💰 $150,000 - $165,000 per yearWebsite

Tech Stack

Tools & technologies
AnsibleAzureBootstrapCloudDockerFluxGoogle Cloud PlatformGrafanaKubernetesLinuxPrometheusPythonServiceNowTCP/IPTerraformVaultVMware

About the role

Key responsibilities & impact
  • Design, develop and operationally manage automated, resilient, high availability, self-healing, secure platforms with native-AI capabilities for IT needs, serving both internal as well as customer business capabilities
  • Develop , and manage the Observability OpenTelemetry Central Backend Stack: Grafana Enterprise, Mimir, Loki, Tempo, and Alertmanager on Kubernetes/RKE2 via Helm and GitLab CI -CD .
  • Build and manage iaC and CI-CD for automated provisioning and deployment, including Terraform modules for Infra/ VM/storage provisioning, Ansible AWX playbooks for OS/ App bootstrap, ArgoCD and Helm for Kubernetes configuration .
  • Develop and manage OpenTelemetry Prometheus scrape profile library including SNMP exporters, REST API exporters, and cloud provider exporters (CloudWatch, Azure Monitor, GCP) for multiple device classes.
  • Develop AIOps capabilities on platforms for e.g Observability use-cases : anomaly detection integrations, event correlation rules in Alertmanager , and synthetic monitoring patterns to reduce alert noise.
  • Configure and maintain Zabbix auto-discovery: network range scanning, device classification, and Prometheus service discovery integration.
  • Build and harden Edge Stack deployments (Prometheus + OTel collector) per data center site using GitOps templates.
  • Integrate Alertmanager with ServiceNow: webhook routing, ticket enrichment, auto-close logic, and escalation policy configuration.
  • Maintain platform security: Conjur /CyberArk secret injection at runtime, mTLS between stack components, RBAC in Grafana Enterprise.
  • Author and maintain Grafana dashboards in JSON/GitLab — facility overview, network health, RED metrics, application telemetry.
  • Mentor mid-level engineers, lead code reviews, and establish engineering standards for the team.
  • Represent platform engineering in cross-functional architecture reviews and executive-level program updates.
  • Perform other duties as required and assigned

Requirements

What you’ll need
  • DevOps / Automation - 5+ years in a production environment
  • Kubernetes (RKE2/k3s), Helm chart deployment, system services, Docker/ container
  • LGTM Stack Development and Configuration - 4 + years : Grafana, Mimir, Loki, Tempo configuration, tuning, dash- boarding and production operation s ; Prometheus required
  • Senior-level Python / Scripting frameworks - 5+ years, Automation scripts, exporter development, GitLab pipeline scripting, REST API integrations
  • GitOps / CI/CD - 5+ years, GitLab CI/CD pipeline authoring; Terraform and Ansible as primary IaC tools; ArgoCD or Flux preferred
  • AIOps / Observability Engineering - 2+ years , Alertmanager rule authoring, anomaly detection integration, event correlation, noise reduction techniques
  • Working infrastructure (Linux/VM) management knowledge - 5+ years, Linux administration, VMware vCenter/ VCF experience , Netapp storage management , network fundamentals (SNMP, TCP/IP)
  • Secrets Management - 2+ years , CyberArk/ Conjur , HashiCorp Vault, or equivalent — runtime secret injection patterns
  • Minimal travel may be required

Benefits

Comp & perks
  • Medical, Telehealth, Dental and Vision
  • 401(k)
  • Health Savings Accounts (HSA) and Flexible Spending Accounts (FSA)
  • Life and AD&D
  • Short Term and Long-Term disability
  • Flex Paid Time Off (PTO)
  • Leave of Absence
  • Employee Assistance Program
  • Wellness Program
  • Rewards and Recognition Program

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesHelmGrafanaMimirLokiTempoPrometheusPythonTerraformAnsible
Soft Skills
mentoringleadershipcommunicationcross-functional collaborationengineering standards