
Principal Systems Engineer – IaaS, Hardware
Mara
full-time
Posted on:
Location Type: Hybrid
Location: Dallas • Texas • 🇺🇸 United States
Visit company websiteJob Level
Lead
Tech Stack
AirflowAnsibleAzureCloudGoGrafanaKubernetesNFSOpenShiftOpenStackPrometheusPythonSparkTerraformVMware
About the role
- Architect and evolve the company’s IaaS platform across hybrid environments (on-premise, distributed), enabling secure and scalable compute foundations.
- Design, build, and maintain infrastructure automation frameworks using Terraform, Pulumi, and Ansible, including development of custom providers and modules.
- Define and enforce engineering standards for infrastructure provisioning, networking, and observability to ensure reliability, security, and consistency.
- Lead evaluation and integration of core technologies including OpenShift, Kubernetes, MAAS, and Ceph to optimize performance, cost, and maintainability.
- Drive multi-tenant PaaS initiatives and private cloud modernization leveraging OpenShift, Juju, and S3-compatible storage (Ceph, MinIO, TrueNAS).
- Collaborate with Data, ML, and Platform Engineering teams to align IaaS architecture with emerging workloads—data pipelines, MLflow, and Airflow orchestration.
- Establish GitOps and CI/CD frameworks (ArgoCD, Helm, GitHub Actions, Azure DevOps) for consistent infrastructure delivery and configuration management.
- Lead capacity planning, HA/DR strategy, and monitoring/alerting design using Prometheus, Grafana, and Loki stacks.
- Partner with InfoSec to embed zero-trust, OIDC/SAML-based IAM, and secret management best practices into infrastructure lifecycle.
- Mentor engineers and contribute to organization-wide technical enablement through documentation, workshops, and community participation.
Requirements
- 10+ years of experience designing and operating large-scale infrastructure systems across on-prem and cloud environments.
- Proven expertise in Infrastructure as Code (Terraform, Pulumi, Ansible) with experience authoring reusable modules and providers.
- Deep understanding of hybrid and private cloud platforms (OpenShift, Juju, MAAS, OpenStack, VMware, Proxmox).
- Strong background in storage (Ceph, TrueNAS, S3, NFS) and networking (VLAN, VXLAN, SDN) for high-availability architectures.
- Demonstrated experience building GitOps-based deployment pipelines and maintaining production-grade Kubernetes environments.
- Familiarity with data and ML infrastructure integration—MLflow, Airflow, Databricks, or Spark preferred.
- Strong proficiency in Python, Go, and Bash for automation and platform tooling.
- Excellent cross-functional leadership, communication, and mentorship skills.
Benefits
- Health insurance
- Retirement plans
- Paid time off
- Flexible work arrangements
- Professional development
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
Infrastructure as CodeTerraformPulumiAnsibleOpenShiftKubernetesCephPythonGoBash
Soft skills
leadershipcommunicationmentorship