FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Infrastructure Operations Engineer
Lightning AIInfrastructure Operations Engineer managing next-generation AI infrastructure across GPU systems at Lightning AI. Collaborating with teams to improve operational efficiency and minimize incidents.
Posted 6/24/2026full-timeRemote • California, New York, Washington • 🇺🇸 United StatesSeniorLead💰 $160,000 - $200,000 per yearWebsite
Tech Stack
Tools & technologiesAnsibleAWSGoKubernetesLinuxNFSPrometheusPythonTerraform
About the role
Key responsibilities & impact- At the direction of the Manager of Infrastructure Operations, design, build, and roll out new platforms and patterns to minimize incidents and enable customer facing and internal features.
- Deploy updates and improvements to support both Voltage Park’s internal and end customer use cases.
- Collaborate with colleagues in Infrastructure Engineering, Network Operations, Customer Success and Software and Platform Development Teams.
- Participate in the on-call rotation which is evenly distributed across all team members in a primary / secondary pattern where you are primary then move to a secondary position.
Requirements
What you’ll need- 8+ years working with Linux as a server / hosting platform, extra points for Ubuntu experience.
- 5+ years experience with AWS.
- 2+ years experience with Kubernetes and strong container fundamentals.
- 2+ years experience with Terraform and Ansible
- 2+ years with network attached storage management (via NFS, ceph, or other protocols). Extra points for experience with VAST storage systems.
- Experience with monitoring systems (Prometheus, ELK stack).
- Familiarity with the gitops workflow.
- Software development experience using Python, Go, bash, or other languages for the purposes of automation & connecting systems & APIs together.
- Deep networking fundamentals, extra points for experience with datacenter level networks, 400Gb ethernet, and Infiniband.
- Experience building and delivering complex systems.
- Effective at navigating tradeoffs between design, risk, cost, and outcomes.
- Comfortable with navigating ambiguity.
- Strong written and oral communication.
Benefits
Comp & perks- Comprehensive medical, dental and vision coverage (U.S.); Private medical and dental insurance (U.K.)
- Retirement and financial wellness support (U.S.); Pension contribution (U.K.)
- Generous paid time off, plus holidays
- Paid parental leave
- Professional development support
- Wellness and work-from-home stipends
- Flexible work environment
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
LinuxUbuntuAWSKubernetesTerraformAnsibleNFScephPythonGo
Soft Skills
communicationcollaborationproblem-solvingnavigating ambiguitytradeoff analysis