FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Staff Platform Engineer, AI/ML Infrastructure
PfizerStaff Platform Engineer providing technical leadership for AI/ML infrastructure at Pfizer. Focused on cloud platforms, deployment systems, and enterprise-scale generative AI applications.
Tech Stack
Tools & technologiesAWSAzureCloudDistributed SystemsDynamoDBGoogle Cloud PlatformGrafanaKubernetesMicroservicesPrometheusTerraform
About the role
Key responsibilities & impact- Provide technical leadership for the cloud platforms, deployment systems, and operational foundations that power enterprise-scale generative AI applications.
- Define and evolve the infrastructure architecture for AI/ML platforms running across AWS, Kubernetes, serverless, and containerized environments.
- Lead platform standards for reliability, scalability, observability, CI/CD, security, and developer enablement, while partnering closely with software engineering, AI engineering, security, and operations teams.
- Define and drive the technical strategy for AI/ML platform infrastructure supporting generative AI applications, LLM integrations, model routing, and enterprise AI services.
- Architect, build, and operate scalable cloud platforms using AWS services such as EKS, ECS Fargate, Lambda, DynamoDB, S3, OpenSearch, Secrets Manager, CloudWatch, ALB, and MWAA.
- Establish reusable infrastructure patterns using CloudFormation, Helm, and Terraform to support reliable multi-environment and multi-region deployments.
- Lead CI/CD architecture using GitHub Actions, reusable workflows, OIDC-based AWS authentication, automated quality gates, deployment promotion, and environment approvals.
- Design and improve observability across AI platforms, including CloudWatch dashboards, logs, alarms, Prometheus/Grafana, OpenSearch, Langfuse, and LLM-specific operational metrics.
- Build platform capabilities for GenAI workloads, including model availability monitoring.
- Partner with software engineering teams to improve deployment reliability, rollback strategies, health checks, autoscaling, load testing, and runtime performance.
- Define and enforce security and compliance practices for infrastructure, including IAM permission boundaries, Secrets Manager usage, secret scanning, audit logging, tagging standards, and change-management controls.
- Provide technical leadership for cost optimization, capacity planning, environment standardization, and operational resilience across development, test, production, and sandbox environments.
- Mentor engineers, review architecture and infrastructure designs, and influence platform engineering practices across teams.
Requirements
What you’ll need- Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related technical field, or equivalent practical experience.
- 7+ years of experience in DevOps, platform engineering, cloud infrastructure, site reliability engineering, or software engineering roles.
- Strong hands-on experience with AWS/Azure/GCP infrastructure and services, including container, serverless, networking, storage, observability, and security services.
- Experience designing and operating production systems on Kubernetes, ECS/Fargate, or comparable container orchestration platforms.
- Proficiency with infrastructure-as-code, especially CloudFormation, Terraform, Helm, or similar tooling.
- Strong CI/CD experience with GitHub Actions or similar platforms, including reusable workflows, automated testing, deployment gates, and cloud authentication.
- Experience building and operating observability solutions using CloudWatch, Prometheus/Grafana, OpenSearch, or similar tools.
- Strong understanding of cloud security practices, IAM, secrets management, least-privilege access, audit logging, and compliance requirements.
- Experience supporting distributed systems, microservices, APIs, asynchronous workloads, and multi-environment deployments.
- Demonstrated ability to lead technical design, mentor engineers, and influence engineering practices across teams.
Benefits
Comp & perks- health care coverage
- retirement savings plans
- insurance benefits
- Employee Assistance Program
- wellness benefits
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
cloud platformsinfrastructure architectureAI/ML platformsAWSKubernetesCI/CDinfrastructure-as-codeobservabilitysecurity practicescost optimization
Soft Skills
technical leadershipmentoringinfluencing engineering practicescollaborationproblem-solvingcommunicationcapacity planningoperational resiliencedesign reviewteam partnership