Apply faster with JobTailor
RecommendedApply
Apply your way
Use the standard apply link, or let JobTailor help you move faster.
- Apply directly in one click
- No setup required
- Best if you’re in a hurry
✨ Start AI Apply

AI Support Engineer, Level II
Global Payments Inc.. Serve as the first line of defense for production AI incidents, ensuring rapid triage, root cause analysis, and resolution.
Tech Stack
Tools & technologiesAWSAzureCloudGoogle Cloud PlatformPythonShell Scripting
About the role
Key responsibilities & impact- Serve as the first line of defense for production AI incidents, ensuring rapid triage, root cause analysis, and resolution.
- Monitor system health and performance of deployed AI applications, agentic and RAG-based solutions, MCPs, and orchestration platforms.
- Track and investigate issues related to latency, failures, model drift, hallucination, prompt misbehavior, or broken integrations, escalating to the AI engineering group where appropriate.
- Collaborate with AI and platform engineers to implement observability, logging, and alerting best practices for all AI services.
- Build diagnostic tools, runbooks, and automated workflows to improve incident response time and reduce manual intervention.
- Maintain knowledge bases and playbooks for repeatable troubleshooting and knowledge transfer.
- Partner with governance and compliance teams to ensure incidents are documented and remediated in line with internal policy.
- Contribute to postmortems and continuous improvement efforts to harden production systems.
Requirements
What you’ll need- 4+ years of experience in production support, software engineering, site reliability engineering (SRE), or DevOps—preferably supporting GenAI and/or ML systems.
- Strong understanding of cloud infrastructure (AWS, GCP) and AI observability tools (e.g., Fiddler AI, Arize AI, IBM WatsonX.governance, etc.).
- Experience with LLM and GenAI systems (OpenAI, Azure OpenAI, Bedrock, Vertex AI, or similar).
- Familiarity with modern orchestration and agentic frameworks such as LangChain, LangGraph, Autogen, or CrewAI.
- Proficiency in Python or shell scripting for automation and troubleshooting.
- Strong analytical, communication, and incident management skills.
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 1+ years of experience in AI/ML engineering, with a focus on Generative AI.
- Proficiency in programming languages such as Python
- Strong understanding of Generative AI models (e.g., GPT, Transformer architectures) and experience in distilling, tuning and training them.
- Familiarity with Retrieval Augmented Generation (RAG) techniques and their implementation.
- Experience with agentic AI concepts and developing autonomous AI workflows.
- Hands-on experience with GCP Vertex AI, AWS Bedrock + Sagemaker, and Snowflake Cortex platforms and their AI/ML capabilities.
- Experience building production-grade AI/ML systems at scale.
- Knowledge of MLOps practices, including model deployment and lifecycle management.
- Excellent problem-solving and analytical skills.
- Excellent communication and collaboration skills.
- Availability for on-call rotation and support.
Benefits
Comp & perks- Health insurance
- 401(k) matching
- Paid time off
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Pythonshell scriptingAI observability toolsGenerative AI modelsMLOps practicesmodel deploymentcloud infrastructureLLM systemsorchestration frameworksautomated workflows
Soft Skills
analytical skillscommunication skillsincident management skillsproblem-solving skillscollaboration skills