
Machine Learning Engineer, AWS
CCT
full-time
Posted on:
Location Type: Hybrid
Location: Tulsa • Oklahoma • United States
Visit company websiteExplore more
About the role
- Build and maintain reproducible model training workflows on AWS (SageMaker, S3, Glue, etc.), making retraining, rollback, and experimentation routine rather than heroic.
- Deploy and operate real-time and batch inference services with full CI/CD pipelines, versioning, and safe rollout strategies (canary, shadow, A/B) so changes are deliberate and observable.
- Instrument production models for performance, data drift, latency, and errors — and automate retraining triggers when models drift out of tolerance.
- Maintain model lineage, auditability, and traceability to meet the compliance, governance, and reporting needs of the regulated gaming industry.
- Enforce least-privilege IAM, encryption, and secure data access patterns across the entire ML platform.
- Treat cost as a first-class engineering metric — right-size infrastructure, balance batch vs. real-time workloads, and continually reduce platform spend without sacrificing reliability.
- Collaborate with engineers, data scientists, and product teams to translate business problems into ML solutions, communicate tradeoffs clearly, and iterate based on feedback.
- Continuously explore new AWS services, ML frameworks, and deployment patterns to improve reliability, observability, and developer velocity on the ML platform.
Requirements
- 3+ years of experience in machine learning engineering, MLOps, or a closely related discipline.
- Hands-on experience with AWS ML and data services — SageMaker (training, endpoints, pipelines), S3, Lambda, Step Functions, CloudWatch, MWAA (Apache Airflow).
- Experience working with time series data, including feature engineering, seasonality handling, and temporal train/test splits.
- Strong Python skills and familiarity with common ML frameworks (scikit-learn, PyTorch, XGBoost, or equivalent).
- Experience building and maintaining CI/CD pipelines for ML systems.
- Demonstrated ability to monitor and debug production ML systems — latency, drift, errors, and data quality — and drive issues to root cause.
- Comfort with SQL and working with structured data at scale.
- Able to work collaboratively across teams, assume positive intent, and communicate clearly with both technical and non-technical stakeholders.
- Track record of self-directed learning and technical growth in areas like AWS, ML frameworks, or deployment patterns.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learning engineeringMLOpsAWS SageMakerAWS S3AWS LambdaAWS Step FunctionsAWS CloudWatchPythonSQLCI/CD pipelines
Soft Skills
collaborationcommunicationself-directed learningproblem-solvingtechnical growthiteration based on feedbackmonitoringdebuggingassume positive intentclear communication