
Machine Learning Operations Engineer II
S&P Global
full-time
Posted on:
Location Type: Hybrid
Location: Cambridge • Massachusetts • New York • United States
Visit company websiteExplore more
Salary
💰 $130,000 - $175,000 per year
About the role
- Iterate on Kensho’s ML processes to develop tools, services, and frameworks that make every stage of the ML workflow robust, auditable, and usable.
- Work closely with ML engineers to understand their unique processes, identify pain points, and form effective solutions.
- Empower engineers with the stable tooling necessary to rapidly experiment and actualize their research into demonstrable prototypes and mature products
- Provide resources and training for ML teams on best practices, enabling them to efficiently productionize their work to be leveraged by high-value products and services
- Evaluate, select and champion open source and third-party solutions, driving their adoption across teams and integrating into Kensho’s existing platform ecosystem
- Ship scalable, efficient, and automated processes for model fine-tuning and reinforcement learning and for the evaluation of LLMs/Agents
- Improve LLM and Agentic observability to help monitor agentic applications in production, detecting performance, decay and drift issues
- Stay at the frontier by actively tracking emerging tools and frameworks, promote best practices and strengthen the technical expertise of the team with your unique skill set
Requirements
- 2+ years of experience in ML infra, ML Ops, ML Engineering or some similar skillset
- Experience managing distributed systems with Kubernetes.
- Cloud Platform (AWS) understanding. We utilize tools like EKS and managed ML services like Bedrock and SageMaker
- Python proficiency (we are a python shop mostly)
- Familiarity with distributed computing frameworks and workflow orchestration (ie. Ray, Airflow)
- Familiarity with software engineering best practices in an ML context
- Some basic understanding of ML concepts, LLMs and agents
- Ability to debug distributed systems across infrastructure, networking and application layers
- Excellent communication skills to drive adoption of new tools and best practices across multiple teams
- Someone who’s very curious, driven, low-ego and eager to learn across a range of engineering disciplines, while being part of a fantastic team.
Benefits
- Medical, Dental, and Vision insurance 100% company paid premiums
- Unlimited Paid Time Off
- 26 weeks of 100% paid Parental Leave (paternity and maternity)
- 401(k) plan with 6% employer matching
- Generous company matching on donations to non-profit charities
- Up to $20,000 tuition assistance toward degree programs, plus up to $4,000/year for ongoing professional education such as industry conferences
- Plentiful snacks, drinks, and regularly catered lunches
- Dog-friendly office (CAM office)
- Bike sharing program memberships
- Compassion leave and elder care leave
- Mentoring and additional learning opportunities
- Opportunity to expand professional network and participate in conferences and events
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningML Opsdistributed systemsKubernetesAWSPythonRayAirflowmodel fine-tuningreinforcement learning
Soft Skills
communicationcuriositydrivenlow-egoeager to learn