
ML Ops Architect
Tiger Analytics
full-time
Posted on:
Location Type: Remote
Location: Texas • United States
Visit company websiteExplore more
Tech Stack
About the role
- Implement scalable and reliable systems leveraging cloud-based architectures, technologies and platforms to handle model inference at scale.
- Deploy and manage machine learning & data pipelines in production environments.
- Work on containerization and orchestration solutions for model deployment.
- Participate in fast iteration cycles, adapting to evolving project requirements.
- Collaborate as part of a cross-functional Agile team to create and enhance software that enables state-of-the-art big data and ML applications.
- Leverage CICD best practices, including test automation and monitoring, to ensure successful deployment of ML models and application code.
- Ensure all code is well-managed to reduce vulnerabilities, models are well-governed from a risk perspective, and the ML follows best practices in Responsible and Explainable AI.
- Collaborate with Data scientists, software engineers, data engineers, and other stakeholders to develop and implement best practices for MLOps, including CI/CD pipelines, version control, model versioning, monitoring, alerting and automated model deployment.
- Manage and monitor machine learning infrastructure, ensuring high availability and performance.
- Implement robust monitoring and logging solutions for tracking model performance and system health.
- Monitor real-time performance of deployed models, analyze performance data, and proactively identify and address performance issues to ensure optimal model performance.
- Troubleshoot and resolve production issues related to ML model deployment, performance, and scalability in a timely and efficient manner.
- Implement security best practices for machine learning systems and ensure compliance with data protection and privacy regulations.
- Collaborate with platform engineers to effectively manage cloud compute resources for ML model deployment, monitoring, and performance optimization.
- Develop and maintain documentation, standard operating procedures, and guidelines related to MLOps processes, tools, and best practices.
Requirements
- Master's or doctoral degree in computer science, electrical engineering, mathematics, or a similar field.
- Typically requires 7+ years of hands-on work experience developing and applying advanced analytics solutions in a corporate environment with at least 4 years of experience programming with Python.
- At least 3 years of experience designing and building data-intensive solutions using distributed computing.
- At least 3 years of experience productionizing, monitoring, and maintaining models.
- Must have skills:
- Understanding of Azure stack like Azure Machine Learning, Azure Data Factory, Azure Databricks, Azure Kubernetes Service, Azure Monitor, etc.
- Demonstrated expertise in building and deploying AI/Machine Learning solutions at scale leveraging cloud such as AWS, Azure, or Google Cloud Platform.
- Experience in developing and maintaining APIs (e.g.: REST).
- Experience specifying infrastructure and Infrastructure as a code (e.g.: Ansible, Terraform).
- Experience in designing, developing & scaling complex data & feature pipelines feeding ML models and evaluating their performance.
- Ability to work across the full stack and move fluidly between programming languages and MLOps technologies (e.g.: Python, Spark, DataBricks, Github, MLFlow, Airflow).
- Expertise in Unix Shell scripting and dependency-driven job schedulers.
- Understanding of security and compliance requirements in ML infrastructure.
- Experience with visualization technologies (e.g.: RShiny, Streamlit, Python DASH, Tableau, PowerBI).
- Familiarity with data privacy standards, methodologies, and best practices.
Benefits
- Significant career development opportunities exist as the company grows.
- The position offers a unique opportunity to be part of a small, fast-growing, challenging and entrepreneurial environment, with a high degree of individual responsibility.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonMachine LearningData PipelinesContainerizationOrchestrationCI/CDAPIsInfrastructure as CodeUnix Shell ScriptingData Visualization
Soft Skills
CollaborationAdaptabilityProblem SolvingCommunicationDocumentation
Certifications
Master's DegreeDoctoral Degree