
Applied ML Engineer
Knowtex
full-time
Posted on:
Location Type: Hybrid
Location: San Francisco • California • United States
Visit company websiteExplore more
Tech Stack
About the role
- Productionize ML models for real-time clinical applications
- Optimize inference pipelines for low latency and high throughput
- Deploy and scale models using AWS-based infrastructure
- Build automated evaluation and regression testing frameworks for LLM outputs
- Implement monitoring systems for model performance and drift detection
- Collaborate with Backend teams to integrate ML services into APIs and workflows
- Improve model efficiency through quantization, batching, caching, and optimization techniques
- Support specialty-level model evaluation and performance analysis
- Contribute to CI/CD workflows for ML deployment
Requirements
- 3–7+ years of experience in machine learning engineering or applied ML roles
- Strong proficiency in Python and PyTorch (or TensorFlow)
- Experience deploying ML models in production environments
- Familiarity with transformer architectures and large language models
- Experience with model optimization techniques (quantization, distillation, pruning)
- Experience working with cloud infrastructure (AWS preferred)
- Strong software engineering fundamentals and debugging skills
Benefits
- Meaningful equity compensation
- Unlimited PTO
- Premium health, dental, and vision coverage
- 401(k) plan
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learning engineeringPythonPyTorchTensorFlowmodel optimizationquantizationdistillationpruningcloud infrastructureAWS
Soft Skills
collaborationdebuggingperformance analysis