Applied ML Engineer

Knowtex

full-time

Posted on: 3/4/2026

Location Type: Hybrid

Location: San Francisco • California • United States

✨ AI Apply

About the role

Productionize ML models for real-time clinical applications
Optimize inference pipelines for low latency and high throughput
Deploy and scale models using AWS-based infrastructure
Build automated evaluation and regression testing frameworks for LLM outputs
Implement monitoring systems for model performance and drift detection
Collaborate with Backend teams to integrate ML services into APIs and workflows
Improve model efficiency through quantization, batching, caching, and optimization techniques
Support specialty-level model evaluation and performance analysis
Contribute to CI/CD workflows for ML deployment

3–7+ years of experience in machine learning engineering or applied ML roles
Strong proficiency in Python and PyTorch (or TensorFlow)
Experience deploying ML models in production environments
Familiarity with transformer architectures and large language models
Experience with model optimization techniques (quantization, distillation, pruning)
Experience working with cloud infrastructure (AWS preferred)
Strong software engineering fundamentals and debugging skills

Benefits

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

machine learning engineeringPythonPyTorchTensorFlowmodel optimizationquantizationdistillationpruningcloud infrastructureAWS

Soft Skills

collaborationdebuggingperformance analysis