Salary
💰 $65 - $75 per hour
Tech Stack
AWSAzureCloudDockerGoogle Cloud PlatformKubernetesPythonPyTorchTensorflow
About the role
- Join our client’s AI/ML team to build and deploy state-of-the-art machine learning models that power intelligent experiences for millions of users.
- Work on everything from training custom models to optimizing inference pipelines, while collaborating with world-class researchers and engineers.
- Design and train custom neural networks for human-centered AI using PyTorch/TensorFlow.
- Fine-tune and adapt large language models (Llama, Claude, GPT) for domain-specific tasks.
- Conduct rigorous A/B testing and evaluation of model performance in production.
- Build and maintain scalable ML pipelines processing 10M+ daily inferences.
- Implement real-time model serving with <100ms p95 latency requirements.
- Design and optimize vector similarity search systems for multi-modal embeddings.
- Create robust data ingestion and feature engineering pipelines.
- Establish MLflow/Weights & Biases workflows for experiment tracking and model versioning.
- Build automated training and deployment pipelines using Kubernetes and Docker.
- Implement model monitoring, drift detection, and automated retraining systems.
- Optimize GPU utilization and cost efficiency for training and inference workloads.
Requirements
- 7+ years of hands-on ML experience with production model deployment
- Expert-level proficiency in Python and ML frameworks (PyTorch/TensorFlow/JAX)
- Deep understanding of transformer architectures, attention mechanisms, and modern NLP
- Experience with large-scale distributed training (multi-GPU, model/data parallelism)
- Strong background in statistics, linear algebra, and optimization theory
- Experience with MLOps tools: MLflow, Weights & Biases, Kubeflow, or similar platforms
- Proficiency with cloud ML services (AWS SageMaker, GCP Vertex AI, Azure ML)
- Docker and Kubernetes experience for containerized ML workloads
- Knowledge of model serving frameworks (TorchServe, TensorFlow Serving, TritonServer)
- Hands-on experience with LLM fine-tuning, RLHF, and prompt engineering
- Understanding of retrieval-augmented generation (RAG) and vector databases
- Experience with multimodal models (vision-language, audio processing)
- Knowledge of model compression techniques (quantization, distillation, pruning)
- PhD in ML/AI, Computer Science, or equivalent industry experience (preferred)
- Publications in top-tier conferences (NeurIPS, ICML, ICLR, EMNLP) (preferred)
- Experience at AI-first companies or research labs (OpenAI, Anthropic, DeepMind, etc.) (preferred)
- Contributions to open-source ML projects with significant community adoption (preferred)
- Experience with edge deployment and mobile ML optimization (preferred)
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
machine learningneural networksPyTorchTensorFlowA/B testingdata ingestionfeature engineeringGPU optimizationlarge language modelstransformer architectures
Certifications
PhD in ML/AIComputer Science