Salary
💰 $224,000 - $356,500 per year
Tech Stack
CloudOpen SourcePythonPyTorch
About the role
- Develop algorithms for AI/DL, data analytics, machine learning, or scientific computing
- Contribute and advance open source Megatron Core and NeMo Framework
- Solve large-scale, end-to-end AI training and inference challenges, spanning the full model lifecycle from initial orchestration, data pre-processing, running of model training and tuning, to model deployment.
- Work at the intersection of compter-architecture, libraries, frameworks, AI applications and the entire software stack.
- Innovate and improve model architectures, distributed training algorithms, and model parallel paradigms.
- Performance tuning and optimizations, model training and finetuning with mixed precision recipes on next-gen NVIDIA GPU architectures.
- Research, prototype, and develop robust and scalable AI tools and pipelines.
Requirements
- MS, PhD or equivalent experience in Computer Science, AI, Applied Math, or related fields and 10+ years of industry experience
- Experience with AI Frameworks (e.g. PyTorch, JAX), and/or inference and deployment environments (e.g. TRTLLM, vLLM, SGLang)
- Proficient in Python programming, software design, debugging, performance analysis, test design and documentation
- Consistent record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations
- Strong understanding of AI/Deep-Learning fundamentals and their practical applications