
Principal Machine Learning Engineer
BJAK
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
About the role
- Build and own end-to-end ML pipelines spanning data, training, evaluation, inference, and deployment.
- Fine-tune and adapt models using state-of-the-art methods such as LoRA, QLoRA, SFT, DPO, and distillation.
- Architect and operate scalable inference systems, balancing latency, cost, and reliability.
- Design and maintain data systems for high-quality synthetic and real-world training data.
- Implement evaluation pipelines covering performance, robustness, safety, and bias, in partnership with research leadership.
- Own production deployment, including GPU optimization, memory efficiency, latency reduction, and scaling policies.
- Collaborate closely with application engineering to integrate ML systems cleanly into backend, mobile, and desktop products.
- Make pragmatic trade-offs and ship improvements quickly, learning from real usage.
- Work under real production constraints: latency, cost, reliability, and safety.
Requirements
- Strong background in deep learning and transformer-based architectures.
- Hands-on experience training, fine-tuning, or deploying large-scale ML models in production.
- Proficiency with at least one modern ML framework (e.g. PyTorch, JAX), and ability to learn others quickly.
- Experience with distributed training and inference frameworks (e.g. DeepSpeed, FSDP, Megatron, ZeRO, Ray).
- Strong software engineering fundamentals – you write robust, maintainable, production-grade systems.
- Experience with GPU optimization, including memory efficiency, quantization, and mixed precision.
- Comfort owning ambiguous, zero-to-one ML systems end-to-end.
- A bias toward shipping, learning fast, and improving systems through iteration.
- Experience with LLM inference frameworks such as vLLM, TensorRT-LLM, or FasterTransformer.
- Contributions to open-source ML or systems libraries.
- Background in scientific computing, compilers, or GPU kernels.
- Experience with RLHF pipelines (PPO, DPO, ORPO).
- Experience training or deploying multimodal or diffusion models.
- Experience with large-scale data processing (Apache Arrow, Spark, Ray).
Benefits
- Our organization is very flat and our team is small, highly motivated, and focused on engineering and product excellence. All members are expected to be hands-on and to contribute directly to the company’s mission.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningdeep learningtransformer-based architecturesmodel fine-tuninglarge-scale ML modelsGPU optimizationdistributed trainingevaluation pipelinesmultimodal modelsdata processing
Soft Skills
collaborationproblem-solvingadaptabilityiterationownershippragmatic decision-makinglearning fastcommunicationtrade-off analysisworking under constraints