Tech Stack
C++PythonPyTorch
About the role
- Craft and develop robust inference software that can be scaled to multiple platforms for functionality and performance
- Performance analysis, optimization, and tuning for Large Language Models (LLMs)
- Closely follow academic developments in the field of artificial intelligence and feature update TensorRT-LLM
- Provide feedback into the architecture and hardware design and development
- Collaborate across the company to guide the direction of deep learning inference, working with software, research and product teams
- Publish key results in scientific conferences
- Work on a fast-paced delivery-focused team and communicate with stakeholders
Requirements
- Master or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience)
- 3+ years of relevant software development experience
- Excellent Python programming skills, software design, and software engineering skills
- Awareness of the latest developments in LLM architectures and LLM inference techniques
- Experience working with deep learning frameworks like PyTorch and HuggingFace
- Proactive and able to work without supervision
- Excellent written and oral communication skills in English
- (Preferred) Prior experience with a LLM inference framework (TensorRT-LLM, SGLang, vLLM, lamma.cpp, MLC-LLM, etc.) or a DL compiler
- (Preferred) Experience with performance modeling, profiling, debug, and code optimization of DL/HPC/high-performance applications
- (Preferred) Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design
- (Preferred) Architectural knowledge of CPU and GPU
- (Preferred) GPU programming experience (CUDA or OpenCL)
- Competitive salaries
- Generous benefits package
- Exposure to the entire DL SW stack and professional development opportunities
- Opportunity to publish key results in scientific conferences
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonC/C++TensorRT-LLMPyTorchHuggingFaceperformance modelingdebuggingcode optimizationGPU programmingarchitectural knowledge of CPU and GPU
Soft skills
proactivecommunication skillscollaborationfeedback provisionability to work without supervision
Certifications
Master's degree in Computer EngineeringMaster's degree in Computer ScienceMaster's degree in Applied Mathematics