FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Kernel Optimization Engineer
Cerebras SystemsKernel Engineer developing high-performance software for cutting-edge AI workloads at Cerebras Systems. Focus on optimizing and scaling deep learning operations for a massively parallel processor architecture.
Tech Stack
Tools & technologiesAssemblyPythonPyTorchTensorflow
About the role
Key responsibilities & impact- Develop design specifications for new machine learning and linear algebra kernels and mapping to the Cerebras WSE System using various parallel programming algorithms.
- Develop and debug kernel library of highly optimized low level assembly instruction and C-like domain specific language routines to implement algorithms targeting the Cerebras hardware system.
- Develop and debug high-performance kernel routines in low-level assembly and a custom C-like (CSL) language, implementing algorithms optimized for the Cerebras hardware system.
- Using mathematical models and analysis to measure the software performance and inform design decisions.
- Develop and integrate unit and system testing methodologies to verify correct functionality and performance of kernel libraries.
- Study emerging trends in Machine Learning applications and help evolve Kernel library architecture to address computational challenges of the start-of-the-art Neural Networks.
- Interact with chip and system architects to optimize instruction sets, microarchitecture, and IO of next generation systems.
Requirements
What you’ll need- Bachelor’s, Master’s, PhD or foreign equivalents in Computer Science, Computer Engineering, Mathematics, or related fields.
- Understanding of hardware architecture concepts — must be comfortable learning the details of a new hardware architecture.
- Skilled in C++ and Python programming languages.
- Good knowledge of library and/or API development best practices.
- Strong debugging skills and knowledge of debugging complex software stack.
- Experience in kernel development and/or testing.
- Familiarity with parallel algorithms and distributed memory systems.
- Experience in programming accelerators such as GPUs and FPGAs.
- Familiarity with Machine Learning neural networks and frameworks such as TensorFlow and PyTorch.
- Familiarity with HPC kernels and their optimization.
Benefits
Comp & perks- Build a breakthrough AI platform beyond the constraints of the GPU.
- Publish and open source their cutting-edge AI research.
- Work on one of the fastest AI supercomputers in the world.
- Enjoy job stability with startup vitality.
- Our simple, non-corporate work culture that respects individual beliefs.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
C++Pythonlow-level assemblydomain specific languagekernel developmentdebuggingparallel algorithmsdistributed memory systemsHPC kernels optimizationMachine Learning
Soft Skills
problem-solvinganalytical thinkingcommunicationcollaboration
Certifications
Bachelor’s degreeMaster’s degreePhD