Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Intel Corporation

Inference Optimization Engineer – Local, Edge Runtime

Intel Corporation

Inference Optimization Engineer optimizing inference engines for local and edge environments at Intel. Focus on model performance enhancement and efficient hardware utilization.

Posted 6/16/2026full-timeSanta Clara • Arizona, California, Oregon • 🇺🇸 United StatesMid-LevelSenior💰 $170,500 - $315,490 per yearWebsite

Tech Stack

Tools & technologies
C++LinuxPython

About the role

Key responsibilities & impact
  • Optimize inference engines (llama.cpp, vLLM) for constrained local and edge environments
  • Profile and optimize local inference for latency, throughput, and memory on edge hardware
  • Tune KV cache, continuous batching, and scheduling for interactive agent workloads
  • Drive quantization strategy and validate quality impact
  • Cut CPU overhead and improve engine startup, model load, and lifecycle
  • Benchmark across hardware tiers and publish performance comparisons
  • Upstream fixes and patches to open-source engines

Requirements

What you’ll need
  • BS/MS in CS, EE, Math or related STEM field
  • 5+ years software development background
  • Strong in C++ and/or Python; comfortable reading systems-level code
  • Understands how LLM inference works (attention, KV cache, decoding)
  • Has profiled and optimized real performance problems (CPU or GPU) and can prove the speedup
  • Linux, build systems, and low-level debugging expertise
  • Hands-on with llama.cpp, vLLM, ggml, or similar engines (preferred)
  • Experience with GPU / accelerator programming (Vulkan, CUDA, SYCL, Metal) or SIMD / CPU kernels (preferred)
  • Familiarity with quantization formats and their quality trade-offs (preferred)
  • Open-source contributions to inference engines (preferred)

Benefits

Comp & perks
  • Competitive pay
  • Stock bonuses
  • Health insurance
  • Retirement plans
  • Vacation

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
C++PythonLLM inferenceKV cacheGPU programmingVulkanCUDASYCLMetallow-level debugging
Certifications
BS in Computer ScienceMS in Computer ScienceBS in Electrical EngineeringMS in Electrical EngineeringBS in MathematicsMS in Mathematics