
Principal Software Architect – High Performance Computing
Applied Materials
full-time
Posted on:
Location Type: Office
Location: Bangalore • India
Visit company websiteExplore more
Job Level
Tech Stack
About the role
- As an architect, you will get the opportunity to grow in the field of high-performance computing, GPU compute infra, complex system design and low-level optimizations for better cost of ownership
- You will analyze and partition workloads to the most appropriate compute unit, ensuring tasks like AI inference and parallel processing runs on specialized accelerators, while serial tasks run on CPUs
- You will work closely with cross-functional teams, including Algo engineers, product managers, and business stakeholders, to understand requirements and translate them into architectural/software designs that meet business needs
- You will be coding and developing quick prototypes to establish your design with real code and data
- You will be a subject Matter expert to unblock software engineers in the HPC domain
- You will be expected to profile entire cluster of nodes and each node with profilers to understand bottlenecks, optimize workflows and code and processes to improve cost of ownership
- Conduct performance tuning and capacity planning, monitoring GPU metrics (e.g., using NVIDIA DCGM) for reliability
- Evaluate and recommend appropriate technologies and frameworks to meet project requirements
- Lead the design and implementation of complex software components and systems
- Ensure that software systems are scalable, reliable, and maintainable
Requirements
- 12 to 18 years of experience in implementing robust, scalable, and secure infrastructure solutions combining diverse processors (CPUs, GPUs, FPGAs)
- Working experience of GPU inference server like Nvidia Triton
- Very good knowledge C/C++, Data structure and Algorithms and complexity analysis
- Experience in developing Distributed High Performance Computing software using Parallel programming frameworks like MPI, UCX etc
- Experience in GPU programming using CUDA, OpenMP, OpenACC, OpenCL etc
- In depth experience in Multi-threading, Thread Synchronization, Inter process communication, and distributed computing fundamentals
- Experience in Inter Process communication using Shared memory and Pipes
- Experience in performance profiling at application and system level (e.g. vtune, Oprofiler, perf, Nividia Nsight etc)
- Experience in low level code optimization techniques using Vectorization and Intrinsics, cache-aware programming, lock free data structures etc
- Familiarity with microservices architecture and containerization technologies (docker/singularity) and low latency Message queues
- Excellent problem-solving and analytical skills
- Strong communication and collaboration abilities
- Ability to mentor and coach junior team members
- Experience in Agile development methodologies
- Experience in HPC Job-Scheduling and Cluster Management Software (SLURM, Torque, LSF etc)
- Good knowledge of Low-latency and high-throughput data transfer technologies (RDMA, RoCE, InfiniBand)
- Good Knowledge of Parallel processing and DAG execution Frameworks like Intel TBB flowgraph, OpenCL/SYCL etc
Benefits
- supportive work culture that encourages you to learn, develop, and grow your career
- programs and support that encourage personal and professional growth
- care for you at work, at home, or wherever you may go
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
C/C++Data structuresAlgorithmsGPU programmingCUDAOpenMPOpenACCOpenCLParallel programming frameworksPerformance profiling
Soft Skills
Problem-solvingAnalytical skillsCommunicationCollaborationMentoringCoachingLeadership