Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Infosys

Edge AI – Model Optimization Engineer

Infosys

Edge AI/Model Optimization Engineer at NextGen optimizing AI solutions for edge computing environments while collaborating with stakeholders. Focused on performance tuning and deployment of AI models.

Posted 5/22/2026full-timeAberdeen • Maryland • 🇺🇸 United StatesMid-LevelSeniorWebsite

Tech Stack

Tools & technologies
DockerKubernetesLinuxPython

About the role

Key responsibilities & impact
  • Evaluate candidate Large Language Models (LLMs), embedding models, and AI inference solutions for quality, latency, memory utilization, reliability, and operational performance on embedded GPU-enabled edge compute platforms, including the X9 Spider Mission Computer architecture.
  • Tune and optimize AI model runtime configurations for edge deployment, including quantization strategies, batching configurations, context window sizing, cache behavior, inference scheduling, and GPU memory utilization specific to operational edge hardware environments.
  • Collaborate with customer stakeholders to assess mission requirements and evaluate alternative edge compute platforms when operational demands exceed X9 Spider capabilities or when cost, performance, power, size, weight, or thermal tradeoffs require additional analysis.
  • Benchmark agentic AI workflows, inference pipelines, and model-serving architectures against target hardware constraints and operational performance thresholds.
  • Recommend model-selection, runtime, and configuration tradeoffs balancing mission effectiveness, latency, throughput, resource utilization, reliability, and operational sustainability.
  • Build and maintain repeatable performance and stress-testing frameworks for evaluating latency, throughput, tool-call overhead, failover behavior, degraded-resource conditions, and disconnected operational scenarios on edge compute platforms.
  • Package, deploy, validate, and sustain local model-serving components and inference services to support reliable operation within tactical and edge environments.
  • Collaborate with agent engineers, AI developers, and integration teams to validate that agent behavior, workflow reliability, and operational outcomes remain acceptable following model compression, quantization, runtime optimization, or hardware configuration changes.
  • Support deployment, troubleshooting, optimization, and sustainment activities for AI-enabled applications operating in edge, airborne, tactical, or disconnected operational environments.
  • Train customer technical personnel on supported model profiles, operational constraints, runtime tuning considerations, deployment limitations, troubleshooting procedures, and platform sustainment best practices.
  • Maintain technical documentation, benchmarking results, model validation reports, deployment procedures, optimization baselines, configuration guides, and operational support materials.
  • Support DevSecOps and CI/CD activities associated with AI model packaging, deployment automation, runtime validation, and operational release processes.

Requirements

What you’ll need
  • Bachelor’s degree in Computer Science, Electrical Engineering, Computer Engineering, Data Science, Artificial Intelligence, or related technical discipline.
  • 5+ years of experience supporting AI/ML deployment, model optimization, edge computing, GPU acceleration, or AI inference operations.
  • Experience deploying and optimizing LLMs, embedding models, or AI inference pipelines within resource-constrained or edge-compute environments.
  • Experience with GPU-enabled systems and inference optimization technologies such as CUDA, TensorRT, ONNX Runtime, vLLM, Ollama, or equivalent platforms.
  • Experience tuning AI runtime configurations including quantization, batching, caching, and memory optimization techniques.
  • Experience benchmarking AI models and operational workflows against hardware performance constraints.
  • Experience with Linux-based systems, containerized deployments, and orchestration technologies such as Docker and Kubernetes.
  • Familiarity with Python and AI/ML deployment frameworks commonly used for edge inference and operational AI systems.
  • Strong analytical, troubleshooting, and performance optimization skills.
  • Ability to communicate technical findings and operational tradeoffs effectively to technical and non-technical stakeholders.
  • Active Security Clearance is required

Benefits

Comp & perks
  • Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Large Language Models (LLMs)embedding modelsAI inference solutionsGPU accelerationmodel optimizationquantizationbatchingmemory optimizationbenchmarkingLinux
Soft Skills
analytical skillstroubleshootingperformance optimizationcommunicationcollaboration
Certifications
Bachelor’s degree in Computer ScienceBachelor’s degree in Electrical EngineeringBachelor’s degree in Computer EngineeringBachelor’s degree in Data ScienceBachelor’s degree in Artificial IntelligenceActive Security Clearance