LLMOps Engineer

Brillio

full-time

Posted on: 9/4/2025

Location: Florida • 🇺🇸 United States

Visit company website

✨ AI Apply

Apply

Salary

💰 $110,000 - $120,000 per year

Job Level

Mid-LevelSenior

Tech Stack

AWSCloudDistributed SystemsDockerGoogle Cloud PlatformKubernetesMicroservicesPythonPyTorchSQLTensorflow

About the role

Design, implement, and maintain end-to-end pipelines for LLM training, fine-tuning, validation, and deployment
Build and optimize scalable infrastructure for large language model operations
Deploy LLMs to production environments with prompt management, observability, serverless deployment, proper monitoring, scaling, and performance optimization
Design, develop, and maintain RESTful APIs endpoints for LLM inference and model interactions
Ensure API reliability, performance optimization, rate limiting, authentication, and comprehensive documentation
Implement comprehensive monitoring solutions for model performance, drift detection, and system health metrics
Research and evaluate emerging LLMOps techniques, tools, and methodologies
Provide informed recommendations on technology choices, architecture decisions, and implementation strategies
Establish and document best practices for LLM operations, deployment patterns, and governance frameworks
Develop prototypes and POCs to validate new approaches and technologies
Work closely with data scientists, ML engineers, DevOps teams, and product managers
Create comprehensive documentation for systems, processes, and architectural decisions
Mentor team members and share expertise through technical presentations and training sessions
Optimize data preprocessing and feature engineering pipelines for LLM training and inference
Implement data validation, quality checks, and lineage tracking for model training datasets
Design efficient data storage and retrieval systems for large-scale model artifacts and training data
Implement model governance frameworks including audit trails, compliance monitoring, and approval workflows
Ensure secure model deployment practices, access controls, and data privacy measures
Identify and mitigate risks associated with LLM deployment and operations
Maintain development, staging, and production environments for LLM workflows

Requirements

Bachelor’s degree in Computer Science, Statistics, Engineering or a related field (exceptional candidates without advanced degrees will be considered)
B.E/B.Tech/M.Tech in Computer Science or related technical degree OR Equivalent
6-12 years of experience building production-quality software (at least 5 years in Python) + 2 years in LLMOps
6+ years of software development experience with strong programming skills in Python, SQL
2+ years of hands-on experience LLMOps
1+ years of experience with machine learning operations, model deployment, and lifecycle management
Proficiency with at least one major cloud provider (AWS or GCP) and their ML services
Experience with Docker, Kubernetes, and container orchestration for ML workloads
Strong experience in designing, building, and maintaining production-grade APIs for ML services
Proficiency with Git, CI/CD pipelines, and DevOps practices
Understanding of LLM architectures, training methodologies, and fine-tuning techniques
Knowledge of ML pipeline design, model monitoring, and deployment strategies
Understanding of distributed systems, scalability patterns, and microservices architecture
Good-to-Have: Experience with HuggingFace Transformers, PyTorch, TensorFlow, or similar frameworks
Good-to-Have: Knowledge of prompt optimization, RAG (Retrieval-Augmented Generation) architectures
Good-to-Have: Experience with vector search

LLMOps Engineer

Salary

Job Level

Tech Stack

About the role

Requirements

Similar jobs on JobTailor

Director, Commercial Operations – Data Center

Operations Manager

Operations Manager

National Vendor Operations Leader

Process Improvement Lead – P&C Claims