
ML Engineer, LLM, Google Cloud
Alpha Talent Solutions - Joelle Borg
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇺🇸 United States
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
AirflowCloudDockerGRPCKafkaKubernetesPythonPyTorchRabbitMQTensorflow
About the role
- Analyse business requirements for the desired output format and the logic the model must implement.
- Prepare datasets based on example texts: cleaning, annotation, creating training/validation splits.
- Train and fine-tune LLMs for specific use cases: configure training parameters; experiment with prompts, system instructions, input/output formats.
- Evaluate model quality: design and track metrics; create test scenarios and A/B experiments; ensure output format consistency and stability.
- Deploy models to Google Cloud (for example via Vertex AI, Cloud Run, Kubernetes, etc.).
- Develop services and APIs (REST/gRPC) that expose the model to other systems.
- Build automations and integrations that call the model: background jobs, queues, event-driven triggers; integration with internal services and databases.
- Implement MLOps pipelines: automate training / retraining workflows; version models and datasets; monitor model performance and quality in production.
- Document models, pipelines, APIs, and architectural decisions.
Requirements
- 3+ years of software development experience (preferably Python)
- Hands-on experience with ML / NLP: understanding of models, loss functions, training and validation workflows
- Practical experience with at least one ML framework: TensorFlow, PyTorch, Hugging Face, etc.
- Experience with Google Cloud: Core services (Cloud Storage, IAM, VPC); ideally Vertex AI, Cloud Run, Pub/Sub or similar
- Experience deploying models into production (API services, containerization with Docker, CI/CD)
- Experience building and integrating REST APIs; confident working with JSON/JSONL, logging, and monitoring
- Understanding of how to design reliable and scalable systems (error handling, retries, queues, timeouts)
- Direct experience with LLMs: prompt engineering, few-shot learning, RAG
- Experience with MLOps tools (MLflow, Vertex AI Pipelines or equivalents)
- Experience with messaging/queue systems (Pub/Sub, Kafka, RabbitMQ) and workflow orchestration (Workflows, Airflow, etc.)
- Understanding of data security and handling sensitive information, including access control (IAM)
Benefits
- Highly competitive Salary along with quarterly bonuses
- Opportunity to work on fully remote basis under a B2B Service Agreement
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonMLNLPTensorFlowPyTorchHugging FaceMLOpsREST APIsDockerCI/CD
Soft skills
analytical skillsproblem-solvingcommunicationdocumentationcollaborationattention to detailorganizational skillsadaptabilitycreativitycritical thinking