Critical Manufacturing

AI Engineer

Critical Manufacturing

full-time

Posted on:

Location Type: Hybrid

Location: MaiaPortugal

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Develop MCP Servers
  • Implement and maintain Model Context Protocol (MCP) servers that connect language models to manufacturing domain tools and data sources
  • Optimize server performance and define clear interfaces for tool integration, ensuring models have safe, reliable access to business logic
  • Collaborate with team leads to map complex manufacturing workflows into structured tools and prompts
  • Build Model Observability and Telemetry Infrastructure
  • Design and implement comprehensive telemetry systems to track model behavior, token usage, latency, and cost in production
  • Create dashboards and alerting systems that give real-time visibility into model performance and anomalies
  • Instrument models to capture structured traces: prompts/system context, tool invocations, inputs/outputs, intermediate artifacts, and decision metadata
  • Contribute to standards for logging, tracing, and distributed observability across all AI systems
  • Develop Retraining and Continuous Improvement Pipelines
  • Build data collection pipelines that capture production interactions, model failures, and edge cases for retraining
  • Implement automated systems for evaluating model improvements and managing safe rollouts
  • Contribute to feedback loops that allow the platform to learn from real-world usage without manual intervention
  • Support Team Deliverables
  • Write clean, testable code and contribute to team codebases, documentation, and CI/CD processes
  • Participate in code reviews, technical design reviews, and troubleshooting production issues
  • Experiment with new tools and techniques under team guidance to improve AI system reliability
  • Promote the adoption of agentic coding across teams to accelerate delivery and increase throughput while maintaining quality and security standards
  • Design repositories, CI, and developer tooling that make agent-driven changes safe (linting, typed APIs, contract tests, golden tests, eval gates)
  • Ensure Production Reliability
  • Implement robust error handling, fallback strategies, and graceful degradation for AI systems
  • Monitor and tune AI systems for performance, uptime, and safety in manufacturing environments
  • Gather feedback from operations and product teams to refine tooling and server implementations

Requirements

  • At least 1 year of hands-on machine learning experience, including training and testing models, and a practical understanding of overfitting, generalization, and bias; plus a solid grasp of common model families (e.g., k-nearest neighbors, decision trees/random forests, support vector machines, linear/logistic regression, and basic neural networks)
  • At least 1 year of hands-on experience with LLMs in production or applied settings, including inference, prompt engineering, and evaluation; with a working understanding of how LLMs are configured and behave (e.g., temperature, top-p, max tokens, context windows, and tool/function calling)
  • Experience with agentic coding workflows or LLM-based code assistance, using tools that accelerate implementation, refactoring, and test generation while maintaining strong engineering rigor (reviews, testing, documentation, and CI discipline)
  • Familiarity with server development, APIs, and containerization (Docker/Kubernetes)
  • Strong problem-solving skills and comfortable writing production code—tests, docs, and all
  • Excellent software engineering fundamentals: version control, testing, code review, documentation
  • Ability to collaborate effectively in a team and work well under technical leadership
  • Excellent English skills - spoken and written

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
machine learningmodel trainingmodel testingoverfittinggeneralizationbiask-nearest neighborsdecision treessupport vector machinesneural networks
Soft skills
problem-solvingcollaborationcommunicationteamworktechnical leadership