
AI Developer, Python – Customer Care AI Platform
IONOS
full-time
Posted on:
Location Type: Hybrid
Location: Bucharest • Romania
Visit company websiteExplore more
Tech Stack
About the role
- Design Agentic Workflows: Design and implement LLM-based systems that go behind response generation - enabling structured tool usage, workflow orchestration, and secure interaction with internal services via MCP (Model Context Protocol).
- Build and Optimize RAG & CAG: Develop high-performance Retrieval-Augmented Generation and Context-Augmented Generation pipelines to ensure accurate, relevant, and low-latency responses. Continuously improve context management, ranking strategies, and grounding mechanisms to support complex, multi-step interactions.
- Voice Channel Mastery: Develop and optimize real-time Speech-to-Speech (S2S) pipelines, focusing on streaming architectures, latency reduction (including Time to First Word - TTFW) and maintaining a natural conversational flow.
- Evaluation, Quality & Alignment: Build and maintain an automated QA module, including LLM-as-a-judge patterns, to measure accuracy, safety, latency, and resolution quality at scale. Translate evaluation insights into systematic models and prompt improvements.
- Model Strategy & Hybrid Integration: Integrate and operate both commercial foundation models (e.g., OpenAI, Anthropic, Google) and open-source alternatives (e.g., Qwen, Kimi, DeepSeek, Moonshot, GLM), selecting and optimizing models based on performance, latency, cost, and use-case requirements.
Requirements
- Strong Python and/or Java Engineering Skills: Advanced-level Python development experience, including asynchronous programming (e.g., FastAPI, asyncio) and building high-performance, production-grade services. Experience with streaming architectures is a strong advantage.
- LLM Application & Multi-Agent Orchestration Experience: Hands-on experience building LLM-powered systems, including multi-step workflows, stateful agents, and tool invocation. Familiarity with orchestration frameworks such as LangChain, LlamaIndex, or LangGraph, particularly in building stateful, multi-turn agents.
- Advanced Retrieval & Context Management: Deep understanding of vector databases (e.g., Weaviate, Qdrant, pgvector, Elasticsearch), semantic search, embedding strategies, and re-ranking techniques. Experience designing and optimizing RAG pipelines.
- Real-Time & Low-Latency Systems: Experience in designing systems that operate under latency constraints, including streaming APIs, event-driven architectures, and performance optimization. Understanding of trade-offs between quality, cost, and response time.
- Evaluation-Driven Development: Experience in implementing evaluation frameworks for LLM-based systems, including automated QA pipelines and LLM-as-a-judge patterns.
- Familiar with API Design: knowledge of RESTful API design, OAuth2
Benefits
- Access to local/international trainings, development and growth opportunities, including access to e-learning platforms, covering both technical and soft skills areas;
- Modern technologies, product responsibility;
- Flexible work schedule;
- Hybrid work option;
- Medical services package from one of two private providers;
- 25 vacation days per year;
- Substitute days off for public holidays that occur on the weekend;
- Meal tickets;
- Internal referral program;
- Team events, networking events organized to promote a passionate, creative and diverse culture;
- Summerfest and Winterfest parties;
- Of course, coffee, soft drinks and fresh fruits are on us in the office.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonJavaasynchronous programmingFastAPIasyncioLLM applicationmulti-agent orchestrationRAG pipelinesvector databasesRESTful API design
Soft Skills
evaluation-driven developmentperformance optimizationsystematic model translation