Deeplight AI

DataOps Engineer – Lakehouse Operations

Deeplight AI

full-time

Posted on:

Location Type: Hybrid

Location: Abu Dhabi • 🇦🇪 United Arab Emirates

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AWSKafka

About the role

  • ***Your responsibilities as a DataOps Engineer will include, but not be limited to;***
  • - ***Monitoring & Alerting***
  • - Monitor and act on incidents related to:
  • - AWS Glue job executions
  • - Current DMS and dbt pipelines
  • - Kafka lag and streaming health
  • - Data freshness SLA breaches
  • - Data quality issues
  • - Platform health alerts
  • - Perform L1 triage or distribute incidents to L2/L3 teams as needed.
  • - ***Incident Management***
  • - Own the incident response process:
  • - Initial triage and severity assessment.
  • - Coordinate with development teams for resolution.
  • - Create and assign JIRA tickets with full context.
  • - Track incident resolution and closure.
  • - Escalate high-priority or long-running incidents to management.
  • - ***Root Cause Analysis & Prevention***
  • - Conduct post-incident root cause analysis.
  • - Maintain incident logs and post-mortem documentation.
  • - Implement preventive measures for recurring issues in collaboration with dev teams.
  • - ***Operational Reporting***
  • - Manage the TV dashboard providing real-time status of critical flows (with color-coded indicators).
  • - Deliver a daily 10–15 min operational review of previous day’s executions, open incidents, and follow-ups.
  • - Share daily summary emails with the team.
  • - ***Continuous Awareness***
  • - Stay up-to-date on the status of all critical flows and remediation efforts.
  • - Ensure proactive communication on risks and delays.

Requirements

  • ***You will have experience in:***
  • - DataOps, DevOps, or data engineering roles, with a minimum of 5 years.
  • - AWS Glue, DMS, dbt, and Kafka monitoring.
  • ***You should also have knowledge of:***
  • - data freshness SLAs, data quality frameworks, and platform health monitoring.
  • - incident management tools (e.g., JIRA) and alerting systems.
  • - identifying ways to automate their work / repetitive tasks.
  • - troubleshooting and triage process.
  • - managing multiple incidents and prioritize effectively.
  • - root cause analysis and preventive action planning
  • - communicating and coordination.
  • - working under pressure and maintain operational discipline
Benefits
  • **Benefits & Growth Opportunities:**
  • · Competitive salary and performance bonuses
  • · Comprehensive health insurance
  • · Professional development and certification support
  • · Opportunity to work on cutting-edge AI projects
  • · Flexible working arrangements
  • · Career advancement opportunities in a rapidly growing AI company
  • This role is based 4 days per week on client site in Abu Dhabi, with 1 day (Friday) working from home.
  • This position offers a unique opportunity to shape the future of AI implementation while working with a talented team of professionals at the forefront of technological innovation. The successful candidate will play a crucial role in driving our company's success in delivering transformative AI solutions to our clients.
  • At DeepLight AI, we recognise that diversity drives innovation. We are committed to fostering an inclusive environment where individuals with different thinking styles can thrive and contribute their unique strengths to our specialised AI and data solutions.
  • Our goal is to ensure our application and interview process is accessible, predictable, and fair for all candidates.
  • If you require any specific adjustments to the application process, or if you require any reasonable adjustments should you be successful in being processed to the interview stage, please do let us know. This information will be kept strictly confidential and will not impact hiring decisions.
  • By applying to Deeplight, you also agree for us to share your profile, where necessary, with external clients.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
DataOpsDevOpsdata engineeringAWS GlueDMSdbtKafkaincident managementroot cause analysisoperational reporting
Soft skills
communicationcoordinationtroubleshootingprioritizationworking under pressureoperational discipline