FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAzureCloudETLPandasPySparkPythonSQL
About the role
Key responsibilities & impact- Assist TSD with data products by providing highly skilled and authoritative expertise on data engineering methods and best practices, including code-first development approaches and modern pipeline design patterns.
- Design, implement, and maintain an efficient, secure, stable, and flexible data architecture that supports products and end-users, with all assets managed via source control.
- Design, implement, and maintain ELT/ETL pipelines for efficient processing of source data in Azure Synapse and Azure Machine Learning (using SDK V1 and SDK V2)
- Review, maintain, and improve existing architecture and pipelines, including periodic audits to address bottlenecks, deprecated dependencies, and architecture drift.
- Establish quality controls for maintaining all pipelines, and introduce error handling, logging mechanisms, and validation checks.
- Incorporate source control for all pipelines and data analytics codebases to enable iterative code development while ensuring data architecture stability.
- Optimize the ingestion, processing, and storage of a wide variety of datasets and data types, including modern columnar formats such as Parquet.
- Develop self-service capabilities for SBA OIG analysts to query and export data for investigations and audits.
- Coordinate with data scientists to ensure the architecture efficiently supports machine learning algorithms and data pipelines in Azure Machine Learning.
- Develop robust standard operating protocols (SOPs) dictating the authoring, development, validation, publishing, execution, and monitoring of all data pipelines and assets in Azure environment.
- Provide detailed documentation of the data architecture, including data dictionaries, ER diagrams, and pipeline process maps.
- Maintain and expand the environment with additional datasets and services upon request, following a defined intake and testing process prior to production deployment.
- Stay current with emerging AI tools relevant to data engineering and contribute to exploratory efforts evaluating automation and LLM-assisted capabilities.
Requirements
What you’ll need- Five (5) years of hands-on experience in maintaining SQL databases and conducting advanced operations in SQL and T-SQL
- Five (5) years of hands-on experience in designing, implementing, and maintaining ELT/ETL processes in cloud-based data analytics environments
- Three (3) years of hands-on experience in working in Azure Synapse and Azure Machine Learning, with the modern data stack
- Certifications preferred (DP-203 or equivalent)
- Manipulating data in Python. Pandas required. PySpark/Polars preferred.
- Experience developing reusable, modular code preferred.
- Implementing pipelines and infrastructure using code-first approaches (Python SDK, CLI, REST APIs, or IaC tooling)
- Implementing source control and CI/CD workflows
- Demonstrated familiarity with AI coding assistants and LLM integration patterns
Benefits
Comp & perks- Healthcare and Insurance: medical, dental, vision, short- and long-term disability protection, basic life and AD&D insurance
- 401(k) Savings Plan
- Accrued Paid Time Off (PTO)
- Employee Recognition and Rewards
- Employee Referral Bonuses
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Data Engineering MethodsCode-First DevelopmentData Architecture DesignSQL Database MaintenancePython ProgrammingPandasPySparkCI/CD WorkflowsData Quality ControlsModular Code Development
Certifications
DP-203
