Salary
💰 $126,000 - $196,000 per year
Tech Stack
AirflowAWSAzureCloudDistributed SystemsPythonRubyRuby on RailsSaltStackScalaSparkTerraform
About the role
- Design and build scalable systems to extract, enrich, and process metadata from millions of documents, images, and audio content
- Leverage LLMs to integrate capabilities like summarization, classification, extraction, and enrichment into metadata pipelines
- Collaborate with cross-functional teams, including ML engineers and product managers, to deliver scalable, efficient, and reliable metadata solutions
- Optimize and refactor existing systems for performance, scalability, and reliability
- Ensure data accuracy, integrity, and quality through automated validation and monitoring
- Participate in code reviews, ensuring best practices are followed and maintaining high-quality standards in the codebase
- Manage and maintain data pipelines, security and infrastructure
- Work on cutting-edge generative AI and metadata enrichment problems at global scale
- Operate and scale systems that process hundreds of millions of documents and billions of images
Requirements
- 4+ years of professional software engineering experience
- Proficiency in Python, Scala, Ruby, or similar languages
- Proficiency with Python, Scala, Ruby on Rails, Airflow, Databricks, Spark, HTTP APIs, AWS (Lambda, ECS, SQS, ElastiCache, Sagemaker, Cloudwatch, Datadog) and Terraform
- Experience designing and building distributed systems at scale
- Hands-on experience building, deploying, and optimizing solutions using ECS, EKS, or AWS Lambda
- Experience with infrastructure-as-code tools like Terraform (or similar)
- Experience working with a public cloud provider (AWS, Azure, or Google Cloud)
- Familiarity with data processing frameworks like Spark or Databricks for large-scale workloads
- Proven ability to test, profile, and optimize systems for performance, scalability, and reliability
- Bachelor’s degree in Computer Science or equivalent professional experience
- Bonus: Experience working with LLMs or integrating ML models into production systems
- Must have primary residence in or near one of the listed cities (Atlanta, Austin, Boston, Dallas, Denver, Chicago, Houston, Jacksonville, Los Angeles, Miami, New York City, Phoenix, Portland, Sacramento, Salt Lake City, San Diego, San Francisco, Seattle, Washington D.C., Ottawa, Toronto, Vancouver, Mexico City)