Sumble

Data Scientist / Machine Learning Engineer

Sumble

full-time

Posted on:

Location: 🇺🇸 United States

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

CloudGoogle Cloud PlatformPostgresPythonPyTorchReactTypeScript

About the role

  • About Us: Sumble's current focus is on acquiring, cleaning, and joining company-related data that integrates seamlessly with customers' data, enhancing go-to-market operations. Our long-term vision is to become the primary destination for accessing high-quality external data.
  • Our Team: We are a dedicated team of 9 engineers with experience at companies such as Google, Meta, Kaggle and Stack Overflow.
  • What you'll do: Finetuning small language models; Improving the quality of existing data using scalable approaches. Examples include: making sure URLs are associated the right company, we have the correct HQ address, we have mapped parents-subsidiary using techniques like LLM validation, SERP, and triangulating across sources.; Adding new signals: this usually involves scrubbing, matching and normalizing new signals and matching to our existing ontology; Pushing solutions into production environments, which may involve touching data pipelines and/or backend systems.
  • More about Sumble: Our Tech Stack: PyTorch, Huggingface, Gemma models, LORA, VLLM, Skypilot, Marimo; Languages & Frameworks: Python, FastAPI, React, Typescript; Cloud Platform: Google Cloud Platform (GCP); Databases: PostgreSQL, DuckDB; Infrastructure: Cloud Run; Challenges We Tackle: Transforming noisy datasets into high-quality data products; Running expensive analytics computations efficiently; Managing the complexity of a growing number of data sources, machine learning models, and large data operations; Join Us: If you're passionate about solving complex data challenges and excited by the opportunity to work with us on cutting-edge technologies, we'd love to hear from you.

Requirements

  • Located within US timezones
  • Committed to creating great products and experiences for our users
Invisible Technologies

Data Scientist

Invisible Technologies
Junior · Midcontract$30–$30🇺🇸 United States
Posted: 15 hours agoSource: boards.greenhouse.io
AWSCloudGoogle Cloud PlatformNumpyPandasPythonPyTorchScikit-LearnSpark
EverCommerce

Head of Data Infrastructure & Engineering

EverCommerce
Leadfull-time$190k–$225k / yearColorado, Texas · 🇺🇸 United States
Posted: 15 hours agoSource: evercommerce.wd1.myworkdayjobs.com
AirflowAmazon RedshiftApacheAWSAzureCassandraCloudDistributed SystemsDockerGoogle Cloud PlatformHadoopJava+8 more
Liaison

Data Scientist

Liaison
Mid · Seniorfull-time$100k–$123k / year🇺🇸 United States
Posted: 20 hours agoSource: boards.greenhouse.io
Python
Abercrombie & Fitch Co.

Data Scientist – Digital

Abercrombie & Fitch Co.
Mid · Seniorfull-time$75k–$85k / yearOhio · 🇺🇸 United States
Posted: 1 day agoSource: jobs.smartrecruiters.com
AzureCloudPythonSQL
biBerk Business Insurance

Data Scientist – Portfolio Strategy

biBerk Business Insurance
Mid · Seniorfull-time$110k–$150k / year🇺🇸 United States
Posted: 1 day agoSource: nationalindemnity.wd5.myworkdayjobs.com
PythonSQL