Featherless AI

Machine Learning Engineer – AI Architecture Research

Featherless AI

full-time

Posted on:

Location Type: Remote

Location: Anywhere in the World

Visit company website

Explore more

AI Apply
Apply

Tech Stack

About the role

  • Research and develop new neural network architectures (e.g. alternatives or extensions to Transformers, recurrent / hybrid models, long-context systems)
  • Design and run architecture-level experiments (scaling laws, memory mechanisms, compute trade-offs)
  • Prototype models end-to-end — from research code to training-ready implementations
  • Collaborate with inference and systems engineers to ensure architectures are deployable and efficient
  • Analyze model behavior, failure modes, and inductive biases
  • Read, reproduce, and extend cutting-edge research papers
  • Contribute to internal research notes, benchmarks, and open-source efforts (where applicable)

Requirements

  • Strong background in machine learning fundamentals and deep learning
  • Hands-on experience implementing model architectures from scratch
  • Solid understanding of:
  • Attention mechanisms, RNNs, state-space models, or hybrid architectures
  • Training dynamics, scaling behavior, and optimization
  • Memory, latency, and compute constraints at the model level
  • Comfortable working in PyTorch or JAX
  • Ability to move fluidly between theory, experimentation, and engineering
  • Clear communicator who can explain architectural trade-offs
  • Nice to Have
  • Experience with non-Transformer architectures (RNN variants, SSMs, long-context models)
  • Background in research-driven startups or open-source ML projects
  • Experience with large-scale training or custom training loops
  • Publications, preprints, or notable research contributions
  • Familiarity with inference optimization and deployment constraints
Benefits
  • Competitive compensation + meaningful equity
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
neural network architecturesTransformersrecurrent modelshybrid modelslong-context systemsmodel architecturesattention mechanismsRNNsstate-space modelsoptimization
Soft Skills
clear communicatorcollaborationtheory experimentation engineering