
Senior Software Engineer, ML Platform
Parafin
full-time
Posted on:
Location Type: Hybrid
Location: San Francisco • California • United States
Visit company websiteExplore more
Salary
💰 $230,000 - $265,000 per year
Job Level
About the role
- Turn notebooks into software. Decompose data scientist training/inference notebooks into reusable, tested components (libraries, pipelines, templates) with clear interfaces and documentation.
- Create developer-friendly ML abstractions. Build SDKs, CLIs, and templates that make it simple to define features, train/evaluate models, and deploy to batch or real-time targets with minimal boilerplate.
- Build our real-time ML inference platform. Stand up and scale low-latency model serving.
- Expand batch ML inference. Improve scheduling, parallelism, cost controls, observability, and failure/rollback for large-scale batch scoring and post-processing.
- Own and expand the feature store. Design offline/online feature definitions, high read/write throughput, and consistent offline/online semantics.
- Platform reliability and observability. Instrument training/inference for latency, throughput, accuracy, drift, data quality, and cost; build alerting and dashboards; drive incident response and postmortems.
- Underwriting infrastructure partnership. Support production batch and real-time underwriting systems in collaboration with Data Science; collaborate on model interfaces, SLAs, safety checks, and product integrations.
Requirements
- 5+ years of software engineering experience, including experience on ML platform/MLOps systems (training, deployment, and/or feature pipelines).
- Strong Python; solid software design and testing fundamentals. Proficiency with SQL; hands-on Spark/PySpark experience.
- Knowledge of ML fundamentals—probability & statistics, supervised vs. unsupervised learning, bias/variance & regularization, feature engineering, model evaluation metrics, validation strategies, and production concerns like drift, stability, and monitoring.
- Expertise with modern data/ML stacks—AWS, Databricks (workflows, lakehouse, MLflow/registry, Model Serving), and Airflow (or equivalent orchestration).
- Experience building real-time systems (service design, caching, rate limiting, backpressure) and batch pipelines at scale.
- Practical knowledge of feature-store concepts (offline/online stores, backfills, point-in-time correctness), model registries, experiment tracking, and evaluation frameworks.
- Strong problem-solving skills and a proactive attitude toward ownership and platform health.
- Excellent communication and collaboration skills, especially in cross-functional settings.
Benefits
- Equity grant
- Medical, dental & vision insurance
- Work from home flexibility
- Unlimited PTO
- Commuter benefits
- Free lunches
- Paid parental leave
- 401(k)
- Employee assistance program
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonSQLSparkPySparkML fundamentalsfeature engineeringmodel evaluation metricsdrift monitoringservice designbatch pipelines
Soft skills
problem-solvingproactive attitudecommunicationcollaboration