
Lead Platform Engineer – Search Platform
TetraScience
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
About the role
- Lead by example to architect and code the next-generation scientific search engine, building a system that can reason over billions of scientific data points—from chemical structures (SMILES) to unstructured lab documents and instrument data.
- Engineer sophisticated hybrid search pipelines that blend sparse (keyword), structured (metadata), and dense (vector) retrieval. You will go beyond out-of-the-box OpenSearch to design custom ranking logic, reciprocal rank fusion, and relevance tuning that surfaces the exact "needle in the haystack" for drug discovery.
- Own and operate the Search Platform infrastructure, ensuring high availability, scalability, performance, and observability across indexing, embedding generation, and query execution.
- Develop and maintain backend services and APIs in Python and TypeScript that power search capabilities for scientists, data engineers, and AI applications.
- Collaborate with Applied AI Scientists to integrate embeddings, transformer models, and chemical fingerprints into production search workflows.
- Architect and implement scientific entity resolution and knowledge graph pipelines to transform raw text into interconnected knowledge. You will design systems that extract and link chemical and biological entities (NER/NED) from unstructured documents, enabling the search engine to "understand" relationships between compounds, targets, and assays.
- Continuously improve search quality through evaluation metrics such as precision@K, recall@K, MRR, and relevance testing with real scientific use cases.
- Ensure security, compliance, and tenant isolation as part of operating search services in enterprise bio-pharma environments.
- Contribute to architectural decisions, technical strategy, and platform-wide improvements to accelerate scientific insight generation.
Requirements
- 10+ years of backend or platform engineering experience building distributed, production grade systems.
- Hands-on experience with search technologies such as Elasticsearch/OpenSearch, Lucene, or vector databases
- Strong understanding of semantic search concepts embeddings, transformers, similarity scoring, ranking logic, relevance tuning, hybrid retrieval.
- Expert-level coding skills in TypeScript and Python building robust APIs and backend services.
- Experience building and operating microservices or search infrastructure on cloud platforms (AWS preferred), including containerization, CI/CD, observability, and performance tuning.
- Familiarity with scientific or unstructured data processing, such as documents, tables, analytical results, or experimental datasets.
- Strong problem solving skills, with the ability to navigate ambiguous scientific workflows and translate them into engineered systems.
- Excellent communication and collaboration skills comfortable working alongside scientists, AI researchers, and product teams.
- Exposure to NLP, LLMs, embedding generation, or retrieval-augmented workflows.
- Experience with large-scale data platforms such as Databricks, Lakehouse architectures, or distributed indexing systems.
- Nice to Have
- Experience with cheminformatics tools and libraries (e.g., RDKit), including molecular fingerprints, similarity metrics, or substructure search.
- Prior experience implementing chemical search systems, such as SMILES parsing, normalization, or chemical indexing.
- Knowledge of vector databases / embeddings stores (e.g., OpenSearch) to support semantic search and RAG.
Benefits
- 100% employer-paid benefits for all eligible employees and immediate family members
- Unlimited paid time off (PTO)
- 401K
- Flexible working arrangements - Remote work
- Company paid Life Insurance, LTD/STD
- A culture of continuous improvement where you can grow your career and get coaching
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonTypeScriptElasticsearchOpenSearchLuceneembeddingstransformerssemantic searchmicroservicesCI/CD
Soft skills
problem solvingcommunicationcollaboration