Apply faster with JobTailor
RecommendedApply
Apply your way
Use the standard apply link, or let JobTailor help you move faster.
- Apply directly in one click
- No setup required
- Best if you’re in a hurry
✨ Start AI Apply
Tech Stack
Tools & technologiesDistributed SystemsGoJavaJavaScriptPythonRust
About the role
Key responsibilities & impact- Build and maintain large-scale web crawlers across diverse domains
- Design high-throughput, fault-tolerant systems for data collection (millions to billions of URLs/day)
- Handle anti-bot systems, rate limits, and dynamic/JS-heavy sites
- Develop pipelines for cleaning, deduplication, filtering, and normalization
- Construct and maintain datasets for research and model training
- Monitor crawl performance, coverage, and data quality; iterate quickly
- Collaborate with research teams to align data collection with modeling needs
- Optimize infrastructure for cost, latency, and reliability
Requirements
What you’ll need- Strong programming experience in one or more of: Go, Rust, Python, Java, or C++
- Experience building web crawlers or large-scale data pipelines
- Solid understanding of HTTP, networking, and browser behavior
- Familiarity with distributed systems and parallel processing
- Experience working with large datasets (TB–PB scale preferred)
- Ability to debug unstable or adversarial environments
Benefits
Comp & perks- Competitive salary
- Benefits and equity package
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
GoRustPythonJavaC++web crawlersdata pipelinesHTTPdistributed systemsparallel processing
Soft Skills
collaborationproblem-solvingdebuggingiteration
