FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Staff Replication Development Engineer
DDNStaff Replication Development Engineer designing and developing enterprise-grade replication systems for AI data platform. Focused on reliable and secure disaster recovery for large-scale data systems.
Tech Stack
Tools & technologiesDistributed SystemsGoJavaRustTCP/IP
About the role
Key responsibilities & impact- Design and develop multi-threaded asynchronous replication systems with parallel streaming capabilities
- Build object-level delta replication with checkpointing and resume functionality
- Develop replication engines supporting bucket/share-level replication controls
- Implement secure data transfer mechanisms using TLS 1.3 with mutual authentication
- Ensure end-to-end data integrity through checksum validation and verification pipelines
- Design and implement manual failover workflows for disaster recovery scenarios
- Build and maintain REST APIs for replication configuration, control, and automation
- Develop metadata tracking and change detection systems to enable efficient replication
- Implement RPO visibility, alerting, and operational insights for replication status
- Contribute to monitoring dashboards focused on replication health and performance
- Ensure systems are designed for high availability, fault tolerance, and scalability
- Partner with QA teams to drive performance, resiliency, and scale validation
- Collaborate with backend, security, and platform teams to deliver end-to-end replication workflows
- Participate in debugging, production issue resolution, and continuous improvement of replication reliability
- Provide technical leadership, architectural guidance, and mentorship to the engineering team.
Requirements
What you’ll need- 8+ years of experience in distributed systems, storage systems, or backend software engineering
- Strong programming skills in one or more languages: C++, Go, Java, or Rust
- Experience designing and building data replication systems, data pipelines, or distributed data services
- Deep understanding of distributed systems concepts (consistency, availability, scalability, fault tolerance)
- Strong expertise in multi-threading, concurrency, and parallel processing
- Knowledge of networking protocols and secure communication (TCP/IP, HTTP/HTTPS, TLS)
- Experience implementing data integrity mechanisms (checksums, validation, consistency checks)
- Experience designing and building REST APIs and service-based architectures
- Familiarity with checkpointing, failure recovery, and retry mechanisms in distributed systems
- Basic understanding of observability concepts (metrics, logging, alerting)
- Strong debugging, problem-solving, and system design skills.
Benefits
Comp & perks- Health insurance
- Retirement plans
- Paid time off
- Flexible work arrangements
- Professional development
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
C++GoJavaRustmulti-threadingconcurrencyparallel processingdata replication systemsREST APIsdata integrity mechanisms
Soft Skills
technical leadershiparchitectural guidancementorshipproblem-solvingdebuggingcollaborationcommunication