FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAWSCloudDistributed SystemsDockerGoKubernetesPostgresPrometheusPythonTerraform
About the role
Key responsibilities & impact- Design, write, and ship production-grade code to fix bugs, improve performance, and increase reliability across multiple services.
- Tackle complex coding challenges in live services — requiring solid understanding of algorithms, data structures, and system architecture.
- Identify and resolve application issues by diving deep across the stack — including backend code, database interactions, and infrastructure components — with the ability to implement code-level fixes where needed.
- Profile APIs and services under load to identify bottlenecks and implement fixes at the code, database, or configuration level.
- Design and evaluate infrastructure solutions — weighing tradeoffs in architecture, tooling choices, and system configuration — and clearly document rationale for decisions made.
- Collaborate effectively with US-based engineering teams, including availability for overlap hours in the late evening IST to support real-time coordination on incidents, design reviews, and cross-functional initiatives.
- Communicate clearly and proactively — write structured updates, flag blockers early, and synthesize technical context for both engineering and non-engineering audiences.
- Build tools, automation, and test frameworks to simulate traffic, validate behavior under stress, and prevent regressions.
- Improve observability by contributing to logs, metrics, dashboards, alerts, and distributed tracing.
- Collaborate on defining and measuring SLIs, SLOs, and SLAs to align reliability goals with business outcomes.
- Contribute to the development and maintenance of incident response runbooks and help improve operational processes to minimize downtime.
- Participate in incident response and root cause analysis and contribute to the engineering on-call rotation.
- Support cost optimization efforts.
- Research and evaluate new monitoring technologies or best practices to continuously improve system visibility and reliability.
Requirements
What you’ll need- 7–8+ years of software development experience, with meaningful time in backend or SRE/infrastructure-adjacent roles.
- Strong coding skills in languages such as Python, Go, or similar.
- Proven ability to write and ship production code — and debug it under real-world conditions.
- Deep understanding of PostgreSQL or similar relational databases, including performance tuning and query optimization.
- Hands-on experience with distributed systems, containers (Docker), and Kubernetes in production — including making and defending architectural decisions, not just operating existing setups.
- Experience with observability and monitoring platforms like Datadog, CloudWatch, or Prometheus.
- Strong written and verbal communication skills — able to write clear technical proposals, postmortems, and async updates for distributed teams. Comfortable presenting tradeoffs and decisions to senior stakeholders.
- Comfortable working with US-based counterparts, including regular overlap hours in the late evening IST.
- Demonstrates a structured execution approach — breaks down ambiguous problems, tracks progress against milestones, and surfaces blockers before they become delays.
- Knowledge of SLIs, SLOs, and SLAs, and how they align with business objectives.
- Familiarity with cloud platforms (AWS preferred) and distributed systems.
- Comfortable working across unfamiliar codebases and resolving issues from code to infrastructure.
- Strong analytical, problem-solving, and collaboration skills.
- Prior experience in an IaC (Terraform/Terragrunt) or performance engineering role is a plus.
Benefits
Comp & perks- Employer paid group health insurance for you and your dependents
- 401(k) plan with employer match (or equivalent for non US-based roles)
- Flexible paid time off
- Regular company-wide in-person events
- Home office stipend, and more!
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonGoPostgreSQLDockerKubernetesobservabilitymonitoringinfrastructure as codeperformance tuningquery optimization
Soft Skills
communicationcollaborationproblem-solvinganalytical skillsstructured executiontechnical writingpresentation skillsproactive communicationability to work with distributed teamsability to track progress
