BorderlessMind

Senior DevOps Engineer

BorderlessMind

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Operate and improve platform tools so product teams can ship reliably triaging tickets, fix build issues, and handling routine service requests (access, secrets, environment setup).
  • Maintain and extend self-service workflows (templates, golden paths) by updating docs, examples, and guardrails under guidance from senior engineers.
  • Perform day-to-day Kubernetes operations: deploy/update Helm charts, manage namespaces, diagnose rollout issues, and follow runbooks for incident response.
  • Support CI/CD pipelines (e.g., GitLab CI): keep pipelines green, add/adjust jobs, implement basic quality gates, and help teams adopt safer deploy strategies (blue/green, canary).
  • Monitor and operate the observability stack using Prometheus, Alert manager, and Thanos; maintain alert rules, dashboards, and SLO/SLA indicators; help reduce alert noise and improve signal quality.
  • Assist with service instrumentation across the core observability pillars—tracing, logging, and metrics—with hands-on OpenTelemetry usage (collectors/SDKs) and related telemetry tooling.
  • Contribute to and improve documentation: runbooks, FAQs, onboarding guides, and standard operating procedures.
  • Participate in an on-call rotation as needed with a well-defined escalation path; assist during incidents, post small fixes, and capture learnings in docs.
  • Help with cost- and performance-minded housekeeping: right-size workloads, prune unused resources, and automate routine tasks where appropriate.

Requirements

  • 8+ years in a platform/SRE/DevOps or infrastructure role, with a strong bias toward automation and support.
  • Experience operating Kubernetes (or similar) and core ecosystem tools (Helm, Docker, Ingress NGINX, Argo Rollouts basics).
  • Hands-on CI/CD experience (preferably GitLab CI): writing/modifying jobs, artifacts, environments, and basic deployment strategies.
  • Scripting ability in Bash or Python (Go a plus) to automate repetitive tasks and improve runbooks.
  • Familiarity with AWS fundamentals (e.g., IAM, EC2/EKS, S3, CloudWatch/CloudTrail, Parameter Store/Secrets Manager).
  • Practical understanding of monitoring/observability (dashboards, logs, alerts) and how to use them for triage and remediation, including Prometheus/Alertmanager/Thanos and OpenTelemetry basics.
  • Comfortable working from tickets (Jira/ServiceNow), following change-management practices, and communicating clearly with stakeholders.
  • Highly preferred candidates also have: Terraform experience, API integration experience (Java, Python, or Go), deeper Linux fundamentals, and exposure to insurance/financial services environments.
Benefits
  • We help make an impact by solving real problems using innovation, improved customer experiences and the right technologies.
  • Advanced training opportunities.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesHelmDockerCI/CDBashPythonAWSPrometheusOpenTelemetryTerraform
Soft Skills
communicationautomationproblem-solvingdocumentationincident responsecollaborationtriagestakeholder engagementperformance optimizationchange management