FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Software Engineer, Cloud Infrastructure – Multiple Seniority Levels
Beacon Venture Capital. Cloud Infrastructure Setup and Maintenance: Design, provision, and maintain AWS infrastructure using IaC tools such as AWS CDK or Terraform.
Posted 5/15/2026full-timeSan Carlos • California • 🇺🇸 United StatesSenior💰 $130,000 - $230,000 per yearWebsite
Tech Stack
Tools & technologiesAirflowAWSCloudDynamoDBPostgresPythonTerraform
About the role
Key responsibilities & impact- Cloud Infrastructure Setup and Maintenance: Design, provision, and maintain AWS infrastructure using IaC tools such as AWS CDK or Terraform.
- Build CI/CD and testing for apps, infra, and ML pipelines using GitHub Actions, CodeBuild, and CodePipeline.
- Operate secure networking with VPCs, PrivateLink, and VPC endpoints. Manage IAM, KMS, Secrets Manager, and audit logging.
- LLM Platform and Runtime: Stand up and operate model endpoints using AWS Bedrock and/or SageMaker; evaluate when to use ECS/EKS, Lambda, or Batch for inference jobs.
- Build and maintain application services that call LLMs through clean APIs, with streaming, batching, and backoff strategies.
- Implement prompt and tool execution flows with LangChain or similar, including agent tools and function calling.
- RAG Data Systems and Vector Search: Design chunking and embedding pipelines for documents, time series, and multimedia. Orchestrate with Step Functions or Airflow.
- Operate vector search using OpenSearch Serverless, Aurora PostgreSQL with pgvector, or Pinecone. Tune recall, latency, and cost.
- Build and maintain knowledge bases and data syncs from S3, Aurora, DynamoDB, and external sources.
- Evaluation, Observability, and Cost Governance: Create offline and online eval harnesses for prompts, retrievers, and chains. Track quality, latency, and regression risk.
- Instrument model and app telemetry with CloudWatch and OpenTelemetry. Build token usage and cost dashboards with budgets and alerts.
- Add guardrails, rate limits, fallbacks, and provider routing for resilience.
- Safety, Privacy, and Compliance: Implement PII detection and redaction, access controls, content filters, and human-in-the-loop review where needed.
- Use Bedrock Guardrails or policy services to enforce safety standards. Maintain audit trails for regulated environments.
- Data Pipeline Construction: Build ingestion and processing pipelines for structured, unstructured, and multimedia data. Ensure integrity, lineage, and cataloging with Glue and Lake Formation.
- Optimize bulk data movement and storage in S3, Glacier, and tiered storage. Use Athena for ad-hoc analysis.
Requirements
What you’ll need- End-to-End Ownership: Drives work from design through production, including on-call and continuous improvement.
- LLM Systems Experience: Shipped or operated LLM-powered applications in production. Familiar with RAG design, prompt versioning, and chain orchestration using LangChain or similar.
- AWS Depth: Strong with core AWS services such as VPC, IAM, KMS, CloudWatch, S3, ECS/EKS, Lambda, Step Functions, Bedrock, and SageMaker.
- Data Engineering Skills: Comfortable building ingestion and transformation pipelines in Python. Familiar with Glue, Athena, and event-driven patterns using EventBridge and SQS.
- Security Mindset: Applies least privilege, secrets management, network isolation, and compliance practices appropriate to sensitive data.
- Evaluation and Metrics: Uses quantitative evals, A/B testing, and live metrics to guide improvements.
- Clear Communication: Explains tradeoffs and aligns partners across product, security, and application engineering.
- Bonus Points: 4+ years working with serverless or container platforms on AWS.
- Experience with vector databases, OpenSearch, or pgvector at scale.
- Hands-on with Bedrock Guardrails, Knowledge Bases, or custom policy engines.
- Familiarity with GPU workloads, Triton Inference Server, or TensorRT-LLM.
- Experience with big data tools for large-scale processing and search.
- Background in aviation data or other safety-critical domains.
- DevOps or DevSecOps experience automating CI/CD for ML and app services.
Benefits
Comp & perks- Healthcare: 100%* of employee medical premiums covered; 25% for dependents
- Time Off: 3 weeks PTO plus 13+ paid company holidays
- Stipends: Monthly phone and wellness benefits
- 401(k): Offered (no current employer match, but we are committed to enhancing this benefit in the future).
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AWSIaCTerraformGitHub ActionsCodeBuildCodePipelinePythonLangChainGlueAthena
Soft Skills
End-to-End OwnershipClear CommunicationSecurity MindsetEvaluation and Metrics