
Principal Engineer
Nexus Cognitive
full-time
Posted on:
Location Type: Hybrid
Location: Atlanta • United States
Visit company websiteExplore more
Job Level
About the role
- Own the architecture of NX1's core platform: defining how Spark, Trino, Iceberg, Ranger, Gravitino, and Keycloak are integrated, versioned, and deployed together
- Lead the design of NX1's multi-tenant security model — Ranger policy enforcement, Keycloak-based JWT authentication, Gravitino credential vending, and YuniKorn-based namespace isolation on Kubernetes
- Drive the Iceberg catalog strategy: Gravitino as Iceberg REST catalog, namespace/table lifecycle management, S3 credential vending, and Spark/Trino catalog configuration
- Author technical design documents (TDDs) for major features and integration efforts; lead cross-team technical reviews and drive adoption of architectural decisions
- Own the engineering team's integration posture with open-source upstream projects — evaluating major version upgrades, managing patches, and leading contribution efforts where NX1 has custom extensions
- Lead technical investigation of the most complex escalations from Forward Engineering — issues involving Spark/Trino query failures, Ranger policy conflicts, Keycloak token issues, or Gravitino catalog inconsistencies
- Define NX1's Kubernetes operator strategy: how platform components are deployed, upgraded, and managed via Helm and Terraform across customer environments
- Mentor Senior Software Engineers and set engineering standards for code quality, testing, observability, and release practices
- Partner with the Head of Engineering on technical roadmap planning, build vs. integrate decisions, and OSS component selection
- Interface directly with enterprise customers on the most complex architectural questions raised during Solutions Architect-led engagements
Requirements
- 10+ years software engineering experience with at least 3 years at staff or principal engineer level on distributed data systems
- Expert-level knowledge of Apache Spark — including Spark on Kubernetes, the Spark SQL engine, custom extensions, and performance tuning at scale
- Deep experience with Trino (formerly PrestoSQL): connector architecture, query planning, cost-based optimization, and integration with Hive Metastore or Iceberg REST catalog
- Hands-on production experience with Apache Iceberg: table format internals, partitioning strategies, compaction, snapshot expiry, and multi-engine compatibility (Spark + Trino)
- Strong experience with enterprise authentication and authorization: Keycloak (OIDC/JWT), Apache Ranger (policy engine, service definitions, plugins), and LDAP/Active Directory integration
- Experience with Apache Gravitino or comparable metadata/catalog platforms: catalog API design, Iceberg REST catalog implementation, and credential vending patterns
- Kubernetes and OpenShift expertise: operators, CRDs, RBAC, network policies, and multi-tenant namespace management; experience with YuniKorn or equivalent schedulers is a strong plus
- Proficiency in Java and/or Scala for Spark/Trino work; Python for tooling, automation, and data engineering integrations
- Experience with Helm chart authoring, Terraform module development, and GitOps-based platform lifecycle management
- Track record of leading engineering teams through complex open-source integrations, major version migrations, and enterprise customer deployments
Benefits
- A collaborative team culture built on curiosity and respect
- Challenging work where your contributions clearly matter
- A leadership team that invests in learning and development
- The opportunity to work at the intersection of cloud, data, and AI innovation
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Apache SparkTrinoApache IcebergKeycloakApache RangerApache GravitinoKubernetesOpenShiftJavaScala
Soft Skills
leadershipmentoringtechnical investigationcross-team collaborationarchitectural decision-makingtechnical roadmap planningcustomer engagementcode quality standardsobservabilityrelease practices