Senior Platform Infrastructure Engineer

Cavnue
Summary
Join Cavnue, a company innovating the future of roadways through advanced technologies, and become a key member of our Infrastructure team as a Senior Platform Infrastructure Engineer. You will design, build, and maintain highly automated, scalable, and secure cloud and edge infrastructure, leveraging Infrastructure-as-Code practices and Kubernetes orchestration. Responsibilities include architecting and managing cloud and on-premise systems, using Terraform for infrastructure provisioning, deploying and administering Kubernetes clusters, developing automation scripts in Python or Golang, implementing monitoring and alerting solutions, and ensuring robust security practices. This hands-on role requires cross-functional collaboration and a commitment to continuous improvement. Cavnue offers a competitive salary, equity program, annual incentive program, and comprehensive benefits.
Requirements
- Experience: 5+ years of hands-on experience in infrastructure engineering, DevOps, or SRE roles, with a track record of operating production cloud environments at scale
- Infrastructure as Code: Strong experience using Terraform for infrastructure provisioning and configuration management in cloud environments
- Cloud Platforms: Proficiency in multi-cloud operations β Google Cloud Platform (GCP) is highly preferred; experience with Amazon Web Services (AWS) and/or Microsoft Azure is also acceptable
- Kubernetes: Deep understanding of Kubernetes (required), including experience setting up and managing Kubernetes clusters, deploying containerized applications, and debugging cluster and networking issues
- Programming & Automation: Ability to write clean, maintainable code for automation and tooling in Python and/or Golang. Experience building internal tools or services to eliminate manual work and integrate systems is a plus
- Networking Fundamentals: Familiarity with basic networking concepts and protocols (TCP/IP, DNS, load balancing, VLANs/VPCs, firewalls) and how they apply in cloud and hybrid environments
- On-Call Readiness: Willingness to take part in on-call rotations and proven skills in troubleshooting and resolving infrastructure incidents under pressure
- Command-Line Proficiency: Strong hands-on skills with Linux and command-line tools; you are comfortable using terminals and utilities (e.g. k9s for Kubernetes, tmux sessions, zsh or similar shells) to manage and debug systems efficiently
- Security Mindset: Knowledge of zero trust architecture principles and a habit of incorporating security best practices into infrastructure design (formal security certifications are not required)
- Communication & Collaboration: Excellent communication skills with the ability to work cross-functionally. You can collaborate in a fast-paced engineering organization, explain complex infrastructure concepts to team members, and contribute to a positive engineering culture
Responsibilities
- Design and implement cloud and edge infrastructure: Architect, deploy, and manage cloud and on-premise (edge) systems to support Cavnueβs platform, ensuring high availability, scalability, and security
- Infrastructure as Code: Use Terraform to provision and manage infrastructure resources consistently across multiple cloud providers (GCP preferred, with AWS/Azure as needed), enabling reproducible and auditable infrastructure changes
- Kubernetes cluster management: Deploy, administer, and optimize Kubernetes clusters for containerized workloads. Handle cluster upgrades, scaling, monitoring, and troubleshoot complex issues in production Kubernetes environments
- Automation and tooling: Develop robust automation scripts and internal tools/services in Python and/or Golang to automate routine tasks, integrate systems, and improve operational efficiency across the infrastructure
- Monitoring and reliability: Implement monitoring, logging, and alerting solutions to track system performance and reliability. Proactively tune systems and address bottlenecks to maintain smooth operation of critical services
- Security and zero trust: Embed security best practices into the infrastructure, enforcing zero trust architecture principles (e.g. least privilege, identity-based access) to protect systems and data. Work closely with security teams to remediate vulnerabilities and ensure compliance with company policies
- On-call and incident response: Participate in an on-call rotation during the teamβs initial growth phase , quickly responding to infrastructure incidents and leading efforts to restore service and perform root cause analysis
- Cross-functional collaboration: Work closely with all teams to understand application needs and translate them into scalable infrastructure solutions. Communicate clearly across teams and document designs and processes for broad understanding
- Continuous improvement: Stay up to date with emerging technologies and industry best practices in cloud infrastructure, DevOps, and platform engineering. Lead or contribute to infrastructure projects that enhance deployment speed, cost efficiency, and overall platform reliability
Benefits
- Medical, dental, and vision benefits
- Life insurance and disability insurance
- 401(k) with 4% company contribution - no waiting period
- Parental and adoption leave
- Fertility and infertility benefits
- Wellness perks including access to on-demand primary care, virtual health appointments, and online mental health therapy
- Generous PTO bank, including paid year-end holiday shutdown
- Company-sponsored lunches twice weekly (in office)
- Learning and development opportunities
- Top-of-the-line equipment