Senior Cloud Infrastructure Engineer

Waabi Logo

Waabi

πŸ’΅ $158k-$269k
πŸ“Remote - United States, Canada

Summary

Join Waabi's Infrastructure team as a Cloud Infrastructure Engineer and collaborate with multidisciplinary engineers and research scientists to build the next generation of self-driving technology. You will work on improving ML infrastructure, overseeing cloud strategy, designing and implementing scalable cloud systems, and devising best practices for cloud usage. The role requires experience in cloud computing, software development, and working with public cloud platforms. Waabi offers competitive compensation, equity awards, health and wellness benefits, unlimited vacation, flexible hours, and more. This is an opportunity to contribute to a rapidly growing AI company and make a positive impact on the world. The ideal candidate is passionate about self-driving technologies and solving complex problems.

Requirements

  • BS, MS/PhD in Computer Science or similar technical field of study or equivalent practical experience
  • 5+ years of relevant industry experience
  • Experience in reading and developing production quality software
  • Deep understanding of Cloud compute and data storage for distributed training and inference workloads
  • Familiarity with Python, GO, Rust or C++ ecosystems
  • Experience working with public cloud platforms (AWS preferred)
  • Experience with infrastructure as code systems (Terraform preferred)
  • Experience in job scheduling and resource allocation
  • Experience with containers and container orchestration (i.e., Docker, ECS, Kubernetes)
  • Experience and high level of comfort working with Linux systems
  • Experience with building platform services that enable other teams to do their best work
  • Open-minded and collaborative team player with the willingness to help others
  • Passionate about self-driving technologies, solving hard problems, and creating innovative solutions
  • Experience working in an Agile/Scrum environment

Responsibilities

  • Work alongside a team of multidisciplinary Engineers and Research Scientists using an AI-first approach to enable safe self-driving at scale
  • Collaborate with cross-functional teams in the company to understand the growing need and pain points in cloud usage
  • Propose cloud strategies around compute and data usages for training and simulation workloads
  • Design and implement scalable and resilient cloud infrastructure optimized for long term reliability and adaptability
  • Devise and promote best practices for cloud usages in training and simulation environments, oversee cloud strategies and usages across the whole company

Preferred Qualifications

  • Experience with on-premise servers, network equipment and scale-out storage systems
  • Experience with CI/CD pipelines and release management
  • Experience in common ML tools, workflows and frameworks (i.e. systems like Kubeflow or MLFlow)
  • Understand system performance tuning at software, hardware, and network levels

Benefits

  • Competitive compensation and equity awards
  • Health and Wellness benefits encompassing Medical, Dental and Vision coverage (for full-time employees only)
  • Unlimited Vacation
  • Flexible hours and Work from Home support
  • Daily drinks, snacks and catered meals (when in office)
  • Regularly scheduled team building activities and social events both on-site, off-site & virtually

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.