Remote ML Infrastructure Engineer

closed
Logo of Cohere

Cohere

πŸ“Remote - United Kingdom

Job highlights

Summary

Join us on our mission to scale intelligence and shape the future! We're looking for a team member to build the infrastructure and compute platform, which acts as the foundation for all members of technical staff at Cohere. The ideal candidate has 5+ years of engineering experience running production infrastructure at a large scale, with expertise in designing large, highly available distributed systems with Kubernetes and GPU workloads.

Requirements

  • 5+ years of engineering experience running production infrastructure at a large scale
  • Experience designing large, highly available distributed systems with Kubernetes and GPU workloads on those clusters
  • Experience working with GCP, Azure, AWS and/or OCI
  • Experience in designing, deploying, supporting, and troubleshooting in complex Linux-based computing environments
  • Excellent collaboration and troubleshooting skills to build mission-critical systems, and ensure smooth operations and efficient teamwork
  • The grit and adaptability to solve complex technical challenges that evolve day to day

Responsibilities

Build the infrastructure and compute platform

Benefits

  • Full health and dental benefits
  • 100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
  • Weekly lunch stipend, in-office lunches & snacks
  • Remote-flexible work environment with offices in Toronto, New York, San Francisco and London
This job is filled or no longer available