Remote Distributed ML Systems Engineer

Logo of Together AI

Together AI

πŸ’΅ $160k-$230k
πŸ“Remote - United States

Job highlights

Summary

Join us in shaping the future at Together AI! We are seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, fault-tolerant distributed systems that handle high-load and high-performance requirements.

Requirements

  • 3+ years of experience in building large-scale, fault-tolerant, high-performance distributed systems
  • Strong programming skills in one or more of Python, Go, Rust, or C/C++
  • Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, and storage, performance, and scale
  • Experience with cloud computing platforms (AWS, GCP, Azure etc.) and large-scale infrastructure
  • Strong problem-solving skills and ability to work in a fast-paced environment

Responsibilities

  • Design and build large-scale, distributed machine learning systems that are fault-tolerant and high-performance
  • Develop and optimize distributed processing frameworks and storage systems
  • Collaborate with researchers, engineers, and product managers to integrate ML systems into our infrastructure
  • Conduct architecture and design reviews to ensure best practices in system design
  • Implement robust monitoring and logging systems to ensure the health and performance of our ML systems

Preferred Qualifications

  • Experience with Kubernetes
  • Experience with Pytorch

Benefits

  • Health insurance
  • Competitive compensation

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Together AI know you found this job on JobsCollider. Thanks! πŸ™