GenAI Optimization Engineer

Modular Logo

Modular

πŸ“Remote - United States, Canada

Summary

Join Modular, a company revolutionizing AI infrastructure, and become part of a team building a cutting-edge, modular platform (MAX) to simplify AI development and deployment. The E2E Optimizations team focuses on implementing cutting-edge optimizations and research for auto-regressive text generation, image generation, and more. You will design, implement, and tune performance features, lead cross-functional projects, collaborate with subject matter experts, and contribute to the MAX tech stack. This role requires in-depth Python knowledge, 3+ years of experience in ML/DL/Generative AI, and experience with performance optimizations for Generative AI. The position offers competitive compensation, including stock options, and world-class benefits such as premier insurance plans, 401k matching, and flexible paid time off. Remote work options are available for US and Canada-based candidates.

Requirements

  • In-depth knowledge of the Python programming language
  • 3+ years of working experience in Machine Learning, Deep Learning, or Generative AI
  • Experience implementing framework-level performance optimizations for Generative AI use cases
  • Experience profiling and reducing latency in GenAI applications
  • Deep interest in machine learning technologies and use cases
  • Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture

Responsibilities

  • Design, scope, implement, and tune performance features for Generative AI use cases in the MAX framework
  • Plan and lead cross-functional projects spanning multiple teams and domains
  • Collaborate with subject matter experts within Modular to enable features across different parts of the stack
  • Contribute to the MAX tech stack across multiple languages, including Mojo, Python, and C++
  • Monitor latest research channels and identify potential opportunities for the MAX framework

Preferred Qualifications

  • Experience using Machine Learning frameworks like PyTorch, Tensorflow, etc
  • CUDA/ROCM/Accelerator Programming and Optimization experience
  • Experience with LLVM/MLIR/Compilers
  • Experience working with distributed/parallel programming models and an understanding of parallel hardware

Benefits

  • Premier insurance plans
  • Up to 5% 401k matching
  • Flexible paid time off

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.