Kernel Engineer

Modular Logo

Modular

πŸ’΅ $113k-$242k
πŸ“Remote - United States, Canada

Summary

Join Modular, a company revolutionizing AI infrastructure, and become an AI Kernel Engineer. You will design and optimize high-performance ML kernels, utilize low-level programming (C/C++/Assembly), collaborate with cross-functional teams, and contribute to the development of future accelerators. This role requires in-depth C++ and low-level architectural performance knowledge, 4+ years of experience with complex code and systems, and experience with performance modeling and analysis. The position offers competitive compensation, including stock options, and world-class benefits such as premier insurance plans, 401k matching, and flexible paid time off. Remote work options are available for US and Canada-based candidates. Modular fosters a collaborative and supportive work environment.

Requirements

  • In-depth knowledge of C++ and low-level (micro)architectural performance is required
  • 4+ years of experience working on complex code and systems
  • Experience with performance modeling and performance data analysis
  • Understanding of Parallelization techniques for ML / HPC Acceleration
  • Deep interest in machine learning technologies and use cases
  • Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture

Responsibilities

  • Design and optimize high-performance ML numeric and data manipulation kernels/operators
  • Utilize low-level C/C++/Assembly programming to achieve state of the art performance. Your work will also entail potentially introducing new novel compiler and tools support
  • Work with compiler, framework, runtime and performance teams to deliver end-to-end performance that fully utilizes today’s complex server and mobile systems
  • Collaborate with architects and hardware engineers to co-design future accelerators, including ISA for new hardware features and evolving ISA
  • Collaborate with machine learning researchers to guide system development for future ML trends

Preferred Qualifications

  • Some knowledge of compiler fundamentals is valuable, as is familiarity with kernel authoring paradigms (i.e., OpenMP, CUDA, Halide, Rise/Lift, or others)
  • Experience with performance profilers, performance data analysis tools, visualization tools, and debugging or experience working with embedded systems
  • Experience working with distributed/parallel programming models and an understanding of parallel hardware
  • Experience developing firmware for accelerators and embedded programming
  • Experience with HPC programming and accelerator languages such as CUDA, OpenCL, SYCL, etc

Benefits

  • Premier insurance plans
  • Up to 5% 401k matching
  • Flexible paid time off
  • Competitive Compensation
  • Stock options
  • Team Building Events
  • Remote work options are available for US and Canada-based candidates

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.