Summary
The job is for an experienced ML acceleration engineer at Stack AV, a company focused on revolutionizing the trucking industry with autonomous solutions. The role involves analyzing and optimizing machine learning models to enhance performance and streamline deployment across various hardware platforms.
Requirements
- Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field
- 5+ years of experience (including experience with GPU programming and optimization)
- Strong analytical and problem-solving skills
- Excellent verbal and written communication skills, with the ability to convey complex technical concepts to non-technical stakeholders
Responsibilities
- Analyze and profile ML models to identify performance bottlenecks
- Use OSS tooling to enhance the platform for model profiling and optimization
- Automate the process of exporting models to optimized format (e.g., TensorRT) and deploy them
- Implement optimizations using CUDA, Triton, and custom kernels
- Collaborate with ML researchers to balance model accuracy and speed