Senior C++/Deep Learning Engineer, GPU Optimization
closed
Torc Robotics
💵 $177k-$212k
📍Remote - United States
Summary
Join Torc and catapult your career with the company that helped pioneer autonomous technology, and the first AV software company with the vision to partner directly with a truck manufacturer.
Requirements
- Bachelor's degree in computer science, data science, artificial intelligence or related field with 6+ years of professional experience or a master's degree with 3+ years of experience
- Mastery of Modern C++ (14 or more recent) and Python, with the ability to write efficient and maintainable code for both performance and flexibility
- Familiarity with object-oriented software design patterns, and their implementation in C++
- In-depth knowledge of CUDA programming and experience with optimizing deep learning kernels
- Excellent understanding of parallel computing (GPGPU) and high-performance (HPC) concepts
Responsibilities
- Optimize machine learning inference models for NVIDIA Orin execution
- Leverage data parallelism and CUDA programming
- Implement tensorrt plugins
- Stay abreast of the latest advancements in PyTorch, maximizing their potential for target hardware execution
- Collaborate with machine learning engineers to develop innovative and performant deep learning solutions
- Analyze and optimize deep learning inference using profiling and optimization tools, identifying and eliminating performance bottlenecks
- Contribute to the development of internal tools and libraries to further enhance deep learning performance on the target hardware
- Document your work clearly and concisely, sharing knowledge effectively with team members
Benefits
- 100% paid medical, dental, and vision premiums for full-time employees
- 401K plan with a 6% employer match
- Flexibility in schedule and generous paid vacation (available immediately after start date)
- Company-wide holiday office closures
- AD+D and Life Insurance
This job is filled or no longer available