Data & ML Infrastructure Architect

Motional Logo

Motional

๐Ÿ’ต $205k-$282k
๐Ÿ“Remote - United States

Summary

Join Motional as a Machine Learning & Data Infrastructure Architect to lead the technical vision and architecture for the systems powering our machine learning lifecycle. This mission-critical role involves shaping the infrastructure supporting terabytes of daily sensor data and petabyte-scale datasets for autonomous vehicle development. You will own the architecture of Motionalโ€™s ML data infrastructure, design and evolve infrastructure for petabyte-scale machine learning workflows, and architect high-throughput systems for distributed training. Responsibilities also include establishing robust data governance and collaborating with cross-functional teams. You will lead technical strategy and roadmap development, mentor engineers, and promote engineering excellence. This role requires 15+ years of software engineering experience with significant architecture-level ownership in ML, data infrastructure, or high-scale systems.

Requirements

  • 15+ years of meaningful software engineering experience, including significant architecture-level ownership in ML, data infrastructure, or high-scale systems
  • Proven experience leading the design of ML platforms that serve large-scale training and inference workloads
  • Deep technical fluency in distributed storage, high-volume data pipelines, and data compression strategies for ML use cases
  • Strong knowledge of Linux systems, Python, and C++ or similar performance-oriented languages
  • Experience operating in hybrid environments: bare metal, HPC, and public cloud (AWS/GCP/Azure)
  • Comfortable owning cross-org initiatives and influencing system-level design across autonomy, simulation, and platform teams

Responsibilities

  • Own the architecture of Motionalโ€™s ML data infrastructure, enabling scalable ingestion, storage, curation, and access for 100+ engineers and researchers across autonomy teams
  • Design and evolve infrastructure to support petabyte-scale machine learning workflows, including multimodal perception data, synthetic data, simulation output, and continuous training pipelines
  • Architect high-throughput systems for distributed training on large GPU clusters, driving significant improvements in utilization, throughput, and job efficiency
  • Establish robust data governance, observability, and retention strategies to ensure compliance, reproducibility, and long-term data utility
  • Collaborate cross-functionally with ML engineers, autonomy researchers, data engineers, and DevOps to ensure tight integration between infrastructure and user workflows
  • Lead technical strategy and roadmap development for the ML & Data Platform team, incorporating cutting-edge tools and best practices from industry and open source
  • Mentor and influence engineers across teams, promoting engineering excellence in distributed systems, ML platforms, and autonomy-scale data management

Preferred Qualifications

  • Prior work in robotics, autonomous vehicles, or safety-critical domains strongly preferred
  • Experience building or leading infrastructure at a top-tier ML/AI company or AV program
  • Background contributing to open-source ML or data infrastructure projects
  • Familiarity with ML experiment tracking, model evaluation pipelines, and versioned data systems

Benefits

Medical, dental, vision, 401k with a company match, health saving accounts, life insurance, pet insurance

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs