Staff Machine Learning Operations Engineer

SandboxAQ
Summary
Join SandboxAQ as a Staff MLOps Engineer and contribute to the advancement of AI solutions for global challenges. You will mature MLOps practices by building infrastructure and application code, embedding with R&D teams, and championing tool adoption. This hands-on role involves close collaboration with data engineers, software developers, and scientists to deliver cutting-edge AI solutions in chemistry and life sciences. The position offers the potential to lead a team of MLOps engineers. You will leverage your expertise in automated training, evaluation, and maintenance of models, efficient inference systems, and dataset/model versioning. The ideal candidate possesses a blend of experience with highly-automated and ad-hoc pipelines, thrives in fast-paced environments, and excels at problem-solving across the software stack.
Requirements
- 7+ years of experience with MLOps fundamentals
- Automated training, evaluation, and retraining loops
- Dataset and model versioning tools, Weights & Biases preferred
- Systems for serving inference
- Build end-to-end ML pipeline using industry-standard tools
- Deep experience with at least one major cloud provider. GCP preferred
- Experience with managing complex data governance requirements
- Familiarity with MLOps architectures and best practices for LLMs and Agentic systems
- 3+ years of experience with infrastructure as code management of public cloud providers. Familiar with terraform. GCP preferred
- Familiarity with building and maintaining CI/CD pipelines for ML systems
- 3+ years of experience with Python, with strong knowledge of software design principles
- Familiarity with building data pipelines or data processing systems at scale, including orchestration tools like Airflow
- Excellent communication and collaboration skills, with the ability to effectively influence a cross-functional team
Responsibilities
- Mature our MLOps practice by defining processes and building fundamental tooling
- Embed closely with R&D teams to assist in delivering project goals and drive adoption of practices and tooling
- Drive the design and implementation of complex, security-sensitive data processing and storage systems with complex tenancy and data isolation requirements
- Collaborate closely with the product team and internal stakeholders in all phases of software development to validate the solutions you propose and implement
- In collaboration with the rest of the engineering team, build and manage infrastructure for SandboxAQβs simulation and data platform
- Review code and participate in design and architectural discussions
Preferred Qualifications
- Domain experience in advanced materials, drug discovery, cheminformatics, or other areas of chemistry or biology, especially experience with AI systems applied to these domains
- Experience with AI applications in knowledge graphs
- Experience profiling and optimizing GPU usage in MLOps applications
Benefits
- Medical/dental/vision
- Family planning/fertility
- PTO (summer and winter breaks)
- Financial wellness resources
- 401(k) plans
- Competitive salaries
- Stock options depending on employment type
- Generous learning opportunities
Share this job:
Similar Remote Jobs
