Senior Machine Learning Operations Engineer Computer Vision

Dandy Logo

Dandy

πŸ“Remote

Summary

Join Dandy's rapidly growing Machine Learning team as a Senior MLOps Engineer and play a key role in the success of our team and company. You will be challenged to learn new technologies, establish best practices, and solve problems independently. We are creating next-generation experiences across the newly 3D-digitized dental stack with ML models, so our ML platform is critical to our success. As a Senior MLOps Engineer, you will be key to the development of our ML platform to create various state-of-the-art machine learning models to revolutionize the digital dental industry. You will collaborate with ML engineers and other stakeholders to design, implement, and maintain MLOps pipelines and cloud-based infrastructure. You will also develop and implement automation strategies and monitoring solutions to ensure high quality and compliance.

Requirements

  • 5+ years of software experience and 3+ years MLOps engineering experience, preferably in a high growth startup environment
  • Hands-on experience working with ML models for performance and training optimizations, hyperparameter tuning, model monitoring, evaluation, and benchmarking
  • Familiarity with ML frameworks and libraries (e.g., TensorFlow, PyTorch, scikit-learn)
  • Experience building and maintaining CI/CD pipelines with best practices
  • Familiarity with containerization tools (e.g., Docker, Kubernetes) and orchestration platforms (e.g. Kubeflow)
  • Comfort working in a highly agile, intensely iterative software development process
  • Self-motivated, self-managing and takes ownership, with excellent organizational skills
  • Ability to thrive in a remote-first organization

Responsibilities

  • In collaboration with ML engineers, design and implement MLOps pipelines for 2D & 3D dataset curation, model training, evaluation, optimization, and deployment
  • Manage and optimize cloud-based infrastructure for ML workloads, including scaling, resource allocation and cost management
  • Help engineer information feedback loops to continuously improve our machine learning models
  • Develop and implement automation strategies for model training, evaluation, optimization, and deployment to improve efficiencies
  • Develop and manage monitoring solutions using GCP tools like Cloud Monitoring and Cloud Logging to track model performance, system health, and operational metrics
  • Ensure that ML operations comply with data security and privacy regulations, utilizing security features and best practices
  • Collaborate with other stakeholders within Engineering and Data to maintain a high bar for quality in a fast-paced, iterative environment

Preferred Qualifications

  • 1+ years of experience working directly with machine learning model training and evaluation preferred
  • Hands-on experience with one of the cloud platforms such as AWS, GCP or Azure. Experience with Google Cloud services (e.g., Vertex AI, BigQuery, Dataflow, Compute Engine, Kubernetes Engine) is preferred

Benefits

  • Healthcare
  • Dental
  • Mental health support
  • Parental planning resources
  • Retirement savings options
  • Generous paid time off

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.