Summary
Join Cobalt's growing Cloud Operations team as an experienced Machine Learning Engineer. Build and maintain MLOps pipelines and infrastructure for a security-focused production environment. Collaborate with data scientists, data engineers, and platform engineers. Deploy and monitor ML models, build reusable pipelines, and manage feature stores. Operationalize machine learning to enhance AI-powered pentesting and analytics. This role requires strong skills in Python, ML libraries, data pipelines, cloud platforms, and data engineering concepts. The ideal candidate will have experience with MLOps or ML infrastructure and a passion for cybersecurity.
Requirements
- 3+ years of experience in software engineering or DevOps roles, with 1+ years focused on MLOps or ML infrastructure
- Strong background in deploying machine learning models to production, including model versioning, rollback, and performance tracking
- Advanced proficiency in Python, including common ML libraries (e.g., scikit-learn, MLflow, PyTorch, TensorFlow)
- Strong skills in building and maintaining data pipelines using tools like Apache Airflow, dbt, or similar
- Experience working with cloud platforms (preferably GCP) and infrastructure tools like Docker, Kubernetes, Terraform, or Pulumi
- Solid understanding of data engineering concepts such as batch and streaming ETL, data partitioning, and schema evolution
Responsibilities
- Design, build, and maintain reliable MLOps pipelines that support versioned, testable, and reproducible model training and deployment
- Develop CI/CD pipelines for model promotion, validation, canary testing, and rollback
- Automate model performance monitoring, logging, and alerting to maintain model health in production
- Collaborate with Data Engineering and Data Science teams to build and maintain data pipelines, feature stores, and high-quality training datasets
- Support the creation of ML-friendly data assets that meet latency, freshness, and accuracy requirements
- Integrate robust data validation, lineage tracking, and quality checks throughout the pipeline
- Define and manage scalable infrastructure for model training and inference using container orchestration platforms (e.g., Kubernetes)
- Apply infrastructure-as-code (IaC) principles to build reproducible environments for experimentation and production
- Ensure compliance with security and privacy best practices in model and data handling
- Work side-by-side with data scientists to enable fast experimentation while maintaining production-grade standards
- Facilitate efficient use of GPU/TPU resources, experiment tracking tools, and model registries
- Participate in planning, postmortems, and optimization of our ML platform to improve velocity and reliability
Preferred Qualifications
- Familiarity with cybersecurity, penetration testing workflows, or secure data handling practices is a plus
- Comfort working in an agile, fast-paced, and mission-driven startup environment
Benefits
- Grow in a passionate, rapidly expanding industry operating at the forefront of the Pentesting industry
- Work directly with experienced senior leaders with ongoing mentorship opportunities
- Earn competitive compensation and an attractive equity plan
- Save for the future with a 401(k) program (US)
- Benefit from medical, dental, vision and life insurance (US)
- Wellness
- Work-from-home equipment & wifi
- Learning & development
- Make the most of our flexible, generous paid time off and paid parental leave
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.