Machine Learning Ops Engineer at iHorizons

Summary

Join our team as an ML Ops Engineer and be responsible for designing, building, and maintaining scalable machine learning pipelines. You will deploy models to production, manage infrastructure, implement CI/CD, and ensure reliability and scalability. The role involves collaboration with data scientists, performance optimization, and managing containers and APIs. You will also be responsible for monitoring, security, and implementing disaster recovery plans. This position reports to the Manager – AI and requires a Bachelor's or Master's degree in a related field and 4+ years of experience.

Requirements

Bachelor’s or master’s degree in computer science, Engineering, Data Science, or a related field
4 years of proven experience as an ML Ops Engineer or similar role in a production environment
Experience with Azure cloud platform
Experience with containerization technologies (Docker, Kubernetes)
Experience with API management tools (Kong)
Strong programming skills in Python
Proficiency in CI/CD tools
Familiarity with machine learning frameworks (TensorFlow, PyTorch)
Strong understanding of DevOps practices and principles
Excellent problem-solving skills and attention to detail
Strong communication and collaboration skills

Responsibilities

Design, build, and maintain scalable ML pipelines to ensure efficient data processing and model deployment
Develop and manage APIs to support machine learning models and services
Ensure seamless integration between machine learning models and external applications
Utilize API management tools to monitor and secure API calls, enforcing access control and data protection measures
Deploy machine learning models to various environments, including testing and production, ensuring seamless integration and functionality
Ensure the reliability, availability, and scalability of ML pipelines by implementing robust monitoring and alerting systems
Provision pipeline operations effectively, managing resources such as compute, storage, and networking to optimize performance and cost-efficiency
Develop and maintain CI/CD pipelines tailored for ML models and applications
Automate the build, test, and deployment processes
Utilize containerization technologies such as Docker and Kubernetes for deploying ML models, ensuring consistency and portability across environments
Manage and orchestrate containers effectively to optimize resource utilization and maintain scalability
Implement comprehensive monitoring and logging solutions to track the performance of ML models and pipelines, enabling proactive issue detection and resolution
Set up robust alerting systems to detect and respond to issues and anomalies promptly, minimizing downtime and performance degradation
Ensure compliance with security standards and regulations, implementing measures to protect data privacy and model security
Continuously monitor and optimize the performance of ML models and infrastructure, identifying and resolving bottlenecks to improve system efficiency
Respond to and resolve incidents related to ML operations promptly
Set up and manage both cloud and on-premises infrastructure to support ML operations
Optimize models and infrastructure for performance and scalability in production environments, ensuring efficient and reliable operations
Manage resource allocation to ensure cost-effective operations
Develop scripts and automation tools to streamline ML operations, automating repetitive tasks to improve operational efficiency
Implement backup and disaster recovery plans for ML models and data
Ensure data and model availability in case of failures
Conduct root cause analysis and implement preventive measures to mitigate future occurrences
Collaborate closely with data scientists and engineers throughout the ML lifecycle, from model development, and testing to deployment and maintenance
Collaborate with data scientists and AI researchers to develop and test machine learning models
Provide support and guidance on best practices for ML operations, facilitating effective teamwork and knowledge sharing
Implement best practices for model versioning, testing, and validation

Preferred Qualifications

AWS experience

Machine Learning Ops Engineer

iHorizons

Summary

Requirements

Responsibilities

Preferred Qualifications

Remote

DevOps

Mid-level

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

ModMed

Remote

DevOps

Mid-level

Remote

DevOps

Senior

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Senior

Remote

Software Development

Mid-level

Remote

Software Development

Senior