πNepal
Senior MLOps Engineer
closed
Tala
πRemote - Mexico
Summary
Join Tala's mission to unleash the economic power of the Global Majority by designing and implementing scalable infrastructure for AI/ML systems as a Senior Cloud Infrastructure Engineer.
Requirements
- 4+ years of experience as a DevOps Engineer
- 1 year of previous experience managing AI/ML infrastructure in public cloud environments
- In-depth hands-on experience with at least one public cloud platform, preferably AWS
- Experience with Python or any other programming language
- Experience with Docker and Kubernetes in production
- Experience with Continuous Deployment tools such as Jenkins or ArgoCD
- Experience with Logging and Monitoring tools for SaaS such as Sumo, Splunk, Datadog, etc
- Proficiency in English
Responsibilities
- Design, build, and maintain scalable and robust infrastructure for AI/ML (Artificial Intelligence / Machine Learning) systems, including cloud-based environments, containerization, and orchestration platforms
- Develop and implement CI/CD pipelines to automate the deployment, testing, and monitoring of AI/ML models and applications
- Evaluate and integrate new tools, technologies, and frameworks to improve the efficiency and effectiveness of our MLOps processes
- Design and manage Continuous deployment using Kubernetes, ArgoCD, and Jenkins
- Maintain related container registry and model registry
- Monitor infrastructure utilization and costs pertaining to model training, inference, and GPU utilization
- Monitor and troubleshoot AI/ML systems to ensure high availability, performance, and reliability
This job is filled or no longer available
Similar Remote Jobs
πWorldwide
π°$150k-$170k
πUnited States
π°$83k-$104k
πWorldwide
π°$134k-$224k
πWorldwide
π°$213k-$300k
πWorldwide

πJapan
πWorldwide
πWorldwide
π°$180k-$230k
πUnited States