Senior AI/ML Infrastructure Engineer
Lyra Health
Job highlights
Summary
Join Lyra Health as a Sr. AI/ML Infrastructure Engineer and contribute to building scalable infrastructure for training, evaluating, deploying, and monitoring machine learning models. This role involves developing generative AI services, creating data systems, deploying applications in Kubernetes clusters, and collaborating with machine learning engineers. The position requires 5+ years of experience in building production-level ML platforms and a strong understanding of ML models and principles. This role can be based in Burlingame, CA, or fully remote within the US. Lyra offers a competitive salary ($159,000 - $219,000), comprehensive healthcare coverage, equity, paid time off, parental leave, 401k benefits, and a monthly tech allowance.
Requirements
- 5+ years of industry experience building production level ML platforms and infrastructure, including experience building ML systems/pipelines from the ground-up
- Ability to write high-quality code in Python, Java or Scala
- Experience building production ready RESTful APIs, as well as having scaled platforms in production to a large number of users
- A desire to own large parts of an ML Platform, with a strong understanding of ML models & principles
- Experience working with containers and deploying applications to Kubernetes
- Experience with LLMs and building infrastructure to support LLM applications
- Experience with relational and low-latency databases
- Experience with transforming data in both batch and streaming contexts
- A desire to learn new technologies quickly, and a proven track record of making quality vs. deadline tradeoffs in fast-paced environments
- Ability to scope out a large project and manage it through project delivery
- Strong communication skills and ability to generate consensus and buy-in within the team
- Organizational skills and the ability to simplify complex problems and prioritize what matters most for the sake of the team and the business
Responsibilities
- Be part of a team working on building out scalable infrastructure to train, evaluate, deploy, perform inference and monitor our ML models
- Build, deploy and maintain generative AI services & applications
- Create data systems to collect, clean, label and store data used for model features
- Deploy and manage various applications in our Kubernetes clusters
- Collaborate with Machine Learning engineers to build & support state of the art experimentation platforms, training frameworks and associated tools
- Work with stakeholders on requirements and solutions for ML infrastructure
- And of course, you will be coding every day!
Preferred Qualifications
- Experience working with highly sensitive data in a healthcare environment
- Experience working with ML frameworks such as Pytorch, SciKit-learn, XGboost
- Experience working with ML Ops tools such as MLFlow, Kubeflow, AWS Sagemaker
- Experience building solutions on cloud infrastructure, particularly AWS
Benefits
- Comprehensive healthcare coverage (including medical, dental, vision, FSA/HSA, life and disability insurances)
- Lyra for Lyrians; coaching and therapy services
- Equity in the company through discretionary restricted stock units
- Competitive time off with pay policies including vacation, sick days, and company holidays
- Paid parental leave
- 401K retirement benefits
- Monthly tech allowance
Share this job:
Similar Remote Jobs
- πUnited States
- πBrazil
- πArgentina
- πAustralia
- π°$93kπUnited States
- πCanada
- πUnited States
- πPortugal
- πIndia