Staff Data Scientist, Machine Learning

Valo Health
Summary
Join Valo Health as a Staff Data Scientist, Machine Learning and become a core member of a team building a computational platform for advancing new medicine research and development. You will design, develop, and apply machine learning models and pipelines using clinical and biomedical data. Collaboration with various scientific and engineering teams is crucial. This role requires extensive experience in machine learning, data science, and software development, along with proficiency in Python and relevant tools. The ideal candidate will possess a strong work ethic and the ability to manage multiple priorities. Valo Health offers a competitive salary and the opportunity to contribute to life-changing drug discoveries.
Requirements
- Degree in a quantitative field with 7+ (BS), 5+ (MS), or 3+ (PhD) years of post-degree experience or equivalent
- Broad experience in ML including supervised learning, unsupervised learning, dimensionality reduction, clustering, metrics, model selection, feature selection, and explainability (3+ years required)
- Demonstrated experience with ML on electronic health records (2+ years required)
- Proficient in Python (5+ years required) and experience with ML and data science packages (e.g., scikit-learn, statsmodels, scipy, MLlib)
- Experience with MLops methodology such as workflow orchestration (e.g., Airflow, Prefect), experiment tracking (e.g., MLflow), containerization (e.g., Docker), and reproducible research (3+ years required)
- Experience with collaborative software development using source control management (e.g., git, unit testing, code review, CI/CD) (3+ years required)
- Experience with large-scale data analytics engines (e.g., Spark or Dask) and working in cloud environments (e.g., AWS) (2+ years required)
- Experience with statistical methods such as hypothesis testing, longitudinal modeling, and time to event analysis
- Strong work ethic with a bias for execution and an ability to manage multiple priorities, ambiguity, and tight timelines. Ability to work effectively in teams or independently
Responsibilities
- Propose, design, and develop ML approaches on high dimensional electronic health records and omics data leveraging Valoβs proprietary platform (data assets and data science packages)
- Design, develop, and support ML pipelines, workbenches, and dashboards to enable users to solve scientific problems
- Develop well-designed, tested, and documented software packages
- Collaborate with cross-functional teams and stakeholders to derive user requirements, maintain alignment, and ensure the relevance and impact of models, analyses, and pipelines
- Be an active team member in code, design, and analysis review
Preferred Qualifications
- Experience with omics data is a plus
- Familiarity with the drug discovery and development process is a plus
Benefits
$175,000 β $235,000 USD