Staff Machine Learning Research Scientist at Crisis Text Line

Summary

Join Crisis Text Line's Research & Impact department as a Staff Machine Learning Research Scientist. You will lead machine learning initiatives to analyze over 300 million text messages, generating insights to optimize Crisis Text Line's service and improve mental health support. Leveraging NLP and ML models, you'll identify trends, build models, and visualize data to communicate findings to both technical and non-technical audiences. Collaboration with various teams and external partners is key. This role requires extensive experience in ML/AI/NLP and strong communication skills. Crisis Text Line offers a comprehensive benefits package, including paid time off, health insurance, retirement plan, parental leave, and various stipends.

Requirements

6+ years of combined training and experience in computational social science, natural language processing, computer science, or related disciplines
Theoretical and practical understanding of ML/AI/NLP models both for structured and unstructured data
Experience in programming languages used for data manipulation, computational statistics, distributed computing, and ML workflows (such as Python, Spark, R, MATLAB, C++, Java, Go), and with SQL
Ability to write clean and modular code, maintained with version control tools
Extensive hands-on experience with multiple NLP techniques (task specific fine-tuning of large language models, text parsing, lemmatization, topic modeling, named entity recognition, text classification, relation extraction, sentiment analysis, etc.)
Skills in translating research results into non-technical insights for broad consumption
Reliable High-Speed Internet Required: Must have a stable high-speed internet connection to support seamless remote collaboration, virtual meetings, online job tasks, etc

Responsibilities

Lead Machine Learning to Deliver New Research Insights for Crisis Text Line
ML/AI/NLP engineering and model development: lead the design and implementation of custom ML/deep learning/NLP pipelines to analyze conversational data. Pipelines will include data ingestion, preprocessing, feature generation, model selection and development, and fine-tuning. Using Python with frameworks like scikit-learn, spaCy, NLTK, Hugging Face, TensorFlow, PyTorch, Transformers, Spark, and/or similar tools, use traditional ML and Large Language Model (LLM)/transformer architectures and related strategies to analyze large datasets. This work will support research projects on mental health crises; briefs related to mental health, coping, and volunteering; and mental health and support disparities in the United States and globally
ML/AI/NLP model evaluation: contribute to the design and implementation of pipelines to evaluate model performance, accuracy, and reliability; and to evaluate and mitigate bias
Fully own model pipelines, from data collection and labeling to deployment
Perform statistical analyses (e.g. hypothesis testing, linear regression, logistic regression, linear mixed effect models) in R or Python to contribute to the team’s scientific output
Use Spark and/or SQL to clean and transform data, join tables, and create automated ETL jobs to regularly update datasets, ensuring availability for analysis and reporting
Support or lead collaborative strategic research sprints, adhering to rigorous, industry-standard research methods and documentation practices, and ensuring research reproducibility
Visualize data (matplotlib, ggplot, plotly) to support internal and external communication of research findings
Write research grants, briefs, memos, technical reports, and scientific manuscripts for peer-reviewed publications
Communicate and share model performance and impact in a digestible way with the rest of the team and organization
Support efforts to communicate results for both technical and non-technical audiences, including ongoing management of organization-wide internal data insights and research requests as a member of the core workflow team
Lead ML/AI Documentation and Coding Practices on the Research and Impact Team
Implement industry standard documentation and coding protocols and practiceson the Research and Impact team
Provide technical assistance, code reviews, and mentorship to other members of the Research & Impact team related to ML/AI/NLP model development and implementation
Support the development or review of external-facing content referencing Crisis Text Line data and insights, as appropriate

Benefits

20 paid holidays including: Federal holidays like Juneteenth and Labor Day, Election day, Holiday break from Dec 24 through January 1, 2 renewal days, 2 floating holidays
Flexible paid time off, including: 15 vacation days, 3 personal days, 7 sick days
Medical, dental, and vision benefits for the staff member and family at no cost to the employee
403B retirement plan (the nonprofit equivalent of a 401K): 3% contribution by Crisis Text Line to support building financial wellness, regardless of personal contribution
12 weeks paid parental leave (after 6 months of employment)
Student loan repayment (after 2 years of continuous full time service)
Family support through a virtual childcare platform
Stipends/Allowances: Mental health (Monthly), Internet Service (Monthly), Professional Development (Annual), Wellness (Annual), Home office setup (One time/First year)

Staff Machine Learning Research Scientist

Crisis Text Line

Summary

Requirements

Responsibilities

Benefits

Remote

Data

Senior

Similar Remote Jobs

Remote

Data

Mid-level

Remote

DevOps

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Data

Senior

Remote

Data

Mid-level

Remote

Data

Mid-level

Remote

Data

Senior

Remote

Data

Senior