Summary
Join DataKind as a Machine Learning Engineer and make a significant impact on student graduation rates. You will build and maintain machine learning pipelines for our innovative predictive analytics platform. This remote position, based anywhere in the U.S., offers a salary range of $106,000-$120,000. You will report to the Director of Data Science, Education, and collaborate with various teams. Responsibilities include designing, building, and maintaining machine learning pipelines, providing data support to partners, and contributing to DataKind's overall initiatives. The ideal candidate possesses extensive experience in machine learning, data engineering, and cloud computing.
Requirements
- Alignment with DataKindโs mission and values, including our commitment to anti-racism
- Experience working across lines of difference (culture, identity, and time zone)
- At least 3 years of professional work experience in developing and deploying a machine learning product at scale
- Foundational understanding of machine learning and statistical methods for predictive modeling
- Expert in Python
- Experience with cloud computing (GCP preferred)
- Experience with databases (SQL, Postgres, PySpark, and/or other data query languages)
- Experience with DataBricks or a similar data intelligence platform
- Experience with data warehousing, orchestration, integration, and ETL tools
- Experience with modern source code management and software repository systems (i.e. Git)
- Experience documenting and implementing RESTful APIs
- Proven track record of successfully managing full life-cycle machine learning implementation projects with multiple stakeholders
- Solid understanding of Software Engineering principles and best practices and the data science project life-cycle
- Comfort and skill in communicating highly technical information to semi- and non- technical audiences
- Self-motivated, results-driven, and persistent in the face of challenges
Responsibilities
- Design, build, test, and maintain machine learning pipeline architectures (70%)
- Produce high-quality, reusable code for data ingestion, validation, and processing pipelines
- Architect and implement end-to-end ML pipelines including training, retraining, and inference systems for schools using the SST
- Design and build APIs to easily access, integrate, and manage data from different sources
- Ensure data infrastructure is in compliance with data governance and security policies
- Create comprehensive documentation for data infrastructure and ML pipelines, tailored for both technical and non-technical stakeholders
- Advance internal analytics reporting and automation capabilities as needed
- Manage initial data lifecycle processes for new school onboarding including ingestion, transfer, audit, and validation
- Collaborate with data platform partners on integration and data transfer pipelines
- Provide technical guidance to partners on how to share data formatted in alignment with our data model and with appropriate data governance measures
- Address partner concerns regarding data security and ensure their specific requirements are satisfied
- Support data science initiatives through processing, cleaning, and analyzing data as needed
- Support other data team members through code reviews and knowledge sharing across products
- Collaborate with the Product, Engineering, and Research teams to ensure seamless integration and alignment of work
- Effectively communicate project status and manage expectations with internal teams and partner organizations
- Maintain accurate and current project information in project management tools like Asana
Preferred Qualifications
- Experience integrating data from SaaS providers
- Experience in the nonprofit sector and/or in a small startup organization
- Experience in scaling machine learning products, handling data quality and volume
- Certifications in cloud computing
- Advanced experience in machine learningโconfident in applying, tuning, and evaluating a wide variety of algorithms
- Experience with software development and/or web-dev work (frontends, dashboards, etc.)
- Track record of strong technical writing for a variety of audiences
- Proven track record of (internal or external) client service orientation
Benefits
- Flexibility and time off. Enjoy genuine flexibility that goes beyond adjustable hours. We build in shared time off, company-wide recharge days, bi-weekly meeting-free days, and flexible PTO (with a minimum of 20 vacation days encouraged annually)
- Comprehensive Wellness Support. We care for your total wellbeing with 100% employer-paid medical, vision, and dental benefits for employees (72% for dependents), a wellness reimbursement program for the activities and purchases that matter to you, and 12 weeks paid parental leave when you need it most
- A Culture of Growth. Every team member receives professional development funding each year, alongside mentorship and advancement opportunities. We invest in your future with a 401(k) plan with 5% employer matching
- Meaningful Connection. Despite being distributed across time zones, we value being able to come together in person for conferences, strategic planning, and at our annual staff retreat