Staff ML Engineer

Oportun
Summary
Join Oportun, a mission-driven fintech company, as a Staff ML Engineer and build self-serve platforms combining real-time ML deployment and advanced data engineering. Design and build platforms supporting real-time ML deployment and robust data engineering workflows. Develop microservices-based solutions using Kubernetes and Docker. Create APIs and backend services using Python and FastAPI. Architect and implement platforms for real-time ML inference using AWS SageMaker and Databricks. Build and optimize ETL/ELT pipelines using PySpark and Pandas. Design scalable, distributed data pipelines on AWS, integrating various tools. Implement data lake and data warehouse solutions. Design and implement robust CI/CD pipelines using Jenkins and GitHub Actions.
Requirements
- 10-15 years of experience in platform engineering, backend engineering, DevOps, or data engineering roles
- 5 years experience as architect building platforms that scale
- Hands-on experience with real-time ML model deployment and data engineering workflows
- Strong expertise in core Python and experience with Pandas, PySpark, and FastAPI
- Proficiency in container orchestration tools such as Kubernetes (K8s) and Docker
- Advanced knowledge of AWS services like SageMaker, Lambda, DynamoDB, EC2, and S3
- Proven experience building and optimizing distributed data pipelines using Databricks and PySpark
- Solid understanding of databases such as MongoDB, DynamoDB, MariaDB, and PostgreSQL
- Proficiency with CI/CD tools like Jenkins, GitHub Actions, and related automation frameworks
- Hands-on experience with observability tools like New Relic for monitoring and troubleshooting
Responsibilities
- Design and build self-serve platforms that support real-time ML deployment and robust data engineering workflows
- Develop microservices-based solutions using Kubernetes and Docker for scalability, fault tolerance, and efficiency
- Create APIs and backend services using Python and FastAPI to manage and monitor ML workflows and data pipelines
- Architect and implement platforms for real-time ML inference using tools like AWS SageMaker and Databricks
- Enable model versioning, monitoring, and lifecycle management with observability tools such as New Relic
- Build and optimize ETL/ELT pipelines for data preprocessing, transformation, and storage using PySpark and Pandas
- Develop and manage feature stores to ensure consistent, high-quality data for ML model training and deployment
- Design scalable, distributed data pipelines on platforms like AWS, integrating tools such as DynamoDB, PostgreSQL, MongoDB, and MariaDB
- Implement data lake and data warehouse solutions to support advanced analytics and ML workflows
- Design and implement robust CI/CD pipelines using Jenkins, GitHub Actions, and other tools for automated deployments and testing
- Automate data validation and monitoring processes to ensure high-quality and consistent data workflows
- Create and maintain detailed technical documentation, including high-level and low-level architecture designs
- Collaborate with cross-functional teams to gather requirements and deliver solutions that align with business goals
- Participate in Agile processes such as sprint planning, daily standups, and retrospectives using tools like Jira