Senior Staff Data Engineer at Oportun

Summary

Join Oportun's dynamic team as a Senior Staff Data Engineer and lead the development of our cutting-edge data infrastructure. You will architect and deploy end-to-end MLops pipelines, implement CI/CD pipelines, and optimize ML training workloads. Collaborate with diverse teams to craft innovative solutions and elevate our data engineering capabilities. This pivotal role requires extensive experience in data engineering, MLOps, and cloud platforms. Your expertise will be instrumental in propelling our products to new heights and making a positive impact on our clients.

Requirements

Requires 12+ years of related experience in data engineering, with a Bachelor's degree in Computer Science; or a Master's degree with an equivalent combination of education and experience
Experience in MLOps, ML infrastructure, or ML platform engineering
Expert in databricks, with hands-on experience in MLflow, model serving, and delta lake
Strong background in CI/CD automation for ML models using Github actions, Terraform
Proficiency in Python, Spark, and SQL, specifically for ML pipeline automation
Experience with feature stores (databricks feature store) for ML feature standardization
Extensive experience orchestrating the development of end-to-end data engineering infrastructure for intricate and large-scale applications
Proven record of transformative leadership, guiding technical teams to achieve remarkable outcomes and innovation
Profound mastery of data engineering architecture and frameworks across batch and stream processing of data, such as Hadoop ecosystem, Medallion architecture, Databricks or equivalent data warehouse / data lake platforms, coupled with Python / PySpark programming
Thorough comprehension of software engineering principles, version control (Git), and collaborative development workflows
Adeptness with cloud platforms (AWS / Azure / GCP) and utilization of cloud-native services for crafting robust data engineering infrastructure
Track record of successfully integrating DevOps practices, continuous integration, and continuous deployment (CI/CD) pipelines
Superior problem-solving acumen and ability to navigate intricate technical challenges with dexterity
Exceptional communication aptitude, capable of fostering effective collaboration across diverse teams and stakeholders

Responsibilities

Model deployment & Infrastructure Automation
Architect and deploy end-to-end MLops pipelines on databricks for automated model training, deployment, and versioning
Implement CI/CD pipelines for ML models using Github actions, Terraform
ML model Orchestration & Management
Leverage databricks MLflow for model tracking, versioning, and lifecycle management
Deploy models using databricks model serving, sagemaker ML endpoints
Build real-time ML inference pipelines with databricks jobs and delta live tables
Optimize spark-based ML training workloads for scalability and cost efficiency
Model monitoring & observability
Implement automated model monitoring for drift detection, bias tracking, and performance degradation
Set up logging, alerting, and monitoring using databricks unity catalog, MLflow, new relic
Enable automated model rollback strategies based on performance degradation thresholds
Implement a/b testing and shadow deployments for model validation before full rollout
Security, compliance & governance
Ensure model security and compliance with GDPR, SOC2, HIPPA, FTC standards
Use unity catalog and lakehouse governance for access control, model lineage, and auditability
Establish data encryption, identity management (iam), and secure model serving practices
Enforce reproducibility and explainability using MLflow and databricks lakehouse governance
Scaling & performance optimization
Optimize ML training and inference costs by tuning databricks clusters, spark jobs, and delta lake performance
Scale MLops workflows to handle large volumes of data and concurrent model deployments
Automate GPU/TPU resource allocation for high-performance training workloads
Set the strategic vision and lead the implementation of a cutting-edge data infrastructure roadmap, encompassing all facets as highlighted above
Provide exceptional technical leadership, mentoring, and guidance to a team of data engineers, fostering a culture of continuous learning and innovation
Collaborate closely with data scientists to translate intricate model requirements into optimized data pipelines, ensuring impeccable data quality, processing, and integration
Spearhead the establishment of best practices for model versioning, experiment tracking, and model evaluation to ensure transparency and reproducibility
Engineer automated CI/CD pipelines that facilitate seamless deployment, monitoring, and continuous optimization for code and configurations in data engineering
Define and refine performance benchmarks, and optimize data infrastructure to achieve peak correctness, availability, cost efficiency, scalability, and robustness
Highly motivated self-starter who loves ownership and responsibility while working in a collaborative and interdependent team environment
Work with multiple teams of data engineers to design, develop, and test major software and data systems components using an agile, scrum methodology
Drive strong data engineering practices around product development execution, operational excellence in observability, quality, reliability, and developer efficiency
Remain at the forefront of industry trends and emerging technologies, expertly integrating the latest advancements into our data ecosystem

Preferred Qualifications

Experience or Knowledge in financial services domain
Experience or Knowledge of one or more data processing frameworks using AWS will be a strong plus
Ability to handle multiple competing priorities in a fast-paced environment

Senior Staff Data Engineer

Oportun

Summary

Requirements

Responsibilities

Preferred Qualifications

Remote

Data

Senior

Share this job:

Similar Remote Jobs

Machinify, Inc.

Remote

Data

Senior

NBCUniversal

Remote

Data

Senior

NBCUniversal

Remote

Data

Senior

Remote

Data

Senior

Remote

Data

Senior

Remote

Data

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior