Senior Staff Data Engineer

Oportun Logo

Oportun

πŸ“Remote - India

Summary

Join Oportun's dynamic team as a Senior Staff Data Engineer and lead the development of our cutting-edge data infrastructure. You will architect and deploy end-to-end MLops pipelines, implement CI/CD pipelines, and optimize ML training workloads. Collaborate with diverse teams to craft innovative solutions and elevate our data engineering capabilities. This pivotal role requires extensive experience in data engineering, MLOps, and cloud platforms. Your expertise will be instrumental in propelling our products to new heights and making a positive impact on our clients.

Requirements

  • Requires 12+ years of related experience in data engineering, with a Bachelor's degree in Computer Science; or a Master's degree with an equivalent combination of education and experience
  • Experience in MLOps, ML infrastructure, or ML platform engineering
  • Expert in databricks, with hands-on experience in MLflow, model serving, and delta lake
  • Strong background in CI/CD automation for ML models using Github actions, Terraform
  • Proficiency in Python, Spark, and SQL, specifically for ML pipeline automation
  • Experience with feature stores (databricks feature store) for ML feature standardization
  • Extensive experience orchestrating the development of end-to-end data engineering infrastructure for intricate and large-scale applications
  • Proven record of transformative leadership, guiding technical teams to achieve remarkable outcomes and innovation
  • Profound mastery of data engineering architecture and frameworks across batch and stream processing of data, such as Hadoop ecosystem, Medallion architecture, Databricks or equivalent data warehouse / data lake platforms, coupled with Python / PySpark programming
  • Thorough comprehension of software engineering principles, version control (Git), and collaborative development workflows
  • Adeptness with cloud platforms (AWS / Azure / GCP) and utilization of cloud-native services for crafting robust data engineering infrastructure
  • Track record of successfully integrating DevOps practices, continuous integration, and continuous deployment (CI/CD) pipelines
  • Superior problem-solving acumen and ability to navigate intricate technical challenges with dexterity
  • Exceptional communication aptitude, capable of fostering effective collaboration across diverse teams and stakeholders

Responsibilities

  • Model deployment & Infrastructure Automation
  • Architect and deploy end-to-end MLops pipelines on databricks for automated model training, deployment, and versioning
  • Implement CI/CD pipelines for ML models using Github actions, Terraform
  • ML model Orchestration & Management
  • Leverage databricks MLflow for model tracking, versioning, and lifecycle management
  • Deploy models using databricks model serving, sagemaker ML endpoints
  • Build real-time ML inference pipelines with databricks jobs and delta live tables
  • Optimize spark-based ML training workloads for scalability and cost efficiency
  • Model monitoring & observability
  • Implement automated model monitoring for drift detection, bias tracking, and performance degradation
  • Set up logging, alerting, and monitoring using databricks unity catalog, MLflow, new relic
  • Enable automated model rollback strategies based on performance degradation thresholds
  • Implement a/b testing and shadow deployments for model validation before full rollout
  • Security, compliance & governance
  • Ensure model security and compliance with GDPR, SOC2, HIPPA, FTC standards
  • Use unity catalog and lakehouse governance for access control, model lineage, and auditability
  • Establish data encryption, identity management (iam), and secure model serving practices
  • Enforce reproducibility and explainability using MLflow and databricks lakehouse governance
  • Scaling & performance optimization
  • Optimize ML training and inference costs by tuning databricks clusters, spark jobs, and delta lake performance
  • Scale MLops workflows to handle large volumes of data and concurrent model deployments
  • Automate GPU/TPU resource allocation for high-performance training workloads
  • Set the strategic vision and lead the implementation of a cutting-edge data infrastructure roadmap, encompassing all facets as highlighted above
  • Provide exceptional technical leadership, mentoring, and guidance to a team of data engineers, fostering a culture of continuous learning and innovation
  • Collaborate closely with data scientists to translate intricate model requirements into optimized data pipelines, ensuring impeccable data quality, processing, and integration
  • Spearhead the establishment of best practices for model versioning, experiment tracking, and model evaluation to ensure transparency and reproducibility
  • Engineer automated CI/CD pipelines that facilitate seamless deployment, monitoring, and continuous optimization for code and configurations in data engineering
  • Define and refine performance benchmarks, and optimize data infrastructure to achieve peak correctness, availability, cost efficiency, scalability, and robustness
  • Highly motivated self-starter who loves ownership and responsibility while working in a collaborative and interdependent team environment
  • Work with multiple teams of data engineers to design, develop, and test major software and data systems components using an agile, scrum methodology
  • Drive strong data engineering practices around product development execution, operational excellence in observability, quality, reliability, and developer efficiency
  • Remain at the forefront of industry trends and emerging technologies, expertly integrating the latest advancements into our data ecosystem

Preferred Qualifications

  • Experience or Knowledge in financial services domain
  • Experience or Knowledge of one or more data processing frameworks using AWS will be a strong plus
  • Ability to handle multiple competing priorities in a fast-paced environment

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.