Remote Databricks Engineer

closed
Logo of LightFeather

LightFeather

πŸ’΅ $100k-$160k
πŸ“Remote - Worldwide

Job highlights

Summary

LightFeather is seeking a skilled Databricks Engineer to join their team. The successful candidate will design, implement, and optimize data pipelines using Databricks, collaborate with stakeholders, and ensure seamless data flow and efficient data processing. This is a full-time remote position.

Requirements

  • US Citizenship
  • Active clearance at the Public Trust level or higher. IRS clearance preferred
  • Bachelor’s degree preferred or equivalent experience
  • 5+ years of hands-on experience with Databricks, including designing and managing large-scale data pipelines
  • Proficiency in ETL tools and techniques, with a strong understanding of data integration from sources like Google Analytics (GA4), Splunk, and Medallion
  • Solid experience with SQL, Python, and Spark for data processing and transformation
  • Familiarity with cloud platforms such as AWS, Azure, or Google Cloud, with a focus on their data services
  • Experience with other big data technologies such as Apache Airflow
  • Knowledge of data warehousing concepts and best practices
  • Familiarity with data visualization tools such as Tableau, Power BI, or Looker
  • Proven experience in designing and deploying Databricks infrastructure on cloud platforms, preferably Amazon AWS
  • Deep understanding of Apache Spark, Delta Lake, and their integration within the Databricks environment
  • Proficient in Terraform for implementing infrastructure as code (IaC) solutions
  • Strong expertise in Python, especially in developing Notebooks for data analysis within Databricks
  • Demonstrated ability to design and implement complex data pipelines with ETL processes for large-scale data aggregation and analysis
  • Knowledge of best practices for infrastructure scaling and data management, with a keen focus on security and robustness
  • Strong problem-solving skills and the ability to troubleshoot complex data issues
  • Excellent communication and collaboration skills to work effectively with cross-functional teams

Responsibilities

  • Develop and maintain ETL processes to extract, transform, and load data from various sources into Databricks
  • Design and implement data pipelines and workflows using Databricks, ensuring scalability, reliability, and performance
  • Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and provide appropriate data solutions
  • Develop and maintain Python Notebooks within Databricks for data analysis and processing, optimizing data workflows for efficiency and accuracy
  • Optimize and tune data processing jobs for performance and cost-efficiency
  • Ensure data quality and consistency through robust data validation and cleansing techniques
  • Monitor and troubleshoot data pipeline issues, ensuring timely resolution and minimal downtime
  • Leverage Terraform for infrastructure as code (IaC) practices to automate and manage infrastructure provisioning and scaling
  • Stay updated with the latest trends and advancements in data engineering and Databricks technologies

Benefits

  • You'll be part of a team dedicated to meaningful impact, working on solutions that address mission-critical needs
  • Experience variety, fulfillment, and the opportunity to work with some of the best in the industry
  • LightFeather is committed to fostering a diverse and inclusive environment for all employees, regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, veteran, or disability status
This job is filled or no longer available