Principal Data Engineer

SpyCloud Logo

SpyCloud

📍Remote - United States

Summary

Join SpyCloud as a Principal Data Engineer and lead the architectural transformation of our data systems, building a scalable, AWS-native Lakehouse architecture. You will build and optimize data pipelines, design systems for machine learning and large language models, and ensure data quality, security, and reliability. This hands-on role requires strong leadership, collaboration, and technical expertise in data engineering and AWS technologies. You will mentor data engineers and champion best practices. The position offers a competitive salary and benefits package, including flexible and remote-friendly work options.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related STEM field
  • 12+ years of experience in data engineering or closely related disciplines
  • 2+ years in a technical leadership or principal engineering role
  • Experience building and operating Lakehouse or large-scale distributed data lake architectures in AWS
  • Knowledge of distributed systems as it pertains to data storage and computing
  • Hands-on experience preparing and managing data for machine learning and LLM workflows
  • Strong development skills in Python and Go , especially in building data-intensive systems
  • Practical experience using DynamoDB and other AWS-native data services at scale
  • Effective communication and collaboration across technical and non-technical stakeholders
  • Proven ability to mentor and grow engineering talent and lead large-scale initiatives

Responsibilities

  • Redesign and modernize data architecture for scalability, performance, and AI-readiness
  • Build and optimize real-time and batch data pipelines using AWS tools (e.g, Lambda, S3, DynamoDB)
  • Design systems to prepare and serve high-quality data for ML models and LLM applications
  • Optimize cost and performance of daily data ingestion, transformation, and storage workflows
  • Work with Product and Engineering teams to ensure data solutions meet product requirements
  • Evaluate and implement modern tooling for data processing, ML data workflows, and observability
  • Guide and mentor data engineers, championing best practices and technical excellence
  • Ensure operational data quality, security, and reliability across systems
  • Lead long-term data architecture planning aligned with company and AI-driven goals

Preferred Qualifications

  • Familiarity with AI/ML tools such as SageMaker, LangChain, or vector databases
  • Understanding of data governance, privacy, and compliance in AI contexts
  • Comfort working in fast-paced environments with evolving product and technical requirements

Benefits

  • 401(k) with Employer Contribution
  • Health, Vision, and Dental Insurance Health Savings Account (HSA) available with Employer Contribution
  • Employer Paid Life, Short-term, and Long-term Disability Insurance
  • Generous PTO Plan and 16 paid holidays per year
  • Retirement Savings Plan with Employer Contribution
  • Employer Provided Private Health Insurance and Healthcare Cashplan
  • Employer Paid Life Insurance and Income Replacement
  • Generous Holiday Plan and 14 paid holidays per year

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.