Software Engineer II

Scribd
Summary
Join Scribd's ML Data Engineering team as a Software Engineer II and contribute to building and optimizing scalable systems for metadata extraction and processing. You will design and develop data pipelines, collaborate with cross-functional teams, and ensure data quality and integrity. This role requires 4+ years of backend software engineering experience, proficiency in programming languages like Python or Scala, and experience with cloud providers like AWS. Scribd offers a competitive compensation package, including a comprehensive benefits package, equity ownership, and flexible work arrangements through Scribd Flex. Occasional in-person attendance is required. The company prioritizes a culture of GRIT (Goals, Results, Innovation, Team).
Requirements
- 4+ years of experience in backend software engineering, with hands-on work in developing data pipelines and building and deploying your own infrastructure
- Proficient in one or more programming languages, such as Python, Scala, Ruby or similar
- Experience working with a public cloud provider (AWS, Azure, or Google Cloud)
- Hands-on experience with building, deploying, and optimizing solutions using ECS, EKS or AWS Lambdas
- Experience working with systems at scale
- Proven ability to test and optimize systems for performance and scalability
- Bachelorβs in CS or equivalent professional experience
Responsibilities
- Design and develop data pipelines to extract, enrich, and process metadata from millions of documents, images, and other content types
- Collaborate with cross-functional teams, including ML engineers and product managers, to deliver scalable, efficient, and reliable metadata solutions
- Build and maintain systems that operate at a massive scale, handling hundreds of millions of documents and billions of images
- Optimize and refactor existing systems for performance, scalability, and reliability
- Ensure data accuracy, integrity, and quality through automated validation and monitoring
- Participate in code reviews, ensuring best practices are followed and maintaining high-quality standards in the codebase
- Manage and maintain data pipelines, security and infrastructure
Preferred Qualifications
- Bonus points on hands-on experience with data processing frameworks like Apache Spark, Databricks, or similar tools for large-scale data processing
- Bonus points if you have experience working with Machine Learning systems
Benefits
- Healthcare Insurance Coverage (Medical/Dental/Vision): 100% paid for employees
- 12 weeks paid parental leave
- Short-term/long-term disability plans
- 401k/RSP matching
- Onboarding stipend for home office peripherals + accessories
- Tuition Reimbursement
- Learning & Development programs
- Quarterly stipend for Wellness, Connectivity & Comfort
- Mental Health support & resources
- Free subscription to Scribd + gift memberships for friends & family
- Referral Bonuses
- Book Benefit
- Sabbaticals
- Company wide events
- Team engagement budgets
- Vacation & Personal Days
- Paid Holidays (+ winter break)
- Flexible Sick Time
- Volunteer Day
- Company-wide Employee Resource Groups and programs that foster an inclusive and diverse workplace