Lead Software Engineer - AI Data Systems

Upwork
Summary
Join Upwork's growing AI team as a Lead Software Engineer (AI Data Systems) and build the critical infrastructure powering the future of intelligent, agent-driven systems. You will be responsible for collecting high-quality training data, building scalable featurization pipelines, and delivering performant systems to support model training and inference at scale. This role requires a T-shaped engineer with a breadth of skills and deep expertise in at least one technical area. Experience in startups or AI/ML research environments is essential. You will be a foundational hire, working with Python and shaping technology and the team. If you are passionate about AI infrastructure and thrive in early-stage environments, this is the opportunity to help set the standard for how agentic systems are built and deployed at scale.
Requirements
- Strong software engineering background with deep experience in building data collection, transformation, and featurization pipelines at scale
- Proficiency in Python, including async programming and concurrency tools, as well as data-centric frameworks such as Pandas, Spark, or Apache Beam
- Familiarity with ML model development workflows and infrastructure, including dataset versioning, experiment tracking, and model evaluation
- Experience deploying and scaling AI systems in cloud environments such as AWS, GCP, or Azure
- Proven success operating in highly ambiguous environments such as research labs, startups, or fast-paced product teams
- A track record of working with or alongside high-caliber peers in top engineering teams, research groups, or startup ecosystems
- Growth mindset, strong communication skills, and a commitment to inclusive collaboration and continuous learning
Responsibilities
- Design and implement systems to collect and curate high-quality training datasets for supervised, unsupervised, and reinforcement learning use cases
- Build scalable featurization and preprocessing pipelines to transform raw data into structured inputs for AI/ML model development
- Partner with ML engineers and researchers to define data requirements and production workflows that support LLM-based agents and autonomous AI systems
- Lead the development of infrastructure that enables experimentation, evaluation, and deployment of machine learning models in production environments
- Support orchestration and real-time inference pipelines using Python and modern cloud-native tools, ensuring low-latency and high availability
- Mentor engineers and foster a high-performance, collaborative engineering culture grounded in technical excellence and curiosity
- Drive cross-functional alignment with product, infrastructure, and research stakeholders, ensuring clarity on progress, goals, and architecture
Benefits
- Comprehensive medical coverage for you and your family
- Unlimited PTO
- A 401(k) plan with matching
- 12 weeks of paid parental leave
- An Employee Stock Purchase Plan