
Machine Learning Engineer – Speech Processing

CloudWalk
Summary
Join CloudWalk R&D, a fintech company reimagining financial services, as an ML Engineer specializing in automatic speech recognition (ASR) and audio processing. You will prototype voice-to-action features, creating intuitive customer experiences. Design and implement speech processing, collaborate with mobile developers for integration, and work with a performance-focused ML engineer for efficient pipelines. This role blends deep speech and ML knowledge with a builder's mindset, transforming research into user-facing prototypes. The ideal candidate possesses hands-on ASR and real-time audio processing expertise, a strong machine learning foundation, and Python proficiency. CloudWalk fosters a collaborative environment where curiosity and adaptability are highly valued.
Requirements
- Speech & Audio Processing: Deep, hands-on experience with ASR and real-time audio processing. You know how to build streaming pipelines, handle noise and interruptions
- Machine Learning Expertise: Solid foundation in machine learning and deep learning - particularly for sequential data, with working knowledge of architectures like RNNs, CNNs, or Transformers in audio contexts
- Python & ML Engineering: Strong proficiency in Python, with the ability to build, debug, and tune ML pipelines using frameworks like PyTorch or TensorFlow
- MLOps Tooling: Experience with tools like MLflow, Weights & Biases, or other systems for model tracking, versioning, and lifecycle management
- Experimental Mindset: Ability to design and run meaningful experiments, interpret results carefully, and iterate based on what the data says - not assumptions
- Curiosity & Adaptability: A deep drive to explore new technologies, ask better questions, and stay flexible in the face of evolving requirements
Responsibilities
- Design and implement the speech processing core of our prototypes - from ASR model selection to inference pipelines and tuning
- Collaborate with a mobile developer to integrate speech input into working prototypes
- Work closely with a performance-focused ML engineer to ensure low-latency, robust streaming pipelines under real-world constraints like noise and spotty connectivity
Preferred Qualifications
- Conversational AI: Experience designing or experimenting with dialogue systems, intent workflows, or agent-like architectures
- Frontend Development: Understanding of how user interfaces are built (especially in Flutter) and interest in connecting backend intelligence to user-facing experiences
- Cloud Deployment Experience: Familiarity with deploying and managing ML models on Google Cloud
- Systems Efficiency: Awareness of strategies for performance tuning, edge deployment, or handling constrained environments (connectivity, device limits, etc.)
Share this job:
Similar Remote Jobs


