Machine Learning Engineer – Speech Processing

CloudWalk Logo

CloudWalk

📍Remote

Summary

Join CloudWalk R&D, a fintech company reimagining financial services, as an ML Engineer specializing in automatic speech recognition (ASR) and audio processing. You will prototype voice-to-action features, creating intuitive customer experiences. Design and implement speech processing, collaborate with mobile developers for integration, and work with a performance-focused ML engineer for efficient pipelines. This role blends deep speech and ML knowledge with a builder's mindset, transforming research into user-facing prototypes. The ideal candidate possesses hands-on ASR and real-time audio processing expertise, a strong machine learning foundation, and Python proficiency. CloudWalk fosters a collaborative environment where curiosity and adaptability are highly valued.

Requirements

  • Speech & Audio Processing: Deep, hands-on experience with ASR and real-time audio processing. You know how to build streaming pipelines, handle noise and interruptions
  • Machine Learning Expertise: Solid foundation in machine learning and deep learning - particularly for sequential data, with working knowledge of architectures like RNNs, CNNs, or Transformers in audio contexts
  • Python & ML Engineering: Strong proficiency in Python, with the ability to build, debug, and tune ML pipelines using frameworks like PyTorch or TensorFlow
  • MLOps Tooling: Experience with tools like MLflow, Weights & Biases, or other systems for model tracking, versioning, and lifecycle management
  • Experimental Mindset: Ability to design and run meaningful experiments, interpret results carefully, and iterate based on what the data says - not assumptions
  • Curiosity & Adaptability: A deep drive to explore new technologies, ask better questions, and stay flexible in the face of evolving requirements

Responsibilities

  • Design and implement the speech processing core of our prototypes - from ASR model selection to inference pipelines and tuning
  • Collaborate with a mobile developer to integrate speech input into working prototypes
  • Work closely with a performance-focused ML engineer to ensure low-latency, robust streaming pipelines under real-world constraints like noise and spotty connectivity

Preferred Qualifications

  • Conversational AI: Experience designing or experimenting with dialogue systems, intent workflows, or agent-like architectures
  • Frontend Development: Understanding of how user interfaces are built (especially in Flutter) and interest in connecting backend intelligence to user-facing experiences
  • Cloud Deployment Experience: Familiarity with deploying and managing ML models on Google Cloud
  • Systems Efficiency: Awareness of strategies for performance tuning, edge deployment, or handling constrained environments (connectivity, device limits, etc.)

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.