ML Engineer

Vosyn
Summary
Join Vosyn, a pre-seed AI startup, as a Senior ML Engineer SME (Subject Matter Expert) to provide strategic guidance and technical leadership for our speech-to-speech (S2S) pipeline. This fully remote, contract position (10-15 hours/week) offers an hourly rate of $250. You will collaborate with the VosynCore team, solving complex challenges in text-to-speech (TTS) model development and optimization. Your expertise will be crucial in driving project progress and ensuring our S2S pipeline meets industry standards. You will mentor the team and contribute to the development of advanced deep learning models for audio processing. This is a unique opportunity to contribute to a fast-growing global organization and leave your mark on the future of AI.
Requirements
- 5+ years of proven experience in machine learning development focused on audio generation and TTS systems
- Extensive expertise in audio signal processing, particularly for human voices
- Deep experience with TTS models, including waveform generation and spectrogram-based methods
- Proven expertise in tuning TTS models for duration control and speech characteristics
- Strong proficiency in Python and machine learning frameworks such as PyTorch
- Experience with advanced deep learning models like WaveNet and transformer-based architectures
- Demonstrated experience in distributed training and deployment of ML models on cloud platforms
- Strong understanding of evaluation metrics for TTS systems
- Proven ability to provide technical leadership and actionable guidance
- Excellent communication and mentoring skills
Responsibilities
- Provide expert-level advice and mentorship on the architecture, training, and production of text-to-speech (TTS) models
- Guide the implementation of robust testing methodologies for TTS models using industry standards like MOS testing
- Share expertise in distributed training, monitoring, and deployment of large-scale ML models on cloud platforms
- Lead latency optimization initiatives in real-time systems for high-quality speech-to-speech conversion
- Provide guidance on tuning TTS models for precise control over speech characteristics
- Share in-depth knowledge of various TTS model architectures and waveform generation methods
- Mentor the team on implementing advanced deep learning models for audio processing
- Guide the development of transformer architectures for complex TTS model development
Preferred Qualifications
- Experience with real-time audio processing systems
- Background in speech synthesis research
- Knowledge of multiple languages and accents in TTS
- Experience with ML model optimization techniques
- Publications or patents in related fields
Benefits
Remote-first culture with flexible working arrangements
Share this job:
Similar Remote Jobs






