Publications

UmbraTTS: Adapting Text-to-Speech to Environmental Contexts with Flow Matching.
Neta Glazer, Aviv Navon, Yael Segal-Feldman, Aviv Shamsian, Hilit Segev, Asaf Buchnick, Menachem Pirchi, Gil Hetz, Joseph Keshet.
Workshop on Machine Learning for Audio, ICML 2025.
[paper] [demo]


FlowTSE: Target Speaker Extraction with Flow Matching.
Aviv Navon, Aviv Shamsian, Yael Segal-Feldman, Neta Glazer, Gil Hetz, Joseph Keshet.
InterSpeech 2025.
[paper] [demo]


Whisper in Medusa’s Ear: Multi-head Efficient Decoding for Transformer-based ASR.
Yael Segal-Feldman, Aviv Shamsian, Aviv Navon, Gill Hetz, Joseph Keshet.
ICASSP 2025.
[paper] [code] [blog] [demo]


WhisperNER: Unified Open Named Entity and Speech Recognition.
Gil Ayache, Menachem Pirchi, Aviv Navon, Aviv Shamsian, Gill Hetz, Joseph Keshet.
[paper] [code] [demo] [models]


Keyword-guided adaptation of automatic speech recognition.
Aviv Shamsian, Aviv Navon, Neta Glazer, Gill Hetz, Joseph Keshet.
InterSpeech 2024.
[paper]


Open-vocabulary keyword-spotting with adaptive instance normalizations.
Aviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet.
ICASSP 2024.
[paper]


Combining Language Models For Specialized Domains: A Colorful Approach.
Daniel Eitan, Menachem Pirchi, Neta Glazer, Shai Meital, Gil Ayach, Gidon Krendel, Aviv Shamsian, Aviv Navon, Gil Hetz, Joseph Keshet.
[paper]


More publications will be added soon. Stay tuned.