Overview
Speech and audio are among the most natural, information-dense, and temporally grounded forms of human communication, yet they remain under-explored within mainstream large language model (LLM) research. While recent multimodal progress has expanded LLMs beyond text, speech and audio introduce fundamentally different challenges, including continuous temporal structure, multi-scale semantics, paralinguistic cues, interaction dynamics, and safety risks unique to audio such as impersonation and audio deepfakes.
The workshop centers on architectural design, training paradigms, data creation, alignment, and evaluation methodologies for large audio-language models, spanning speech, non-speech sounds, music, and mixed audio environments. SALMA aims to bridge NLP and speech/audio communities around shared architectures, objectives, benchmarks, and evaluation practices.
Important Dates
All deadlines are 11:59 PM UTC-12:00 (“Anywhere on Earth”).
| Event | Date |
|---|---|
| Paper Submission Deadline | July 27, 2026 |
| ARR Commitment Deadline | August 26, 2026 |
| Acceptance Notifications | September 9, 2026 |
| Camera-ready Deadline | September 23, 2026 |
| Workshop Date | During EMNLP 2026, October 24–29 |
News
- April 2026 – Website launched. Stay tuned for updates!
- Call for Papers is now open! See the Call for Papers page for details.