ISCA/ITG Workshop on Diversity in Large Speech and Language Models

Termin

Beginn:	20.02.2025
Ende:	20.02.2025

Veranstaltungs-Sprache

Veranstaltungsort

Berlin

Beschreibung

Machine learning techniques have conquered many different tasks in speech and natural language processing, such as speech recognition, information extraction, text and speech generation, and human machine interaction using natural language or speech (chatbots). Modern techniques typically rely on large models for representing general knowledge of one or several languages (Large Language Models, LLMs), or for representing speech and general audio characteristics. These models have been trained with large amounts of speech and language data, typically including web content. When humans interact with such technologies, the effectiveness of the interaction will be influenced by how far humans make use of the same type of language the models have been trained on or, in other words, if the models are able to generalize to the language used by humans when interacting with the technology. This may lead to some gradual forms of adaptation in human speech and language production, and users who do not adapt may be excluded from efficient use of such technologies. On top of this, as commercial model development follows market needs, under-represented languages and dialects/sociolects may decrease in terms of priorities. Furthermore, for many lesser spoken languages the necessary data is not available, which will worsen a digital divide in speech and language technology usage.

The workshop sets out to discuss this problem based on scientific contributions from the perspective of computer science and linguistics (including computational linguistics and NLP).
Topics which we aim to address include but are not limited to:

User diversity: Which aspects of human speech and language production affect the performance of large foundation models? In which way, and for which tasks?
Language use: How are large language models able to cope with different languages, dialects, and sociolects? How do they deal with code switching?
Human adaptation: How does the use of large language models affect language comprehension, as well as speech and language production? Which alignment effects occur, and in which time spans?
Model adaptation: How do models need to be designed to better cope with speech and language diversity? How do training and finetuning affect model performance?
Inclusion: What data and technologies are necessary to better cope with diversity in large speech and language models?

We invite both experimental and positional papers addressing these and other related research questions.

The workshop will consist of a number of oral presentations and discussion panels. Accepted speakers are invited to submit a short or long paper which will be published online after the workshop.

Veranstalter

ITG Informationstechnische Gesells. im VDE

Mitveranstalter

Technische Universität Berlin, Humboldt-Universität zu Berlin
German Center for Artificial Intelligence (DFKI) Berlin

Bemerkungen

Important dates:

Abstract submission: 13 December 2024
Notification of acceptance: 10 January 2025
Short or long paper submission: 14 February 2025

ISCA/ITG Workshop on Diversity in Large Speech and Language Models

Themen

Das könnte Sie auch interessieren: