My Account Log in

1 option

Speech and Computer : 26th International Conference, SPECOM 2024, Belgrade, Serbia, November 25–28, 2024, Proceedings, Part I / edited by Alexey Karpov, Vlado Delić.

Springer Nature - Springer Computer Science (R0) eBooks 2025 English International Available online

View online
Format:
Book
Author/Creator:
Karpov, Alexey.
Contributor:
Delić, Vlado.
Series:
Lecture Notes in Artificial Intelligence, 2945-9141 ; 15299
Language:
English
Subjects (All):
Artificial intelligence.
Image processing--Digital techniques.
Image processing.
Computer vision.
Computer engineering.
Computer networks.
Application software.
Artificial Intelligence.
Computer Imaging, Vision, Pattern Recognition and Graphics.
Computer Engineering and Networks.
Computer and Information Systems Applications.
Local Subjects:
Artificial Intelligence.
Computer Imaging, Vision, Pattern Recognition and Graphics.
Computer Engineering and Networks.
Computer and Information Systems Applications.
Physical Description:
1 online resource (0 pages)
Edition:
1st ed. 2025.
Place of Publication:
Cham : Springer Nature Switzerland : Imprint: Springer, 2025.
Summary:
The two-volume set LNAI 15299 and 15300 constitutes the refereed proceedings of the 26th International Conference on Speech and Computer, SPECOM 2024, held in Belgrade, Serbia, during November 25–28, 2024. The 53 full papers included in these proceedings were carefully reviewed and selected from 90 submissions. The book also contains two invited talks in full paper length. The papers are organized in the following topical sections: Volume I: Invited papers; automatic speech recognition; speech and language resources; speech synthesis and perception; and speech processing for medicine. Volume II: Computational paralinguistics; affective computing; speaker recognition; digital speech processing; natural language processing.
Contents:
Invited Papers
Preserving Language Heritage Through Speech Technology: The Case of Upper Sorbian
Retrospective and Perspectives of TTS & STT Technology Development and Implementation for South Slavic Under-Resourced Languages
Automatic Speech Recognition
Comparison of Well- and Lower-Resourced Self-Training in ASR
Towards a Livvi-Karelian End-to-End ASR System
Advances in OpenASR21 Evaluation with Increased Temporal Resolution for Speech Self-Supervised Learning Models
Benchmarking Whisper under Diverse Audio Transformations and Real-time Constraints
AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost
Pre-Training and Adverse Audio Samples for Data-Efficient Wake Word Detection
Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline
Speech and Language Resources
The ParlaSpeech Collection of Automatically Generated Speech and Text Corpora from Parliamentary Proceedings
ESC Corpus of Spoken Russian: Everyday Student Conversations Captured through Continuous Speech Recording in Natural Communicative Environments
OpenAV: Bilingual Dataset for Audio-Visual Voice Control of a Computer for Hand Disabled People
Bulgarian Speech Resources in the CHILDES System
Multiword Units in Russian Everyday Speech: Empirical Classification and Corpus-Based Studies
Neurophysiological Correlates of Textual Modulation in Visual Stimuli: An Experimental Study of Russian and English Memes
Speech Synthesis and Perception
End-to-End Speech Synthesis for the Serbian Language Based on Tacotron
ChildTinyTalks (CTT): A Benchmark Dataset and Baseline for Expressive Child Speech Synthesis
Multidimensional Rhythm: Comparing Rhythmic Properties of Australian and New Zealand Monologues
Influence of Linguistic and Sociolinguistic Factors on Speech Rate Perception
Human and Machine Keyphrase Perception in Russian Text and Speech
Assessment of Children’s Ability to Manifest Emotions in Facial Expressions, Voice and Speech by Humans, Automatic, and on a Likert Scale
Speech Processing for Medicine
Investigating the Utility of wav2vec 2.0 Hidden Layers for Detecting Multiple Sclerosis
Cross-Cultural Automatic Depression Detection based on Audio Signals
Depression Classification using Token Merging-based Speech Spectrotemporal Transformer
Detecting Depression from Audio Data
Binary and Multiclass Classification of Dysphonia Using Whisper Encoder and One-Dimensional Convolutional Neural Network
Approach to Assessing the Quality of Syllable Pronunciation by Patients in the Process of Speech Rehabilitation Based on Comparison with Healthy Speakers
A Comparative Study for Contextualized Spoken Answer Classification in German Medical Questionnaires.
Notes:
Description based on publisher supplied metadata and other sources.
ISBN:
9783031779619
3031779614
OCLC:
1474242802

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account