My Account Log in

1 option

Speech and Computer : 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings / edited by S. R. Mahadeva Prasanna, Alexey Karpov, K. Samudravijaya, Shyam S. Agrawal.

SpringerLink Books Lecture Notes In Computer Science (LNCS) (1997-2024) Available online

View online
Format:
Book
Contributor:
Prasanna, S. R. Mahadeva, editor.
Series:
Lecture Notes in Artificial Intelligence, 2945-9141 ; 13721
Language:
English
Subjects (All):
Artificial intelligence.
Computer engineering.
Computer networks.
Application software.
Image processing--Digital techniques.
Image processing.
Computer vision.
Artificial Intelligence.
Computer Engineering and Networks.
Computer and Information Systems Applications.
Computer Imaging, Vision, Pattern Recognition and Graphics.
Local Subjects:
Artificial Intelligence.
Computer Engineering and Networks.
Computer and Information Systems Applications.
Computer Imaging, Vision, Pattern Recognition and Graphics.
Physical Description:
1 online resource (737 pages)
Edition:
1st ed. 2022.
Place of Publication:
Cham : Springer International Publishing : Imprint: Springer, 2022.
Summary:
This book constitutes the proceedings of the 24th International Conference on Speech and Computer, SPECOM 2022, held as a hybrid event in Gurugram, India, in November 2022. The 51 full and 9 short papers presented in this volume were carefully reviewed and selected from 99 submissions. The papers present current research in the area of computer speech processing including audio signal processing, automatic speech recognition, speaker recognition, computational paralinguistics, speech synthesis, sign language and multimodal processing, and speech and language resources. .
Contents:
Thematic Diversity of Everyday Russian Discourse: a Case Study Based on the ORD corpus
Neural Embedding Extractors for Text-Independent Speaker Verification
Deep Speaker Embeddings based Online Diarization
Overlapped Speech Detection Using AM-FM based Time-Frequency Representations
Significance of Dimensionality Reduction in CNN-based Vowel Classification from Imagined Speech using Electroencephalogram Signals
Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource Language
An Initial Study on Birdsong Re-synthesis using Neural Vocoders
Speech Music Overlap Detection using Spectral Peak Evolutions
Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English
ClusterVote: Automatic Summarization Dataset Construction with Document Clusters
Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples
Celtic EnglishContinuum in Pitch Patterns of Spontane-ous Talk: Evidence of Long-Term Contacts
Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural Networks
Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore Vs BLEU Score
DyCoDa: A Multi-Modal Data Collection of Multi-User Remote Survival Game Recordings
On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection
Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection
Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous Speech
An Electroglottographic Method for Assessing the Emotional State of the Speaker
Significance of Distance on Pop Noise for Voice Liveness Detection
CRIM’s Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector Embeddings
Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical Approach
Comparing NLPSolutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS Systems
Detection of Speech Related Disorders by Pre-Trained Embedding Models Extracted Biomarkers
Multi-Label Dysfluency Classification
Harnessing Uncertainty - Multi-Label Dysfluency Classification with Uncertain Labels
Continuous Wavelet Transform for Severity-Level Classification of Dysarthria
Significance of Energy Features for Severity Classification of Dysarthria
Sailor and Hemant A. Patil An Analytic Study on Clustering-based Pseudo-Labels for Self-Supervised Deep Speaker Verification
Investigation of Transfer Learning for End-to-End Russian Speech Recognition
Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-Specific
Categorization of Threatening Speech Acts
Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem
Multi-level Fusion of Fisher Vector Encoded BERT and wav2vec 2.0 Embeddingsfor Native Language Identification
Fake Speech Detection using OpenSMILE Features
Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody Interaction
Classifying Mahout and Social Interactions of Asian Elephants based on Trumpet Calls
Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic
Fake Speech Detection using Modulation Spectrogram
Self-Configuring Genetic Programming Feature Generation in Affect Recognition Tasks
A Multi[1]Modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery Domain
Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language
Diarization Task
Low-resource Emotional Speech Synthesis: Transfer Learning, Data requirements and Adversarial Training
Fuzzy Classifier For Speech Assessment in Speech Rehabilitation
Analysis-by-Synthesis Modeling of Bengali Intonation
Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric Speech
Retrieval-based Dialogue Agents
Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of Phonograms
Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted Translation
Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification
Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners
Emotional Speech Recognition Based on Lip-Reading
Exploring The Use of Machine Learning for Resume Recommendations
The Role of Pause in Interaction: A Case of Polylogue
Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian Words
Effects of Depth of Field on Focus using a Virtual Reality Escape Room
Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer Interfaces
Device Robust Acoustic Scene Classification using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network
Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology
Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022.
Notes:
Includes bibliographical references and index.
Other Format:
Print version: Prasanna, S. R. Mahadeva Speech and Computer
ISBN:
9783031209802
303120980X

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account