We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions

[ total of 47 entries: 1-25 | 26-47 ]
[ showing 25 entries per page: fewer | more | all ]

Thu, 25 Nov 2021

[1]  arXiv:2111.12588 [pdf, other]
Title: Towards Cross-Cultural Analysis using Music Information Dynamics
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[2]  arXiv:2111.12531 [pdf, ps, other]
Title: Non-Intrusive Binaural Speech Intelligibility Prediction from Discrete Latent Representations
Comments: 4 pages + 1 refs; 1 figure; submitted to IEEE SPL (pending review)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[3]  arXiv:2111.12331 [pdf, other]
Title: An MAP Estimation for Between-Class Variance
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[4]  arXiv:2111.12326 [pdf, other]
Title: A Study on Decoupled Probabilistic Linear Discriminant Analysis
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[5]  arXiv:2111.12324 [pdf, other]
Title: How Speech is Recognized to Be Emotional - A Study Based on Information Decomposition
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6]  arXiv:2111.12124 [pdf, ps, other]
Title: Towards Learning Universal Audio Representations
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[7]  arXiv:2111.12566 (cross-list from q-bio.QM) [pdf, other]
Title: Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy
Comments: Submitted to Speech Prosody 2022
Subjects: Quantitative Methods (q-bio.QM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[8]  arXiv:2111.12516 (cross-list from eess.AS) [pdf, other]
Title: LightSAFT: Lightweight Latent Source Aware Frequency Transform for Source Separation
Comments: MDX Workshop @ ISMIR 2021, 7 pages, 1 figure
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[9]  arXiv:2111.12277 (cross-list from eess.AS) [pdf, other]
Title: One-shot Voice Conversion For Style Transfer Based On Speaker Adaptation
Comments: Submitted to ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10]  arXiv:2111.12203 (cross-list from eess.AS) [pdf, other]
Title: KUIELab-MDX-Net: A Two-Stream Neural Network for Music Demixing
Comments: MDX Workshop @ ISMIR 2021, 7 pages, 3 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Wed, 24 Nov 2021

[11]  arXiv:2111.11859 [pdf]
Title: Longitudinal Speech Biomarkers for Automated Alzheimer's Detection
Journal-ref: Frontiers in Computer Science, 08 April 2021
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Quantitative Methods (q-bio.QM)
[12]  arXiv:2111.11773 [pdf, other]
Title: Upsampling layers for music source separation
Comments: Demo page: this http URL
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[13]  arXiv:2111.11755 [pdf, other]
Title: Guided-TTS:Text-to-Speech with Untranscribed Speech
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[14]  arXiv:2111.11737 [pdf]
Title: ADTOF: A large dataset of non-synthetic music for automatic drum transcription
Comments: Proceedings of the 22nd International Society for Music Information Retrieval Conference, ISMIR, Online, pp. 818-824
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[15]  arXiv:2111.11636 [pdf]
Title: Music Classification: Beyond Supervised Learning, Towards Real-world Applications
Comments: This is a web book written for a tutorial session of the 22nd International Society for Music Information Retrieval Conference, Nov 8-12, 2021. Please visit this https URL for the original, web book format
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Audio and Speech Processing (eess.AS)
[16]  arXiv:2111.12028 (cross-list from cs.CL) [pdf]
Title: Romanian Speech Recognition Experiments from the ROBIN Project
Comments: 12 pages, 3 figures, ConsILR2020
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17]  arXiv:2111.11882 (cross-list from eess.AS) [pdf, other]
Title: Dataset of Spatial Room Impulse Responses in a Variable Acoustics Room for Six Degrees-of-Freedom Rendering and Analysis
Comments: 3 pages, 3 figures, 2 tables
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[18]  arXiv:2111.11831 (cross-list from eess.AS) [pdf, other]
Title: SpeechMoE2: Mixture-of-Experts Model with Improved Routing
Comments: 5 pages, 1 figure. Submitted to ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[19]  arXiv:2111.11703 (cross-list from cs.LG) [pdf, other]
Title: A Contextual Latent Space Model: Subsequence Modulation in Melodic Sequence
Authors: Taketo Akama
Comments: 22nd International Society for Music Information Retrieval Conference (ISMIR), 2021; 8 pages
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[20]  arXiv:2111.11606 (cross-list from eess.AS) [pdf, other]
Title: Effect of noise suppression losses on speech distortion and ASR performance
Comments: submitted to ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Tue, 23 Nov 2021 (showing first 5 of 10 entries)

[21]  arXiv:2111.11063 [pdf, other]
Title: Comparing the Accuracy of Deep Neural Networks (DNN) and Convolutional Neural Network (CNN) in Music Genre Recognition (MGR): Experiments on Kurdish Music
Comments: 8 pages, 5 figures, 3 tables
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[22]  arXiv:2111.11023 [pdf, ps, other]
Title: Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[23]  arXiv:2111.10897 [pdf, other]
Title: Health Monitoring of Industrial machines using Scene-Aware Threshold Selection
Comments: 5 pages, 4 figures, 1 Table
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[24]  arXiv:2111.10783 [pdf]
Title: Automatic Detection of Depression from Stratified Samples of Audio Data
Comments: 30 pages, 6 figures
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Audio and Speech Processing (eess.AS)
[25]  arXiv:2111.10639 [pdf, other]
Title: Implicit Acoustic Echo Cancellation for Keyword Spotting and Device-Directed Speech Detection
Comments: Submitted to ICASSP 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[ total of 47 entries: 1-25 | 26-47 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2111, contact, help  (Access key information)