We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions

[ total of 41 entries: 1-25 | 26-41 ]
[ showing 25 entries per page: fewer | more | all ]

Tue, 17 May 2022

[1]  arXiv:2205.07711 [pdf, other]
Title: Transferability of Adversarial Attacks on Synthetic Speech Detection
Comments: 5 pages, submit to Interspeech2022
Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[2]  arXiv:2205.07682 [pdf, ps, other]
Title: L3-Net Deep Audio Embeddings to Improve COVID-19 Detection from Smartphone Data
Comments: accepted for IEEE SMARTCOMP 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[3]  arXiv:2205.07450 [pdf, other]
Title: PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[4]  arXiv:2205.07319 [pdf]
Title: cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[5]  arXiv:2205.07646 (cross-list from cs.CL) [pdf, other]
Title: A Fast Attention Network for Joint Intent Detection and Slot Filling on Edge Devices
Comments: 9 pages, 4 figures
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6]  arXiv:2205.07390 (cross-list from eess.AS) [pdf, other]
Title: Learning Representations for New Sound Classes With Continual Self-Supervised Learning
Comments: Submitted to IEEE Signal Processing Letters
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[7]  arXiv:2205.07301 (cross-list from cs.GR) [pdf, other]
Title: Conditional Vector Graphics Generation for Music Cover Images
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[8]  arXiv:2205.07211 (cross-list from eess.AS) [pdf, other]
Title: GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[9]  arXiv:2205.07180 (cross-list from eess.AS) [pdf, other]
Title: Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Comments: Submitted to Interspeech
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[10]  arXiv:2205.07100 (cross-list from cs.CL) [pdf, other]
Title: Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation
Comments: NAACL-SRW 2022
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[11]  arXiv:2205.07086 (cross-list from eess.AS) [pdf, other]
Title: Collar-aware Training for Streaming Speaker Change Detection in Broadcast Speech
Comments: Accepted to Speaker Odyssey 2022
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[12]  arXiv:2205.06963 (cross-list from cs.CL) [pdf, other]
Title: Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Comments: Submitted to INTERSPEECH 2022
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[13]  arXiv:2205.06931 (cross-list from eess.AS) [pdf, other]
Title: Task splitting for DNN-based acoustic echo and noise removal
Comments: submitted to IWAENC 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Mon, 16 May 2022

[14]  arXiv:2205.06799 [pdf, other]
Title: The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes
Comments: 5 pages, part of the ACM Multimedia 2022 Grand Challenge "The ACM Multimedia 2022 Computational Paralinguistics Challenge (ComParE 2022)"
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[15]  arXiv:2205.06655 (cross-list from cs.CL) [pdf, other]
Title: Unified Modeling of Multi-Domain Multi-Device ASR Systems
Comments: Submitted to Interspeech 2022
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Fri, 13 May 2022

[16]  arXiv:2205.06066 [pdf, other]
Title: Data-aided Underwater Acoustic Ray Propagation Modeling
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17]  arXiv:2205.06053 [pdf, other]
Title: Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Comments: Submitted to INTERSPEECH 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[18]  arXiv:2205.05871 [pdf, other]
Title: Towards Robust Unsupervised Disentanglement of Sequential Data -- A Case Study Using Music Audio
Comments: The paper is accepted to IJCAI 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[19]  arXiv:2205.06182 (cross-list from cs.CL) [pdf, other]
Title: Improved Meta Learning for Low Resource Speech Recognition
Comments: Published in IEEE ICASSP 2022
Journal-ref: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 4798-4802
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[20]  arXiv:2205.06157 (cross-list from eess.AS) [pdf, other]
Title: Training Strategies for Own Voice Reconstruction in Hearing Protection Devices using an In-ear Microphone
Comments: Submitted to IWAENC 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[21]  arXiv:2205.05949 (cross-list from eess.AS) [pdf, other]
Title: Automated Audio Captioning: an Overview of Recent Progress and New Challenges
Comments: Submitted to EURASIP Journal on Audio Speech and Music Processing in April
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD)
[22]  arXiv:2205.05785 (cross-list from eess.AS) [pdf, other]
Title: Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model
Comments: Submitted to INTERSPEECH 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23]  arXiv:2205.05764 (cross-list from cs.LG) [pdf, other]
Title: Deep Learning and Synthetic Media
Comments: Forthcoming in Synthese (please cite published version)
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[24]  arXiv:2205.05684 (cross-list from eess.AS) [pdf, other]
Title: A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Comments: arXiv admin note: text overlap with arXiv:2205.05586
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD)

Thu, 12 May 2022 (showing first 1 of 13 entries)

[25]  arXiv:2205.05580 [pdf, other]
Title: Scream Detection in Heavy Metal Music
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[ total of 41 entries: 1-25 | 26-41 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2205, contact, help  (Access key information)