Publications

Last updated: 2 May 2025

Note: PDF is for personal use only. Copyrights are reserved by Publishers

You can find my Google Scholar profile here.

Preprint(s)

J. Luo, H. Phan, L. Wang, J. Reiss. Heterogeneous bimodal attention fusion for speech emotion recognition, arXiv preprint arXiv:2503.06405, 2025 [PDF]

J. Luo, H. Phan, L. Wang, J. Reiss. Bimodal Connection Attention Fusion for Speech Emotion Recognition, arXiv preprint arXiv:2503.05858, 2025 [PDF]

N. D. T. Nguyen, H. Phan, K. Mikkelsen, P. Kidmose. Single-word Auditory Attention Classification Using Deep Learning Models, arXiv preprint arXiv:2410.19793, 2024 [PDF]

N. D. T. Nguyen, H. Phan, S. Geirnaert, K. Mikkelsen, P. Kidmose. AADNet: An End-to-End Deep Learning Model for Auditory Attention Decoding, arXiv preprint arXiv:2410.13059, 2024 [PDF]

H. A. Just, M. Jin, A. K. Sahu, H. Phan, R. Jia, Data-Centric Human Preference Optimization with Rationales, arXiv preprint arXiv:2407.14477, 2024 [PDF]

O. Y. Chén, D. T. Vũ, C. S. Diaz, J. S. Bodelet, H. Phan, G. Allali, V.-D. Nguyen, H. Cao, X. He, Y. Müller, B. Zhi, H. Shou, H. Zhang, W. He, X. Wang, M. Munafò, N. L. Trung, G. Nagels, P. Ryvlin, G. Pantaleo. Residual Partial Least Squares Learning: Brain Cortical Thickness Simultaneously Predicts Eight Non-pairwise-correlated Behavioural and Disease Outcomes in Alzheimer’s Disease. bioRxiv, DOI:2024.03.11.584383, 2024 [PDF]

2025

K.-P. Huang, S.-W. Yang, H. Phan, B.-R. Lu, B. Kim, S. Macha, Q. Tang, S. Ghosh, H.-Y. Lee, C.-C. Kao, C. Wang, IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling. International Conference on Machine Learning (ICML), 2025
(Accepted)

S.-W. Yang, B. Kim, K.-P. Huang, Q. Tang, H. Phan, S. Ghosh, B.-R. Lu, H. Sundar, S. Ghosh, H.-Y. Lee, C.-C. Kao, C. Wang, Generative Audio Language Modeling with Continuous-valued Tokens and Masked Next-Token Prediction. International Conference on Machine Learning (ICML), 2025
(Accepted)

J. Liang, X. Liu, W. Wang, M. D. Plumbley, H. Phan, E. Benetos, Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities, IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2025 [Preprint]
(Accepted)

J. Wang, X. Wang, X. Ning, Y. Lin, H. Phan, Z. Jia. Subject-Adaptation Salient Wave Detection Network for Multimodal Sleep Stage Classification. IEEE Journal of Biomedical and Health Informatics (JBHI), vol. 29, no. 3, 2025 [PDF]

B. Zhai, H. Duan, Y. Guan, H. Phan, W. L. Woo, DSleepNet: Disentanglement Learning for Personal Attribute-agnostic Three-stage Sleep Classification Using Wearable Sensing Data. IEEE Journal of Biomedical and Health Informatics (JBHI), 2025 [PDF]

B. Kim, A. Bydlon, Q. Tang, H. Phan, C.-C. Kao, T. Zhang, C. Wang. Effective Techniques for Scaling Audio Encoder Pretraining. Proc. 50th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 [PDF]

S. Singh, E. Benetos, H. Phan, D. Stowell. LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging. Proc. 50th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 [PDF]

2024

J. Liang, I. Nolasco, B. Ghani, H. Phan, E. Benetos, D. Stowell. Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection. Proc. EUSIPCO, 2024 [PDF]

J. Luo, H. Phan, L. Wang, J. Reiss. Enhanced speech emotion recognition incorporating speaker-sensitive interactions in conversations. Proc. IEEE International Conference on Multimedia & Expo (ICME), 2024 [PDF]

M. Chen, H. Zhang, Y. Li, J. Luo, W. Wu, Z. Ma, P. Bell, C. Lai, J. Reiss, L. Wang, P. C. Woodland, X. Chen, H. Phan, T. Hain. 1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem. Proc. Odyssey, 2024 [Preprint] [PDF]

M. Baumert, H. Phan. A perspective on automated rapid eye movement sleep assessment. Journal of Sleep Research, article ID e14223, 2024 [PDF]

P. Autthasan, R. Chaisaen, H. Phan, M. De Vos, T. Wilaiprasitporn. MixNet: Joining Force of Classical and Modern Approaches toward The Comprehensive Pipeline in Motor Imagery EEG Classification. IEEE Internet of Things Journal, 2024 [PDF]

J. Liang, H. Zhang, H. Liu, Y. Cao, Q. Kong, X. Liu, W. Wang, M. D. Plumbley, H. Phan, E. Benetos. WavCraft: Audio Editing and Generation with Large Language Models. ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024 [Preprint] [Code]

H. Phan, B. Kim, V. Nguyen, A. Bydlon, Q. Tang, C.-C. Kao, C. Wang. Cross-triggering Issue in Audio Event Detection and Mitigation. Proc. 49th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 [PDF]

J. Liang, H. Phan, E. Benetos. Learning from Taxonomy: Multi-label Few-Shot Classification for Everyday Sound Recognition. Proc. 49th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 [Preprint][PDF][Dataset]

V.-D. Nguyen, H. Phan, O. Y. Chén, A. Mansour, A. Coatanhay, T. Marsault, “On the relationship between Vogler algorithm derivation and parabolic equation for multiple knife edge diffraction,” IEEE Transactions on Antennas and Propagation, vol. 72, no. 5, pp. 4682-4686, 2024

K. Kontras, C. Chatzichristos, H. Phan, J. Suykens, M. De Vos. CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2024 [Preprint] [PDF]

S. Singh, C. J. Steinmetz, E. Benetos, H. Phan, D. Stowell, ATGNN: Audio Tagging Graph Neural Network, IEEE Signal Processing Letters, 2024 [Preprint] [PDF]

2023

O. Y. Chén, J. S. Bodelet, R. G. Saraiva, H. Phan, J. Di, G. Nagels, T. Schwantje, H. Cao, J. Gou, J. M. Reinen, B. Xiong, B. Zhi, X. Wang, M. de Vos. The roles, challenges, and merits of the p value. Patterns, vol. 4, no. 12, article no. 100878, 2023 [PDF]

C. Vahidi, S. Singh, G. Fazekas, E. Benetos, D. Stowell, H. Phan, M. Lagrange. Perceptual Musical Similarity Metric Learning with Graph Neural Networks. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023 [PDF]

D. Ngo, L. Pham, H. Phan, D. Jarchi, S. Kolozali. An Inception-Residual-based Architecture with Multi-objective Loss for Detecting Respiratory Anomalies. IEEE 25th International Workshop on Multimedia Signal Processing (MMSP), 2023 [PDF]

D. Ngo, L. Pham, H. Phan, M. Tran, D. Jarchi. A Deep Learning Architecture with Spatio-Temporal Focusing for Detecting Respiratory Anomalies. 2023 IEEE Biomedical Circuits and Systems Conference (BioCAS), 2023 [PDF]

H. Phan, K. P. Lorenzen, E. Heremans, O. Y. Chén, M. C. Tran, P. Koch, A. Mertins, M. Baumert, K. Mikkelsen, M. De Vos. L-SeqSleepNet: Whole-cycle Long Sequence Modelling for Automatic Sleep Staging. IEEE Journal of Biomedical and Health Informatics (JBHI), vol. 27, no. 10, pp. 1-10, 2023 [PDF][Code]

J. Luo, H. Phan, L. Wang, J. Reiss. Mutual Cross-Attention in Dyadic Fusion Networks for Audio-Video Emotion Recognition. Proc. Affective Computing + Intelligent Interaction (ACII), 2023 [PDF]
(Spotlight)

R. Pandey, S. Jaiswal, H. Phan, S. Nannuru. Improving Audio Event Localization via Derivative Prediction. Proc. EUSIPCO, 2023 [PDF]

J. Liang, X. Liu, H. Liu, H. Phan, E. Benetos, M. Plumbley, W. Wang. Adapting Language-Audio Models as Few-Shot Audio Learners. Proc. INTERSPEECH, 2023 [PDF]

J. Luo, H. Phan, J. D. Reiss. Fine-tuned RoBERTa Model with a CNN-LSTM Network for Conversational Emotion Recognition. Proc. INTERSPEECH, 2023 [PDF]

I. Ghinassi, M. Purver, H. Phan, C. Newell. Exploring Pre-Trained Neural Audio Representations for Audio Topic Segmentation. Proc. IEEE International Conference on Multimedia & Expo (ICME), 2023 [PDF]

H. Phan, E. Heremans, O. Chén, P. Koch, A. Mertins, M. De Vos. Improving Automatic Sleep Staging via Temporal Smoothness Regularization. Proc. 48th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 [PDF]

J. Luo, H. Phan, J. D. Reiss. Cross-modal fusion techniques for utterance-level emotion recognition from text and speech. Proc. 48th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 [PDF][Preprint]

M. Comunità, C. J. Steinmetz, H. Phan, J. D. Reiss. Modelling black-box audio effects with time-varying feature modulation. Proc. 48th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 [PDF][Preprint]

O. Y. Chén, F. Lipsmeier, H. Phan, F. Dondelinger, A. Creagh, C. Gossens, M. Lindemann, M. de Vos. Personalized Longitudinal Assessment of Multiple Sclerosis Using Smartphones. IEEE Journal of Biomedical and Health Informatics (JBHI), 2023 [PDF] [Preprint]

M. Baumert, S. Hartmannm, H. Phan. Automatic sleep staging for the young and the old – evaluating age bias in deep learning. Sleep Medicine, 2023 [PDF]

K. Borup, P. Kidmose, H. Phan, K. Mikkelsen. Automatic Sleep Scoring using Patient-Specific Ensemble Models and Knowledge Distillation for Ear-EEG Data. Biomedical Signal Processing and Control, vol. 81, Article ID 104496, 2023 [PDF]

2022

H. Phan, A. Mertins, M. Baumert. Pediatric Automatic Sleep Staging: A comparative study of state-of-the-art deep learning methods. IEEE Transactions on Biomedical Engineering, vol. 69, no. 12, pp. 3612-3622, 2022 [PDF][Preprint]

O. Y. Chén, H. Phan, H. Cao, T. Qian, M. De Vos. Probing Potential Priming: Defining, Quantifying, and Testing the Causal Priming Effect Using the Potential Outcomes Framework. Frontiers in Psychology, vol. 13, Article ID 724498, 2022 [PDF]

R. Li, J. Liang, H. Phan. Few-Shot Bioacoustic Event Detection: Enhanced Classifiers for Prototypical Networks. 7th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2022), 2022 [PDF]

J. Liang, H. Phan, E. Benetos. Leveraging Label Hierarchies for Few-shot Everyday Sound Recognition. 7th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2022), 2022 [PDF]

E. R. M. Heremans, T. Osselaer, N. Seeuws, H. Phan, D. Testelmans, M. De Vos. Data augmentation in semi-supervised adversarial domain adaptation for EEG-based sleep staging. IEEE-EMBS International Conference on Biomedical and Health Informatics (IEEE BHI), 2022 [PDF]

S. Singh, H. Phan, E. Benetos. HyperNetworks for Sound Event Detection: A Proof-Of-Concept. 30th European Signal Processing Conference (EUSIPCO 2022), 2022 [PDF]

H. Phan, K. B. Mikkelsen, O. Y. Chén, P. Koch, A. Mertins, M. De Vos. SleepTransformer: Automatic Sleep Staging with Interpretability and Uncertainty Quantification. IEEE Transactions on Biomedical Engineering, vol. 69, no.8, pp. 2456-2467, 2022 [Preprint] [PDF] [Code]

H. Phan, O. Y. Chén, Minh C. Tran, P. Koch, A. Mertins, and M. De Vos. XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 9, pp. 5903-5915, 2022 [Preprint][PDF][Code]

V.-D. Nguyen, H. Phan, A. Mansour, A. Coatanhay, T. Marsault. Deep learning based higher-order approximation for multiple knife edge diffraction. IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, 2022 [PDF]

E. R. M. Heremans, H. Phan, A. H. Ansari, P. Borzée, B. Buyse, D. Testelmans, M. De Vos. Feature matching as improved transfer learning technique for wearable EEG. Biomedical Signal Processing and Control, vol. 78, article ID 1040092022, 2022 [Preprint][PDF]

E. R. M. Heremans, H. Phan, P. Borzée, B. Buyse, D. Testelmans, M. De Vos. From unsupervised to semi-supervised adversarial domain adaptation in EEG-based sleep staging. Journal of Neural Engineering, vol. 19, article ID 036044, 2022 [PDF]

M. Comunità, H. Phan, J. D. Reiss. Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks. 152nd International Audio Engineering Convention, 2022 [PDF][Preprint]

H. Phan, K. Mikkelsen. Automatic Sleep Staging of EEG Signals: Recent Development, Challenges, and Future Directions. Physiological Measurement (Topical Review), vol. 43, article ID 04TR01, 2022 [PDF]

H. Phan, T. N. T. Nguyen, P. Koch, A. Mertins. Polyphonic audio event detection: multi-label or multi-class classification problem?. Proc. 47th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 8877-8881, 2022 [Preprint] [PDF]

T. N. T. Nguyen, D. L. Jones, K. N. Watcharasupat, H. Phan, W.-S. Gan. SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays. Proc. 47th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 716-720, 2022 [Preprint] [PDF]

K. B. Mikkelsen, H. Phan, M. L. Rank, M. C. Hemmsen, M. de Vos, P. Kidmose. Sleep monitoring using ear-centered setups: Investigating the influence from electrode configurations. IEEE Transactions on Biomedical Engineering (TBME), vol. 69, no. 5, pp. 1564-1572, 2022 [Preprint][PDF]
(Featured Article)

P. Autthasan, R. Chaisaen, T. Sudhawiyangkul, P. Rangpong, S. Kiatthaveephong, N. Dilokthanakul, G. Bhakdisongkhram, H. Phan, C. Guan, and T. Wilaiprasitporn. MIN2Net: End-to-End Multi-Task Learning for Subject-Independent Motor Imagery EEG Classification. IEEE Transactions on Biomedical Engineering, vol. 69, no. 6, pp. 2105-2118, 2022 [Preprint][PDF]

2021

T. T. H. Duong, P.-L. Nguyen, H.-S. Nguyen, D.-C. Nguyen, H. Phan, N. Q. K. Duong. Speaker Count: A new building block for speaker diarization. Proc. 3th Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021 [PDF]

P. Koch, K. Mohammad-Zadeh, M. Maass, M. Dreier, O. Thomsen, T. J. Parbs, H. Phan, and A. Mertins. sEMG-Based Hand Movement Regression via Prediction of Joint Angles. Proc. 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2021 [PDF]

L. Pham, H. Phan, A. Schindler, R. King, A. Mertins, and I. McLoughlin. Inception-Based Network and Multi-Spectrogram Ensemble Applied to Predicting Respiratory Anomalies and Lung Diseases. Proc. 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2021 [PDF]

H. Phan, H. L. Nguyen, O. Y. Chén, L. Pham, P. Koch, I. McLoughlin, and A. Mertins. Multi-view Audio and Music Classification. Proc. 46th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 [PDF]

H. Phan, H. L. Nguyen, O. Y. Chén, P. Koch, N. Q. K. Duong, I. McLoughlin, and A. Mertins. Self-Attention Generative Adversarial Network for Speech Enhancement. Proc. 46th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 [PDF][CODE]

T. N. T. Nguyen, N. K. Nguyen, H. Phan, L. Pham, K. Ooi, D. L. Jones, W.-S. Gan. A General Network Architecture for Sound Event Localization and Detection Using Transfer Learning and Recurrent Neural Network. Proc. 46th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 [PDF][CODE]

V.-D. Nguyen, H. Phan, A. Mansour, A. Coatanhay, and T. Marsault. On the proof of recursive Vogler algorithm for multiple knife-edge diffraction. IEEE Transactions on Antennas and Propagation (TAP), vol. 69, no. 9, pp. 3617-3622, 2021 [PDF]

N. Banluesombatkul, P. Ouppaphan, P. Leelaarporn, P. Lakhan, B. Chaitusaney, N. Jaimchariyatam, E. Chuangsuwanich, W. Chen, H. Phan, N. Dilokthanakul, T. Wilaiprasitporn. MetaSleepLearner: A Pilot Study on Fast Adaptation of Bio-signals-Based Sleep Stage Classifier to New Individual Subject Using Meta-Learning. IEEE Journal of Biomedical and Health Informatics (JBHI), vol. 25, no. 6, pp. 1949-1963, 2021 [PDF]

L. Pham, H. Phan, R. Palaniappan, A. Mertins, I. McLoughlin. CNN-MoE based framework for classification of respiratory anomalies and lung disease detection. IEEE Journal of Biomedical and Health Informatics (JBHI), 2021 [Preprint] [PDF]

H. Phan, O. Y. Chén, P. Koch, Z. Lu, I. McLoughlin, A. Mertins, and M. De Vos. Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning. IEEE Transactions on Biomedical Engineering (TBME), vol. 68, no. 6, pp. 1787-1798, 2021 [Preprint][PDF][CODE]

L. Pham, H. Phan, T. Nguyen, R. Palaniappan, A. Mertins, I. McLoughlin. Robust Acoustic Scene Classification using a Multi-Spectrogram Encoder-Decoder Framework. Digital Signal Processing, vol. 110, Article ID 102943, 2021 [PDF]

O. Y. Chén, H. Cao, H. Phan, G. Nagels, J. M. Reinen, J. Gou, T. Qian, J. Di, J. Prince, T. D. Cannon, M. de Vos. Identifying Neural Signatures Mediating Behavioral Symptoms and Psychosis Onset: High-Dimensional Whole Brain Functional Mediation Analysis. NeuroImage, vol. 226, Article ID 117508, 2021 [PDF]

2020

H. Phan, L. Pham, P. Koch, N. Q. K. Duong, I. McLoughlin, and A. Mertins. On Multitask Loss Function for Audio Event Detection and Localization. 5th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2020), 2020 [PDF]

H. Phan, I. V. McLoughlin, L. Pham, O. Y. Chén, P. Koch, M. De Vos, and A. Mertins. Improving GANs for Speech Enhancement. IEEE Signal Processing Letters, vol. 27, pp. 1700-1704, 2020 [PDF] [CODE]

P. Koch, M. Dreier, M. Maass, H. Phan, and A. Mertins. RNN with Stacked Architecture for sEMG based Sequence-to-Sequence Hand Gesture Recognition. 28th European Signal Processing Conference (EUSIPCO 2020), 2020

L. Pham, I. McLoughlin, H. Phan, M. Tran, T. Nguyen, R. Palaniappan. Robust Deep Learning Framework For Predicting Respiratory Anomalies and Diseases. 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2020), 2020 [PDF]

P. Koch, M. Dreier, A. Larsen, T. J. Parbs, M. Maaß, H. Phan, Alfred Mertins. Regression of Hand Movements from sEMG Data with Recurrent Neural Networks. 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2020), 2020

L. Pham, I. McLoughlin, H. Phan, R. Palaniappan, A. Mertins. Deep Feature Embedding and Hierarchical Classification for Audio Scene Classification. 2020 International Joint Conference on Neural Networks (IJCNN), 2020 [PDF]

H. Phan, L. Pham, P. Koch, N. Q. K. Duong, I. McLoughlin, A. Mertins. Audio Event Detection and Localization with Multitask Regression Network. Detection and Classification of Acoustic Scenes and Events (DCASE 2020) Technical Report, 2020 [PDF]

H. Phan, K. Mikkelsen, O. Y. Chén, P. Koch, A. Mertins, P. Kidmose, M. De Vos. Personalized Automatic Sleep Staging with Single-Night Data: a Pilot Study with KL-Divergence Regularization. Physiological Measurement, vol. 41, Article ID 064004, 2020 [PDF]

O. Y. Chén, F. Lipsmeier, H. Phan, J. Prince, K. I. Taylor, C. Gossens, M. Lindemann, and M. De Vos. Building a Machine-learning Framework to Remotely Assess Parkinson’s Disease Using Smartphones. IEEE Transactions on Biomedical Engineering (TBME), vol. 67, no. 12, pp. 3491-3500, 2020 [PDF]

V.-D. Nguyen, H. Phan, A. Mansour, and A. Coatanhay. VoglerNet: Multiple Knife-Edge Diffraction Using Deep Neural Network. 14th European Conference on Antennas and Propagation (EuCAP 2020), 2020

I. McLoughlin, Z. Xie, Y. Song, H. Phan, and P. Ramaswamy. Time-frequency feature fusion for noise-robust audio event classification. Circuits, Systems, and Signal Processing (CSSP): 39(3), pp. 1672–1687, 2020 [PDF]

2019

H. Phan, F. Andreotti, N. Cooray, O. Y. Chén, and M. De Vos. SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging. IEEE Transactions on Neural Systems and Rehabilitation Engineering (TNSRE): 27(3), pp. 400-410, 2019 [PDF][CODE]
(IEEE-EMBS Best Paper Award 2019-2020)(Cover featured in the published issue: Link)

H. Phan, O. Y. Chén, L. Pham, P. Koch, M. De Vos, I. McLoughlin, and A. Mertins. Spatio-Temporal Attention Pooling for Audio Scene Classification. 20th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2019 [PDF]

L. Pham, I. McLoughlin, H. Phan, and R. Palaniappan. A Robust Framework For Acoustic Scene Classification. 20th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2019

H. Phan, O. Y. Chén, P. Koch, A. Mertins, and M. De Vos. Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch. 27th European Signal Processing Conference (EUSIPCO), 2019 [PDF]

P. Koch, M. Dreier, M. Böhme, M. Maass, H. Phan, and A. Mertins. Inhomogeneously Stacked RNN for Recognizing Hand Gestures from Magnetometer Data. 27th European Signal Processing Conference (EUSIPCO), 2019 [PDF]

H. Phan, O. Y. Chén, P. Koch, A. Mertins, and M. De Vos. Fusion of End-to-End Deep Learning Models for Sequence-to-Sequence Sleep Staging. 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019

P. Koch, M. Dreier, M. Maass, M. Böhme, H. Phan, and A. Mertins. A Recurrent Neural Network for Hand Gesture Recognition based on Accelerometer Data. 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019

H. Phan, O. Y. Chén, P. Koch, L. Pham, I. McLoughlin, A. Mertins, and M. De Vos. Beyond Equal-Length Snippets: How Long is Sufficient to Recognize an Audio Scene? 2019 AES Conference on Audio Forensics, no. 16, 2019 [PDF]

L. Pham, I. McLoughlin, H. Phan, R. Palaniappan, and Y. Lang. Bag-of-Features Models Based on C-DNN Network for Acoustic Scene Classification. 2019 AES Conference on Audio Forensics, no. 12, 2019

H. Phan, O. Y. Chén, P. Koch, L. Pham, I. McLoughlin, A. Mertins, and M. De Vos. Unifying Isolated and Overlapping Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks. Proc. 44th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 51-55, 2019 [PDF]

P. Koch, N. Brügge, H. Phan, M. Maass, and A. Mertins. Forked Recurrent Neural Network for Hand Gesture Classification Using Inertial Measurement Data. Proc. 44th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2877-2881, 2019

H. Phan, F. Andreotti, N. Cooray, O. Y. Chén, and M. De Vos. Joint Classification and Prediction CNN Framework for Automatic Sleep Stage Classification. IEEE Transactions on Biomedical Engineering (TBME): 66(5), pp. 1285-1296, 2019 [PDF][CODE]

O. Y. Chén, H. Cao, J. Reinen, T. Qian, J. Gou, H. Phan, M. De Vos, and T. Cannon. Resting-State Brain Information Flow Predicts Cognitive Flexibility in Humans. Nature Scientific Reports: 9, Article number 3879, 2019 [PDF]

2018

I. McLoughlin, Y. Song, L. Pham, R. Palaniappan, H. Phan, Y. Lang. Early Detection of Continuous and Partial Audio Events Using CNN. In Proceedings of 19th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 3314-3318, 2018 [PDF]

P. Koch, H. Phan, M. Maass, F. Katzberg, R. Mazur, and A. Mertins. Recurrent Neural Networks with Weighting Loss for Early Prediction of Hand Movements. In Proceedings of 26th European Signal Processing Conference (EUSIPCO), 2018 [PDF]

F. Andreotti, H. Phan, M. De Vos. Visualising Convolutional Neural Network Decisions in Automatic Sleep Scoring. In Proceedings of Joint Workshop on Artificial Intelligence in Health (AIH), 2018 [PDF]

H. Phan, F. Andreotti, N. Cooray, O. Y. Chén, and M. De Vos. Automatic Sleep Stage Classification Using Single-Channel EEG: Learning Sequential Features with Attention-Based Recurrent Neural Networks. In Proceedings of 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1452-1455, 2018 [PDF]

H. Phan, F. Andreotti, N. Cooray, O. Y. Chén, and M. De Vos. DNN Filter Bank Improves 1-Max Pooling CNN for Single-Channel EEG Automatic Sleep Stage Classification. In Proceedings of 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 453-456, 2018 [PDF]

F. Andreotti, H. Phan, N. Cooray, C. Lo, M. T. M. Hu, and M. De Vos. Multichannel Sleep Stage Classification and Transfer Learning Using Convolutional Neural Networks. In Proceedings of 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 171-174, 2018 [PDF]

P. Koch, H. Phan, M. Maass, F. Katzberg, and A. Mertins. Recurrent Neural Network Based Early Prediction of Future Hand Movements. In Proceedings of 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2018 [PDF]

H. Phan, P. Koch, I. McLoughlin, and A. Mertins. Enabling Early Audio Event Detection with Neural Networks. In Proceedings of 43rd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 141-145, 2018 [PDF]

H. Phan, M. Krawczyk-Becker, T. Gerkmann, and A. Mertins. Weighted and Multi-Task Loss for Rare Audio Event Detection. In Proceedings of 43rd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 336-340, 2018 [PDF]

2017

Q. H. Phan. Audio Event Detection, Classification, and Beyond. Ph.D. Thesis, University of Lübeck, August 2017 [PDF]
(summa cum laude) (Bernd Fischer award)

I. McLoughlin, H. Zhang, Z. Xie, Y. Song, W. Xiao, and H. Phan. Continuous Robust Sound Event Classification Using Time-Frequency Features and Deep Learning. PLoS ONE: 12(9), Article ID e0182309, 2017 [PDF]

H. Phan, L. Hertel, M. Maass, P. Koch, R. Mazur, and A. Mertins. Improved Audio Scene Classification based on Label-Tree Embeddings and Convolutional Neural Networks. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP): 25(6), pp. 1278-1290, 2017 [PDF] [CODE]

M. Maass, M. Ahlborg, A. Bakenecker, F. Katzberg, H. Phan, T. M. Buzug, and A. Mertins. A Trajectory Study for Obtaining MPI System Matrices in a Compressed-Sensing Framework. International Journal on Magnetic Particle Imaging (IJMPI): 3(2), Article ID 1706005, 2017 [PDF]

H. Phan, P. Koch, F. Katzberg, M. Maass, R. Mazur, I. McLoughlin, and A. Mertins. What Makes Audio Event Detection Harder than Classification? In Proceedings of 25th European Signal Processing Conference (EUSIPCO), pp. 2739-2743, 2017 [PDF]

H. Phan, P. Koch, F. Katzberg, M. Maass, R. Mazur, and A. Mertins. Audio Scene Classification with Deep Recurrent Neural Networks. In Proceedings of 18th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 3043-3047, 2017 [PDF] [CODE]

P. Koch, H. Phan, M. Maass, F. Katzberg, and A. Mertins. Early Prediction of Future Hand Movements Using sEMG Data. In Proceedings of 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 54-57, 2017 [PDF]

H. Phan, P. Koch, L. Hertel, M. Maass, R. Mazur, and A. Mertins. CNN-LTE: A Class of 1-X Pooling Convolutional Neural Networks on Label Tree Embeddings for Audio Scene Classification. In Proceedings of 42nd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 136-140, 2017 [PDF] [CODE]

R. Mazur, F. Katzberg, H. Phan, and A. Mertins. Room Equalization Based on Measurements of Moving Microphones. In Proceedings of 5th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), pp. 121-125, 2017

H. Phan, M. Krawczyk-Becker, T. Gerkmann, and A. Mertins. DNN and CNN with Weighted and Multi-Task Loss Functions for Audio Event Detection, Technical Report: Detection and Classification of Audio Scenes and Events (DCASE 2017), 2017 [PDF]

H. Phan, P. Koch, F. Katzberg, M. Maass, R. Mazur, and A. Mertins. Attention-Based CNN with Generalized Label Tree Embedding for Audio Scene Classification, Technical Report: Detection and Classification of Audio Scenes and Events (DCASE 2017), 2017 [PDF]

2016

H. Phan, L. Hertel, M. Maass, R. Mazur, and A. Mertins. Learning Representations for Nonspeech Audio Events through Their Similarities to Speech Patterns. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP): 24(4), pp. 807-822, 2016 [PDF]

M. Maass, K. Bente, M. Ahlborg, H. Medimagh, H. Phan, T. M. Buzug, and A. Mertins. Optimized Compression of MPI System Matrices Using a Symmetry-Preserving Secondary Orthogonal Transform. International Journal on Magnetic Particle Imaging (IJMPI): 2(1), Article ID 1607002, 2016 [PDF]

H. Phan, L. Hertel, M. Maass, P. Koch, and A. Mertins. Label Tree Embeddings for Acoustic Scene Classification. In Proceedings of 24th ACM Multimedia (ACMMM), pp. 486-490, 2016 [PDF] [CODE]

H. Phan, L. Hertel, M. Maass, and A. Mertins. Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks. In Proceedings of 17th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 3653-3657, 2016 [PDF] [CODE]

L. Hertel, H. Phan, and A. Mertins. Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning. In Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN), pp. 3407-3411, 2016 [PDF]

H. Phan, M. Maass, L. Hertel, R. Mazur, I. McLoughlin, and A. Mertins. Learning Compact Structural Representations for Audio Events Using Regressor Banks. In Proceedings of 41st IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 211-215, 2016 [PDF]
(SPS Travel Grant)

M. Maass, K. Bente, M. Ahlborg, H. Medimagh, H. Phan, T. M. Buzug, and A. Mertins. Compression of FFP System Matrix with a Special Sampling Rate on the Lissajous Trajectory. In Proceedings of 6th International Workshop on Magnetic Particle Imaging (IWMPI), 2016

L. Hertel, H. Phan, and A. Mertins. Classifying Variable-Length Audio Files with All-Convolutional Networks and Masked Global Pooling. Technical Report: Detection and Classification of Audio Scenes and Events (DCASE 2016), 2016 [PDF]

H. Phan, L. Hertel, M. Maass, P. Koch, and A. Mertins. CaR-FOREST: Joint Classification-Regression Decision Forests for Overlapping Audio Event Detection. Technical Report: Detection and Classification of Audio Scenes and Events (DCASE 2016), 2016 [PDF]

H. Phan, L. Hertel, M. Maass, P. Koch, and A. Mertins. CNN-LTE: a Class of 1-X Pooling Convolutional Neural Networks on Label Tree Embeddings for Audio Scene Recognition. Technical Report: Detection and Classification of Audio Scenes and Events (DCASE 2016), 2016 [PDF]

2015

H. Phan, Marco Maaß, R. Mazur, and A. Mertins. Random Regression Forests for Acoustic Event Detection and Classification. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP): 23(1), pp. 20-31, 2015 [PDF] [CODE]

M. Maass, H. Phan, A. Möller, and A. Mertins. Cosine-Sine Modulated Filter Banks for Motion Estimation and Correction. In Proceedings of International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS), pp. 195-204, 2015 [PDF]

H. Phan, M. Maass, L. Hertel, R. Mazur, and A. Mertins. A Multi-Channel Fusion Framework for Audio Event Detection. In Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1-5, 2015 [PDF]

H. Phan, L. Hertel, M. Maass, R. Mazur, and A. Mertins. Representing Nonspeech Audio Signals through Speech Classification Models. In Proceedings of 16th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 3441-3445, 2015 [PDF]
(Best Student Paper Award Finalist) (ISCA Travel Grant)

H. Phan, L. Hertel, M. Maass, R. Mazur, and A. Mertins. Audio Phrases for Audio Event Recognition. In Proceedings of 23rd European Signal Processing Conference (EUSIPCO), pp. 2546-2550, 2015 [PDF]

R. Mazur, H. Phan, and A. Mertins. A Clustering Approach for Solving the Spatial Aliasing Problem in Convolutive Blind Source Separation. In Proceedings of 20th International Conference on Digital Signal Processing (DSP), pp. 679-683, 2015

H. Phan, M. Maass, R. Mazur, and A. Mertins. Early Event Detection in Audio Streams. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME), pp. 1-6, 2015 [PDF]
(Top 15% paper)

2014

H. Phan, M. Maass, R. Mazur, and A. Mertins. Acoustic Event Detection and Localization with Regression Forests. In Proceedings of 15th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 2524-2528, 2014 [PDF] [DEMO]
(ISCA Travel Grant)

H. Phan and A. Mertins. Exploring Superframe Co-occurrence for Acoustic Event Recognition. In Proceedings of 22nd European Signal Processing Conference (EUSIPCO), pp. 631-635, 2014 [PDF]

M. Maass, H. Phan, and A. Mertins. Design of Cosine-Sine Modulated Filter Banks without DC Leakage. In Proceedings of 19th International Conference on Digital Signal Processing (DSP), pp. 486-491, 2014 [PDF]

H. Phan and A. Mertins. A Voting-based Technique for Acoustic Event-Specific Detection. In Proceedings of 40th Annual German Congress on Acoustics (DAGA), 2014 [PDF]

2013

Q.-H. Phan, S.-L. Tan, and I. McLoughlin. GPS Multipath Mitigation: A Nonlinear Regression Approach. GPS Solutions: 17(3), pp. 371-380, 2013 [PDF]

Q.-H. Phan, S.-L. Tan, I. McLoughlin, and D.-L. Vu. A Unified Framework for GPS Code and Carrier-Phase Multipath Mitigation using Support Vector Regression. Advances in Artificial Neural Systems, 2013 [PDF]

H. Phan, Q. Do, T.-L. Do, and D.-L. Vu. Metric Learning for Automatic Sleep Stage Classification. In Proceedings of 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 5025-5028, 2013 [PDF] [CODE]

2012

Q. H. Phan. Nonlinear regression approach for GPS multipath mitigation: from code to carrier-phase measurements. M.Eng. Thesis, Nanyang Technological University, 2012 [PDF]

2011

Q.-H. Phan and S.-L. Tan. Mitigation of GPS Periodic Multipath Using Nonlinear Regression. In Proceedings of 19th European Signal Processing Conference (EUSIPCO), pp. 1795-1799, 2011 [PDF]

Huy Phan