I am currently a Senior Researcher at Tencent Singapore under the ‘Project Up’ (青云计划) Talent Programme, specializing in Speech Synthesis (Speech GenAI). Previously, I obtained my Ph.D. from the National University of Singapore (NUS), where I was supervised by Prof. Li Haizhou and Prof.Mike Z. SHOU. Before joining Tencent, I worked as a Senior Research Engineer at the Agency for Science, Technology and Research (A*STAR), Singapore, focusing on Speech-LLM (MERaLiON Team), Speech Anti-Spoofing (Speech DeepFake Detection), and Speaker Recognition.
My research interests include text-to-speech, speaker recognition, anti-spoofing, speech-LLM, etc. I have published more than 20 papers in the top international AI conferences and journals such as NeurIPS, IEEE TIFS, IEEE TASLP, EMNLP, IEEE SPL, IEEE ICASSP, INTERSPEECH.
🔍 Research Area
Speech Processing: Text-to-Speech (TTS), Speaker Recognition, Speech Foundation Model, Anti-spoofing / DeepFake Detection, Target Speaker Extraction
Multi-modal Processing: Speech-LLM, Audio-visual
Algorithm: Self-supervised Learning, Disentanglement Learning
🎓 Education
- 2021.01 - 2025.01, Ph.D. in Speech Processing, National University of Singapore (NUS), Singapore.
- 2018.07 - 2019.06, M.Sc. in Electronic and Computer Engineering (Specialization in Computer Engineering), National University of Singapore (NUS), Singapore.
💼 Work Experience
- 2025.02 - Now, Senior Researcher, ‘Project Up’ (青云计划) Talent Programme, Tencent, Singapore.
- 2020.09 - 2025.02, Senior Research Engineer, Agency for Science, Technology and Research (A*STAR), Singapore.
- 2019.06 - 2020.08, AI Scientist, PENSEES R&D Center, Singapore
📜 Publication
⭐ Selected Publications
- T. Liu, D. T. Truong, R. K. Das, K. A. Lee, H. Li, Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing, IEEE Transactions on Information Forensics and Security (TIFS), 2025. 🔗[IEEE, arXiv, Code & Models]
- T. Liu, K. A. Lee, Q. Wang, H. Li, Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification, IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2024. 🔗[IEEE, arXiv, pre-trained models & simple inference, train & test code, ONNX]
- T. Liu, K. A. Lee, Q. Wang, H. Li, Disentangling Voice and Content with Self-Supervision for Speaker Recognition, Advances in Neural Information Processing Systems (NeurIPS), 2023. 🔗[NeurIPS, arXiv]
-
T. Liu, R. K. Das, K. A. Lee, H. Li, MFA: TDNN with multi-scale frequency-channel attention for text-independent speaker verification with short utterances, IEEE ICASSP, oral, 2022. 🔗[IEEE, arXiv, code]
- [🌊MERaLiON🦁] M. Huzaifah*, G. Lin*, T. Liu*, H. B. Sailor*, K. M. Tan*, T. K. Vangani*, Q. Wang*, J. H. M. Wong*, N. F. Chen, A. T. Aw (MERaLiON Team), MERaLiON-SpeechEncoder: Towards a Speech Foundation Model for Singapore and Beyond, 2024. 🔗[arXiv, 🤗Hugging Face]
Speech-LLM Pretraining & Reasoning
- Q. Wang, H. B. Sailor, T. Liu, …, Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data, EMNLP (findings), 2025. 🔗[arXiv, ACL Anthology]
- Q. Wang, H. B. Sailor, J. H. Wong, T. Liu, …, Incorporating Contextual Paralinguistic Understanding in Large Speech-Language Models, IEEE ASRU, 2025. 🔗[IEEE (soon), arXiv]
- Q. Wang, H. B. Sailor, T. Liu, A. T. Aw, Contextual Paralinguistic Data Creation for Multi-Modal Speech-LLM: Data Condensation and Spoken QA Generation, INTERSPEECH, 2025. 🔗[ISCA, arXiv, dataset(🤗Hugging Face)]
- [🌊MERaLiON🦁] M. Huzaifah*, G. Lin*, T. Liu*, H. B. Sailor*, K. M. Tan*, T. K. Vangani*, Q. Wang*, J. H. M. Wong*, N. F. Chen, A. T. Aw (MERaLiON Team), MERaLiON-SpeechEncoder: Towards a Speech Foundation Model for Singapore and Beyond, 2024. 🔗[arXiv, 🤗Hugging Face]
- (Under Review) L. Xue, …, T. Liu, …, Audio-FLAN: A Preliminary Release, 2025. 🔗[arXiv, Dataset(🤗Hugging Face), Github]
Speaker Recognition & Target Speaker Extraction
- (Best Paper Award) R. Tao, Z. Shi, Y. Jiang, T. Liu, H. Li, Voice Conversion Augmentation for Speaker Recognition on Defective Datasets, APSIPA ASC, 2025. 🔗[arXiv, IEEE Xplore]
- Y. Ma, S. Wang, T. Liu, H. Li, PhiNet: Speaker Verification with Phonetic Interpretability, IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2025. 🔗[IEEE Xplore]
- T. Liu, R. Tao, Q. Wang, …, Interpolating Speaker Identities in Embedding Space for Data Expansion, APSIPA ASC, 2025. 🔗[arXiv, IEEE Xplore]
- Y. Ma, S. Wang, T. Liu, H. Li, ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification, IEEE Signal Processing Letters (SPL), 2025. 🔗[IEEE, arXiv, code]
- K. Zhang, M. Borsdorf, T. Liu, S. Wang, Y. Wei, H. Li, Speaker Extraction with Verification of Present and Absent Target Speakers, Journal of Shanghai Jiaotong University (Science), 2025. 🔗[Paper Link]
- T. Liu, Advances in Robust and Practical Speaker Verification, 2025.
- T. Liu, K. A. Lee, Q. Wang, H. Li, Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification, IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2024. 🔗[IEEE, arXiv, pre-trained models & simple inference, train & test code, ONNX]
- T. Liu, K. A. Lee, Q. Wang, H. Li, Disentangling Voice and Content with Self-Supervision for Speaker Recognition, Advances in Neural Information Processing Systems (NeurIPS), 2023. 🔗[NeurIPS, arXiv]
- Q. Wang, K. A. Lee, T. Liu, Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification, IEEE ICASSP, oral, 2023. 🔗[IEEE, arXiv]
- T. Liu, R. K. Das, K. A. Lee, H. Li, MFA: TDNN with multi-scale frequency-channel attention for text-independent speaker verification with short utterances, IEEE ICASSP, oral, 2022. 🔗[IEEE, arXiv, code]
- T. Liu, R. K. Das, K. A. Lee, H. Li, Neural acoustic-phonetic approach for speaker verification with phonetic attention mask, IEEE Signal Processing Letters (SPL), 2022. 🔗[IEEE]
- Q. Wang, K. A. Lee, T. Liu, Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?, INTERSPEECH, oral, 2022. 🔗[ISCA, arXiv]
- T. Liu, R. K. Das, M. C. Madhavi, S. Shen, H. Li, Speaker-Utterance Dual Attention for Speaker and Utterance Verification, INTERSPEECH, 2020. 🔗[ISCA, arXiv]
- T. Liu, M. C. Madhavi, R. K. Das, H. Li, A Unified Framework for Speaker and Utterance Verification, INTERSPEECH, 2019. 🔗[ISCA, code]
Speech DeepFake Detection / Antispoofing <!—
- D. T. Truong, T. Liu, …, Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection, IEEE ICASSP, 2026. 🔗[arXiv] –>
- T. Liu, D. T. Truong, R. K. Das, K. A. Lee, H. Li, Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing, IEEE Transactions on Information Forensics and Security (TIFS), 2025. 🔗[IEEE, arXiv, Code & Models]
- T. Liu, L. Zhang, R. K. Das, Y. Ma, R. Tao, H. Li, How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?, INTERSPEECH, oral, 2024. 🔗[ISCA, arXiv]
- Z. Pan, T. Liu, H. B. Sailor, Q. Wang, Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing Detection, INTERSPEECH, oral, 2024. 🔗[ISCA, arXiv, code]
- T. Liu, I. Kukanov, Z. Pan, Q. Wang, H. B. Sailor, K. A. Lee, Towards Quantifying and Reducing Language Mismatch Effects in Cross-lingual Speech Anti-Spoofing, IEEE SLT, 2024.🔗[IEEE, arXiv]
- A. Guragain*, T. Liu*, Z. Pan, H. B. Sailor, Q. Wang, Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CtrSVDD) Challenge 2024, IEEE SLT, (co-first author*), 2024. 🔗[IEEE, arXiv, code] <!—
- D. T. Truong, T. Liu, …, QAMO: Quality-aware Multi-centroid One-class Learning For Speech Deepfake Detection, under review. 🔗[arXiv] –>
Audio-Visual
- J. Meng, H. B. Sailor, Q. Wang, T. Liu, …, Exploring audio-visual fusion methods in foundation model-based deception detection, APSIPA ASC, 2025. 🔗[IEEE Xplore]
💻 Open Source Contributions
- WeSpeaker
- AudioLLM
- Audio-FLAN
- Nes2Net
- MERaLiON [🤗Hugging Face]
🌟 Others
🥇 Awards🥈🥉🏅🎖️
- 🥇Best Paper Award (1 paper only), APSIPA, 2025 (Co-author)
- 🥈Sliver Award, Tencent AI Application Award, Tencent, 2025
- 🥇Best Student Paper Award (1 recipient annually), Pattern Recognition and Machine Intelligence Association (PREMIA), 2024
- 🥇Best Student Paper Award (2 recipients annually), Nanyang Speech Technology Forum (NSTF), 2024
- 🥉3rd place winner out of 49 teams in the Controlled Singing Voice Deepfake Detection (CtrSVDD) Challenge 2024 @ IEEE SLT, 2024
- 🥇Best Paper Award in 2023 International Doctoral Forum (CUHK & Microsoft), 2023
- 🥇Travel Grant Award (Top Tier), Chinese and Oriental Languages Information Processing Society (COLIPS), 2023
- 🎖️5th place winner out of 128 teams in NIST Face Recognition Vendor Test (FRVT) - Face Mask Effects, 2020
- Outstanding Achievement Award under the Research Domain, PENSEES Singapore, 2020
🏛️ Academic Service
- Virtual Session Chair @ IEEE IALP, 2023
- Program Committee @ IEEE ISCSLP, 2022
- Volunteer @ IEEE ICASSP, 2022
📝 Reviewer
Journal/Transactions/Letters:
- Neural Networks ‘25
- IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) ‘25
- IEEE Transactions on Dependable and Secure Computing (TDSC) ‘25
- IEEE Transactions on Biometrics, Behavior, and Identity Science (TBIOM) ‘25
- Computer Speech & Language ‘24~’25
- IEEE Signal Processing Letters (SPL) - ‘23~’25
Conference:
- Conference on Neural Information Processing Systems (NeurIPS) - ‘25
- IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - ‘23~’25
- INTERSPEECH - ‘23~’25
- International Joint Conference on Neural Networks (IJCNN) ‘25
- IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) - ‘25
- IEEE Spoken Language Technology Workshop (SLT) - ‘24
- ASVspoof5 - ‘24
- IEEE ISCSLP - ‘22