Mr Kaiwen Zheng
- Graduate Teaching Assistant - School of Computing Science (School of Computing Science)
Publications
2026
Fu, Junchen, Deng, Wenhao, Zheng, Kaiwen, Arapakis, Ioannis, Ye, Yu, Ni, Yongxin, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759 and Ge, Xuri
(2026)
Benchmarking Multimodal Large Language Models for missing modality completion in product catalogues.
Pattern Recognition, 180(A),
114020.
(doi: 10.1016/j.patcog.2026.114020)
Ye, Yu, Fu, Junchen, Song, Yu, Zheng, Kaiwen and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2026)
A reproducibility study of multimodal embeddings for recommender systems.
International Journal of Multimedia Information Retrieval,
(Accepted for Publication)
Fu, Junchen, Ge, Xuri, Zheng, Kaiwen, Karatzoglou, Alexandros, Arapakis, Ioannis, Xin, Xin, Ni, Yongxin and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2026)
LLMpopcorn: exploring LLMs as assistants for popular micro-video generation.
In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026), Barcelona, Spain, 04-08 May 2026,
pp. 12007-12011.
ISBN 9798331567026
(doi: 10.1109/ICASSP55912.2026.11463708)
Zheng, Kaiwen, Fu, Junchen, XU, Songpei, He, Yaoqin, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759, Han, Hu and Ge, Xuri
ORCID: https://orcid.org/0000-0002-3925-4951
(2026)
Focal-RegionFace: Generating Fine-Grained Multi-attribute Descriptions for Arbitrarily Selected Face Focal Regions.
In: 16th ACM International Conference on Multimedia Retrieval (ICMR 2026), Amsterdam, The Netherlands, 16-19 June 2026,
(Accepted for Publication)
Ge, Xuri, Zhang, Tianshuo, Li, Ruihan, Ye, Hui, Zheng, Kaiwen, Fu, Junchen, Huo, Da, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759 and Han, Hu
(2026)
A Comprehensive Review of Multimodal Facial State Analysis: Tasks, Methods, and Resources.
In: IEEE International Conference on Multimedia and Expo (ICME 2026), Bangkok, Thailand, 05-09 July 2026,
(Accepted for Publication)
Ye, Yu, Fu, Junchen, Song, Yu, Zheng, Kaiwen and Jose, Joemon ORCID: https://orcid.org/0000-0001-9228-1759
(2026)
Are Multimodal Embeddings Truly Beneficial for Recommendation? A Deep Dive into Whole vs. Individual Modalities.
In: 48th European Conference on Information Retrieval (ECIR 2026), Delft, The Netherlands, 30 March - 1 April 2026,
pp. 66-81.
ISBN 9783032213235
(doi: 10.1007/978-3-032-21324-2_5)
2025
Zheng, Kaiwen, Ge, Xuri ORCID: https://orcid.org/0000-0002-3925-4951, Fu, Junchen, Peng, Jun and Jose, Joemon
ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Multimodal Representation Learning Techniques for Comprehensive Facial State Analysis.
In: 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 30 Jun - 04 Jul 2025,
ISBN 9798331594954
(doi: 10.1109/ICME59968.2025.11208908)
Parker, Deven ORCID: https://orcid.org/0000-0003-0467-5294, Zheng, Kaiwen, Gamer, Michael and Jose, Joemon M.
ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Unlocking 18th- and 19th-century playbills with AI: an experiment in qualitative data categorization.
Umanistica Digitale, 9(21),
pp. 33-83.
(doi: 10.60923/issn.2532-8816/21718)
Fu, Junchen, Ge, Xuri, Xin, Xin, Karatzoglou, Alexandros, Arapakis, Ioannis, Zheng, Kaiwen, Ni, Yongxin and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Efficient and effective adaptation of multimodal foundation models in sequential recommendation.
IEEE Transactions on Knowledge and Data Engineering,
(doi: 10.1109/TKDE.2025.3608071)
(Early Online Publication)
He, Yaoqin, Fu, Junchen, Zheng, Kaiwen, Xu, Songpei, Chen, Fuhai, Li, Jie, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759 and Ge, Xuri
(2025)
Double-Filter: Efficient Fine-tuning of Pre-trained Vision-Language Models via Patch&Layer Filtering.
In: ICML 2025, Vancouver, Canada, 13-19 July 2025,
Ge, Xuri, Li, Linqing, Xu, Songpei, Zheng, Kaiwen, He, Yaoqin, Fu, Junchen and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
The DenseCap-Guided Attention Network For Image-Text Matching.
In: ACM Web Conference 2025, Sydney, Australia, 28 April - 2 May 2025,
pp. 2153-2160.
ISBN 9798400713316
(doi: 10.1145/3701716.3717564)
Liu, Zhiyu, Fu, Junchen, Zheng, Kaiwen and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Exploring Multimodal Pre-trained Models for Speech Emotion Recognition.
In: ACM Web Conference 2025, Sydney, Australia, 28 April - 2 May 2025,
pp. 2176-2180.
ISBN 9798400713316
(doi: 10.1145/3701716.3717561)
Articles
Fu, Junchen, Deng, Wenhao, Zheng, Kaiwen, Arapakis, Ioannis, Ye, Yu, Ni, Yongxin, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759 and Ge, Xuri
(2026)
Benchmarking Multimodal Large Language Models for missing modality completion in product catalogues.
Pattern Recognition, 180(A),
114020.
(doi: 10.1016/j.patcog.2026.114020)
Ye, Yu, Fu, Junchen, Song, Yu, Zheng, Kaiwen and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2026)
A reproducibility study of multimodal embeddings for recommender systems.
International Journal of Multimedia Information Retrieval,
(Accepted for Publication)
Parker, Deven ORCID: https://orcid.org/0000-0003-0467-5294, Zheng, Kaiwen, Gamer, Michael and Jose, Joemon M.
ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Unlocking 18th- and 19th-century playbills with AI: an experiment in qualitative data categorization.
Umanistica Digitale, 9(21),
pp. 33-83.
(doi: 10.60923/issn.2532-8816/21718)
Fu, Junchen, Ge, Xuri, Xin, Xin, Karatzoglou, Alexandros, Arapakis, Ioannis, Zheng, Kaiwen, Ni, Yongxin and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Efficient and effective adaptation of multimodal foundation models in sequential recommendation.
IEEE Transactions on Knowledge and Data Engineering,
(doi: 10.1109/TKDE.2025.3608071)
(Early Online Publication)
Conference Proceedings
Fu, Junchen, Ge, Xuri, Zheng, Kaiwen, Karatzoglou, Alexandros, Arapakis, Ioannis, Xin, Xin, Ni, Yongxin and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2026)
LLMpopcorn: exploring LLMs as assistants for popular micro-video generation.
In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026), Barcelona, Spain, 04-08 May 2026,
pp. 12007-12011.
ISBN 9798331567026
(doi: 10.1109/ICASSP55912.2026.11463708)
Zheng, Kaiwen, Fu, Junchen, XU, Songpei, He, Yaoqin, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759, Han, Hu and Ge, Xuri
ORCID: https://orcid.org/0000-0002-3925-4951
(2026)
Focal-RegionFace: Generating Fine-Grained Multi-attribute Descriptions for Arbitrarily Selected Face Focal Regions.
In: 16th ACM International Conference on Multimedia Retrieval (ICMR 2026), Amsterdam, The Netherlands, 16-19 June 2026,
(Accepted for Publication)
Ge, Xuri, Zhang, Tianshuo, Li, Ruihan, Ye, Hui, Zheng, Kaiwen, Fu, Junchen, Huo, Da, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759 and Han, Hu
(2026)
A Comprehensive Review of Multimodal Facial State Analysis: Tasks, Methods, and Resources.
In: IEEE International Conference on Multimedia and Expo (ICME 2026), Bangkok, Thailand, 05-09 July 2026,
(Accepted for Publication)
Ye, Yu, Fu, Junchen, Song, Yu, Zheng, Kaiwen and Jose, Joemon ORCID: https://orcid.org/0000-0001-9228-1759
(2026)
Are Multimodal Embeddings Truly Beneficial for Recommendation? A Deep Dive into Whole vs. Individual Modalities.
In: 48th European Conference on Information Retrieval (ECIR 2026), Delft, The Netherlands, 30 March - 1 April 2026,
pp. 66-81.
ISBN 9783032213235
(doi: 10.1007/978-3-032-21324-2_5)
Zheng, Kaiwen, Ge, Xuri ORCID: https://orcid.org/0000-0002-3925-4951, Fu, Junchen, Peng, Jun and Jose, Joemon
ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Multimodal Representation Learning Techniques for Comprehensive Facial State Analysis.
In: 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 30 Jun - 04 Jul 2025,
ISBN 9798331594954
(doi: 10.1109/ICME59968.2025.11208908)
He, Yaoqin, Fu, Junchen, Zheng, Kaiwen, Xu, Songpei, Chen, Fuhai, Li, Jie, Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759 and Ge, Xuri
(2025)
Double-Filter: Efficient Fine-tuning of Pre-trained Vision-Language Models via Patch&Layer Filtering.
In: ICML 2025, Vancouver, Canada, 13-19 July 2025,
Ge, Xuri, Li, Linqing, Xu, Songpei, Zheng, Kaiwen, He, Yaoqin, Fu, Junchen and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
The DenseCap-Guided Attention Network For Image-Text Matching.
In: ACM Web Conference 2025, Sydney, Australia, 28 April - 2 May 2025,
pp. 2153-2160.
ISBN 9798400713316
(doi: 10.1145/3701716.3717564)
Liu, Zhiyu, Fu, Junchen, Zheng, Kaiwen and Jose, Joemon M. ORCID: https://orcid.org/0000-0001-9228-1759
(2025)
Exploring Multimodal Pre-trained Models for Speech Emotion Recognition.
In: ACM Web Conference 2025, Sydney, Australia, 28 April - 2 May 2025,
pp. 2176-2180.
ISBN 9798400713316
(doi: 10.1145/3701716.3717561)