Article Info
Unifying Modalities: A Comparative Analysis of Bilinear Pooling Fusion Techniques for Multimodal Fake News Detection
Idza Aisara Norabid, Masita Jalil, Rozniza Ali, Ahed Mleih Al-Sbou, Noor Hafhizah Abd Rahim
dx.doi.org/10.17576/apjitm-2026-1501-22
Abstract
The spread of fake news on social media has required the need for multimodal detection techniques that combine both textual and image data. In this paper, it presents a comparative study of fusion approaches for fine-grained multimodal fake news detection that apply BERT method for representing the textual features, while ResNet models (RestNet18 and ResNet50) for image features. The comparison study involves a few fusion techniques, namely Multimodal Factorized Bilinear Pooling (MFB), Multimodal Compact Bilinear pooling (MCB) and alongside their self-attention-enhanced variants. The experiments are conducted on nine subsets of the Fakeddit dataset which have various sizes, to evaluate the performance scalability. The findings show that bilinear pooling techniques perform better in accuracy specifically in larger datasets. Among the approaches tested, MFB consistently achieves strong and stable performance while MCB also performs well, although slightly lower than MFB across all experiments. In addition, ResNet50 tends to outperform ResNet18 when trained on larger datasets. To conclude, the main contribution of this study is the benefits and limitations of four fusion techniques which provide useful guidelines for developing a more reliable automated system of multimodal fake news detection.
keyword
Deep learning, Fake news detection, Fine-grained, Fusion technique, Multimodal
Area
Pattern Recognition

