FURI | Summer 2020

Real Time Multimodal Classification for Social Media Notifications

Data icon, disabled. Four grey bars arranged like a vertical bar chart.

Aim of this research is to create a model that can take either image, text or both to classify the content of the message into different categories in real time. In the past other researchers tried early, late and common space fusion to solve similar problem. In this research model behaves as, if image input is provided then model first creates the caption and appends it into the text input either text input exists or not. Then feeds that result into text classifier. Resulting model performed better than text classifier on inputs that has image from everyday life. However, model performed worse on images that are creepy/gore. Because, caption generator is not trained to anything like that.

Student researcher

Mertay Dayanc

Computer science

Hometown: Ankara, Incek, Turkey

Graduation date: Fall 2021