TY - JOUR AU - Blandfort, Philipp AU - Patton, Desmond U. AU - Frey, William R. AU - Karaman, Svebor AU - Bhargava, Surabhi AU - Lee, Fei-Tzin AU - Varia, Siddharth AU - Kedzie, Chris AU - Gaskell, Michael B. AU - Schifanella, Rossano AU - McKeown, Kathleen AU - Chang, Shih-Fu PY - 2019/07/06 Y2 - 2024/03/28 TI - Multimodal Social Media Analysis for Gang Violence Prevention JF - Proceedings of the International AAAI Conference on Web and Social Media JA - ICWSM VL - 13 IS - 01 SE - Full Papers DO - 10.1609/icwsm.v13i01.3214 UR - https://ojs.aaai.org/index.php/ICWSM/article/view/3214 SP - 114-124 AB - <p>Gang violence is a severe issue in major cities across the U.S. and recent studies have found evidence of social media communications that can be linked to such violence in communities with high rates of exposure to gang activity. In this paper we partnered computer scientists with social work researchers, who have domain expertise in gang violence, to analyze how public tweets with images posted by youth who mention gang associations on Twitter can be leveraged to automatically detect psychosocial factors and conditions that could potentially assist social workers and violence outreach workers in prevention and early intervention programs. To this end, we developed a rigorous methodology for collecting and annotating tweets. We gathered 1,851 tweets and accompanying annotations related to visual concepts and the <em>psychosocial codes</em>: <em>aggression</em>, <em>loss</em>, and <em>substance use</em>. These codes are relevant to social work interventions, as they represent possible pathways to violence on social media. We compare various methods for classifying tweets into these three classes, using only the text of the tweet, only the image of the tweet, or both modalities as input to the classifier. In particular, we analyze the usefulness of mid-level visual concepts and the role of different modalities for this tweet classification task. Our experiments show that individually, text information dominates classification performance of the <em>loss</em> class, while image information dominates the <em>aggression</em> and <em>substance use</em> classes. Our multimodal approach provides a very promising improvement (18% relative in mean average precision) over the best single modality approach. Finally, we also illustrate the complexity of understanding social media data and elaborate on open challenges. The annotated dataset will be made available for research with strong ethical protection mechanism.</p> ER -