Von der Materialisierung zur (globalisierenden) Digitalisierung / Teil 2

7.02.2019
10:00

Vortrag : Antoine Mazière und Telmo Menezes (Centre March Bloch), Forschung zu „Computational Intelligence for the Social Sciences“ Die Sitzung ist die Fortsetzung der Diskussion „Von der Materialisierung zur (globalisierenden) Digitalisierung“. Over the last few decades, information and communication technologies dramatically changed public spaces. Digital media now play a central role across different scales, from the political communications of world leaders and news reporting to the personal sphere of family, friendship and dating. The new digital public space is a highly complex socio-technical system that emerged in a decentralized way, and which presents dynamics that we are far from being fully understood, but that is clearly already causing profound social changes. While empirical data used to be scarce and required great effort to obtain, we now have the opposite problem of being flooded with it. Something did not change: the full richness of human interactions which is contained in texts, their meaning and their context. On one hand, it is impossible for reasearchers to read and analyze even a small fraction of such corpuses. On the other, full human-level understanding of text is still a distant dream, even considering recent advances in Artificial Intelligence. It is however possible to create scientific intruments that bridge this gap, using AI to extract and structure some of the meaning, providing researchers with insights that can then guide them in their exploration and help them validate their hypotheses. In that sense, Telmo will present a project [1] he has been developing for several years: a knowledge representation formalism that we refer to as semantic hypergraphs, along with a framework of algorithms to transform text into this representation. We demonstrate knowledge inference from a multi-year corpus of international news, identifying claims and expressions of conflicts, along with their participating actors and topics. We show how this higher-order knowledge can be used to provide effective overviews of large natural language corpuses. We demonstrate actor-centric summarization of conflicts, comparison of topics of claims between actors and networks of conflicts between actors in the context of a given topic. Then, Antoine will present the Namograph project [2], a textual analysis framework for surnames built in order to estimate their geographical origins. Applying this model in the context of dicrimination studies in France, we were able to make several estimations on the representativeness of the origin of surnames of members of various social and occupational groups in France. This presentation will guide the audience to the process of data analysis and its limitations in reaching a proper qualitative assessment of its original intent. References for the presented research: – [1] Menezes, T. and Roth C. (now). GraphBrain, http://graphbrain.org/ – [2] Mazieres, A. and Roth, C. (2018). Large-scale diversity estimation through surname origin inference. Bulletin of Sociological Methodology, 139. https://namograph.antonomase.fr/