Powered by OpenAIRE graph
Found an issue? Give us feedback

ICAR

Interactions Corpus Apprentissage Représentations
14 Projects, page 1 of 3
  • Funder: French National Research Agency (ANR) Project Code: ANR-15-CE38-0011
    Funder Contribution: 635,186 EUR

    The SoSweet project focuses on the synchronic variation and the diachronic evolution of the variety of French language used on Twitter. The Web has entered all areas of our social life. As the language is central in our social interactions, it is legitimate to ask how the Web has become a factor acting on language. This is even more actual as the recent rise of novel digital services opens up new areas of expression, which support new linguistics behaviors. In particular, social medias such as Twitter provide channels of communication through which speakers/writers use their language in ways that differ from standard written and oral forms. The result is the emergence of new varieties of languages. A characteristic of these varieties is that they exhibit large variability among communities of speakers and high innovation rates. A scientific description must take into account this variability and explain how social forces and technical constraints regulate its dynamic. The main goal of SoSweet is to provide a detailed account of the links between linguistic variation and social structure in Twitter, both synchronically and diachronically. Through this specific example, and aware of its bias, we aim at providing a more detailed understanding of the dynamic links between individuals, social structure and language variation and change. Traditional methods are not suitable to address these questions. On the one hand, Twitter requires redefining fundamental concepts such as “addressee” or the public/private communication distinction. Moreover, while sociolinguistic studies are based on small samples, we will base our analysis on a corpus of 500 million tweets combined with the social network of the 10 million users who authored these tweets, complemented by socio-demographic data. This large data mass leads us to heavily rely on computational methods from different areas. The SoSweet project will therefore adopt a strong interdisciplinary position, at the crossing of social media linguistics, sociolinguistics, natural language processing (NLP) and network science. The NLP tools are designed for standard forms of language and exhibit a drastic loss of accuracy when applied to social media varieties. To define appropriate tools, descriptions of these varieties are needed. Descriptions that needs tools. We will address this circularity interdisciplinary, by working simultaneously both on linguistics description and on NLP tools development. For its part, network science provides us with tools for studying massive data from complex networks of users, through graph theory and computational modeling. The scientific program of SoSweet has been conceived in order to favor optimal interdisciplinary work as the four work packages (management, data collection and enrichment, variation and evolution analysis, outreach) involve all partners. The project will last 48 months. It involves 4 leading teams in their own field of research. The principal investigator, Icar, is specialized in corpus linguistics and computer mediated interaction. Icar will carry out the tasks of unifying linguistics evidences (empirical and theoric) with social clues (extracted from a massive network of sociological relations). Lidilem is in charge of adapting the sociolinguistics framework to the case of variation and communication on Twitter. Alpage, specialized in natural language processing, takes care of the linguistics enrichment part, which provides the other partners with normalized and structurally enriched forms of text. Alpage is also responsible of providing distributional analysis of our corpus, by the means of various forms of word clustering in order to define sociolinguistic variants in the tweets. Inria DANTE, specialized in the exploration of massive graph structures, will lead the crucial network analysis and will work on jointly integrating the sociological network and the linguistic distributional network of lexical relations

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-21-CE38-0005
    Funder Contribution: 210,166 EUR

    The C-maphore project aims at creating innovative learning materials to improve phonemic awareness as well as reading and writing in a mother language (L1) and a foreign language (L2), through in-person and online teaching. We will test the effect of 1) personalized external manipulable representations of phonemes thanks to digital tools and 2) prototypes for explicit phonemic awareness tasks and the improvement of phonographemic awareness. This project meets the needs for learning materials expressed by teachers and speech specialists, a need spotlighted by the pandemic lockdown. The experiment will allow us to improve the tools, but also to collect data that we will analyze to get a better understanding of the role of external representations by taking into account the state of the art about cognition (synaesthesia, mirror neurons) and the processes at work when learners discover the written system of a language.

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-20-CE38-0009
    Funder Contribution: 564,805 EUR

    The MOBILES project aims to document, understand and support the spatial and language learning practices of international students.ales hosted in higher education in France. The originality of the project consists in the analysis of the learning process within a long-term and immersion stay, through the angle of the spatial practices using digital tools. The project will (1) analyse the students’ spatial practices, i.e. shed light on the learning opportunities harboured by the context; (2) conceive a mapping of the city as it is practiced, by means of a cartographic interface that allows combining heterogeneous sources of data and exploring them in a quantitative and qualitative manner; (3) examine ways in which recommendation systems based on users’ participation can be set up in order to support the goals of learning.

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-19-CE27-0024
    Funder Contribution: 600,767 EUR

    The LiPoL research program aims to exploit the potentialities of the edition of the Tale of Baybars (Presses de l'Ifpo, 2000-2020, 18 volumes) in order to give a new impetus to research about popular Levantine literature and about Middle Arabic, its dominant linguistic register. By relying on "popular culture and its linguistic, artistic and literary expressions" as a structuring theme, LiPoL brings together four French institutional partners, including an Institute for Research Abroad. It aims to create a new dynamic capable of accompanying the disciplinary evolution that Arab and Islamic studies will experience following the deep socio-political changes experienced by the countries of the Middle East since the beginning of 2010. Organized along four axes, Digital, Language, Society and Aesthetics, it brings together researchers in the different fields of Humanities, so as to study altogether the linguistic, literary, historical and cultural dimensions of the text. The program includes a major digital component: both the critical edition of the Tale and the handwritten notebooks of storytellers on which it relies (about 60 000 folios digitalized up to now) will be made available to the scientific community online in open access. Composed of over ten million characters, the Tell of Baybars is from far the largest Middle Arabic corpus ever published, so that it constitutes a unique object of research in the field of Arabic studies. By contributing to the safeguarding of an endangered heritage, this digital collection of manuscripts should serve as a basis for a Digital Library of Popular Levantine Literature which will be fed when new handwritten notebooks from the Near East are revealed. A collaborative work tool - which will be experimented during the first 24 months of the project - will allow researchers to submit translations or comments online, while participating in the editing of these manuscripts.

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-08-FRAL-0006
    Funder Contribution: 160,000 EUR

    The two largest machine-readable text corpora of Medieval French have been built in two independent, though cooperating, projects in France (ENS-LSH Lyon: ICAR, UMR 5191) and Germany (ILR, University of Stuttgart). The central theme of this project is the syntactic annotation of both corpora in a joint project, so that researchers can base their research on both resources, according to shared principles. The project will use and, if necessary, develop tools for automatic and manual annotation on both morphology and syntax levels. It will keep tools and resources as system-independent as possible and thus ensure their reusability in future research. The project will be an important contribution to closing the gap between English and French medieval corpus resources and constitute an important impact for future work in diachronic syntax. The collaborating scholars world-wide will embed the annotation project in their diachronic linguistic research on various syntactic subjects. The added value of the German-French cooperation over two national projects is the simultaneous coordinated work on two different corpora according to shared standards. Only an intense well-structured cooperation between both groups will provide a resource on such a scale and having such an extensive chronological span (7 centuries). The collaborating scholars will embed the annotation project in their diachronic linguistic research on various syntactic subjects, and the henceforth possibility to process such a large syntactically enriched corpus will enable to discover many new evolution processes and to verify existing theories.

    more_vert
  • chevron_left
  • 1
  • 2
  • 3
  • chevron_right

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.

Content report
No reports available
Funder report
No option selected
arrow_drop_down

Do you wish to download a CSV file? Note that this process may take a while.

There was an error in csv downloading. Please try again later.