Powered by OpenAIRE graph
Found an issue? Give us feedback

CNRS DELEGATION REGIONALE PARIS A

Country: France

CNRS DELEGATION REGIONALE PARIS A

17 Projects, page 1 of 4
  • Funder: French National Research Agency (ANR) Project Code: ANR-06-BLAN-0032
    Funder Contribution: 207,700 EUR
    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-06-ROBO-0003
    Funder Contribution: 537,153 EUR
    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-09-BLAN-0170
    Funder Contribution: 723,252 EUR

    PERSEE "PERceptual Scheme for 2D&3D vidE(E)o Coding" Context and positionning: Our digital age has seen a large deployment of video coding standards. The latest standard H.264/AVC follows a classical coding model, on the other hand a new impulse to research is brought by the emergence of new formats beyond HDTV towards formats for immersive displays allowing panoramic viewing, interactive and 3DTV (omni-directional video, free viewpoint video and stereoscopic or multi-view video). A quantum leap in subjective quality is required, before these formats can enable a truly immersive experience for the viewer. In this competitive context, the project aims first at advancing the knowledge in perceptual modeling, in video processing and coding, and in computer vision, and second, at developing a content-based and perceptually driven representation and coding paradigm for 2D and 3D visual content. Scientific and technical description: Efforts are currently dedicated to the compression of multi-view sequences, and the ISO/MPEG has defined the MVC (Multi-View Coding) format which aims at capturing the redundancy between the different views. Apart from insufficient coding gain, MVC suffers from functionality limitations when virtual view-points have to be rendered at the receiver. It has also to be pointed out that visual quality is an even more crucial issue in 3D video than in 2D video. The targeted « perceptually-friendly » 2D and 3D representation and coding paradigm will build upon different models and techniques which have evolved in the last years and require further research. To achieve the required next generation coding performance, we propose to work in the direction of a content-based and perceptually driven representation and coding paradigm using a clever combination of perceptual models, texture analysis/synthesis, waveform coding, and a rate-visual quality optimization framework. The first scientific objective of the project will thus be to define a representation of 2D and 3D visual content with the goal of best taking into account perceptual models and perceptual quality metrics rather than the ubiquitous mean square error distortion measure. The obtained framework would set the foundations for an efficient perceptual coding scheme for 2D and 3D (multi-view plus depth) visual content. Organisation and programme: The project involves a close collaboration between four complementary academic partners with recognized expertise in the field: IRCCyN-Nantes, Image and Video-Communication team (perceptual modeling); INRIA-Rennes, TEMICS team ((Spatio-temporal texture analysis); IETR-Rennes, IMAGE team (3D content representation and compression); and LTCI-TelecomParisTech, Multimedia group (2D content representation and compression). The project is structured in 7 main tasks: 1.Coordination 2.Perceptual modeling for 2D and 3D video coding 3.Spatio-temporal texture analysis and synthesis 4.2D content representation and compression 5.3D content representation and compression 6.Integration within a common software platform 7.Perceptual and subjective performance assessment Results exploitation: The scientific results will be disseminated as soon as they are available. The goal will be also to exploit the results through future collaborations with other partners of the 'pôles of compétitivité'. The framework, integrating the most promising tools, is intended to set the foundations for technologies which could then be presented as candidates for international standardization (namely a reply to the ISO/ITU forthcoming call which should launch the H.265 standardization phase).

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-09-BLAN-0370
    Funder Contribution: 299,863 EUR

    The BINAAHR (BINaural Active Audition for Humanoid Robots) project proposes a psychoacoustic approach to active audition, and its validation on a robotics humanoid platform. Algorithms, hardware and software solutions will be defined to the defintion of auditory functions integrating the robot motion and its proprioception, first from a robotics approach then according to the sensorimotor contingencies theory developed by some members of the consoritum. The aim is to evaluate these approaches in terms of robustness and accuracy of the obtained solutions in uncontrolled variable and dynamic robotics environments. On the basis of these functions, modalities to the autitory h/r interaction will also be designed and integrated.

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-09-BLAN-0171
    Funder Contribution: 361,518 EUR

    Despite the factors, that at a first glance might make a language appear less important, and thus unnecessary as target for human language technologies (HLT), good reasons exist for developing speech recognition systems for literally all languages in the world. First, the diversity of languages in the world is the basis of the rich cultural diversity. However in today's world, languages are frequently disappearing. The ongoing extinction of many languages is in part caused by a switch to more prevalent languages that might give their speakers an economic advantage. The lack of HLT systems for these languages accelerates their extinction while on the other side HLT could help to stop this trend by making the less prevalent languages more attractive to their original speakers. A second reason why HLT should be available for all languages is that the political impact of a language can be very volatile. In today's world, language is one of the few remaining barriers that hinder human-to-human interaction. Events such as armed conflicts or natural disasters might make it important to be able to communicate with speakers of a less-prevalent language, e.g. for humanitarian workers in a disaster area. Here, readily available technology such as speech translation systems can be highly beneficial. Such technology might be far from being perfect, but when being faced with the alternative of having no translation system at all for an unknown language in an emergency situation, the imperfect system will be of great use. Therefore, HLT needs to be developed especially for under-resourced languages! Nowadays, almost all of the techniques and methods in spoken language technologies, in particular the automatic speech recognition (ASR) systems, use statistical approaches. However, given the statistical nature of these methods, a large amount of resources (vocabularies, text corpora, transcribed speech corpora, phonetic dictionaries) is crucial and required to train models and to test the performances of the systems. Consequently, a large speech corpus which contains hours of signals recorded by hundreds speakers (for acoustic modeling) and a text corpus with million words (for language modeling) is currently necessary for building an ASR system for a new language. However, these crucial resources are not directly available for under-resourced languages. Thus, a methodology for rapidly building them is necessary, and in the mean time, strategies to exploit a minimum amount of resources are necessary. From a scientific point of view, the interest and originality of this project consists in proposing viable innovative methods that go far beyond the simple retraining or adaptation of acoustic and linguistic models. Consequently, a significant breakdown is needed to develop ASR systems for '-languages. For instance we plan to question the use of the word as a fundamental unit for language modelling : low complexity language models could be obtained by using sub-word units (morphemes, syllables or even characters) which might be of interest when few training data is available. Concerning acoustic modelling, the originality of this project lies in the proposal of large coverage multilingual acoustic models based on our knowledge of the world speech sounds systems. Another goal is to explore more 'language independent' ASR systems that would adapt themselves (without or with very few supervision) to the audio flow they have at their input. From an operational point of view, this project aims at providing a free open source ASR development kit for '-languages. This goal is realistic since elements of it have been already developed in a preliminary fashion by some partners of this consortium : text data collection and filtering tools (LIG, LIA), ASR training and decoding tools (LIA), phone mapping tools from a source to a target language (LIG). We plan to distribute and evaluate such a development kit by deploying ASR systems for new under-resourced languages with very poor resources (khmer and lao for instance). If successful, this project would lead to a dynamic user group of our development kit, composed by research teams or individuals doing ASR research for their own language. It is also important to note that some under-resourced languages could show, in the future of their development, a very strong economic potential: Bengali, Malay and Vietnamese are for instance in the top-20 of the most spoken languages in the world. Some other languages may be of great interest for governmental or non governmental projects involving global security (see example of Iraqi dialect in US DARPA projects) or humanitarian issues. Finally, in the objective of saving some endangered languages (some mostly spoken and not written), the possibility to rapidly develop ASR systems to transcribe them is an important step for their preservation and would facilitate access to audio contents in these languages (notably languages from Africa).

    more_vert
  • chevron_left
  • 1
  • 2
  • 3
  • 4
  • chevron_right

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.

Content report
No reports available
Funder report
No option selected
arrow_drop_down

Do you wish to download a CSV file? Note that this process may take a while.

There was an error in csv downloading. Please try again later.