Laboratoire de Recherche en Informatique
facilityGif-sur-Yvette, France
Research output, citation impact, and the most-cited recent papers from Laboratoire de Recherche en Informatique (France). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Laboratoire de Recherche en Informatique
We have previously shown correction of X-linked severe combined immunodeficiency [SCID-X1, also known as gamma chain (gamma(c)) deficiency] in 9 out of 10 patients by retrovirus-mediated gamma(c) gene transfer into autologous CD34 bone marrow cells. However, almost 3 years after gene therapy, uncontrolled exponential clonal proliferation of mature T cells (with gammadelta+ or alphabeta+ T cell receptors) has occurred in the two youngest patients. Both patients' clones showed retrovirus vector integration in proximity to the LMO2 proto-oncogene promoter, leading to aberrant transcription and expression of LMO2. Thus, retrovirus vector insertion can trigger deregulated premalignant cell proliferation with unexpected frequency, most likely driven by retrovirus enhancer activity on the LMO2 gene promoter.
Several recent advances to the state of the art in image classification benchmarks have come from better configurations of existing techniques rather than novel ap-proaches to feature learning. Traditionally, hyper-parameter optimization has been the job of humans because they can be very efficient in regimes where only a few trials are possible. Presently, computer clusters and GPU processors make it pos-sible to run more trials and we show that algorithmic approaches can find better results. We present hyper-parameter optimization results on tasks of training neu-ral networks and deep belief networks (DBNs). We optimize hyper-parameters using random search and two new greedy sequential methods based on the ex-pected improvement criterion. Random search has been shown to be sufficiently efficient for learning neural networks for several datasets, but we show it is unreli-able for training DBNs. The sequential algorithms are applied to the most difficult DBN learning problems from [1] and find significantly better results than the best previously reported. This work contributes novel techniques for making response surface models P (y|x) in which many elements of hyper-parameter assignment (x) are known to be irrelevant given particular values of other elements. 1
DESCRIPTION: VARNA is a tool for the automated drawing, visualization and annotation of the secondary structure of RNA, designed as a companion software for web servers and databases. FEATURES: VARNA implements four drawing algorithms, supports input/output using the classic formats dbn, ct, bpseq and RNAML and exports the drawing as five picture formats, either pixel-based (JPEG, PNG) or vector-based (SVG, EPS and XFIG). It also allows manual modification and structural annotation of the resulting drawing using either an interactive point and click approach, within a web server or through command-line arguments. AVAILABILITY: VARNA is a free software, released under the terms of the GPLv3.0 license and available at http://varna.lri.fr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
We describe a new method for use in the process of co-designing technologies with users called technology probes. Technology probes are simple, flexible, adaptable technologies with three interdisciplinary goals: the social science goal of understanding the needs and desires of users in a real-world setting, the engineering goal of field-testing the technology, and the design goal of inspiring users and researchers to think about new technologies. We present the results of designing and deploying two technology probes, the messageProbe and the videoProbe, with diverse families in France, Sweden, and the U.S. We conclude with our plans for creating new technologies for and with families based on our experiences.
This article aims to provide an introductory survey on quantum random walks. Starting from a physical effect to illustrate the main ideas we will introduce quantum random walks, review some of their properties and outline their striking differences to classical walks. We will touch upon both physical effects and computer science applications, introducing some of the main concepts and language of present day quantum information science in this context. We will mention recent developments in this new area and outline some open questions.
Quantum random walks on graphs have been shown to display many interesting properties, including exponentially fast hitting times when compared with their classical counterparts. However, it is still unclear how to use these novel properties to gain an algorithmic speedup over classical algorithms. In this paper, we present a quantum search algorithm based on the quantum random-walk architecture that provides such a speedup. It will be shown that this algorithm performs an oracle search on a database of N items with $O(\sqrt{N})$ calls to the oracle, yielding a speedup similar to other quantum search algorithms. It appears that the quantum random-walk formulation has considerable flexibility, presenting interesting opportunities for development of other, possibly novel quantum algorithms.
We propose an end-to-end framework for the dense, pixelwise classification of satellite imagery with convolutional neural networks (CNNs). In our framework, CNNs are directly trained to produce classification maps out of the input images. We first devise a fully convolutional architecture and demonstrate its relevance to the dense classification problem. We then address the issue of imperfect training data through a two-step training approach: CNNs are first initialized by using a large amount of possibly inaccurate reference data, and then refined on a small amount of accurately labeled data. To complete our framework, we design a multiscale neuron module that alleviates the common tradeoff between recognition and precise localization. A series of experiments show that our networks consider a large amount of context to provide fine-grained classification maps.
Phylogeny.fr, created in 2008, has been designed to facilitate the execution of phylogenetic workflows, and is nowadays widely used. However, since its development, user needs have evolved, new tools and workflows have been published, and the number of jobs has increased dramatically, thus promoting new practices, which motivated its refactoring. We developed NGPhylogeny.fr to be more flexible in terms of tools and workflows, easily installable, and more scalable. It integrates numerous tools in their latest version (e.g. TNT, FastME, MrBayes, etc.) as well as new ones designed in the last ten years (e.g. PhyML, SMS, FastTree, trimAl, BOOSTER, etc.). These tools cover a large range of usage (sequence searching, multiple sequence alignment, model selection, tree inference and tree drawing) and a large panel of standard methods (distance, parsimony, maximum likelihood and Bayesian). They are integrated in workflows, which have been already configured ('One click'), can be customized ('Advanced'), or are built from scratch ('A la carte'). Workflows are managed and run by an underlying Galaxy workflow system, which makes workflows more scalable in terms of number of jobs and size of data. NGPhylogeny.fr is deployable on any server or personal computer, and is freely accessible at https://ngphylogeny.fr.
International audience
The European Space Agency’s Planck satellite was launched on 14 May 2009, and has been surveying the sky stably and continuously since 13 August 2009. Its performance is well in line with expectations, and it will continue to gather scientific data until the end of its cryogenic lifetime. We give an overview of the history of Planck in its first year of operations, and describe some of the key performance aspects of the satellite. This paper is part of a package submitted in conjunction with Planck’s Early Release Compact Source Catalogue, the first data product based on Planck to be released publicly. The package describes the scientific performance of the Planck payload, and presents results on a variety of astrophysical topics related to the sources included in the Catalogue, as well as selected topics on diffuse emission.
Beside impressive progresses made by SAT solvers over the last ten years, only few works tried to understand why Conflict Directed Clause Learning algorithms (CDCL) are so strong and efficient on most industrial applications. We report in this work a key observation of CDCL solvers behavior on this family of benchmarks and explain it by an unsuspected side effect of their particular Clause Learning scheme. This new paradigm allows us to solve an important, still open, question: How to designing a fast, static, accurate, and predictive measure of new learnt clauses pertinence. Our paper is followed by empirical evidences that show how our new learning scheme improves state-of-the art results by an order of magnitude on both SAT and UNSAT industrial problems.
An all sky map of the apparent temperature and optical depth of thermal dust emission is constructed using the Planck-HFI (350μm to 2 mm) andIRAS(100μm) data. The optical depth maps are correlated with tracers of the atomic (Hi) and molecular gas traced by CO. The correlation with the column density of observed gas is linear in the lowest column density regions at high Galactic latitudes. At high NH, the correlation is consistent with that of the lowest NH, for a given choice of the CO-to-H2 conversion factor. In the intermediate NH range, a departure from linearity is observed, with the dust optical depth in excess of the correlation. This excess emission is attributed to thermal emission by dust associated with a dark gas phase, undetected in the available Hi and CO surveys. The 2D spatial distribution of the dark gas in the solar neighbourhood (|bII| > 10°) is shown to extend around known molecular regions traced by CO. The average dust emissivity in the Hi phase in the solar neighbourhood is found to be τD/NHtot = 5.2×10-26 cm2 at 857 GHz. It follows roughly a power law distribution with a spectral index β = 1.8 all the way down to 3 mm, although the SED flattens slightly in the millimetre. Taking into account the spectral shape of the dust optical depth, the emissivity is consistent with previous values derived fromFIRAS measurements at high latitudes within 10%. The threshold for the existence of the dark gas is found at NHtot = (8.0±0.58)×1020 H cm−2 (AV = 0.4mag). Assuming the same high frequency emissivity for the dust in the atomic and the molecular phases leads to an average XCO = (2.54 ± 0.13) × 1020 H2 cm-2/(K km s-1). The mass of dark gas is found to be 28% of the atomic gas and 118% of the CO emitting gas in the solar neighbourhood. The Galactic latitude distribution shows that its mass fraction is relatively constant down to a few degrees from the Galactic plane. A possible explanation for the dark gas lies in a dark molecular phase, where H2 survives photodissociation but CO does not. The observed transition for the onsetof this phase in the solar neighbourhood (AV = 0.4mag) appears consistent with recent theoretical predictions. It is also possible that up to half of the dark gas could be in atomic form, due to optical depth effects in the Hi measurements.
This book develops algorithms for using formulaic representations of real numbers. Techniques are illustrated by a corresponding program.
We present the first all-sky sample of galaxy clusters detected blindly by the Planck satellite through the Sunyaev-Zeldovich (SZ) effect from its six highest frequencies. This early SZ (ESZ) sample is comprised of 189 candidates, which have a high signal-to-noise ratio ranging from 6 to 29. Its high reliability (purity above 95%) is further ensured by an extensive validation process based on Planck internal quality assessments and by external cross-identification and follow-up observations. Planck provides the first measured SZ signal for about 80% of the 169 previouslyknown ESZ clusters. Planck furthermore releases 30 new cluster candidates, amongst which 20 meet the ESZ signal-to-noise selection criterion. At the submission date, twelve of the 20 ESZ candidates were confirmed as new clusters, with eleven confirmed using XMM-Newton snapshot observations, most of them with disturbed morphologies and low luminosities. The ESZ clusters are mostly at moderate redshifts (86% with z below 0.3) and span more than a decade in mass, up to the rarest and most massive clusters with masses above 1 10 15 M .
Quantifying and comparing performance of numerical optimization algorithms is an important aspect of research in search and optimization. However, this task turns out to be tedious and difficult to realize even in the single-objective case – at least if one is willing to accomplish it in a scientifically decent and rigorous way. The COCO software used for the BBOB workshops (2009, 2010 and 2012) furnishes most of this tedious task for its participants: (1) choice and implementation of a wellmotivated single-objective benchmark function testbed, (2) design of an experimental set-up, (3) generation of data output for (4) post-processing and presentation of the results in graphs and tables. What remains to be done for practitioners is to allocate CPU-time, run their favorite black-box real-parameter optimizer in a few dimensions a few hundreds of times and execute the provided post-processing scripts. Two testbeds are provided, • noise-free functions • noisy functions and practitioners can freely choose any or all of them. The post-processing provides a quantitative performance assessment in graphs and tables, categorized by function properties like multi-modality, ill-conditioning, global structure, separability,... This document describes the experimental setup and touches the question of how the results are displayed. The benchmark function definitions, source code of the benchmark functions and for the post-processing and this report are available at
Throughout the history of mathematics, concepts of number and space have been tightly intertwined. We tested the hypothesis that cortical circuits for spatial attention contribute to mental arithmetic in humans. We trained a multivariate classifier algorithm to infer the direction of an eye movement, left or right, from the brain activation measured in the posterior parietal cortex. Without further training, the classifier then generalized to an arithmetic task. Its left versus right classification could be used to sort out subtraction versus addition trials, whether performed with symbols or with sets of dots. These findings are consistent with the suggestion that mental arithmetic co-opts parietal circuitry associated with spatial coding.
Our goal is to define a list of tasks for graph visualization that has enough detail and specificity to be useful to: 1) designers who want to improve their system and 2) to evaluators who want to compare graph visualization systems. In this paper, we suggest a list of tasks we believe are commonly encountered while analyzing graph data. We define graph specific objects and demonstrate how all complex tasks could be seen as a series of low-level tasks performed on those objects. We believe that our taxonomy, associated with benchmark datasets and specific tasks, would help evaluators generalize results collected through a series of controlled experiments.
Quantifying and comparing performance of optimization algorithms is one important aspect of research in search and optimization. However, this task turns out to be tedious and difficult to realize even in the single-objective case – at least if one is willing to accomplish it in a scientifically decent and rigorous way. The BBOB 2009 workshop will furnish most of this tedious task for its participants: (1) choice and implementation of a well-motivated single-objective benchmark function testbed, (2) design of an experimental set-up, (3) generation of data output for (4) post-processing and presentation of the results in graphs and tables. What remains to be done for the participants is to allocate CPU-time, run their favorite black-box real-parameter optimizer in a few dimensions a few hundreds of times and execute the provided post-processing script afterwards. Two testbeds are provided, • noise-free functions
This paper analyzes a (1, $λ$)-Evolution Strategy, a randomized comparison-based adaptive search algorithm, optimizing a linear function with a linear constraint. The algorithm uses resampling to handle the constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using cumulative step-size adaptation. We exhibit for each case a Markov chain describing the behaviour of the algorithm. Stability of the chain implies, by applying a law of large numbers, either convergence or divergence of the algorithm. Divergence is the desired behaviour. In the constant step-size case, we show stability of the Markov chain and prove the divergence of the algorithm. In the cumulative step-size adaptation case, we prove stability of the Markov chain in the simplified case where the cumulation parameter equals 1, and discuss steps to obtain similar results for the full (default) algorithm where the cumulation parameter is smaller than 1. The stability of the Markov chain allows us to deduce geometric divergence or convergence , depending on the dimension, constraint angle, population size and damping parameter, at a rate that we estimate. Our results complement previous studies where stability was assumed.
We present a model for building, visualizing, and interacting with multiscale representations of information visualization techniques using hierarchical aggregation. The motivation for this work is to make visual representations more visually scalable and less cluttered. The model allows for augmenting existing techniques with multiscale functionality, as well as for designing new visualization and interaction techniques that conform to this new class of visual representations. We give some examples of how to use the model for standard information visualization techniques such as scatterplots, parallel coordinates, and node-link diagrams, and discuss existing techniques that are based on hierarchical aggregation. This yields a set of design guidelines for aggregated visualizations. We also present a basic vocabulary of interaction techniques suitable for navigating these multiscale visualizations.