Institut de Recherche en Informatique et Systèmes Aléatoires
facilityRennes, Brittany, France
Research output, citation impact, and the most-cited recent papers from Institut de Recherche en Informatique et Systèmes Aléatoires (France). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Institut de Recherche en Informatique et Systèmes Aléatoires
The aim of this paper is to address recognition of natural human actions in diverse and realistic video settings. This challenging but important subject has mostly been ignored in the past due to several problems one of which is the lack of realistic and annotated video datasets. Our first contribution is to address this limitation and to investigate the use of movie scripts for automatic annotation of human actions in videos. We evaluate alternative methods for action retrieval from scripts and show benefits of a text-based classifier. Using the retrieved action samples for visual learning, we next turn to the problem of action classification in video. We present a new method for video classification that builds upon and extends several recent ideas including local space-time features, space-time pyramids and multi-channel non-linear SVMs. The method is shown to improve state-of-the-art results on the standard KTH action dataset by achieving 91.8% accuracy. Given the inherent problem of noisy labels in automatic annotation, we particularly investigate and show high tolerance of our method to annotation errors in the training set. We finally apply the method to learning and classifying challenging action classes in movies and show promising results.
In this paper, we discuss the evaluation of blind audio source separation (BASS) algorithms. Depending on the exact application, different distortions can be allowed between an estimated source and the wanted true source. We consider four different sets of such allowed distortions, from time-invariant gains to time-varying filters. In each case, we decompose the estimated source into a true source part plus error terms corresponding to interferences, additive noise, and algorithmic artifacts. Then, we derive a global performance measure using an energy ratio, plus a separate performance measure for each error term. These measures are computed and discussed on the results of several BASS problems with various difficulty levels
In this paper we review classification algorithms used to design brain-computer interface (BCI) systems based on electroencephalography (EEG). We briefly present the commonly employed algorithms and describe their critical properties. Based on the literature, we compare them in terms of performance and provide guidelines to choose the suitable classification algorithm(s) for a specific BCI.
This paper is the first of a two-part series on the topic of visual servo control using computer vision data in the servo loop to control the motion of a robot. In this paper, we describe the basic techniques that are by now well established in the field. We first give a general overview of the formulation of the visual servo control problem. We then describe the two archetypal visual servo control schemes: image-based and position-based visual servo control. Finally, we discuss performance and stability issues that pertain to these two schemes, motivating the second article in the series, in which we consider advanced techniques
A wavelet network concept, which is based on wavelet transform theory, is proposed as an alternative to feedforward neural networks for approximating arbitrary nonlinear functions. The basic idea is to replace the neurons by ;wavelons', i.e., computing units obtained by cascading an affine transform and a multidimensional wavelet. Then these affine transforms and the synaptic weights must be identified from possibly noise corrupted input/output data. An algorithm of backpropagation type is proposed for wavelet network training, and experimental results are reported.
The determination of upper bounds on execution times, commonly called worst-case execution times (WCETs), is a necessary step in the development and validation process for hard real-time systems. This problem is hard if the underlying processor architecture has components, such as caches, pipelines, branch prediction, and other speculative components. This article describes different approaches to this problem and surveys several commercially available tools 1 and research prototypes.
Tractography based on non-invasive diffusion imaging is central to the study of human brain connectivity. To date, the approach has not been systematically validated in ground truth studies. Based on a simulated human brain data set with ground truth tracts, we organized an open international tractography challenge, which resulted in 96 distinct submissions from 20 research groups. Here, we report the encouraging finding that most state-of-the-art algorithms produce tractograms containing 90% of the ground truth bundles (to at least some extent). However, the same tractograms contain many more invalid than valid bundles, and half of these invalid bundles occur systematically across research groups. Taken together, our results demonstrate and confirm fundamental ambiguities inherent in tract reconstruction based on orientation information alone, which need to be considered when interpreting tractography and connectivity results. Our approach provides a novel framework for estimating reliability of tractography and encourages innovation to address its current limitations.
Vision-based control in robotics based on considering a vision system as a specific sensor dedicated to a task and included in a control servo loop is described. Once the necessary modeling stage is performed, the framework becomes one of automatic control, and stability and robustness questions arise. State-of-the-art visual servoing is reviewed, and the basic concepts for modeling the concerned interactions are given. The interaction screw is thus defined in a general way, and the application to images follows. Starting from the concept of task function, the general framework of the control is described, and stability results are recalled. The concept of the hybrid task is presented and then applied to visual sensors. Simulation and experimental results are presented, and guidelines for future work are drawn in the conclusion.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
A critical issue in image restoration is the problem of noise removal while keeping the integrity of relevant image information. Denoising is a crucial step to increase image quality and to improve the performance of all the tasks needed for quantitative imaging analysis. The method proposed in this paper is based on a 3-D optimized blockwise version of the nonlocal (NL)-means filter (Buades, , 2005). The NL-means filter uses the redundancy of information in the image under study to remove the noise. The performance of the NL-means filter has been already demonstrated for 2-D images, but reducing the computational burden is a critical aspect to extend the method to 3-D images. To overcome this problem, we propose improvements to reduce the computational complexity. These different improvements allow to drastically divide the computational time while preserving the performances of the NL-means filter. A fully automated and optimized version of the NL-means filter is then presented. Our contributions to the NL-means filter are: 1) an automatic tuning of the smoothing parameter; 2) a selection of the most relevant voxels; 3) a blockwise implementation; and 4) a parallelized computation. Quantitative validation was carried out on synthetic datasets generated with BrainWeb (Collins, , 1998). The results show that our optimized NL-means filter outperforms the classical implementation of the NL-means filter, as well as two other classical denoising methods [anisotropic diffusion (Perona and Malik, 1990)] and total variation minimization process (Rudin, , 1992) in terms of accuracy (measured by the peak signal-to-noise ratio) with low computation time. Finally, qualitative results on real data are presented.
DNA metabarcoding offers new perspectives in biodiversity research. This recently developed approach to ecosystem study relies heavily on the use of next-generation sequencing (NGS) and thus calls upon the ability to deal with huge sequence data sets. The obitools package satisfies this requirement thanks to a set of programs specifically designed for analysing NGS data in a DNA metabarcoding context. Their capacity to filter and edit sequences while taking into account taxonomic annotation helps to set up tailor-made analysis pipelines for a broad range of DNA metabarcoding applications, including biodiversity surveys or diet analyses. The obitools package is distributed as an open source software available on the following website: http://metabarcoding.org/obitools. A Galaxy wrapper is available on the GenOuest core facility toolshed: http://toolshed.genouest.org.
This article is the second of a two-part tutorial on visual servo control. In this tutorial, we have only considered velocity controllers. It is convenient for most of classical robot arms. However, the dynamics of the robot must of course be taken into account for high speed task, or when we deal with mobile nonholonomic or underactuated robots. As for the sensor, geometrical features coming from a classical perspective camera is considered. Features related to the image motion or coming from other vision sensors necessitate to revisit the modeling issues to select adequate visual features. Finally, fusing visual features with data coming from other sensors at the level of the control scheme will allow to address new research topics
A discrete-time analysis of the orthogonal frequency division multiplex/offset QAM (OFDM/OQAM) multicarrier modulation technique, leading to a modulated transmultiplexer, is presented. The conditions of discrete orthogonality are established with respect to the polyphase components of the OFDM/OQAM prototype filter, which is assumed to be symmetrical and with arbitrary length. Fast implementation schemes of the OFDM/OQAM modulator and demodulator are provided, which are based on the inverse fast Fourier transform. Non-orthogonal prototypes create intersymbol and interchannel interferences (ISI and ICI) that, in the case of a distortion-free transmission, are expressed by a closed-form expression. A large set of design examples is presented for OFDM/OQAM systems with the number of subcarriers going from four up to 2048, which also allows a comparison between different approaches to get well-localized prototypes.
The purpose of this correspondence is to generalize a result by Donoho and Huo and Elad and Bruckstein on sparse representations of signals in a union of two orthonormal bases for R/sup N/. We consider general (redundant) dictionaries for R/sup N/, and derive sufficient conditions for having unique sparse representations of signals in such dictionaries. The special case where the dictionary is given by the union of L/spl ges/2 orthonormal bases for R/sup N/ is studied in more detail. In particular, it is proved that the result of Donoho and Huo, concerning the replacement of the /spl lscr//sup 0/ optimization problem with a linear programming problem when searching for sparse representations, has an analog for dictionaries that may be highly redundant.
We propose an approach to vision-based robot control, called 2 1/2 D visual servoing, which avoids the respective drawbacks of classical position-based and image-based visual servoing. Contrary to the position-based visual servoing, our scheme does not need any geometric three-dimensional model of the object. Furthermore and contrary to image-based visual servoing, our approach ensures the convergence of the control law in the whole task space. 2 1/2 D visual servoing is based on the estimation of the partial camera displacement from the current to the desired camera poses at each iteration of the control law. Visual features and data extracted from the partial displacement allow us to design a decoupled control law controlling the six camera DOFs. The robustness of our visual servoing scheme with respect to camera calibration errors is also analyzed: the necessary and sufficient conditions for local asymptotic stability are easily obtained. Then, due to the simple structure of the system, sufficient conditions for global asymptotic stability are established. Finally, experimental results with an eye-in-hand robotic system confirm the improvement in the stability and convergence domain of the 2 1/2 D visual servoing with respect to classical position-based and image-based visual servoing.
The Critical Assessment of Metagenome Interpretation (CAMI) community initiative presents results from its first challenge, a rigorous benchmarking of software for metagenome assembly, binning and taxonomic profiling. Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.
Twelve years ago, Proceedings of the IEEE devoted a special section to the synchronous languages. This paper discusses the improvements, difficulties, and successes that have occured with the synchronous languages since then. Today, synchronous languages have been established as a technology of choice for modeling, specifying, validating, and implementing real-time embedded applications. The paradigm of synchrony has emerged as an engineer-friendly design method based on mathematically sound tools.
Domain adaptation is one of the most challenging tasks of modern data analytics. If the adaptation is done correctly, models built on a specific data representation become more robust when confronted to data depicting the same classes, but described by another observation system. Among the many strategies proposed, finding domain-invariant representations has shown excellent properties, in particular since it allows to train a unique classifier effective in all domains. In this paper, we propose a regularized unsupervised optimal transportation model to perform the alignment of the representations in the source and target domains. We learn a transportation plan matching both PDFs, which constrains labeled samples of the same class in the source domain to remain close during transport. This way, we exploit at the same time the labeled samples in the source and the distributions observed in both domains. Experiments on toy and challenging real visual adaptation examples show the interest of the method, that consistently outperforms state of the art approaches. In addition, numerical experiments show that our approach leads to better performances on domain invariant deep learning features and can be easily adapted to the semi-supervised case where few labeled samples are available in the target domain.
BACKGROUND: The prevailing paradigm of host-parasite evolution is that arms races lead to increasing specialisation via genetic adaptation. Insect herbivores are no exception and the majority have evolved to colonise a small number of closely related host species. Remarkably, the green peach aphid, Myzus persicae, colonises plant species across 40 families and single M. persicae clonal lineages can colonise distantly related plants. This remarkable ability makes M. persicae a highly destructive pest of many important crop species. RESULTS: To investigate the exceptional phenotypic plasticity of M. persicae, we sequenced the M. persicae genome and assessed how one clonal lineage responds to host plant species of different families. We show that genetically identical individuals are able to colonise distantly related host species through the differential regulation of genes belonging to aphid-expanded gene families. Multigene clusters collectively upregulate in single aphids within two days upon host switch. Furthermore, we demonstrate the functional significance of this rapid transcriptional change using RNA interference (RNAi)-mediated knock-down of genes belonging to the cathepsin B gene family. Knock-down of cathepsin B genes reduced aphid fitness, but only on the host that induced upregulation of these genes. CONCLUSIONS: Previous research has focused on the role of genetic adaptation of parasites to their hosts. Here we show that the generalist aphid pest M. persicae is able to colonise diverse host plant species in the absence of genetic specialisation. This is achieved through rapid transcriptional plasticity of genes that have duplicated during aphid evolution.
The BioMart Community Portal (www.biomart.org) is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations.
This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Gaussian mixture modeling, which is the speaker modeling technique used in most systems, is then explained. A few speaker modeling alternatives, namely, neural networks and support vector machines, are mentioned. Normalization of scores is then explained, as this is a very important step to deal with real-world data. The evaluation of a speaker verification system is then detailed, and the detection error trade-off (DET) curve is explained. Several extensions of speaker verification are then enumerated, including speaker tracking and segmentation by speakers. Then, some applications of speaker verification are proposed, including on-site applications, remote applications, applications relative to structuring audio information, and games. Issues concerning the forensic area are then recalled, as we believe it is very important to inform people about the actual performance and limitations of speaker verification systems. This paper concludes by giving a few research trends in speaker verification for the next couple of years.