GIPSA-Lab
facilityGrenoble, Auvergne-Rhône-Alpes, France
Research output, citation impact, and the most-cited recent papers from GIPSA-Lab (France). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from GIPSA-Lab
Imaging spectrometers measure electromagnetic energy scattered in their instantaneous field view in hundreds or thousands of spectral channels with higher spectral resolution than multispectral cameras. Imaging spectrometers are therefore often referred to as hyperspectral cameras (HSCs). Higher spectral resolution enables material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis. Due to low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus, accurate estimation requires unmixing. Pixels are assumed to be mixtures of a few materials, called endmembers. Unmixing involves estimating all or some of: the number of endmembers, their spectral signatures, and their abundances at each pixel. Unmixing is a challenging, ill-posed inverse problem because of model inaccuracies, observation noise, environmental conditions, endmember variability, and data set size. Researchers have devised and investigated many models searching for robust, stable, tractable, and accurate unmixing algorithms. This paper presents an overview of unmixing methods from the time of Keshava and Mustard's unmixing tutorial to the present. Mixing models are first discussed. Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixing algorithms are described. Mathematical problems and potential solutions are described. Algorithm characteristics are illustrated experimentally.
OBJECTIVE: Most current electroencephalography (EEG)-based brain-computer interfaces (BCIs) are based on machine learning algorithms. There is a large diversity of classifier types that are used in this field, as described in our 2007 review paper. Now, approximately ten years after this review publication, many new algorithms have been developed and tested to classify EEG signals in BCIs. The time is therefore ripe for an updated review of EEG classification algorithms for BCIs. APPROACH: We surveyed the BCI and machine learning literature from 2007 to 2017 to identify the new classification approaches that have been investigated to design BCIs. We synthesize these studies in order to present such algorithms, to report how they were used for BCIs, what were the outcomes, and to identify their pros and cons. MAIN RESULTS: We found that the recently designed classification algorithms for EEG-based BCIs can be divided into four main categories: adaptive classifiers, matrix and tensor classifiers, transfer learning and deep learning, plus a few other miscellaneous classifiers. Among these, adaptive classifiers were demonstrated to be generally superior to static ones, even with unsupervised adaptation. Transfer learning can also prove useful although the benefits of transfer learning remain unpredictable. Riemannian geometry-based methods have reached state-of-the-art performances on multiple BCI problems and deserve to be explored more thoroughly, along with tensor-based methods. Shrinkage linear discriminant analysis and random forests also appear particularly useful for small training samples settings. On the other hand, deep learning methods have not yet shown convincing improvement over state-of-the-art BCI methods. SIGNIFICANCE: This paper provides a comprehensive overview of the modern classification algorithms used in EEG-based BCIs, presents the principles of these methods and guidelines on when and how to use them. It also identifies a number of challenges to further advance EEG classification in BCI.
Hyperspectral remote sensing technology has advanced significantly in the past two decades. Current sensors onboard airborne and spaceborne platforms cover large areas of the Earth surface with unprecedented spectral, spatial, and temporal resolutions. These characteristics enable a myriad of applications requiring fine identification of materials or estimation of physical parameters. Very often, these applications rely on sophisticated and complex data analysis methods. The sources of difficulties are, namely, the high dimensionality and size of the hyperspectral data, the spectral mixing (linear and nonlinear), and the degradation mechanisms associated to the measurement process such as noise and atmospheric effects. This paper presents a tutorial/overview cross section of some relevant hyperspectral data analysis methods and algorithms, organized in six main topics: data fusion, unmixing, classification, target detection, physical parameter retrieval, and fast computing. In all topics, we describe the state-of-the-art, provide illustrative examples, and point to future challenges and research directions.
Convolutional neural networks (CNNs) have been attracting increasing attention in hyperspectral (HS) image classification due to their ability to capture spatial-spectral feature representations. Nevertheless, their ability in modeling relations between the samples remains limited. Beyond the limitations of grid sampling, graph convolutional networks (GCNs) have been recently proposed and successfully applied in irregular (or nongrid) data representation and analysis. In this article, we thoroughly investigate CNNs and GCNs (qualitatively and quantitatively) in terms of HS image classification. Due to the construction of the adjacency matrix on all the data, traditional GCNs usually suffer from a huge computational cost, particularly in large-scale remote sensing (RS) problems. To this end, we develop a new minibatch GCN (called miniGCN hereinafter), which allows to train large-scale GCNs in a minibatch fashion. More significantly, our miniGCN is capable of inferring out-of-sample data without retraining networks and improving classification performance. Furthermore, as CNNs and GCNs can extract different types of HS features, an intuitive solution to break the performance bottleneck of a single model is to fuse them. Since miniGCNs can perform batchwise network training (enabling the combination of CNNs and GCNs), we explore three fusion strategies: additive fusion, elementwise multiplicative fusion, and concatenation fusion to measure the obtained performance gain. Extensive experiments, conducted on three HS data sets, demonstrate the advantages of miniGCNs over GCNs and the superiority of the tested fusion strategies with regard to the single CNN or GCN models. The codes of this work will be available at https://github.com/danfenghong/IEEE_TGRS_GCN for the sake of reproducibility.
ISBN = 978-0-12-374726-6 http://www.elsevier.com/wps/find/bookdescription.cws_home/717222/description
Classification and identification of the materials lying over or beneath the earth's surface have long been a fundamental but challenging research topic in geoscience and remote sensing (RS), and have garnered a growing concern owing to the recent advancements of deep learning techniques. Although deep networks have been successfully applied in single-modality-dominated classification tasks, yet their performance inevitably meets the bottleneck in complex scenes that need to be finely classified, due to the limitation of information diversity. In this work, we provide a baseline solution to the aforementioned difficulty by developing a general multimodal deep learning (MDL) framework. In particular, we also investigate a special case of multi-modality learning (MML)-cross-modality learning (CML) that exists widely in RS image classification applications. By focusing on “what,” “where,” and “how” to fuse, we show different fusion strategies as well as how to train deep networks and build the network architecture. Specifically, five fusion architectures are introduced and developed, further being unified in our MDL framework. More significantly, our framework is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs). To validate the effectiveness and superiority of the MDL framework, extensive experiments related to the settings of MML and CML are conducted on two different multimodal RS data sets. Furthermore, the codes and data sets will be available at https://github.com/danfenghong/IEEE_TGRS_MDL-RS, contributing to the RS community.
Recent advances in spectral-spatial classification of hyperspectral images are presented in this paper. Several techniques are investigated for combining both spatial and spectral information. Spatial information is extracted at the object (set of pixels) level rather than at the conventional pixel level. Mathematical morphology is first used to derive the morphological profile of the image, which includes characteristics about the size, orientation, and contrast of the spatial structures present in the image. Then, the morphological neighborhood is defined and used to derive additional features for classification. Classification is performed with support vector machines (SVMs) using the available spectral information and the extracted spatial information. Spatial postprocessing is next investigated to build more homogeneous and spatially consistent thematic maps. To that end, three presegmentation techniques are applied to define regions that are used to regularize the preliminary pixel-wise thematic map. Finally, a multiple-classifier (MC) system is defined to produce relevant markers that are exploited to segment the hyperspectral image with the minimum spanning forest algorithm. Experimental results conducted on three real hyperspectral images with different spatial and spectral resolutions and corresponding to various contexts are presented. They highlight the importance of spectral-spatial strategies for the accurate classification of hyperspectral images and validate the proposed methods.
Pansharpening aims at fusing a multispectral and a panchromatic image, featuring the result of the processing with the spectral resolution of the former and the spatial resolution of the latter. In the last decades, many algorithms addressing this task have been presented in the literature. However, the lack of universally recognized evaluation criteria, available image data sets for benchmarking, and standardized implementations of the algorithms makes a thorough evaluation and comparison of the different pansharpening techniques difficult to achieve. In this paper, the authors attempt to fill this gap by providing a critical description and extensive comparisons of some of the main state-of-the-art pansharpening methods. In greater details, several pansharpening algorithms belonging to the component substitution or multiresolution analysis families are considered. Such techniques are evaluated through the two main protocols for the assessment of pansharpening results, i.e., based on the full- and reduced-resolution validations. Five data sets acquired by different satellites allow for a detailed comparison of the algorithms, characterization of their performances with respect to the different instruments, and consistency of the two validation procedures. In addition, the implementation of all the pansharpening techniques considered in this paper and the framework used for running the simulations, comprising the two validation procedures and the main assessment indexes, are collected in a MATLAB toolbox that is made available to the community.
In various disciplines, information about the same phenomenon can be acquired from different types of detectors, at different conditions, in multiple experiments or subjects, among others. We use the term “modality” for each such acquisition framework. Due to the rich characteristics of natural phenomena, it is rare that a single modality provides complete knowledge of the phenomenon of interest. The increasing availability of several modalities reporting on the same system introduces new degrees of freedom, which raise questions beyond those related to exploiting each modality separately. As we argue, many of these questions, or “challenges,” are common to multiple domains. This paper deals with two key issues: “why we need data fusion” and “how we perform it.” The first issue is motivated by numerous examples in science and technology, followed by a mathematical framework that showcases some of the benefits that data fusion provides. In order to address the second issue, “diversity” is introduced as a key concept, and a number of data-driven solutions based on matrix and tensor decompositions are discussed, emphasizing how they account for diversity across the data sets. The aim of this paper is to provide the reader, regardless of his or her community of origin, with a taste of the vastness of the field, the prospects, and the opportunities that it holds.
Hyperspectral (HS) images are characterized by approximately contiguous spectral information, enabling the fine identification of materials by capturing subtle spectral discrepancies. Owing to their excellent locally contextual modeling ability, convolutional neural networks (CNNs) have been proven to be a powerful feature extractor in HS image classification. However, CNNs fail to mine and represent the sequence attributes of spectral signatures well due to the limitations of their inherent network backbone. To solve this issue, we rethink HS image classification from a sequential perspective with transformers, and propose a novel backbone network called \ul{SpectralFormer}. Beyond band-wise representations in classic transformers, SpectralFormer is capable of learning spectrally local sequence information from neighboring bands of HS images, yielding group-wise spectral embeddings. More significantly, to reduce the possibility of losing valuable information in the layer-wise propagation process, we devise a cross-layer skip connection to convey memory-like components from shallow to deep layers by adaptively learning to fuse "soft" residuals across layers. It is worth noting that the proposed SpectralFormer is a highly flexible backbone network, which can be applicable to both pixel- and patch-wise inputs. We evaluate the classification performance of the proposed SpectralFormer on three HS datasets by conducting extensive experiments, showing the superiority over classic transformers and achieving a significant improvement in comparison with state-of-the-art backbone networks. The codes of this work will be available at https://github.com/danfenghong/IEEE_TGRS_SpectralFormer for the sake of reproducibility.
In this paper, a fast algorithm for overcomplete sparse decomposition, called SL0, is proposed. The algorithm is essentially a method for obtaining sparse solutions of underdetermined systems of linear equations, and its applications include underdetermined sparse component analysis (SCA), atomic decomposition on overcomplete dictionaries, compressed sensing, and decoding real field codes. Contrary to previous methods, which usually solve this problem by minimizing the <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> norm using linear programming (LP) techniques, our algorithm tries to directly minimize the <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> norm. It is experimentally shown that the proposed algorithm is about two to three orders of magnitude faster than the state-of-the-art interior-point LP solvers, while providing the same (or better) accuracy.
An array of biomagnetometers may be used to measure the spatio-temporal neuromagnetic field or magnetoencephalogram (MEG) produced by neural activity in the brain. A popular model for the neural activity produced in response to a given sensory stimulus is a set of current dipoles, where each dipole represents the primary current associated with the combined activation of a large number of neurons located in a small volume of the brain. An important problem in the interpretation of MEG data from evoked response experiments is the localization of these neural current dipoles. We present here a linear algebraic framework for three common spatio-temporal dipole models: i) unconstrained dipoles, ii) dipoles with a fixed location, and iii) dipoles with a fixed orientation and location. In all cases, we assume that the location, orientation, and magnitude of the dipoles are unknown. With a common model, we show how the parameter estimation problem may be decomposed into the estimation of the time invariant parameters using nonlinear least-squares minimization, followed by linear estimation of the associated time varying parameters. A subspace formulation is presented and used to derive a suboptimal least-squares subspace scanning method. The resulting algorithm is a special case of the well-known MUltiple SIgnal Classification (MUSIC) method, in which the solution (multiple dipole locations) is found by scanning potential locations using a simple one dipole model. Principal components analysis (PCA) dipole fitting has also been used to individually fit single dipoles in a multiple dipole problem. Analysis is presented here to show why PCA dipole fitting will fail in general, whereas the subspace method presented here will generally succeed. Numerically efficient means of calculating the cost functions are presented, and problems of model order selection and missing moments are discussed. Results from a simulation and a somatosensory experiment are presented.
A method is proposed for the classification of urban hyperspectral data with high spatial resolution. The approach is an extension of previous approaches and uses both the spatial and spectral information for classification. One previous approach is based on using several principal components (PCs) from the hyperspectral data and building several morphological profiles (MPs). These profiles can be used all together in one extended MP. A shortcoming of that approach is that it was primarily designed for classification of urban structures and it does not fully utilize the spectral information in the data. Similarly, the commonly used pixelwise classification of hyperspectral data is solely based on the spectral content and lacks information on the structure of the features in the image. The proposed method overcomes these problems and is based on the fusion of the morphological information and the original hyperspectral data, i.e., the two vectors of attributes are concatenated into one feature vector. After a reduction of the dimensionality, the final classification is achieved by using a support vector machine classifier. The proposed approach is tested in experiments on ROSIS data from urban areas. Significant improvements are achieved in terms of accuracies when compared to results obtained for approaches based on the use of MPs based on PCs only and conventional spectral classification. For instance, with one data set, the overall accuracy is increased from 79% to 83% without any feature reduction and to 87% with feature reduction. The proposed approach also shows excellent results with a limited training set.
Hyperspectral imagery collected from airborne or satellite sources inevitably suffers from spectral variability, making it difficult for spectral unmixing to accurately estimate abundance maps. The classical unmixing model, the linear mixing model (LMM), generally fails to handle this sticky issue effectively. To this end, we propose a novel spectral mixture model, called the augmented linear mixing model (ALMM), to address spectral variability by applying a data-driven learning strategy in inverse problems of hyperspectral unmixing. The proposed approach models the main spectral variability (i.e., scaling factors) generated by variations in illumination or typography separately by means of the endmember dictionary. It then models other spectral variabilities caused by environmental conditions (e.g., local temperature and humidity, atmospheric effects) and instrumental configurations (e.g., sensor noise), as well as material nonlinear mixing effects, by introducing a spectral variability dictionary. To effectively run the data-driven learning strategy, we also propose a reasonable prior knowledge for the spectral variability dictionary, whose atoms are assumed to be low-coherent with spectral signatures of endmembers, which leads to a well-known low-coherence dictionary learning problem. Thus, a dictionary learning technique is embedded in the framework of spectral unmixing so that the algorithm can learn the spectral variability dictionary and estimate the abundance maps simultaneously. Extensive experiments on synthetic and real datasets are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods.
Classification of hyperspectral data with high spatial resolution from urban areas is discussed. An approach has been proposed which is based on using several principal components from the hyperspectral data and build morphological profiles. These profiles can be used all together in one extended morphological profile. A shortcoming of the approach is that it is primarily designed for classification of urban structures and it does not fully utilize the spectral information in the data. Similarly, a pixel-wise classification solely based on the spectral content can be performed, but it lacks information on the structure of the features in the image. An extension is proposed in this paper in order to overcome these dual problems. The proposed method is based on the data fusion of the morphological information and the original hyperspectral data: the two vectors of attributes are concatenated. After a reduction of the dimensionality using Decision Boundary Feature Extraction, the final classification is achieved using a Support Vector Machines classifier. The proposed approach is tested in experiments on ROSIS data from urban areas. Significant improvements are achieved in terms of accuracies when compared to results of approaches based on the use of morphological profiles based on PCs only and conventional spectral classification.
Learning-based infrared small object detection methods currently rely heavily on the classification backbone network. This tends to result in tiny object loss and feature distinguishability limitations as the network depth increases. Furthermore, small objects in infrared images are frequently emerged bright and dark, posing severe demands for obtaining precise object contrast information. For this reason, we in this paper propose a simple and effective "U-Net in U-Net" framework, UIU-Net for short, and detect small objects in infrared images. As the name suggests, UIU-Net embeds a tiny U-Net into a larger U-Net backbone, enabling the multi-level and multi-scale representation learning of objects. Moreover, UIU-Net can be trained from scratch, and the learned features can enhance global and local contrast information effectively. More specifically, the UIU-Net model is divided into two modules: the resolution-maintenance deep supervision (RM-DS) module and the interactive-cross attention (IC-A) module. RM-DS integrates Residual U-blocks into a deep supervision network to generate deep multi-scale resolution-maintenance features while learning global context information. Further, IC-A encodes the local context information between the low-level details and high-level semantic features. Extensive experiments conducted on two infrared single-frame image datasets, i.e., SIRST and Synthetic datasets, show the effectiveness and superiority of the proposed UIU-Net in comparison with several state-of-the-art infrared small object detection methods. The proposed UIU-Net also produces powerful generalization performance for video sequence infrared small object datasets, e.g., ATR ground/air video sequence dataset. The codes of this work are available openly at https://github.com/danfenghong/IEEE.
In January 2006, the Data Fusion Committee of the IEEE Geoscience and Remote Sensing Society launched a public contest for pansharpening algorithms, which aimed to identify the ones that perform best. Seven research groups worldwide participated in the contest, testing eight algorithms following different philosophies [component substitution, multiresolution analysis (MRA), detail injection, etc.]. Several complete data sets from two different sensors, namely, QuickBird and simulated Pleiades, were delivered to all participants. The fusion results were collected and evaluated, both visually and objectively. Quantitative results of pansharpening were possible owing to the availability of reference originals obtained either by simulating the data collected from the satellite sensor by means of higher resolution data from an airborne platform, in the case of the Pleiades data, or by first degrading all the available data to a coarser resolution and saving the original as the reference, in the case of the QuickBird data. The evaluation results were presented during the special session on data fusion at the 2006 international geoscience and remote sensing symposium in Denver, and these are discussed in further detail in this paper. Two algorithms outperform all the others, the visual analysis being confirmed by the quantitative evaluation. These two methods share the same philosophy: they basically rely on MRA and employ adaptive models for the injection of high-pass details.
This paper describes the OpenViBE software platform which enables researchers to design, test, and use brain–computer interfaces (BCIs). BCIs are communication systems that enable users to send commands to computers solely by means of brain activity. BCIs are gaining interest among the virtual reality (VR) community since they have appeared as promising interaction devices for virtual environments (VEs). The key features of the platform are (1) high modularity, (2) embedded tools for visualization and feedback based on VR and 3D displays, (3) BCI design made available to non-programmers thanks to visual programming, and (4) various tools offered to the different types of users. The platform features are illustrated in this paper with two entertaining VR applications based on a BCI. In the first one, users can move a virtual ball by imagining hand movements, while in the second one, they can control a virtual spaceship using real or imagined foot movements. Online experiments with these applications together with the evaluation of the platform computational performances showed its suitability for the design of VR applications controlled with a BCI. OpenViBE is a free software distributed under an open-source license.
This paper presents a new classification framework for brain-computer interface (BCI) based on motor imagery. This framework involves the concept of Riemannian geometry in the manifold of covariance matrices. The main idea is to use spatial covariance matrices as EEG signal descriptors and to rely on Riemannian geometry to directly classify these matrices using the topology of the manifold of symmetric and positive definite (SPD) matrices. This framework allows to extract the spatial information contained in EEG signals without using spatial filtering. Two methods are proposed and compared with a reference method [multiclass Common Spatial Pattern (CSP) and Linear Discriminant Analysis (LDA)] on the multiclass dataset IIa from the BCI Competition IV. The first method, named minimum distance to Riemannian mean (MDRM), is an implementation of the minimum distance to mean (MDM) classification algorithm using Riemannian distance and Riemannian mean. This simple method shows comparable results with the reference method. The second method, named tangent space LDA (TSLDA), maps the covariance matrices onto the Riemannian tangent space where matrices can be vectorized and treated as Euclidean objects. Then, a variable selection procedure is applied in order to decrease dimensionality and a classification by LDA is performed. This latter method outperforms the reference method increasing the mean classification accuracy from 65.1% to 70.2%.
oatao 14340