Structure et Instabilité des Génomes
facilityParis, Île-de-France, France
Research output, citation impact, and the most-cited recent papers from Structure et Instabilité des Génomes (France). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Structure et Instabilité des Génomes
Summary 1. Species distribution models are increasingly used to address questions in conservation biology, ecology and evolution. The most effective species distribution models require data on both species presence and the available environmental conditions (known as background or pseudo‐absence data) in the area. However, there is still no consensus on how and where to sample these pseudo‐absences and how many. 2. In this study, we conducted a comprehensive comparative analysis based on simple simulated species distributions to propose guidelines on how, where and how many pseudo‐absences should be generated to build reliable species distribution models. Depending on the quantity and quality of the initial presence data (unbiased vs. climatically or spatially biased), we assessed the relative effect of the method for selecting pseudo‐absences (random vs. environmentally or spatially stratified) and their number on the predictive accuracy of seven common modelling techniques (regression, classification and machine‐learning techniques). 3. When using regression techniques, the method used to select pseudo‐absences had the greatest impact on the model’s predictive accuracy. Randomly selected pseudo‐absences yielded the most reliable distribution models. Models fitted with a large number of pseudo‐absences but equally weighted to the presences (i.e. the weighted sum of presence equals the weighted sum of pseudo‐absence) produced the most accurate predicted distributions. For classification and machine‐learning techniques, the number of pseudo‐absences had the greatest impact on model accuracy, and averaging several runs with fewer pseudo‐absences than for regression techniques yielded the most predictive models. 4. Overall, we recommend the use of a large number (e.g. 10 000) of pseudo‐absences with equal weighting for presences and absences when using regression techniques (e.g. generalised linear model and generalised additive model); averaging several runs (e.g. 10) with fewer pseudo‐absences (e.g. 100) with equal weighting for presences and absences with multiple adaptive regression splines and discriminant analyses; and using the same number of pseudo‐absences as available presences (averaging several runs if few pseudo‐absences) for classification techniques such as boosted regression trees, classification trees and random forest. In addition, we recommend the random selection of pseudo‐absences when using regression techniques and the random selection of geographically and environmentally stratified pseudo‐absences when using classification and machine‐learning techniques.
CRISPOR.org is a web tool for genome editing experiments with the CRISPR-Cas9 system. It finds guide RNAs in an input sequence and ranks them according to different scores that evaluate potential off-targets in the genome of interest and predict on-target activity. The list of genomes is continuously expanded, with more 150 genomes added in the last two years. CRISPOR tries to provide a comprehensive solution from selection, cloning and expression of guide RNA as well as providing primers needed for testing guide activity and potential off-targets. Recent developments include batch design for genome-wide CRISPR and saturation screens, creating custom oligonucleotides for guide cloning and the design of next generation sequencing primers to test for off-target mutations. CRISPOR is available from http://crispor.org, including the full source code of the website and a stand-alone, command-line version.
Characterization of the genetic landscape of Alzheimer's disease (AD) and related dementias (ADD) provides a unique opportunity for a better understanding of the associated pathophysiological processes. We performed a two-stage genome-wide association study totaling 111,326 clinically diagnosed/'proxy' AD cases and 677,663 controls. We found 75 risk loci, of which 42 were new at the time of analysis. Pathway enrichment analyses confirmed the involvement of amyloid/tau pathways and highlighted microglia implication. Gene prioritization in the new loci identified 31 genes that were suggestive of new genetically associated processes, including the tumor necrosis factor alpha pathway through the linear ubiquitin chain assembly complex. We also built a new genetic risk score associated with the risk of future AD/dementia or progression from mild cognitive impairment to AD/dementia. The improvement in prediction led to a 1.6- to 1.9-fold increase in AD risk from the lowest to the highest decile, in addition to effects of age and the APOE ε4 allele.
BACKGROUND: The success of the CRISPR/Cas9 genome editing technique depends on the choice of the guide RNA sequence, which is facilitated by various websites. Despite the importance and popularity of these algorithms, it is unclear to which extent their predictions are in agreement with actual measurements. RESULTS: We conduct the first independent evaluation of CRISPR/Cas9 predictions. To this end, we collect data from eight SpCas9 off-target studies and compare them with the sites predicted by popular algorithms. We identify problems in one implementation but found that sequence-based off-target predictions are very reliable, identifying most off-targets with mutation rates superior to 0.1 %, while the number of false positives can be largely reduced with a cutoff on the off-target score. We also evaluate on-target efficiency prediction algorithms against available datasets. The correlation between the predictions and the guide activity varied considerably, especially for zebrafish. Together with novel data from our labs, we find that the optimal on-target efficiency prediction model strongly depends on whether the guide RNA is expressed from a U6 promoter or transcribed in vitro. We further demonstrate that the best predictions can significantly reduce the time spent on guide screening. CONCLUSIONS: To make these guidelines easily accessible to anyone planning a CRISPR genome editing experiment, we built a new website ( http://crispor.org ) that predicts off-targets and helps select and clone efficient guide sequences for more than 120 genomes using different Cas9 proteins and the eight efficiency scoring systems evaluated here.
Research on the relationship between the architecture of ecological networks and community stability has mainly focused on one type of interaction at a time, making difficult any comparison between different network types. We used a theoretical approach to show that the network architecture favoring stability fundamentally differs between trophic and mutualistic networks. A highly connected and nested architecture promotes community stability in mutualistic networks, whereas the stability of trophic networks is enhanced in compartmented and weakly connected architectures. These theoretical predictions are supported by a meta-analysis on the architecture of a large series of real pollination (mutualistic) and herbivory (trophic) networks. We conclude that strong variations in the stability of architectural patterns constrain ecological networks toward different architectures, depending on the type of interaction.
Abstract. Accurate assessment of anthropogenic carbon dioxide(CO2) emissions and their redistribution among the atmosphere,ocean, and terrestrial biosphere – the “global carbon budget” – isimportant to better understand the global carbon cycle, support thedevelopment of climate policies, and project future climate change. Here wedescribe data sets and methodology to quantify the five major components ofthe global carbon budget and their uncertainties. Fossil CO2emissions (EFF) are based on energy statistics and cementproduction data, while emissions from land use and land-use change (ELUC),mainly deforestation, are based on land use and land-use change data andbookkeeping models. Atmospheric CO2 concentration is measureddirectly and its growth rate (GATM) is computed from the annualchanges in concentration. The ocean CO2 sink (SOCEAN)and terrestrial CO2 sink (SLAND) are estimated withglobal process models constrained by observations. The resulting carbonbudget imbalance (BIM), the difference between the estimatedtotal emissions and the estimated changes in the atmosphere, ocean, andterrestrial biosphere, is a measure of imperfect data and understanding ofthe contemporary carbon cycle. All uncertainties are reported as ±1σ. For the last decade available (2008–2017), EFF was9.4±0.5 GtC yr−1, ELUC 1.5±0.7 GtC yr−1, GATM 4.7±0.02 GtC yr−1,SOCEAN 2.4±0.5 GtC yr−1, and SLAND 3.2±0.8 GtC yr−1, with a budget imbalance BIM of0.5 GtC yr−1 indicating overestimated emissions and/or underestimatedsinks. For the year 2017 alone, the growth in EFF was about 1.6 %and emissions increased to 9.9±0.5 GtC yr−1. Also for 2017,ELUC was 1.4±0.7 GtC yr−1, GATM was 4.6±0.2 GtC yr−1, SOCEAN was 2.5±0.5 GtC yr−1, and SLAND was 3.8±0.8 GtC yr−1,with a BIM of 0.3 GtC. The global atmosphericCO2 concentration reached 405.0±0.1 ppm averaged over 2017.For 2018, preliminary data for the first 6–9 months indicate a renewedgrowth in EFF of +2.7 % (range of 1.8 % to 3.7 %) basedon national emission projections for China, the US, the EU, and India andprojections of gross domestic product corrected for recent changes in thecarbon intensity of the economy for the rest of the world. The analysispresented here shows that the mean and trend in the five components of theglobal carbon budget are consistently estimated over the period of 1959–2017,but discrepancies of up to 1 GtC yr−1 persist for the representationof semi-decadal variability in CO2 fluxes. A detailed comparisonamong individual estimates and the introduction of a broad range ofobservations show (1) no consensus in the mean and trend in land-use changeemissions, (2) a persistent low agreement among the different methods onthe magnitude of the land CO2 flux in the northern extra-tropics,and (3) an apparent underestimation of the CO2 variability by oceanmodels, originating outside the tropics. This living data update documentschanges in the methods and data sets used in this new global carbon budgetand the progress in understanding the global carbon cycle compared withprevious publications of this data set (Le Quéré et al., 2018, 2016,2015a, b, 2014, 2013). All results presented here can be downloaded fromhttps://doi.org/10.18160/GCP-2018.
ABSTRACT We review Seewave, new software for analysing and synthesizing sounds. Seewave is free and works on a wide variety of operating systems as an extension of the R operating environment. Its current 67 functions allow the user to achieve time, amplitude and frequency analyses, to estimate quantitative differences between sounds, and to generate new sounds for playback experiments. Thanks to its implementation in the R environment, Seewave is fully modular. All functions can be combined for complex data acquisition and graphical output, they can be part of important scripts for batch processing and they can be modified ad libitum. New functions can also be written, making Seewave a truly open-source tool.
In 2010, the international community, under the auspices of the Convention on Biological Diversity, agreed on 20 biodiversity-related "Aichi Targets" to be achieved within a decade. We provide a comprehensive mid-term assessment of progress toward these global targets using 55 indicator data sets. We projected indicator trends to 2020 using an adaptive statistical framework that incorporated the specific properties of individual time series. On current trajectories, results suggest that despite accelerating policy and management responses to the biodiversity crisis, the impacts of these efforts are unlikely to be reflected in improved trends in the state of biodiversity by 2020. We highlight areas of societal endeavor requiring additional efforts to achieve the Aichi Targets, and provide a baseline against which to assess future progress.
MOTIVATION: The cell nucleus is a highly organized cellular organelle that contains the genetic material. The study of nuclear architecture has become an important field of cellular biology. Extracting quantitative data from 3D fluorescence imaging helps understand the functions of different nuclear compartments. However, such approaches are limited by the requirement for processing and analyzing large sets of images. RESULTS: Here, we describe Tools for Analysis of Nuclear Genome Organization (TANGO), an image analysis tool dedicated to the study of nuclear architecture. TANGO is a coherent framework allowing biologists to perform the complete analysis process of 3D fluorescence images by combining two environments: ImageJ (http://imagej.nih.gov/ij/) for image processing and quantitative analysis and R (http://cran.r-project.org) for statistical processing of measurement results. It includes an intuitive user interface providing the means to precisely build a segmentation procedure and set-up analyses, without possessing programming skills. TANGO is a versatile tool able to process large sets of images, allowing quantitative study of nuclear organization. AVAILABILITY: TANGO is composed of two programs: (i) an ImageJ plug-in and (ii) a package (rtango) for R. They are both free and open source, available (http://biophysique.mnhn.fr/tango) for Linux, Microsoft Windows and Macintosh OSX. Distribution is under the GPL v.2 licence. CONTACT: thomas.boudier@snv.jussieu.fr SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
century. Yet, most species remain unknown or unstudied, while others attract most of the public, scientific and government attention. Although known to be detrimental, this taxonomic bias continues to be pervasive in the scientific literature, but is still poorly studied and understood. Here, we used 626 million occurrences from the Global Biodiversity Information Facility (GBIF), the biggest biodiversity data portal, to characterize the taxonomic bias in biodiversity data. We also investigated how societal preferences and taxonomic research relate to biodiversity data gathering. For each species belonging to 24 taxonomic classes, we used the number of publications from Web of Science and the number of web pages from Bing searches to approximate research activity and societal preferences. Our results show that societal preferences, rather than research activity, strongly correlate with taxonomic bias, which lead us to assert that scientists should advertise less charismatic species and develop societal initiatives (e.g. citizen science) that specifically target neglected organisms. Ensuring that biodiversity is representatively sampled while this is still possible is an urgent prerequisite for achieving efficient conservation plans and a global understanding of our surrounding environment.
Zika virus is known to be transmitted by mosquitoes. The authors report the sexual transmission of Zika virus to a woman in Paris from a man who had recently traveled from Brazil.
Biological invasion is increasingly recognized as one of the greatest threats to biodiversity. Using ensemble forecasts from species distribution models to project future suitable areas of the 100 of the world's worst invasive species defined by the International Union for the Conservation of Nature, we show that both climate and land use changes will likely cause drastic species range shifts. Looking at potential spatial aggregation of invasive species, we identify three future hotspots of invasion in Europe, northeastern North America, and Oceania. We also emphasize that some regions could lose a significant number of invasive alien species, creating opportunities for ecosystem restoration. From the list of 100, scenarios of potential range distributions show a consistent shrinking for invasive amphibians and birds, while for aquatic and terrestrial invertebrates distributions are projected to substantially increase in most cases. Given the harmful impacts these invasive species currently have on ecosystems, these species will likely dramatically influence the future of biodiversity.
One of the oldest challenges in ecology is to understand the processes that underpin the composition of communities. Historically, an obvious way in which to describe community compositions has been diversity in terms of the number and abundances of species. However, the failure to reject contradictory models has led to communities now being characterized by trait and phylogenetic diversities. Our objective here is to demonstrate how species, trait and phylogenetic diversity can be combined together from large to local spatial scales to reveal the historical, deterministic and stochastic processes that impact the compositions of local communities. Research in this area has recently been advanced by the development of mathematical measures that incorporate trait dissimilarities and phylogenetic relatedness between species. However, measures of trait diversity have been developed independently of phylogenetic measures and conversely most of the phylogenetic diversity measures have been developed independently of trait diversity measures. This has led to semantic confusions particularly when classical ecological and evolutionary approaches are integrated so closely together. Consequently, we propose a unified semantic framework and demonstrate the importance of the links among species, phylogenetic and trait diversity indices. Furthermore, species, trait and phylogenetic diversity indices differ in the ways they can be used across different spatial scales. The connections between large-scale, regional and local processes allow the consideration of historical factors in addition to local ecological deterministic or stochastic processes. Phylogenetic and trait diversity have been used in large-scale analyses to determine how historical and/or environmental factors affect both the formation of species assemblages and patterns in species richness across latitude or elevation gradients. Both phylogenetic and trait diversity have been used at different spatial scales to identify the relative impacts of ecological deterministic processes such as environmental filtering and limiting similarity from alternative processes such as random speciation and extinction, random dispersal and ecological drift. Measures of phylogenetic diversity combine phenotypic and genetic diversity and have the potential to reveal both the ecological and historical factors that impact local communities. Consequently, we demonstrate that, when used in a comparative way, species, trait and phylogenetic structures have the potential to reveal essential details that might act simultaneously in the assembly of species communities. We highlight potential directions for future research. These might include how variation in trait and phylogenetic diversity alters with spatial distances, the role of trait and phylogenetic diversity in global-scale gradients, the connections between traits and phylogeny, the importance of trait rarity and independent evolutionary history in community assembly, the loss of trait and phylogenetic diversity due to human impacts, and the mathematical developments of biodiversity indices including within-species variations.
Biodiversity assessment remains one of the most difficult challenges encountered by ecologists and conservation biologists. This task is becoming even more urgent with the current increase of habitat loss. Many methods-from rapid biodiversity assessments (RBA) to all-taxa biodiversity inventories (ATBI)-have been developed for decades to estimate local species richness. However, these methods are costly and invasive. Several animals-birds, mammals, amphibians, fishes and arthropods-produce sounds when moving, communicating or sensing their environment. Here we propose a new concept and method to describe biodiversity. We suggest to forego species or morphospecies identification used by ATBI and RBA respectively but rather to tackle the problem at another evolutionary unit, the community level. We also propose that a part of diversity can be estimated and compared through a rapid acoustic analysis of the sound produced by animal communities. We produced alpha and beta diversity indexes that we first tested with 540 simulated acoustic communities. The alpha index, which measures acoustic entropy, shows a logarithmic correlation with the number of species within the acoustic community. The beta index, which estimates both temporal and spectral dissimilarities, is linearly linked to the number of unshared species between acoustic communities. We then applied both indexes to two closely spaced Tanzanian dry lowland coastal forests. Indexes reveal for this small sample a lower acoustic diversity for the most disturbed forest and acoustic dissimilarities between the two forests suggest that degradation could have significantly decreased and modified community composition. Our results demonstrate for the first time that an indicator of biological diversity can be reliably obtained in a non-invasive way and with a limited sampling effort. This new approach may facilitate the appraisal of animal diversity at large spatial and temporal scales.
Sequence-specific nucleases like TALENs and the CRISPR/Cas9 system have greatly expanded the genome editing possibilities in model organisms such as zebrafish. Both systems have recently been used to create knock-out alleles with great efficiency, and TALENs have also been successfully employed in knock-in of DNA cassettes at defined loci via homologous recombination (HR). Here we report CRISPR/Cas9-mediated knock-in of DNA cassettes into the zebrafish genome at a very high rate by homology-independent double-strand break (DSB) repair pathways. After co-injection of a donor plasmid with a short guide RNA (sgRNA) and Cas9 nuclease mRNA, concurrent cleavage of donor plasmid DNA and the selected chromosomal integration site resulted in efficient targeted integration of donor DNA. We successfully employed this approach to convert eGFP into Gal4 transgenic lines, and the same plasmids and sgRNAs can be applied in any species where eGFP lines were generated as part of enhancer and gene trap screens. In addition, we show the possibility of easily targeting DNA integration at endogenous loci, thus greatly facilitating the creation of reporter and loss-of-function alleles. Due to its simplicity, flexibility, and very high efficiency, our method greatly expands the repertoire for genome editing in zebrafish and can be readily adapted to many other organisms.
The order Cetartiodactyla includes cetaceans (whales, dolphins and porpoises) that are found in all oceans and seas, as well as in some rivers, and artiodactyls (ruminants, pigs, peccaries, hippos, camels and llamas) that are present on all continents, except Antarctica and until recent invasions, Australia. There are currently 332 recognized cetartiodactyl species, which are classified into 132 genera and 22 families. Most phylogenetic studies have focused on deep relationships, and no comprehensive time-calibrated tree for the group has been published yet. In this study, 128 new complete mitochondrial genomes of Cetartiodactyla were sequenced and aligned with those extracted from nucleotide databases. Our alignment includes 14,902 unambiguously aligned nucleotide characters for 210 taxa, representing 183 species, 107 genera, and all cetartiodactyl families. Our mtDNA data produced a statistically robust tree, which is largely consistent with previous classifications. However, a few taxa were found to be para- or polyphyletic, including the family Balaenopteridae, as well as several genera and species. Accordingly, we propose several taxonomic changes in order to render the classification compatible with our molecular phylogeny. In some cases, the results can be interpreted as possible taxonomic misidentification or evidence for mtDNA introgression. The existence of three new cryptic species of Ruminantia should therefore be confirmed by further analyses using nuclear data. We estimate divergence times using Bayesian relaxed molecular clock models. The deepest nodes appeared very sensitive to prior assumptions leading to unreliable estimates, primarily because of the misleading effects of rate heterogeneity, saturation and divergent outgroups. In addition, we detected that Whippomorpha contains slow-evolving taxa, such as large whales and hippos, as well as fast-evolving taxa, such as river dolphins. Our results nevertheless indicate that the evolutionary history of cetartiodactyls was punctuated by four main phases of rapid radiation during the Cenozoic era: the sudden occurrence of the three extant lineages within Cetartiodactyla (Cetruminantia, Suina and Tylopoda); the basal diversification of Cetacea during the Early Oligocene; and two radiations that involve Cetacea and Pecora, one at the Oligocene/Miocene boundary and the other in the Middle Miocene. In addition, we show that the high species diversity now observed in the families Bovidae and Cervidae accumulated mainly during the Late Miocene and Plio-Pleistocene.
Abstract Nucleic acids can be selectively recognized by a large number of natural and synthetic ligands. Oligonucleotides provide the highest specificity of recognition. They can bind to a complementary single‐stranded sequence by forming Watson–Crick hydrogen bonds. They can also recognize the major groove of double‐helical DNA at specific sequences by forming Hoogsteen or reverse Hoogsteen hydrogen bonds with purine bases of the Watson‐Crick base pairs, resulting in a triple helix. Triple‐helix formation through oligonucleotide binding to DNA is a sequence–specific interaction involving primarily homopurine·homopyrimidine sequences in the double‐helical target. Extending the range of recognition sequences remains a challenge to chemists. Both thermodynamic and kinetic parameters for triplex formation have been determined. These parameters indicate, for example, that triple‐helix formation is a much slower process than duplex formation. Nuclease‐resistant oligonucleotides synthesized with the anomers of nucleosides (instead of the natural β‐anomers) also form triple helices with double–stranded DNA. Triple‐helix‐forming oligonucleotides can be modified, for example, by attaching DNA intercalating agents to enhance their binding affinity. They may also be modified with reagents that induce irreversible reactions in their target sequence upon chemical or photochemical activation. Thus, artificial nucleases can be developed with very high sequence specificity on megabase‐size DNA. Furthermore, triple‐helix‐forming oligonucleotides can be used to selectively control gene expression. When bound to the regulatory region(s) of specific genes they may prevent activation (or repression) of transcription. When binding occurs near or downstream from the transcription initiation site, elongation of the transcript may be inhibited. Therefore, the potential exists for developing new gene‐blocking agents with therapeutic applications in the treatment of gene disorders.
Assessing trait responses to environmental gradients requires the simultaneous analysis of the information contained in three tables: L (species distribution across samples), R (environmental characteristics of samples), and Q (species traits). Among the available methods, the so-called fourth-corner and RLQ methods are two appealing alternatives that provide a direct way to test and estimate trait-nvironment relationships. Both methods are based on the analysis of the fourth-corner matrix, which crosses traits and environmental variables weighted by species abundances. However, they differ greatly in their outputs: RLQ is a multivariate technique that provides ordination scores to summarize the joint structure among the three tables, whereas the fourth-corner method mainly tests for individual trait-environment relationships (i.e., one trait and one environmental variable at a time). Here, we illustrate how the complementarity between these two methods can be exploited to promote new ecological knowledge and to improve the study of trait-environment relationships. After a short description of each method, we apply them to real ecological data to present their different outputs and provide hints about the gain resulting from their combined use.
Range shifts of many species are now documented as a response to global warming. But whether these observed changes are occurring fast enough remains uncertain and hardly quantifiable. Here, we developed a simple framework to measure change in community composition in response to climate warming. This framework is based on a community temperature index (CTI) that directly reflects, for a given species assemblage, the balance between low- and high-temperature dwelling species. Using data from the French breeding bird survey, we first found a strong increase in CTI over the last two decades revealing that birds are rapidly tracking climate warming. This increase corresponds to a 91 km northward shift in bird community composition, which is much higher than previous estimates based on changes in species range edges. During the same period, temperature increase corresponds to a 273 km northward shift in temperature. Change in community composition was thus insufficient to keep up with temperature increase: birds are lagging approximately 182 km behind climate warming. Our method is applicable to any taxa with large-scale survey data, using either abundance or occurrence data. This approach can be further used to test whether different delays are found across groups or in different land-use contexts.
Guanine-rich DNA strands can fold in vitro into non-canonical DNA structures called G-quadruplexes. These structures may be very stable under physiological conditions. Evidence suggests that G-quadruplex structures may act as 'knots' within genomic DNA, and it has been hypothesized that proteins may have evolved to remove these structures. The first indication of how G-quadruplex structures could be unfolded enzymatically came in the late 1990s with reports that some well-known duplex DNA helicases resolved these structures in vitro. Since then, the number of studies reporting G-quadruplex DNA unfolding by helicase enzymes has rapidly increased. The present review aims to present a general overview of the helicase/G-quadruplex field.