NobleBlocks
Walter and Eliza Hall Institute of Medical Research logo

Walter and Eliza Hall Institute of Medical Research

nonprofitMelbourne, Australia

Research output, citation impact, and the most-cited recent papers from Walter and Eliza Hall Institute of Medical Research (Australia). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
19.2K
Citations
4.5M
h-index
727
i10-index
33.2K
Also known as
Walter and Eliza Hall InstituteWalter and Eliza Hall Institute of Medical Research

Top-cited papers from Walter and Eliza Hall Institute of Medical Research

<tt>edgeR</tt> : a Bioconductor package for differential expression analysis of digital gene expression data
Mark D. Robinson, Davis J. McCarthy, Gordon K. Smyth
2009· Bioinformatics44.3Kdoi:10.1093/bioinformatics/btp616

SUMMARY: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. AVAILABILITY: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).

limma powers differential expression analyses for RNA-sequencing and microarray studies
Matthew E. Ritchie, Belinda Phipson, Di Wu, Yifang Hu +3 more
2015· Nucleic Acids Research42.6Kdoi:10.1093/nar/gkv007

limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

featureCounts: an efficient general purpose program for assigning sequence reads to genomic features
Yang Liao, Gordon K. Smyth, Wei Shi
2013· Bioinformatics28.7Kdoi:10.1093/bioinformatics/btt656

MOTIVATION: Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. RESULTS: We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. AVAILABILITY AND IMPLEMENTATION: featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.

Bioconductor: open software development for computational biology and bioinformatics
Robert Gentleman, Vincent J. Carey, Douglas M. Bates, Ben Bolstad +4 more
2004· Genome biology12.5Kdoi:10.1186/gb-2004-5-10-r80

The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.

Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments
Gordon K. Smyth
2004· Statistical Applications in Genetics and Molecular Biology12.0Kdoi:10.2202/1544-6115.1027

The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. The purpose of this paper is to develop the hierarchical model of Lonnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples. The model is reset in the context of general linear models with arbitrary coefficients and contrasts of interest. The approach applies equally well to both single channel and two color microarray experiments. Consistent, closed form estimators are derived for the hyperparameters in the model. The estimators proposed have robust behavior even for small numbers of arrays and allow for incomplete data arising from spot filtering or spot quality weights. The posterior odds statistic is reformulated in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations. The empirical Bayes approach is equivalent to shrinkage of the estimated sample variances towards a pooled estimate, resulting in far more stable inference when the number of arrays is small. The use of moderated t-statistics has the advantage over the posterior odds that the number of hyperparameters which need to estimated is reduced; in particular, knowledge of the non-null prior for the fold changes are not required. The moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom. The moderated t inferential approach extends to accommodate tests of composite null hypotheses through the use of moderated F-statistics. The performance of the methods is demonstrated in a simulation study. Results are presented for two publicly available data sets.

A scaling normalization method for differential expression analysis of RNA-seq data
Mark D. Robinson, Alicia Oshlack
2010· Genome biology8.5Kdoi:10.1186/gb-2010-11-3-r25

The fine detail provided by sequencing-based transcriptome surveys suggests that RNA-seq is likely to become the platform of choice for interrogating steady state RNA. In order to discover biologically important changes in expression, we show that normalization continues to be an essential step in the analysis. We outline a simple and effective method for performing normalization and show dramatically improved results for inferring differential expression in simulated and publicly available data sets.

A comparison of normalization methods for high densityoligonucleotide array data based on variance and bias
Benjamin M. Bolstad, Rafael A. Irizarry, Magnus Åstrand, Terence P. Speed
2003· Bioinformatics8.4Kdoi:10.1093/bioinformatics/19.2.185

MOTIVATION: When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations. RESULTS: We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably. AVAILABILITY: Software implementing all three of the complete data normalization methods is available as part of the R package Affy, which is a part of the Bioconductor project http://www.bioconductor.org. SUPPLEMENTARY INFORMATION: Additional figures may be found at http://www.stat.berkeley.edu/~bolstad/normalize/index.html

Integrated genomic analyses of ovarian carcinoma
Debra Bell, Andrew Berchuck, Andrew Berchuck, Michael J. Birrer +4 more
2011· Nature8.1Kdoi:10.1038/nature10166

A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients’ lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology. The Cancer Genome Atlas (TCGA) project reports here its analysis of messenger RNA and microRNA expression, promoter methylation, DNA copy number and exome sequences in 489 high-grade serous ovarian adenocarcinomas. The analyses help establish new tumour subtypes. Among other insights is the finding that while the gene encoding p53 tumour suppressor is mutated in almost all tumours, nine other loci including NF1, BRCA1, BRCA2, RB1 and CDK12 carry recurrent albeit low-prevalence mutations. Homologous recombination is defective in about half of the tumours studied, and Notch and FOXM1 signalling are involved in the pathophysiology.

Gene ontology analysis for RNA-seq: accounting for selection bias
Matthew D. Young, Matthew J. Wakefield, Gordon K. Smyth, Alicia Oshlack
2010· Genome biology7.8Kdoi:10.1186/gb-2010-11-2-r14

We present GOseq, an application for performing Gene Ontology (GO) analysis on RNA-seq data. GO analysis is widely used to reduce complexity and highlight biological processes in genome-wide expression studies, but standard methods give biased results on RNA-seq data due to over-detection of differential expression for long and highly expressed transcripts. Application of GOseq to a prostate cancer data set shows that GOseq dramatically changes the results, highlighting categories more consistent with the known biology.

voom: precision weights unlock linear model analysis tools for RNA-seq read counts
Charity W. Law, Yunshun Chen, Wei Shi, Gordon K. Smyth
2014· Genome biology6.7Kdoi:10.1186/gb-2014-15-2-r29

New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.

Guidelines for the use and interpretation of assays for monitoring autophagy (3rd edition)
Daniel J. Klionsky, Kotb Abdelmohsen, Akihisa Abe, Md. Joynal Abedin +4 more
2016· Autophagy6.0Kdoi:10.1080/15548627.2015.1100356

AUTORES: Daniel J Klionsky1745,1749*, Kotb Abdelmohsen840, Akihisa Abe1237, Md Joynal Abedin1762, Hagai Abeliovich425,&#13;\nAbraham Acevedo Arozena789, Hiroaki Adachi1800, Christopher M Adams1669, Peter D Adams57, Khosrow Adeli1981,&#13;\nPeter J Adhihetty1625, Sharon G Adler700, Galila Agam67, Rajesh Agarwal1587, Manish K Aghi1537, Maria Agnello1826,&#13;\nPatrizia Agostinis664, Patricia V Aguilar1960, Julio Aguirre-Ghiso784,786, Edoardo M Airoldi89,422, Slimane Ait-Si-Ali1376,&#13;\nTakahiko Akematsu2010, Emmanuel T Akporiaye1097, Mohamed Al-Rubeai1394, Guillermo M Albaiceta1294,&#13;\nChris Albanese363, Diego Albani561, Matthew L Albert517, Jesus Aldudo128, Hana Alg€ul1164, Mehrdad Alirezaei1198,&#13;\nIraide Alloza642,888, Alexandru Almasan206, Maylin Almonte-Beceril524, Emad S Alnemri1212, Covadonga Alonso544,&#13;\nNihal Altan-Bonnet848, Dario C Altieri1205, Silvia Alvarez1497, Lydia Alvarez-Erviti1395, Sandro Alves107,&#13;\nGiuseppina Amadoro860, Atsuo Amano930, Consuelo Amantini1554, Santiago Ambrosio1458, Ivano Amelio756,&#13;\nAmal O Amer918, Mohamed Amessou2089, Angelika Amon726, Zhenyi An1538, Frank A Anania291, Stig U Andersen6,&#13;\nUsha P Andley2079, Catherine K Andreadi1690, Nathalie Andrieu-Abadie502, Alberto Anel2027, David K Ann58,&#13;\nShailendra Anoopkumar-Dukie388, Manuela Antonioli832,858, Hiroshi Aoki1791, Nadezda Apostolova2007,&#13;\nSaveria Aquila1500, Katia Aquilano1876, Koichi Araki292, Eli Arama2098, Agustin Aranda456, Jun Araya591,&#13;\nAlexandre Arcaro1472, Esperanza Arias26, Hirokazu Arimoto1225, Aileen R Ariosa1749, Jane L Armstrong1930,&#13;\nThierry Arnould1773, Ivica Arsov2120, Katsuhiko Asanuma675, Valerie Askanas1924, Eric Asselin1867, Ryuichiro Atarashi794,&#13;\nSally S Atherton369, Julie D Atkin713, Laura D Attardi1131, Patrick Auberger1787, Georg Auburger379, Laure Aurelian1727,&#13;\nRiccardo Autelli1992, Laura Avagliano1029,1755, Maria Laura Avantaggiati364, Limor Avrahami1166, Suresh Awale1986,&#13;\nNeelam Azad404, Tiziana Bachetti568, Jonathan M Backer28, Dong-Hun Bae1933, Jae-sung Bae677, Ok-Nam Bae409,&#13;\nSoo Han Bae2117, Eric H Baehrecke1729, Seung-Hoon Baek17, Stephen Baghdiguian1368,&#13;\nAgnieszka Bagniewska-Zadworna2, Hua Bai90, Jie Bai667, Xue-Yuan Bai1133, Yannick Bailly884,&#13;\nKithiganahalli Narayanaswamy Balaji473, Walter Balduini2002, Andrea Ballabio316, Rena Balzan1711, Rajkumar Banerjee239,&#13;\nG abor B anhegyi1052, Haijun Bao2109, Benoit Barbeau1363, Maria D Barrachina2007, Esther Barreiro467, Bonnie Bartel997,&#13;\nAlberto Bartolom e222, Diane C Bassham550, Maria Teresa Bassi1046, Robert C Bast Jr1273, Alakananda Basu1798,&#13;\nMaria Teresa Batista1578, Henri Batoko1336, Maurizio Battino970, Kyle Bauckman2085, Bradley L Baumgarner1909,&#13;\nK Ulrich Bayer1594, Rupert Beale1553, Jean-Fran¸cois Beaulieu1360, George R. Beck Jr48,294, Christoph Becker336,&#13;\nJ David Beckham1595, Pierre-Andr e B edard749, Patrick J Bednarski301, Thomas J Begley1135, Christian Behl1419,&#13;\nChristian Behrends757, Georg MN Behrens406, Kevin E Behrns1627, Eloy Bejarano26, Amine Belaid490,&#13;\nFrancesca Belleudi1041, Giovanni B enard497, Guy Berchem706, Daniele Bergamaschi983, Matteo Bergami1401,&#13;\nBen Berkhout1441, Laura Berliocchi714, Am elie Bernard1749, Monique Bernard1354, Francesca Bernassola1880,&#13;\nAnne Bertolotti791, Amanda S Bess272, S ebastien Besteiro1351, Saverio Bettuzzi1828, Savita Bhalla913,&#13;\nShalmoli Bhattacharyya973, Sujit K Bhutia838, Caroline Biagosch1159, Michele Wolfe Bianchi520,1378,1381,&#13;\nMartine Biard-Piechaczyk210, Viktor Billes298, Claudia Bincoletto1314, Baris Bingol350, Sara W Bird1128, Marc Bitoun1112,&#13;\nIvana Bjedov1258, Craig Blackstone843, Lionel Blanc1183, Guillermo A Blanco1496, Heidi Kiil Blomhoff1812,&#13;\nEmilio Boada-Romero1297, Stefan B€ockler1464, Marianne Boes1423, Kathleen Boesze-Battaglia1835, Lawrence H Boise286,287,&#13;\nAlessandra Bolino2063, Andrea Boman693, Paolo Bonaldo1823, Matteo Bordi897, J€urgen Bosch608, Luis M Botana1308,&#13;\nJoelle Botti1375, German Bou1405, Marina Bouch e1038, Marion Bouchecareilh1331, Marie-Jos ee Boucher1901,&#13;\nMichael E Boulton481, Sebastien G Bouret1926, Patricia Boya133, Micha€el Boyer-Guittaut1345, Peter V Bozhkov1141,&#13;\nNathan Brady374, Vania MM Braga469, Claudio Brancolini1997, Gerhard H Braus353, Jos e M Bravo-San Pedro299,393,508,1374,&#13;\nLisa A Brennan322, Emery H Bresnick2022, Patrick Brest490, Dave Bridges1939, Marie-Agn es Bringer124, Marisa Brini1822,&#13;\nGlauber C Brito1311, Bertha Brodin631, Paul S Brookes1872, Eric J Brown352, Karen Brown1690, Hal E Broxmeyer480,&#13;\nAlain Bruhat486,1339, Patricia Chakur Brum1893, John H Brumell446, Nicola Brunetti-Pierri315,1171,&#13;\nRobert J Bryson-Richardson781, Shilpa Buch1777, Alastair M Buchan1819, Hikmet Budak1022, Dmitry V Bulavin118,505,1789,&#13;\nScott J Bultman1792, Geert Bultynck665, Vladimir Bumbasirevic1470, Yan Burelle1356, Robert E Burke216,217,&#13;\nMargit Burmeister1750, Peter B€utikofer1473, Laura Caberlotto1987, Ken Cadwell896, Monika Cahova112, Dongsheng Cai24,&#13;\nJingjing Cai2099, Qian Cai1018, Sara Calatayud2007, Nadine Camougrand1343, Michelangelo Campanella1700,&#13;\nGrant R Campbell1525, Matthew Campbell1249, Silvia Campello556,1876, Robin Candau1769, Isabella Caniggia1983,&#13;\nLavinia Cantoni560, Lizhi Cao116, Allan B Caplan1656, Michele Caraglia1051, Claudio Cardinali1043, Sandra Morais Cardoso1579, Jennifer S Carew208, Laura A Carleton874, Cathleen R Carlin101, Silvia Carloni2002,&#13;\nSven R Carlsson1267, Didac Carmona-Gutierrez1643, Leticia AM Carneiro312, Oliana Carnevali971, Serena Carra1318,&#13;\nAlice Carrier120, Bernadette Carroll900, Caty Casas1324, Josefina Casas1116, Giuliana Cassinelli324, Perrine Castets1462,&#13;\nSusana Castro-Obregon214, Gabriella Cavallini1841, Isabella Ceccherini568, Francesco Cecconi253,555,1884,&#13;\nArthur I Cederbaum459, Valent ın Ce~na199,1281, Simone Cenci1323,2064, Claudia Cerella444, Davide Cervia1996,&#13;\nSilvia Cetrullo1478, Hassan Chaachouay2028, Han-Jung Chae187, Andrei S Chagin634, Chee-Yin Chai626,628,&#13;\nGopal Chakrabarti1502, Georgios Chamilos1601, Edmond YW Chan1142, Matthew TV Chan181, Dhyan Chandra1003,&#13;\nPallavi Chandra548, Chih-Peng Chang818, Raymond Chuen-Chung Chang1653, Ta Yuan Chang345, John C Chatham1434,&#13;\nSaurabh Chatterjee1910, Santosh Chauhan527, Yongsheng Che62, Michael E Cheetham1263, Rajkumar Cheluvappa1783,&#13;\nChun-Jung Chen1153, Gang Chen598,1676, Guang-Chao Chen9, Guoqiang Chen1078, Hongzhuan Chen1077, Jeff W Chen1514,&#13;\nJian-Kang Chen370,371, Min Chen249, Mingzhou Chen2104, Peiwen Chen1823, Qi Chen1674, Quan Chen172,&#13;\nShang-Der Chen138, Si Chen325, Steve S-L Chen10, Wei Chen2125, Wei-Jung Chen829, Wen Qiang Chen979, Wenli Chen1113,&#13;\nXiangmei Chen1133, Yau-Hung Chen1157, Ye-Guang Chen1250, Yin Chen1447, Yingyu Chen953,955, Yongshun Chen2135,&#13;\nYu-Jen Chen712, Yue-Qin Chen1145, Yujie Chen1208, Zhen Chen339, Zhong Chen2123, Alan Cheng1702,&#13;\nChristopher HK Cheng184, Hua Cheng1728, Heesun Cheong814, Sara Cherry1836, Jason Chesney1703,&#13;\nChun Hei Antonio Cheung817, Eric Chevet1359, Hsiang Cheng Chi140, Sung-Gil Chi656, Fulvio Chiacchiera308,&#13;\nHui-Ling Chiang958, Roberto Chiarelli1826, Mario Chiariello235,567,577, Marcello Chieppa835, Lih-Shen Chin290,&#13;\nMario Chiong1285, Gigi NC Chiu878, Dong-Hyung Cho676, Ssang-Goo Cho650, William C Cho982, Yong-Yeon Cho105,&#13;\nYoung-Seok Cho1064, Augustine MK Choi2095, Eui-Ju Choi656, Eun-Kyoung Choi387,400,685, Jayoung Choi1563,&#13;\nMary E Choi2093, Seung-Il Choi2116, Tsui-Fen Chou412, Salem Chouaib395, Divaker Choubey1574, Vinay Choubey1936,&#13;\nKuan-Chih Chow822, Kamal Chowdhury730, Charleen T Chu1856, Tsung-Hsien Chuang827, Taehoon Chun657,&#13;\nHyewon Chung652, Taijoon Chung978, Yuen-Li Chung1194, Yong-Joon Chwae18, Valentina Cianfanelli254,&#13;\nRoberto Ciarcia1775, Iwona A Ciechomska886, Maria Rosa Ciriolo1876, Mara Cirone1042, Sofie Claerhout1694,&#13;\nMichael J Clague1698, Joan Cl aria1457, Peter GH Clarke1687, Robert Clarke361, Emilio Clementi1045,1398, C edric Cleyrat1781,&#13;\nMiriam Cnop1366, Eliana M Coccia574, Tiziana Cocco1459, Patrice Codogno1375, J€orn Coers271, Ezra EW Cohen1533,&#13;\nDavid Colecchia235,567,577, Luisa Coletto25, N uria S Coll123, Emma Colucci-Guyon516, Sergio Comincini1829,&#13;\nMaria Condello578, Katherine L Cook2073, Graham H Coombs1929, Cynthia D Cooper2076, J Mark Cooper1395,&#13;\nIsabelle Coppens601, Maria Tiziana Corasaniti1387, Marco Corazzari485,1884, Ramon Corbalan1566,&#13;\nElisabeth Corcelle-Termeau251, Mario D Cordero1899, Cristina Corral-Ramos1289, Olga Corti507,1109, Andrea Cossarizza1767,&#13;\nPaola Costelli1993, Safia Costes1518, Susan L Cotman721, Ana Coto-Montes946, Sandra Cottet566,1688, Eduardo Couve1301,&#13;\nLori R Covey1015, L Ashley Cowart762, Jeffery S Cox1536, Fraser P Coxon1427, Carolyn B Coyne1846, Mark S Cragg1919,&#13;\nRolf J Craven1679, Tiziana Crepaldi1995, Jose L Crespo1300, Alfredo Criollo1285, Valeria Crippa558, Maria Teresa Cruz1576,&#13;\nAna Maria Cuervo26, Jose M Cuezva1277, Taixing Cui1907, Pedro R Cutillas987, Mark J Czaja27, Maria F Czyzyk-Krzeska1572,&#13;\nRuben K Dagda2068, Uta Dahmen1404, Chunsun Dai800, Wenjie Dai1187, Yun Dai2059, Kevin N Dalby1940,&#13;\nLuisa Dalla Valle1822, Guillaume Dalmasso1340, Marcello D’Amelio557, Markus Damme188, Arlette Darfeuille-Michaud1340,&#13;\nCatherine Dargemont950, Victor M Darley-Usmar1433, Srinivasan Dasarathy205, Biplab Dasgupta202, Srikanta Dash1254,&#13;\nCrispin R Dass242, Hazel Marie Davey8, Lester M Davids1560, David D avila227, Roger J Davis1731, Ted M Dawson604,&#13;\nValina L Dawson606, Paula Daza1898, Jackie de Belleroche470, Paul de Figueiredo1180,1182,&#13;\nRegina Celia Bressan Queiroz de Figueiredo135, Jos e de la Fuente1023, Luisa De Martino1775,&#13;\nAntonella De Matteis1171, Guido RY De Meyer1443, Angelo De Milito631, Mauro De Santi2002,

Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation
Davis J. McCarthy, Yunshun Chen, Gordon K. Smyth
2012· Nucleic Acids Research5.8Kdoi:10.1093/nar/gks042

A flexible statistical framework is developed for the analysis of read counts from RNA-Seq gene expression studies. It provides the ability to analyse complex experiments involving multiple treatment conditions and blocking variables while still taking full account of biological variation. Biological variation between RNA samples is estimated separately from the technical variation associated with sequencing technologies. Novel empirical Bayes methods allow each gene to have its own specific variability, even when there are relatively few biological replicates from which to estimate such variability. The pipeline is implemented in the edgeR package of the Bioconductor project. A case study analysis of carcinoma data demonstrates the ability of generalized linear model methods (GLMs) to detect differential expression in a paired design, and even to detect tumour-specific expression changes. The case study demonstrates the need to allow for gene-specific variability, rather than assuming a common dispersion across genes or a fixed relationship between abundance and variability. Genewise dispersions de-prioritize genes with inconsistent results and allow the main analysis to focus on changes that are consistent between biological replicates. Parallel computational approaches are developed to make non-linear model fitting faster and more reliable, making the application of GLMs to genomic data more convenient and practical. Simulations demonstrate the ability of adjusted profile likelihood estimators to return accurate estimators of biological variability in complex situations. When variation is gene-specific, empirical Bayes estimators provide an advantageous compromise between the extremes of assuming common dispersion or separate genewise dispersion. The methods developed here can also be applied to count data arising from DNA-Seq applications, including ChIP-Seq for epigenetic marks and DNA methylation analyses.

The Bcl-2 Protein Family: Arbiters of Cell Survival
Jerry M. Adams, Suzanne Cory
1998· Science5.4Kdoi:10.1126/science.281.5381.1322

Bcl-2 and related cytoplasmic proteins are key regulators of apoptosis, the cell suicide program critical for development, tissue homeostasis, and protection against pathogens. Those most similar to Bcl-2 promote cell survival by inhibiting adapters needed for activation of the proteases (caspases) that dismantle the cell. More distant relatives instead promote apoptosis, apparently through mechanisms that include displacing the adapters from the pro-survival proteins. Thus, for many but not all apoptotic signals, the balance between these competing activities determines cell fate. Bcl-2 family members are essential for maintenance of major organ systems, and mutations affecting them are implicated in cancer.

New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays
Jonathan B. Baell, Georgina A. Holloway
2010· Journal of Medicinal Chemistry3.9Kdoi:10.1021/jm901137j

This report describes a number of substructural features which can help to identify compounds that appear as frequent hitters (promiscuous compounds) in many biochemical high throughput screens. The compounds identified by such substructural features are not recognized by filters commonly used to identify reactive compounds. Even though these substructural features were identified using only one assay detection technology, such compounds have been reported to be active from many different assays. In fact, these compounds are increasingly prevalent in the literature as potential starting points for further exploration, whereas they may not be.

The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote
Yang Liao, Gordon K. Smyth, Wei Shi
2013· Nucleic Acids Research3.3Kdoi:10.1093/nar/gkt214

Read alignment is an ongoing challenge for the analysis of data from sequencing technologies. This article proposes an elegantly simple multi-seed strategy, called seed-and-vote, for mapping reads to a reference genome. The new strategy chooses the mapped genomic location for the read directly from the seeds. It uses a relatively large number of short seeds (called subreads) extracted from each read and allows all the seeds to vote on the optimal location. When the read length is <160 bp, overlapping subreads are used. More conventional alignment algorithms are then used to fill in detailed mismatch and indel information between the subreads that make up the winning voting block. The strategy is fast because the overall genomic location has already been chosen before the detailed alignment is done. It is sensitive because no individual subread is required to map exactly, nor are individual subreads constrained to map close by other subreads. It is accurate because the final location must be supported by several different subreads. The strategy extends easily to find exon junctions, by locating reads that contain sets of subreads mapping to different exons of the same gene. It scales up efficiently for longer reads.

Pan-cancer analysis of whole genomes
Lauri A. Aaltonen, Federico Abascal, Adam Abeshouse, Hiroyuki Aburatani +4 more
2020· Nature3.3Kdoi:10.1038/s41586-020-1969-6

Abstract Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale 1–3 . Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter 4 ; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation 5,6 ; analyses timings and patterns of tumour evolution 7 ; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity 8,9 ; and evaluates a range of more-specialized features of cancer genomes 8,10–18 .

The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads
Yang Liao, Gordon K. Smyth, Wei Shi
2019· Nucleic Acids Research3.2Kdoi:10.1093/nar/gkz114

We present Rsubread, a Bioconductor software package that provides high-performance alignment and read counting functions for RNA-seq reads. Rsubread is based on the successful Subread suite with the added ease-of-use of the R programming environment, creating a matrix of read counts directly as an R object ready for downstream analysis. It integrates read mapping and quantification in a single package and has no software dependencies other than R itself. We demonstrate Rsubread's ability to detect exon-exon junctions de novo and to quantify expression at the level of either genes, exons or exon junctions. The resulting read counts can be input directly into a wide range of downstream statistical analyses using other Bioconductor packages. Using SEQC data and simulations, we compare Rsubread to TopHat2, STAR and HTSeq as well as to counting functions in the Bioconductor infrastructure packages. We consider the performance of these tools on the combined quantification task starting from raw sequence reads through to summary counts, and in particular evaluate the performance of different combinations of alignment and counting algorithms. We show that Rsubread is faster and uses less memory than competitor tools and produces read count summaries that more accurately correlate with true values.

Cancer Stem Cells—Perspectives on Current Status and Future Directions: AACR Workshop on Cancer Stem Cells
Michael F. Clarke, John E. Dick, Peter B. Dirks, Connie J. Eaves +4 more
2006· Cancer Research3.1Kdoi:10.1158/0008-5472.can-06-3126

A workshop was convened by the AACR to discuss the rapidly emerging cancer stem cell model for tumor development and progression. The meeting participants were charged with evaluating data suggesting that cancers develop from a small subset of cells with self-renewal properties analogous to organ

A communal catalogue reveals Earth’s multiscale microbial diversity
Luke Thompson, Jon G. Sanders, Daniel McDonald, Amnon Amir +4 more
2017· Nature2.9Kdoi:10.1038/nature24621

Our growing awareness of the microbial world's importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth's microbial diversity.

Detection and localization of surgically resectable cancers with a multi-analyte blood test
Joshua D. Cohen, Lu Li, Yuxuan Wang, Christopher J. Thoburn +4 more
2018· Science2.8Kdoi:10.1126/science.aar3247

Earlier detection is key to reducing cancer deaths. Here, we describe a blood test that can detect eight common cancer types through assessment of the levels of circulating proteins and mutations in cell-free DNA. We applied this test, called CancerSEEK, to 1005 patients with nonmetastatic, clinically detected cancers of the ovary, liver, stomach, pancreas, esophagus, colorectum, lung, or breast. CancerSEEK tests were positive in a median of 70% of the eight cancer types. The sensitivities ranged from 69 to 98% for the detection of five cancer types (ovary, liver, stomach, pancreas, and esophagus) for which there are no screening tests available for average-risk individuals. The specificity of CancerSEEK was greater than 99%: only 7 of 812 healthy controls scored positive. In addition, CancerSEEK localized the cancer to a small number of anatomic sites in a median of 83% of the patients.