German Research Centre for Artificial Intelligence
funderKaiserslautern, Germany
Research output, citation impact, and the most-cited recent papers from German Research Centre for Artificial Intelligence (Germany). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from German Research Centre for Artificial Intelligence
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
Recognizing lines of unconstrained handwritten text is a challenging task. The difficulty of segmenting cursive or overlapping characters, combined with the need to exploit surrounding context, has led to low recognition rates for even the best current recognizers. Most recent progress in the field has been made either through improved preprocessing or through advances in language modeling. Relatively little work has been done on the basic recognition algorithms. Indeed, most systems rely on the same hidden Markov models that have been used for decades in speech and handwriting recognition, despite their well-known shortcomings. This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies. In experiments on two large unconstrained handwriting databases, our approach achieves word recognition accuracies of 79.7 percent on online data and 74.1 percent on offline data, significantly outperforming a state-of-the-art HMM-based system. In addition, we demonstrate the network's robustness to lexicon size, measure the individual influence of its hidden layers, and analyze its use of context. Last, we provide an in-depth discussion of the differences between the network and HMMs, suggesting reasons for the network's superior performance.
As of today, the fifth generation (5G) mobile communication system has been rolled out in many countries and the number of 5G subscribers already reaches a very large scale. It is time for academia and industry to shift their attention towards the next generation. At this crossroad, an overview of the current state of the art and a vision of future communications are definitely of interest. This article thus aims to provide a comprehensive survey to draw a picture of the sixth generation (6G) system in terms of drivers, use cases, usage scenarios, requirements, key performance indicators (KPIs), architecture, and enabling technologies. First, we attempt to answer the question of “Is there any need for 6G?” by shedding light on its key driving factors, in which we predict the explosive growth of mobile traffic until 2030, and envision potential use cases and usage scenarios. Second, the technical requirements of 6G are discussed and compared with those of 5G with respect to a set of KPIs in a quantitative manner. Third, the state-of-the-art 6G research efforts and activities from representative institutions and countries are summarized, and a tentative roadmap of definition, specification, standardization, and regulation is projected. Then, we identify a dozen of potential technologies and introduce their principles, advantages, challenges, and open research issues. Finally, the conclusions are drawn to paint a picture of “What 6G may look like?.” This survey is intended to serve as an enlightening guideline to spur interests and further investigations for subsequent research and development of 6G communications systems.
BACKGROUND: Genomewide association studies can be used to identify disease-relevant genomic regions, but interpretation of the data is challenging. The FTO region harbors the strongest genetic association with obesity, yet the mechanistic basis of this association remains elusive. METHODS: We examined epigenomic data, allelic activity, motif conservation, regulator expression, and gene coexpression patterns, with the aim of dissecting the regulatory circuitry and mechanistic basis of the association between the FTO region and obesity. We validated our predictions with the use of directed perturbations in samples from patients and from mice and with endogenous CRISPR-Cas9 genome editing in samples from patients. RESULTS: Our data indicate that the FTO allele associated with obesity represses mitochondrial thermogenesis in adipocyte precursor cells in a tissue-autonomous manner. The rs1421085 T-to-C single-nucleotide variant disrupts a conserved motif for the ARID5B repressor, which leads to derepression of a potent preadipocyte enhancer and a doubling of IRX3 and IRX5 expression during early adipocyte differentiation. This results in a cell-autonomous developmental shift from energy-dissipating beige (brite) adipocytes to energy-storing white adipocytes, with a reduction in mitochondrial thermogenesis by a factor of 5, as well as an increase in lipid storage. Inhibition of Irx3 in adipose tissue in mice reduced body weight and increased energy dissipation without a change in physical activity or appetite. Knockdown of IRX3 or IRX5 in primary adipocytes from participants with the risk allele restored thermogenesis, increasing it by a factor of 7, and overexpression of these genes had the opposite effect in adipocytes from nonrisk-allele carriers. Repair of the ARID5B motif by CRISPR-Cas9 editing of rs1421085 in primary adipocytes from a patient with the risk allele restored IRX3 and IRX5 repression, activated browning expression programs, and restored thermogenesis, increasing it by a factor of 7. CONCLUSIONS: Our results point to a pathway for adipocyte thermogenesis regulation involving ARID5B, rs1421085, IRX3, and IRX5, which, when manipulated, had pronounced pro-obesity and anti-obesity effects. (Funded by the German Research Center for Environmental Health and others.).
This paper addresses the lack of a commonly used, standard dataset and established benchmarking problems for physical activity monitoring. A new dataset - recorded from 18 activities performed by 9 subjects, wearing 3 IMUs and a HR-monitor - is created and made publicly available. Moreover, 4 classification problems are benchmarked on the dataset, using a standard data processing chain and 5 different classifiers. The benchmark shows the difficulty of the classification tasks and exposes new challenges for physical activity monitoring.
International challenges have become the de facto standard for comparative assessment of image analysis algorithms. Although segmentation is the most widely investigated medical image processing task, the various challenges have been organized to focus only on specific clinical tasks. We organized the Medical Segmentation Decathlon (MSD)-a biomedical image analysis challenge, in which algorithms compete in a multitude of both tasks and modalities to investigate the hypothesis that a method capable of performing well on multiple tasks will generalize well to a previously unseen task and potentially outperform a custom-designed solution. MSD results confirmed this hypothesis, moreover, MSD winner continued generalizing well to a wide range of other clinical problems for the next two years. Three main conclusions can be drawn from this study: (1) state-of-the-art image segmentation algorithms generalize well when retrained on unseen tasks; (2) consistent algorithmic performance across multiple tasks is a strong surrogate of algorithmic generalizability; (3) the training of accurate AI segmentation models is now commoditized to scientists that are not versed in AI model training.
Modern enterprise applications are currently undergoing a complete paradigm shift away from traditional transactional processing to combined analytical and transactional processing. This challenge of combining two opposing query types in a single database management system results in additional requirements for transaction management as well. In this paper, we discuss our approach to achieve high throughput for transactional query processing while allowing concurrent analytical queries. We present our approach to distributed snapshot isolation and optimized two-phase commit protocols.
A decade after its introduction, Industrie 4.0 has been established globally as the dominant paradigm for the digital transformation of the manufacturing industry. Amalgamating research-based results and practical experience from the German industry, this contribution reviews the progress made in implementing Industrie 4.0 and identifies future fields of action from a technological and application-oriented perspective. Putting the human in the center, Industrie 4.0 is the basis for data-based value creation, innovative business models, and agile forms of organization. Today, in the German manufacturing industry, the Internet of Things and cyber–physical production systems are a reality in newly built factories, and the connectivity of machinery has been significantly increased in existing factories. Now, the trends of industrial AI, edge computing up to the edge cloud, 5G in the factory, team robotics, autonomous intralogistics systems, and trustworthy data infrastructures must be leveraged to strengthen resilience, sovereignty, semantic interoperability, and sustainability. This enables the creation of digital innovation ecosystems that ensure long-term adaptability in a volatile economic and geopolitical environment. In sum, this review represents a comprehensive assessment of the status quo and identifies what is needed in the future to reap the rewards of the groundwork done in the first ten years of Industrie 4.0.
We investigate whether a classifier can continuously authenticate users based on the way they interact with the touchscreen of a smart phone. We propose a set of 30 behavioral touch features that can be extracted from raw touchscreen logs and demonstrate that different users populate distinct subspaces of this feature space. In a systematic experiment designed to test how this behavioral pattern exhibits consistency over time, we collected touch data from users interacting with a smart phone using basic navigation maneuvers, i.e., up–down and left–right scrolling. We propose a classification framework that learns the touch behavior of a user during an enrollment phase and is able to accept or reject the current user by monitoring interaction with the touch screen. The classifier achieves a median equal error rate of 0% for intrasession authentication, 2%–3% for intersession authentication, and below 4% when the authentication test was carried out one week after the enrollment phase. While our experimental findings disqualify this method as a standalone authentication mechanism for long-term authentication, it could be implemented as a means to extend screen-lock time or as a part of a multimodal biometric authentication system.
The development of Industry 4.0 will be accompanied by changing tasks and demands for the human in the factory. As the most flexible entity in cyber-physical production systems, workers will be faced with a large variety of jobs ranging from specification and monitoring to verification of production strategies. Through technological support it is guaranteed that workers can realize their full potential and adopt the role of strategic decision-makers and flexible problem-solvers. The use of established interaction technologies and metaphors from the consumer goods market seems to be promising. This paper demonstrates solutions for the technological assistance of workers, which implement the representation of a cyber-physical world and the therein occurring interactions in the form of intelligent user interfaces. Besides technological means, the paper points out the requirement for adequate qualification strategies, which will create the required, inter-disciplinary understanding for Industry 4.0.
The vision of the 4th industrial revolution describes the realization of the Internet of Things within the context of the factory to realize a significantly higher flexibility and adaptability of production systems. Driven by politics and research meanwhile most of the automation technology providers in Germany have recognized the potentials of Industry 4.0 and provide first solutions. However, presented solutions so far represent vendor-specific or isolated production system. In order to make Industry 4.0 a success, these proprietary approaches must be replaced by open and standardized solutions. For this reason, the SmartFactoryKL has realized a very first multi-vendor and highly modular production system as a sample reference for Industry 4.0. This contribution gives an overview of the current status of the SmartFactoryKL initiative to build a highly modular, multi-vendor production line based on common concepts and standardization activities. The findings and experiences of this multi-vendor project are documented as an outline for further research on highly modular production lines.
The Lean Production paradigm has become the major approach to create highly efficient processes in industry since the early 1990s. After the sudden end of the Computer Integrated Manufacturing (CIM) era, which finally was doomed to fail due to its unrulable complexity of the required automation technology, the Lean approach was successful because of its high effectiveness by reducing complexity and avoiding non-value-creating process steps. Today, the term Industry 4.0 describes a vision of future production. Many people are at least skeptical or even hostile towards this new approach. This position paper gives an overview over existing combinations of Lean Production and automation technology, also called Lean Automation. Furthermore, it discusses major Industry 4.0 corner stones and links them to the well-proven Lean approach. Examples of combining both are smart watches for supporting the Andon principle or Cyber Physical Systems (CPS) for a flexible Kanban production scheduling.
SEMAINE has created a large audiovisual database as a part of an iterative approach to building Sensitive Artificial Listener (SAL) agents that can engage a person in a sustained, emotionally colored conversation. Data used to build the agents came from interactions between users and an "operator” simulating a SAL agent, in different configurations: Solid SAL (designed so that operators displayed an appropriate nonverbal behavior) and Semi-automatic SAL (designed so that users' experience approximated interacting with a machine). We then recorded user interactions with the developed system, Automatic SAL, comparing the most communicatively competent version to versions with reduced nonverbal skills. High quality recording was provided by five high-resolution, high-framerate cameras, and four microphones, recorded synchronously. Recordings total 150 participants, for a total of 959 conversations with individual SAL characters, lasting approximately 5 minutes each. Solid SAL recordings are transcribed and extensively annotated: 6-8 raters per clip traced five affective dimensions and 27 associated categories. Other scenarios are labeled on the same pattern, but less fully. Additional information includes FACS annotation on selected extracts, identification of laughs, nods, and shakes, and measures of user engagement with the automatic system. The material is available through a web-accessible database.
Traditional distance and density-based anomaly detection techniques are unable to detect periodic and seasonality related point anomalies which occur commonly in streaming data, leaving a big gap in time series anomaly detection in the current era of the IoT. To address this problem, we present a novel deep learning-based anomaly detection approach (DeepAnT) for time series data, which is equally applicable to the non-streaming cases. DeepAnT is capable of detecting a wide range of anomalies, i.e., point anomalies, contextual anomalies, and discords in time series data. In contrast to the anomaly detection methods where anomalies are learned, DeepAnT uses unlabeled data to capture and learn the data distribution that is used to forecast the normal behavior of a time series. DeepAnT consists of two modules: time series predictor and anomaly detector. The <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">time series predictor</i> module uses deep convolutional neural network (CNN) to predict the next time stamp on the defined horizon. This module takes a window of time series (used as a context) and attempts to predict the next time stamp. The predicted value is then passed to the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">anomaly detector</i> module, which is responsible for tagging the corresponding time stamp as normal or abnormal. DeepAnT can be trained even without removing the anomalies from the given data set. Generally, in deep learning-based approaches, a lot of data are required to train a model. Whereas in DeepAnT, a model can be trained on relatively small data set while achieving good generalization capabilities due to the effective parameter sharing of the CNN. As the anomaly detection in DeepAnT is unsupervised, it does not rely on anomaly labels at the time of model generation. Therefore, this approach can be directly applied to real-life scenarios where it is practically impossible to label a big stream of data coming from heterogeneous sensors comprising of both normal as well as anomalous points. We have performed a detailed evaluation of 15 algorithms on 10 anomaly detection benchmarks, which contain a total of 433 real and synthetic time series. Experiments show that DeepAnT outperforms the state-of-the-art anomaly detection methods in most of the cases, while performing on par with others.
A worldwide movement in advanced manufacturing countries is seeking to reinvigorate (and revolutionize) the industrial and manufacturing core competencies with the use of the latest advances in information and communications technology. Visual computing plays an important role as the "glue factor" in complete solutions. This article positions visual computing in its intrinsic crucial role for Industrie 4.0 and provides a general, broad overview and points out specific directions and scenarios for future research.
While applications for mobile devices have become extremely important in the last few years, little public information exists on mobile application usage behavior. We describe a large-scale deployment-based research study that logged detailed application usage information from over 4,100 users of Android-powered mobile devices. We present two types of results from analyzing this data: basic descriptive statistics and contextual descriptive statistics. In the case of the former, we find that the average session with an application lasts less than a minute, even though users spend almost an hour a day using their phones. Our contextual findings include those related to time of day and location. For instance, we show that news applications are most popular in the morning and games are at night, but communication applications dominate through most of the day. We also find that despite the variety of apps available, communication applications are almost always the first used upon a device's waking from sleep. In addition, we discuss the notion of a virtual application sensor, which we used to collect the data.
Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparability, in contrast to more ‘traditional ’ disciplines in speech analysis. The INTERSPEECH 2010 Paralinguistic Challenge shall help overcome the usually low compatibility of results, by addressing three selected subchallenges. In the Age Sub-Challenge, the age of speakers has to be determined in four groups. In the Gender Sub-Challenge, a three-class classification task has to be solved and finally, the Affect Sub-Challenge asks for speakers ’ interest in ordinal representation. This paper introduces the conditions, the Challenge corpora “aGender ” and “TUM AVIC ” and standard feature sets that may be used. Further, baseline results are given.
Brain plasticity as a neurobiological reflection of individuality is difficult to capture in animal models. Inspired by behavioral-genetic investigations of human monozygotic twins reared together, we obtained dense longitudinal activity data on 40 inbred mice living in one large enriched environment. The exploratory activity of the mice diverged over time, resulting in increasing individual differences with advancing age. Individual differences in cumulative roaming entropy, indicating the active coverage of territory, correlated positively with individual differences in adult hippocampal neurogenesis. Our results show that factors unfolding or emerging during development contribute to individual differences in structural brain plasticity and behavior. The paradigm introduced here serves as an animal model for identifying mechanisms of plasticity underlying nonshared environmental contributions to individual differences in behavior.
Ondřej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Christof Monz. Proceedings of the Third Conference on Machine Translation: Shared Task Papers. 2018.
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.