Stevens Institute of Technology

UniversityHoboken, United States

Research output, citation impact, and the most-cited recent papers from Stevens Institute of Technology (United States). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works

23.2K

Citations

942.3K

h-index

307

i10-index

15.8K

Also known as

Stevens Institute of Technology

Top-cited papers from Stevens Institute of Technology

Learning from Imbalanced Data

Haibo He, Edwardo A. Garcia

2009· IEEE Transactions on Knowledge and Data Engineering9.9Kdoi:10.1109/tkde.2008.239

With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data.

Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems

Seyedali Mirjalili, Seyedali Mirjalili, Amir H. Gandomi, Seyedeh Zahra Mirjalili +4 more

2017· Advances in Engineering Software5.0Kdoi:10.1016/j.advengsoft.2017.07.002

This work proposes two novel optimization algorithms called Salp Swarm Algorithm (SSA) and Multi-objective Salp Swarm Algorithm (MSSA) for solving optimization problems with single and multiple objectives. The main inspiration of SSA and MSSA is the swarming behaviour of salps when navigating and foraging in oceans. These two algorithms are tested on several mathematical optimization functions to observe and confirm their effective behaviours in finding the optimal solutions for optimization problems. The results on the mathematical functions show that the SSA algorithm is able to improve the initial random solutions effectively and converge towards the optimum. The results of MSSA show that this algorithm can approximate Pareto optimal solutions with high convergence and coverage. The paper also considers solving several challenging and computationally expensive engineering design problems (e.g. airfoil design and marine propeller design) using SSA and MSSA. The results of the real case studies demonstrate the merits of the algorithms proposed in solving real-world problems with difficult and unknown search spaces.

ADASYN: Adaptive synthetic sampling approach for imbalanced learning

Haibo He, Yang Bai, Edwardo A. Garcia, Shutao Li

20084.5Kdoi:10.1109/ijcnn.2008.4633969

This paper presents a novel adaptive synthetic (ADASYN) sampling approach for learning from imbalanced data sets. The essential idea of ADASYN is to use a weighted distribution for different minority class examples according to their level of difficulty in learning, where more synthetic data is generated for minority class examples that are harder to learn compared to those minority examples that are easier to learn. As a result, the ADASYN approach improves learning with respect to the data distributions in two ways: (1) reducing the bias introduced by the class imbalance, and (2) adaptively shifting the classification decision boundary toward the difficult examples. Simulation analyses on several machine learning data sets show the effectiveness of this method across five evaluation metrics.

Lateral Collinearity and Misleading Results in Variance-Based SEM: An Illustration and Recommendations

Ned Kock, Gary S. Lynn

2012· Journal of the Association for Information Systems3.6Kdoi:10.17705/1jais.00302

Variance-based structural equation modeling is extensively used in information systems research, and many related findings may have been distorted by hidden collinearity. This is a problem that may extend to multivariate analyses, in general, in the field of information systems as well as in many other fields. In multivariate analyses, collinearity is usually assessed as a predictor-predictor relationship phenomenon, where two or more predictors are checked for redundancy. This type of assessment addresses vertical, or “classic”, collinearity. However, another type of collinearity may also exist, here called “lateral” collinearity. It refers to predictor-criterion collinearity. Lateral collinearity problems are exemplified based on an illustrative variance-based structural equation modeling analysis. The analysis employs WarpPLS 2.0, with the results double-checked with other statistical analysis software tools. It is shown that standard validity and reliability tests do not properly capture lateral collinearity. A new approach for the assessment of both vertical and lateral collinearity in variance-based structural equation modeling is proposed and demonstrated in the context of the illustrative analysis.

Analysis of Classical Statistical Mechanics by Means of Collective Coordinates

J. K. Percus, George J. Yevick

1958· Physical Review2.8Kdoi:10.1103/physrev.110.1

The three-dimensional classical many-body system is approximated by the use of collective coordinates, through the assumed knowledge of two-body correlation functions. The resulting approximate statistical state is used to obtain the two-body correlation function. Thus, a self-consistent formulation is available for determining the correlation function. Then, the self-consistent integral equation is solved in virial expansion, and the thermodynamic quantities of the system thereby ascertained. The first three virial coefficients are exactly reproduced, while the fourth is nearly correct, as evidenced by numerical results for the case of hard spheres.

LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain

Tixiao Shan, Brendan Englot

20182.1Kdoi:10.1109/iros.2018.8594299

We propose a lightweight and ground-optimized lidar odometry and mapping method, LeGO-LOAM, for realtime six degree-of-freedom pose estimation with ground vehicles. LeGO-LOAM is lightweight, as it can achieve realtime pose estimation on a low-power embedded system. LeGO-LOAM is ground-optimized, as it leverages the presence of a ground plane in its segmentation and optimization steps. We first apply point cloud segmentation to filter out noise, and feature extraction to obtain distinctive planar and edge features. A two-step Levenberg-Marquardt optimization method then uses the planar and edge features to solve different components of the six degree-of-freedom transformation across consecutive scans. We compare the performance of LeGO-LOAM with a state-of-the-art method, LOAM, using datasets gathered from variable-terrain environments with ground vehicles, and show that LeGO-LOAM achieves similar or better accuracy with reduced computational expense. We also integrate LeGO-LOAM into a SLAM framework to eliminate the pose estimation error caused by drift, which is tested using the KITTI dataset.

LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping

Tixiao Shan, Brendan Englot, Drew Meyers, Wei Wang +2 more

20202.0Kdoi:10.1109/iros45743.2020.9341176

We propose a framework for tightly-coupled lidar inertial odometry via smoothing and mapping, LIO-SAM, that achieves highly accurate, real-time mobile robot trajectory estimation and map-building. LIO-SAM formulates lidar-inertial odometry atop a factor graph, allowing a multitude of relative and absolute measurements, including loop closures, to be incorporated from different sources as factors into the system. The estimated motion from inertial measurement unit (IMU) pre-integration de-skews point clouds and produces an initial guess for lidar odometry optimization. The obtained lidar odometry solution is used to estimate the bias of the IMU. To ensure high performance in real-time, we marginalize old lidar scans for pose optimization, rather than matching lidar scans to a global map. Scan-matching at a local scale instead of a global scale significantly improves the real-time performance of the system, as does the selective introduction of keyframes, and an efficient sliding window approach that registers a new keyframe to a fixed-size set of prior "sub-keyframes." The proposed method is extensively evaluated on datasets gathered from three platforms over various scales and environments.

Perturbation Methods in Applied Mathematics

Julian D. Cole, Lawrence E. Levine

1973· IEEE Transactions on Systems Man and Cybernetics1.8Kdoi:10.1109/tsmc.1973.4309322

1 Introduction.- 2 Limit Process Expansions Applied to Ordinary Differential Equations.- 3 Multiple-Variable Expansion Procedures.- 4 Applications to Partial Differential Equations.- 5 Examples from Fluid Mechanics.- Author Index.

Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network

Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu +2 more

20191.5Kdoi:10.1145/3292500.3330672

Industry devices (i.e., entities) such as server machines, spacecrafts, engines, etc., are typically monitored with multivariate time series, whose anomaly detection is critical for an entity's service quality management. However, due to the complex temporal dependence and stochasticity of multivariate time series, their anomaly detection remains a big challenge. This paper proposes OmniAnomaly, a stochastic recurrent neural network for multivariate time series anomaly detection that works well robustly for various devices. Its core idea is to capture the normal patterns of multivariate time series by learning their robust representations with key techniques such as stochastic variable connection and planar normalizing flow, reconstruct input data by the representations, and use the reconstruction probabilities to determine anomalies. Moreover, for a detected entity anomaly, OmniAnomaly can provide interpretations based on the reconstruction probabilities of its constituent univariate time series. The evaluation experiments are conducted on two public datasets from aerospace and a new server machine dataset (collected and released by us) from an Internet company. OmniAnomaly achieves an overall F1-Score of 0.86 in three real-world datasets, signicantly outperforming the best performing baseline method by 0.09. The interpretation accuracy for OmniAnomaly is up to 0.89.

Statistical Mechanics of Rigid Spheres

Howard Reiss, H. L. Frisch, Joel L. Lebowitz

1959· The Journal of Chemical Physics1.4Kdoi:10.1063/1.1730361

An equilibrium theory of rigid sphere fluids is developed based on the properties of a new distribution function G(r) which measures the density of rigid sphere molecules in contact with a rigid sphere solute of arbitrary size. A number of exact relations which describe rather fully the functional form of G(r) are derived. These are based on both geometrical considerations and the virial theorem. A knowledge of G(a) where a is the diameter of a rigid sphere enables one to arrive at the equation of state. The resulting analytical expression which is exact up to the third virial coefficient gives the fourth virial coefficient within 3% and the fifth, insofar as it is known, within 5%. Furthermore over the entire range of fluid density, the equation of state derived from theory agrees with that computed using machine methods. Theory also gives an expression for the surface tension of a hard sphere fluid in contact with a perfectly repelling wall. The dependence of surface tension on curvature is also given. The expressions obtained correlate nicely with those adduced by other thermodynamic and statistical mechanical theories. They also suggest that macroscopic consideration on surface tension can sometimes be successfully extrapolated to molecular dimensions.

A convolutional neural network cascade for face detection

Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt +1 more

20151.4Kdoi:10.1109/cvpr.2015.7299170

In real-world face detection, large visual variations, such as those due to pose, expression, and lighting, demand an advanced discriminative model to accurately differentiate faces from the backgrounds. Consequently, effective models for the problem tend to be computationally prohibitive. To address these two conflicting challenges, we propose a cascade architecture built on convolutional neural networks (CNNs) with very powerful discriminative capability, while maintaining high performance. The proposed CNN cascade operates at multiple resolutions, quickly rejects the background regions in the fast low resolution stages, and carefully evaluates a small number of challenging candidates in the last high resolution stage. To improve localization effectiveness, and reduce the number of candidates at later stages, we introduce a CNN-based calibration stage after each of the detection stages in the cascade. The output of each calibration stage is used to adjust the detection window position for input to the subsequent stage. The proposed method runs at 14 FPS on a single CPU core for VGA-resolution images and 100 FPS using a GPU, and achieves state-of-the-art detection performance on two public face detection benchmarks.

A Definition of Systems Thinking: A Systems Approach

Ross Arnold, Jon Wade

2015· Procedia Computer Science1.3Kdoi:10.1016/j.procs.2015.03.050

This paper proposes a definition of systems thinking for use in a wide variety of disciplines, with particular emphasis on the development and assessment of systems thinking educational efforts. The definition was derived from a review of the systems thinking literature combined with the application of systems thinking to itself. Many different definitions of systems thinking can be found throughout the systems community, but key components of a singular definition can be distilled from the literature. This researcher considered these components both individually and holistically, then proposed a new definition of systems thinking that integrates these components as a system. The definition was tested for fidelity against a System Test and against three widely accepted system archetypes. Systems thinking is widely believed to be critical in handling the complexity facing the world in the coming decades; however, it still resides in the educational margins. In order for this important skill to receive mainstream educational attention, a complete definition is required. Such a definition has not yet been established. This research is an attempt to rectify this deficiency by providing such a definition.

Deep Models Under the GAN

Briland Hitaj, Giuseppe Ateniese, Fernando Pérez‐Cruz

20171.3Kdoi:10.1145/3133956.3134012

Deep Learning has recently become hugely popular in machine learning for its ability to solve end-to-end learning systems, in which the features and the classifiers are learned simultaneously, providing significant improvements in classification accuracy in the presence of highly-structured and large databases.

<i>The Anthropic Cosmological Principle</i>

John D. Barrow, Frank J. Tipler, James L. Anderson

1987· Physics Today1.3Kdoi:10.1063/1.2820190

Is there any connection between the vastness of the universes of stars and galaxies and the existence of life on a small planet out in the suburbs of the Milky Way? This book shows that there is. In their classic work, John Barrow and Frank Tipler examine the question of Mankind's place in the Universe, taking the reader on a tour of many scientific disciplines and offering fascinating insights into issues such as the nature of life, the serach for extraterrestrial intelligence, and the past history and fate of our universe.

Sudden Death of Entanglement

Ting Yu, J. H. Eberly

2009· Science1.1Kdoi:10.1126/science.1167343

A new development in the dynamical behavior of elementary quantum systems is the surprising discovery that correlation between two quantum units of information called qubits can be degraded by environmental noise in a way not seen previously in studies of dissipation. This new route for dissipation attacks quantum entanglement, the essential resource for quantum information as well as the central feature in the Einstein-Podolsky-Rosen so-called paradox and in discussions of the fate of Schrödinger's cat. The effect has been labeled ESD, which stands for early-stage disentanglement or, more frequently, entanglement sudden death. We review recent progress in studies focused on this phenomenon.

The transverse force on a spinning sphere moving in a viscous fluid

S. I. Rubinow, Joseph B. Keller

1961· Journal of Fluid Mechanics1.1Kdoi:10.1017/s0022112061000640

The flow about a spinning sphere moving in a viscous fluid is calculated for small values of the Reynolds number. With this solution the force and torque on the sphere are computed. It is found that in addition to the drag force determined by Stokes, the sphere experiences a force F L orthogonal to its direction of motion. This force is given by ${\bf F}_L = \pi a^3 \rho \Omega \times {\bf V}[1 + O(R)]$ . Here a is the radius of the sphere, Ω is its angular velocity, V is its velocity, ρ is the fluid density and R is the Reynolds number, $R = \rho \mu ^{-1} Va$ . For small values of R , the transverse force is independent of the viscosity μ. This force is in such a direction as to account for the curving of a pitched baseball, the long range of a spinning golf ball, etc. It is used as a basis for the discussion of the flow of a suspension of spheres through a tube. The calculation involves the Stokes and Oseen expansions. A representation of solutions of the Oseen equations in terms of two scalar functions is also presented.

On the Convergence of FedAvg on Non-IID Data

Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang +1 more

2019· arXiv (Cornell University)1.0Kdoi:10.48550/arxiv.1907.02189

Federated learning enables a large amount of edge computing devices to jointly learn a model without data sharing. As a leading algorithm in this setting, Federated Averaging (\texttt{FedAvg}) runs Stochastic Gradient Descent (SGD) in parallel on a small subset of the total devices and averages the sequences only once in a while. Despite its simplicity, it lacks theoretical guarantees under realistic settings. In this paper, we analyze the convergence of \texttt{FedAvg} on non-iid data and establish a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs. Importantly, our bound demonstrates a trade-off between communication-efficiency and convergence rate. As user devices may be disconnected from the server, we relax the assumption of full device participation to partial device participation and study different averaging schemes; low device participation rate can be achieved without severely slowing down the learning. Our results indicate that heterogeneity of data slows down the convergence, which matches empirical observations. Furthermore, we provide a necessary condition for \texttt{FedAvg} on non-iid data: the learning rate $η$ must decay, even if full-gradient is used; otherwise, the solution will be $Ω(η)$ away from the optimal.

Measuring Corporate Culture Using Machine Learning

Kai Li, Feng Mai, Rui Shen, Xinyan Yan

2020· Review of Financial Studies966doi:10.1093/rfs/hhaa079

Abstract We create a culture dictionary using one of the latest machine learning techniques—the word embedding model—and 209,480 earnings call transcripts. We score the five corporate cultural values of innovation, integrity, quality, respect, and teamwork for 62,664 firm-year observations over the period 2001–2018. We show that an innovative culture is broader than the usual measures of corporate innovation – R&D expenses and the number of patents. Moreover, we show that corporate culture correlates with business outcomes, including operational efficiency, risk-taking, earnings management, executive compensation design, firm value, and deal making, and that the culture-performance link is more pronounced in bad times. Finally, we present suggestive evidence that corporate culture is shaped by major corporate events, such as mergers and acquisitions.

Hands-on, simulated, and remote laboratories

Jing Ma, Jeffrey V. Nickerson

2006· ACM Computing Surveys965doi:10.1145/1132960.1132961

Laboratory-based courses play a critical role in scientific education. Automation is changing the nature of these laboratories, and there is a long-running debate about the value of hands-on versus simulated laboratories. In addition, the introduction of remote laboratories adds a third category to the debate. Through a review of the literature related to these labs in education, the authors draw several conclusions about the state of current research. The debate over different technologies is confounded by the use of different educational objectives as criteria for judging the laboratories: Hands-on advocates emphasize design skills, while remote lab advocates focus on conceptual understanding. We observe that the boundaries among the three labs are blurred in the sense that most laboratories are mediated by computers, and that the psychology of presence may be as important as technology. We also discuss areas for future research.

Microplasmas and applications

K. Becker, Karl H. Schoenbach, J. G. Eden

2006· Journal of Physics D Applied Physics902doi:10.1088/0022-3727/39/3/r01

Atmospheric-pressure, non-equilibrium plasmas are susceptible to instabilities and, in particular, to arcing (glow-to-arc transition). Spatially confining the plasma to dimensions of 1 mm or less is a promising approach to the generation and maintenance of stable, glow discharges at atmospheric-pressure. Often referred to as microdischarges or microplasmas, these weakly-ionized discharges represent a new and fascinating realm of plasma science, where issues such as the possible breakdown of 'pd scaling' and the role of boundary-dominated phenomena come to the fore. Microplasmas are generated under conditions that promote the efficient production of transient molecular species such as the rare gas excimers, which generally are formed by three-body collisions. Pulsed excitation on a sub-microsecond time scale results in microplasmas with significant shifts in both the temperatures and energy distribution functions associated with the ions and electrons. This allows for the selective production of chemically reactive species and opens the door to a wide range of new applications of microplasmas. The implementation of semiconductor and microelectronics and MEMs microfabrication techniques has resulted in the realization of microplasma arrays as large as 250,000 devices. Fabricated in silicon or ceramics with characteristic device dimensions as small as 10 µm and at packing densities up to 104 cm−2, these arrays offer optical and electrical characteristics well suited for applications in medical diagnostics, displays and environmental sensing. Several microplasma device structures, including their fundamental properties and selected applications, will be discussed.

Search all NobleBlocks papers mentioning “Stevens Institute of Technology” →