Latent class analysis has been used to model measurement error, to identify flawed survey questions and to estimate mode effects. Using data from a survey of University of Maryland alumni together with alumni records, we evaluate this technique to determine its usefulness for detecting bad questions in the survey context. Two sets of latent class analysis models are applied in this evaluation: latent class models with three indicators and latent class models with two indicators under different assumptions about prevalence and error rates. Our results indicated that the latent class analysis approach produced good qualitative results for the latent class models-the item that the model deemed the worst was the worst according to the true scores. However, the approach yielded weaker quantitative estimates of the error rates for a given item.
This paper presents an approach to generalize the concept of isogeometric analysis by allowing different spaces for the parameterization of the computational domain and for the approximation of the solution field. The method inherits the main advantage of isogeometric analysis, ie, preserves the original exact computer‐aided design geometry (for example, given by nonuniform rational B‐splines), but allows pairing it with an approximation space, which is more suitable/flexible for analysis, for example, T‐splines, LR‐splines, (truncated) hierarchical B‐splines, and PHT‐splines. This generalization offers the advantage of adaptive local refinement without the need to reparameterize the domain, and therefore without weakening the link with the computer‐aided design model. We demonstrate the use of the method with different choices of geometry and field spaces and show that, despite the failure of the standard patch test, the optimum convergence rate is achieved for nonnested spaces.
We explore T-splines, a generalization of NURBS enabling local refinement, as a basis for isogeometric analysis. We review T-splines as a surface design methodology and then develop it for engineering analysis applications. We test T-splines on some elementary two-dimensional and three-dimensional fluid and structural analysis problems and attain good results in all cases. We summarize the current status of T-splines, their limitations, and future possibilities.
Lagrangian analysis is a powerful way to analyse the output of ocean circulation models and other ocean velocity data such as from altimetry. In the Lagrangian approach, large sets of virtual particles are integrated within the three-dimensional, time-evolving velocity fields. Over several decades, a variety of tools and methods for this purpose have emerged. Here, we review the state of the art in the field of Lagrangian analysis of ocean velocity data, starting from a fundamental kinematic framework and with a focus on large-scale open ocean applications. Beyond the use of explicit velocity fields, we consider the influence of unresolved physics and dynamics on particle trajectories. We comprehensively list and discuss the tools currently available for tracking virtual particles. We then showcase some of the innovative applications of trajectory data, and conclude with some open questions and an outlook. The overall goal of this review paper is to reconcile some of the different techniques and methods in Lagrangian ocean analysis, while recognising the rich diversity of codes that have and continue to emerge, and the challenges of the coming age of petascale computing.
We consider the problem of performing interpretable classification in the high-dimensional setting, in which the number of features is very large and the number of observations is limited. This setting has been studied extensively in the chemometrics literature, and more recently has become commonplace in biological and medical applications. In this setting, a traditional approach involves performing feature selection before classification. We propose sparse discriminant analysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously. Sparse discriminant analysis is based on the optimal scoring interpretation of linear discriminant analysis, and can be extended to perform sparse discrimination via mixtures of Gaussians if boundaries between classes are nonlinear or if subgroups are present within each class. Our proposal also provides low-dimensional views of the discriminative directions.
Principal component analysis (PCA) is widely used in data processing and dimensionality reduction. However, PCA suffers from the fact that each principal component is a linear combination of all the original variables, thus it is often difficult to interpret the results. We introduce a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings. We first show that PCA can be formulated as a regression-type optimization problem; sparse loadings are then obtained by imposing the lasso (elastic net) constraint on the regression coefficients. Efficient algorithms are proposed to fit our SPCA models for both regular multivariate data and gene expression arrays. We also give a new formula to compute the total variance of modified principal components. As illustrations, SPCA is applied to real and simulated data with encouraging results.
Aiming at solving large-scale optimization problems, this paper studies distributed optimization methods based on the alternating direction method of multipliers (ADMM). By formulating the optimization problem as a consensus problem, the ADMM can be used to solve the consensus problem in a fully parallel fashion over a computer network with a star topology. However, traditional synchronized computation does not scale well with the problem size, as the speed of the algorithm is limited by the slowest workers. This is particularly true in a heterogeneous network where the computing nodes experience different computation and communication delays. In this paper, we propose an asynchronous distributed ADMM (AD-ADMM), which can effectively improve the time efficiency of distributed optimization. Our main interest lies in analyzing the convergence conditions of the AD-ADMM, under the popular partially asynchronous model, which is defined based on a maximum tolerable delay of the network. Specifically, by considering general and possibly non-convex cost functions, we show that the AD-ADMM is guaranteed to converge to the set of Karush-Kuhn-Tucker (KKT) points as long as the algorithm parameters are chosen appropriately according to the network delay. We further illustrate that the asynchrony of the ADMM has to be handled with care, as slightly modifying the implementation of the AD-ADMM can jeopardize the algorithm convergence, even under the standard convex setting.
We propose to combine cepstrum and nonlinear time–frequency (TF) analysis to study multiple component oscillatory signals with time-varying frequency and amplitude and with time-varying non-sinusoidal oscillatory pattern. The concept of cepstrum is applied to eliminate the wave-shape function influence on the TF analysis, and we propose a new algorithm, named de-shape synchrosqueezing transform (de-shape SST). The mathematical model, adaptive non-harmonic model, is introduced and the de-shape SST algorithm is theoretically analyzed. In addition to simulated signals, several different physiological, musical and biological signals are analyzed to illustrate the proposed algorithm.
While estimation of the marginal (total) causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. Specifically, upon evaluating the total effect of the exposure, investigators routinely wish to make inferences about the direct or indirect pathways of the effect of the exposure, through a mediator variable or not, that occurs subsequently to the exposure and prior to the outcome. Although powerful semiparametric methodologies have been developed to analyze observational studies that produce double robust and highly efficient estimates of the marginal total causal effect, similar methods for mediation analysis are currently lacking. Thus, this paper develops a general semiparametric framework for obtaining inferences about so-called marginal natural direct and indirect causal effects, while appropriately accounting for a large number of pre-exposure confounding factors for the exposure and the mediator variables. Our analytic framework is particularly appealing, because it gives new insights on issues of efficiency and robustness in the context of mediation analysis. In particular, we propose new multiply robust locally efficient estimators of the marginal natural indirect and direct causal effects, and develop a novel double robust sensitivity analysis framework for the assumption of ignorability of the mediator variable.
We introduce here Momocs, a package intended to ease and popularize modern morphometrics with R, and particularly outline analysis, which aims to extract quantitative variables from shapes. It mostly hinges on the functions published in the book entitled Modern Morphometrics Using R by Claude (2008). From outline extraction from raw data to multivariate analysis, Momocs provides an integrated and convenient toolkit to students and researchers who are, or may become, interested in describing the shape and its variation. The methods implemented so far in Momocs are introduced through a simplistic case study that aims to test if two sets of bottles have different shapes.