14 resultados para Low dimensional topology

em Duke University


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.

Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Subspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.

A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.

The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.

From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.

Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Intriguing lattice dynamics has been predicted for aperiodic crystals that contain incommensurate substructures. Here we report inelastic neutron scattering measurements of phonon and magnon dispersions in Sr14Cu24O41, which contains incommensurate one-dimensional (1D) chain and two-dimensional (2D) ladder substructures. Two distinct acoustic phonon-like modes, corresponding to the sliding motion of one sublattice against the other, are observed for atomic motions polarized along the incommensurate axis. In the long wavelength limit, it is found that the sliding mode shows a remarkably small energy gap of 1.7-1.9 meV, indicating very weak interactions between the two incommensurate sublattices. The measurements also reveal a gapped and steep linear magnon dispersion of the ladder sublattice. The high group velocity of this magnon branch and weak coupling with acoustic phonons can explain the large magnon thermal conductivity in Sr14Cu24O41 crystals. In addition, the magnon specific heat is determined from the measured total specific heat and phonon density of states, and exhibits a Schottky anomaly due to gapped magnon modes of the spin chains. These findings offer new insights into the phonon and magnon dynamics and thermal transport properties of incommensurate magnetic crystals that contain low-dimensional substructures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Bayesian methods offer a flexible and convenient probabilistic learning framework to extract interpretable knowledge from complex and structured data. Such methods can characterize dependencies among multiple levels of hidden variables and share statistical strength across heterogeneous sources. In the first part of this dissertation, we develop two dependent variational inference methods for full posterior approximation in non-conjugate Bayesian models through hierarchical mixture- and copula-based variational proposals, respectively. The proposed methods move beyond the widely used factorized approximation to the posterior and provide generic applicability to a broad class of probabilistic models with minimal model-specific derivations. In the second part of this dissertation, we design probabilistic graphical models to accommodate multimodal data, describe dynamical behaviors and account for task heterogeneity. In particular, the sparse latent factor model is able to reveal common low-dimensional structures from high-dimensional data. We demonstrate the effectiveness of the proposed statistical learning methods on both synthetic and real-world data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a fiber-optic interferometric system for measuring depth-resolved scattering in two angular dimensions using Fourier-domain low-coherence interferometry. The system is a unique hybrid of the Michelson and Sagnac interferometer topologies. The collection arm of the interferometer is scanned in two dimensions to detect angular scattering from the sample, which can then be analyzed to determine the structure of the scatterers. A key feature of the system is the full control of polarization of both the illumination and the collection fields, allowing for polarization-sensitive detection, which is essential for two-dimensional angular measurements. System performance is demonstrated using a double-layer microsphere phantom. Experimental data from samples with different sizes and acquired with different polarizations show excellent agreement with Mie theory, producing structural measurements with subwavelength accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The goal of this work is to analyze three-dimensional dispersive metallic photonic crystals (PCs) and to find a structure that can provide a bandgap and a high cutoff frequency. The determination of the band structure of a PC with dispersive materials is an expensive nonlinear eigenvalue problem; in this work we propose a rational-polynomial method to convert such a nonlinear eigenvalue problem into a linear eigenvalue problem. The spectral element method is extended to rapidly calculate the band structure of three-dimensional PCs consisting of realistic dispersive materials modeled by Drude and Drude-Lorentz models. Exponential convergence is observed in the numerical experiments. Numerical results show that, at the low frequency limit, metallic materials are similar to a perfect electric conductor, where the simulation results tend to be the same as perfect electric conductor PCs. Band structures of the scaffold structure and semi-woodpile structure metallic PCs are investigated. It is found that band structures of semi-woodpile PCs have a very high cutoff frequency as well as a bandgap between the lowest two bands and the higher bands.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability of diffuse reflectance spectroscopy to extract quantitative biological composition of tissues has been used to discern tissue types in both pre-clinical and clinical cancer studies. Typically, diffuse reflectance spectroscopy systems are designed for single-point measurements. Clinically, an imaging system would provide valuable spatial information on tissue composition. While it is feasible to build a multiplexed fiber-optic probe based spectral imaging system, these systems suffer from drawbacks with respect to cost and size. To address these we developed a compact and low cost system using a broadband light source with an 8-slot filter wheel for illumination and silicon photodiodes for detection. The spectral imaging system was tested on a set of tissue mimicking liquid phantoms which yielded an optical property extraction accuracy of 6.40 +/- 7.78% for the absorption coefficient (micro(a)) and 11.37 +/- 19.62% for the wavelength-averaged reduced scattering coefficient (micro(s)').

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We introduce an approach to the design of three-dimensional transformation optical (TO) media based on a generalized quasiconformal mapping approach. The generalized quasiconformal TO (QCTO) approach enables the design of media that can, in principle, be broadband and low loss, while controlling the propagation of waves with arbitrary angles of incidence and polarization. We illustrate the method in the design of a three-dimensional carpet ground plane cloak and of a flattened Luneburg lens. Ray-trace studies provide a confirmation of the performance of the QCTO media, while also revealing the limited performance of index-only versions of these devices.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent emergence of human connectome imaging has led to a high demand on angular and spatial resolutions for diffusion magnetic resonance imaging (MRI). While there have been significant growths in high angular resolution diffusion imaging, the improvement in spatial resolution is still limited due to a number of technical challenges, such as the low signal-to-noise ratio and high motion artifacts. As a result, the benefit of a high spatial resolution in the whole-brain connectome imaging has not been fully evaluated in vivo. In this brief report, the impact of spatial resolution was assessed in a newly acquired whole-brain three-dimensional diffusion tensor imaging data set with an isotropic spatial resolution of 0.85 mm. It was found that the delineation of short cortical association fibers is drastically improved as well as the definition of fiber pathway endings into the gray/white matter boundary-both of which will help construct a more accurate structural map of the human brain connectome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: X-ray computed tomography (CT) is widely used, both clinically and preclinically, for fast, high-resolution anatomic imaging; however, compelling opportunities exist to expand its use in functional imaging applications. For instance, spectral information combined with nanoparticle contrast agents enables quantification of tissue perfusion levels, while temporal information details cardiac and respiratory dynamics. The authors propose and demonstrate a projection acquisition and reconstruction strategy for 5D CT (3D+dual energy+time) which recovers spectral and temporal information without substantially increasing radiation dose or sampling time relative to anatomic imaging protocols. METHODS: The authors approach the 5D reconstruction problem within the framework of low-rank and sparse matrix decomposition. Unlike previous work on rank-sparsity constrained CT reconstruction, the authors establish an explicit rank-sparse signal model to describe the spectral and temporal dimensions. The spectral dimension is represented as a well-sampled time and energy averaged image plus regularly undersampled principal components describing the spectral contrast. The temporal dimension is represented as the same time and energy averaged reconstruction plus contiguous, spatially sparse, and irregularly sampled temporal contrast images. Using a nonlinear, image domain filtration approach, the authors refer to as rank-sparse kernel regression, the authors transfer image structure from the well-sampled time and energy averaged reconstruction to the spectral and temporal contrast images. This regularization strategy strictly constrains the reconstruction problem while approximately separating the temporal and spectral dimensions. Separability results in a highly compressed representation for the 5D data in which projections are shared between the temporal and spectral reconstruction subproblems, enabling substantial undersampling. The authors solved the 5D reconstruction problem using the split Bregman method and GPU-based implementations of backprojection, reprojection, and kernel regression. Using a preclinical mouse model, the authors apply the proposed algorithm to study myocardial injury following radiation treatment of breast cancer. RESULTS: Quantitative 5D simulations are performed using the MOBY mouse phantom. Twenty data sets (ten cardiac phases, two energies) are reconstructed with 88 μm, isotropic voxels from 450 total projections acquired over a single 360° rotation. In vivo 5D myocardial injury data sets acquired in two mice injected with gold and iodine nanoparticles are also reconstructed with 20 data sets per mouse using the same acquisition parameters (dose: ∼60 mGy). For both the simulations and the in vivo data, the reconstruction quality is sufficient to perform material decomposition into gold and iodine maps to localize the extent of myocardial injury (gold accumulation) and to measure cardiac functional metrics (vascular iodine). Their 5D CT imaging protocol represents a 95% reduction in radiation dose per cardiac phase and energy and a 40-fold decrease in projection sampling time relative to their standard imaging protocol. CONCLUSIONS: Their 5D CT data acquisition and reconstruction protocol efficiently exploits the rank-sparse nature of spectral and temporal CT data to provide high-fidelity reconstruction results without increased radiation dose or sampling time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The successful, efficient, and safe turbine design requires a thorough understanding of the underlying physical phenomena. This research investigates the physical understanding and parameters highly correlated to flutter, an aeroelastic instability prevalent among low pressure turbine (LPT) blades in both aircraft engines and power turbines. The modern way of determining whether a certain cascade of LPT blades is susceptible to flutter is through time-expensive computational fluid dynamics (CFD) codes. These codes converge to solution satisfying the Eulerian conservation equations subject to the boundary conditions of a nodal domain consisting fluid and solid wall particles. Most detailed CFD codes are accompanied by cryptic turbulence models, meticulous grid constructions, and elegant boundary condition enforcements all with one goal in mind: determine the sign (and therefore stability) of the aerodynamic damping. The main question being asked by the aeroelastician, ``is it positive or negative?'' This type of thought-process eventually gives rise to a black-box effect, leaving physical understanding behind. Therefore, the first part of this research aims to understand and reveal the physics behind LPT flutter in addition to several related topics including acoustic resonance effects. A percentage of this initial numerical investigation is completed using an influence coefficient approach to study the variation the work-per-cycle contributions of neighboring cascade blades to a reference airfoil. The second part of this research introduces new discoveries regarding the relationship between steady aerodynamic loading and negative aerodynamic damping. Using validated CFD codes as computational wind tunnels, a multitude of low-pressure turbine flutter parameters, such as reduced frequency, mode shape, and interblade phase angle, will be scrutinized across various airfoil geometries and steady operating conditions to reach new design guidelines regarding the influence of steady aerodynamic loading and LPT flutter. Many pressing topics influencing LPT flutter including shocks, their nonlinearity, and three-dimensionality are also addressed along the way. The work is concluded by introducing a useful preliminary design tool that can estimate within seconds the entire aerodynamic damping versus nodal diameter curve for a given three-dimensional cascade.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Light rainfall is the baseline input to the annual water budget in mountainous landscapes through the tropics and at mid-latitudes. In the Southern Appalachians, the contribution from light rainfall ranges from 50-60% during wet years to 80-90% during dry years, with convective activity and tropical cyclone input providing most of the interannual variability. The Southern Appalachians is a region characterized by rich biodiversity that is vulnerable to land use/land cover changes due to its proximity to a rapidly growing population. Persistent near surface moisture and associated microclimates observed in this region has been well documented since the colonization of the area in terms of species health, fire frequency, and overall biodiversity. The overarching objective of this research is to elucidate the microphysics of light rainfall and the dynamics of low level moisture in the inner region of the Southern Appalachians during the warm season, with a focus on orographically mediated processes. The overarching research hypothesis is that physical processes leading to and governing the life cycle of orographic fog, low level clouds, and precipitation, and their interactions, are strongly tied to landform, land cover, and the diurnal cycles of flow patterns, radiative forcing, and surface fluxes at the ridge-valley scale. The following science questions will be addressed specifically: 1) How do orographic clouds and fog affect the hydrometeorological regime from event to annual scale and as a function of terrain characteristics and land cover?; 2) What are the source areas, governing processes, and relevant time-scales of near surface moisture convergence patterns in the region?; and 3) What are the four dimensional microphysical and dynamical characteristics, including variability and controlling factors and processes, of fog and light rainfall? The research was conducted with two major components: 1) ground-based high-quality observations using multi-sensor platforms and 2) interpretive numerical modeling guided by the analysis of the in situ data collection. Findings illuminate a high level of spatial – down to the ridge scale - and temporal – from event to annual scale - heterogeneity in observations, and a significant impact on the hydrological regime as a result of seeder-feeder interactions among fog, low level clouds, and stratiform rainfall that enhance coalescence efficiency and lead to significantly higher rainfall rates at the land surface. Specifically, results show that enhancement of an event up to one order of magnitude in short-term accumulation can occur as a result of concurrent fog presence. Results also show that events are modulated strongly by terrain characteristics including elevation, slope, geometry, and land cover. These factors produce interactions between highly localized flows and gradients of temperature and moisture with larger scale circulations. Resulting observations of DSD and rainfall patterns are stratified by region and altitude and exhibit clear diurnal and seasonal cycles.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A tenet of modern radiotherapy (RT) is to identify the treatment target accurately, following which the high-dose treatment volume may be expanded into the surrounding tissues in order to create the clinical and planning target volumes. Respiratory motion can induce errors in target volume delineation and dose delivery in radiation therapy for thoracic and abdominal cancers. Historically, radiotherapy treatment planning in the thoracic and abdominal regions has used 2D or 3D images acquired under uncoached free-breathing conditions, irrespective of whether the target tumor is moving or not. Once the gross target volume has been delineated, standard margins are commonly added in order to account for motion. However, the generic margins do not usually take the target motion trajectory into consideration. That may lead to under- or over-estimate motion with subsequent risk of missing the target during treatment or irradiating excessive normal tissue. That introduces systematic errors into treatment planning and delivery. In clinical practice, four-dimensional (4D) imaging has been popular in For RT motion management. It provides temporal information about tumor and organ at risk motion, and it permits patient-specific treatment planning. The most common contemporary imaging technique for identifying tumor motion is 4D computed tomography (4D-CT). However, CT has poor soft tissue contrast and it induce ionizing radiation hazard. In the last decade, 4D magnetic resonance imaging (4D-MRI) has become an emerging tool to image respiratory motion, especially in the abdomen, because of the superior soft-tissue contrast. Recently, several 4D-MRI techniques have been proposed, including prospective and retrospective approaches. Nevertheless, 4D-MRI techniques are faced with several challenges: 1) suboptimal and inconsistent tumor contrast with large inter-patient variation; 2) relatively low temporal-spatial resolution; 3) it lacks a reliable respiratory surrogate. In this research work, novel 4D-MRI techniques applying MRI weightings that was not used in existing 4D-MRI techniques, including T2/T1-weighted, T2-weighted and Diffusion-weighted MRI were investigated. A result-driven phase retrospective sorting method was proposed, and it was applied to image space as well as k-space of MR imaging. Novel image-based respiratory surrogates were developed, improved and evaluated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Highlights of Data Expedition: • Students explored daily observations of local climate data spanning the past 35 years. • Topological Data Analysis, or TDA for short, provides cutting-edge tools for studying the geometry of data in arbitrarily high dimensions. • Using TDA tools, students discovered intrinsic dynamical features of the data and learned how to quantify periodic phenomenon in a time-series. • Since nature invariably produces noisy data which rarely has exact periodicity, students also considered the theoretical basis of almost-periodicity and even invented and tested new mathematical definitions of almost-periodic functions. Summary The dataset we used for this data expedition comes from the Global Historical Climatology Network. “GHCN (Global Historical Climatology Network)-Daily is an integrated database of daily climate summaries from land surface stations across the globe.” Source: https://www.ncdc.noaa.gov/oa/climate/ghcn-daily/ We focused on the daily maximum and minimum temperatures from January 1, 1980 to April 1, 2015 collected from RDU International Airport. Through a guided series of exercises designed to be performed in Matlab, students explore these time-series, initially by direct visualization and basic statistical techniques. Then students are guided through a special sliding-window construction which transforms a time-series into a high-dimensional geometric curve. These high-dimensional curves can be visualized by projecting down to lower dimensions as in the figure below (Figure 1), however, our focus here was to use persistent homology to directly study the high-dimensional embedding. The shape of these curves has meaningful information but how one describes the “shape” of data depends on which scale the data is being considered. However, choosing the appropriate scale is rarely an obvious choice. Persistent homology overcomes this obstacle by allowing us to quantitatively study geometric features of the data across multiple-scales. Through this data expedition, students are introduced to numerically computing persistent homology using the rips collapse algorithm and interpreting the results. In the specific context of sliding-window constructions, 1-dimensional persistent homology can reveal the nature of periodic structure in the original data. I created a special technique to study how these high-dimensional sliding-window curves form loops in order to quantify the periodicity. Students are guided through this construction and learn how to visualize and interpret this information. Climate data is extremely complex (as anyone who has suffered from a bad weather prediction can attest) and numerous variables play a role in determining our daily weather and temperatures. This complexity coupled with imperfections of measuring devices results in very noisy data. This causes the annual seasonal periodicity to be far from exact. To this end, I have students explore existing theoretical notions of almost-periodicity and test it on the data. They find that some existing definitions are also inadequate in this context. Hence I challenged them to invent new mathematics by proposing and testing their own definition. These students rose to the challenge and suggested a number of creative definitions. While autocorrelation and spectral methods based on Fourier analysis are often used to explore periodicity, the construction here provides an alternative paradigm to quantify periodic structure in almost-periodic signals using tools from topological data analysis.