621 resultados para Dimensionality
Resumo:
We introduce a framework for population analysis of white matter tracts based on diffusion-weighted images of the brain. The framework enables extraction of fibers from high angular resolution diffusion images (HARDI); clustering of the fibers based partly on prior knowledge from an atlas; representation of the fiber bundles compactly using a path following points of highest density (maximum density path; MDP); and registration of these paths together using geodesic curve matching to find local correspondences across a population. We demonstrate our method on 4-Tesla HARDI scans from 565 young adults to compute localized statistics across 50 white matter tracts based on fractional anisotropy (FA). Experimental results show increased sensitivity in the determination of genetic influences on principal fiber tracts compared to the tract-based spatial statistics (TBSS) method. Our results show that the MDP representation reveals important parts of the white matter structure and considerably reduces the dimensionality over comparable fiber matching approaches.
Resumo:
Despite longstanding concern with the dimensionality of the service quality construct as measured by ServQual and IS-ServQual instruments, variations on the IS-ServQual instrument have been enduringly prominent in both academic research and practice in the field of IS. We explain the continuing popularity of the instrument based on the salience of the item set for predicting overall customer satisfaction, suggesting that the preoccupation with the dimensions has been a distraction. The implicit mutual exclusivity of the items suggests a more appropriate conceptualization of IS-ServQual as a formative index. This conceptualization resolves the paradox in IS-ServQual research, that of how an instrument with such well-known and well-documented weaknesses continue to be very influential and widely used by academics and practitioners. A formative conceptualization acknowledges and addresses the criticisms of IS-ServQual, while simultaneously explaining its enduring salience by focusing on the items rather than the “dimensions.” By employing an opportunistic sample and adopting the most recent IS-ServQual instrument published in a leading IS journal (virtually, any valid IS- ServQual sample in combination with a previously tested instrument variant would suffice for study purposes), we demonstrate that when re-specified as both first-order and second-order formatives, IS-ServQual has good model quality metrics and high predictive power on customer satisfaction. We conclude that this formative specification has higher practical use and is more defensible theoretically.
Resumo:
Document clustering is one of the prominent methods for mining important information from the vast amount of data available on the web. However, document clustering generally suffers from the curse of dimensionality. Providentially in high dimensional space, data points tend to be more concentrated in some areas of clusters. We take advantage of this phenomenon by introducing a novel concept of dynamic cluster representation named as loci. Clusters’ loci are efficiently calculated using documents’ ranking scores generated from a search engine. We propose a fast loci-based semi-supervised document clustering algorithm that uses clusters’ loci instead of conventional centroids for assigning documents to clusters. Empirical analysis on real-world datasets shows that the proposed method produces cluster solutions with promising quality and is substantially faster than several benchmarked centroid-based semi-supervised document clustering methods.
Resumo:
This thesis studies document signatures, which are small representations of documents and other objects that can be stored compactly and compared for similarity. This research finds that document signatures can be effectively and efficiently used to both search and understand relationships between documents in large collections, scalable enough to search a billion documents in a fraction of a second. Deliverables arising from the research include an investigation of the representational capacity of document signatures, the publication of an open-source signature search platform and an approach for scaling signature retrieval to operate efficiently on collections containing hundreds of millions of documents.
Resumo:
The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.
Resumo:
In this paper we tackle the problem of efficient video event detection. We argue that linear detection functions should be preferred in this regard due to their scalability and efficiency during estimation and evaluation. A popular approach in this regard is to represent a sequence using a bag of words (BOW) representation due to its: (i) fixed dimensionality irrespective of the sequence length, and (ii) its ability to compactly model the statistics in the sequence. A drawback to the BOW representation, however, is the intrinsic destruction of the temporal ordering information. In this paper we propose a new representation that leverages the uncertainty in relative temporal alignments between pairs of sequences while not destroying temporal ordering. Our representation, like BOW, is of a fixed dimensionality making it easily integrated with a linear detection function. Extensive experiments on CK+, 6DMG, and UvA-NEMO databases show significant performance improvements across both isolated and continuous event detection tasks.
Resumo:
A comparative study of the electric-field induced hopping transport probes the effective dimensionality (D) in bulk and ultrathin films of single-wall carbon nanotubes (SWNTs). The values of the scaling function exponents for the electroconductance are found to be consistent with that in three-dimensional and two-dimensional systems. The significant difference in threshold voltage in these two types of SWNTs is a consequence of the variation in the number of energetically favorable sites available for charge carriers to hop by using the energy from the field. Furthermore, a modification to the magnetotransport is observed under high electric-fields.
Resumo:
Spatial dimensionality affects the degree of confinement when an electron-hole pair is squeezed from one or more dimensions approaching the bulk exciton Bohr radius (alpha(B)) limit. The etectron-hole interaction in zero-dimensional (0D) dots, one-dimensional (1D) rods/wires, and two-dimensional (2D) wells/sheets should be enhanced by the increase in confinement dimensions in the order 0D > 1D > 2D. We report the controlled synthesis of PbS nanomateriats with 0D, 1D, and 2D forms retaining at least one dimension in the strongly confined regime far below alpha(B) (similar to 10 nm for PbS) and provide evidence through varying the exciton-phonon coupling strength that the degree of confinement is systematically weakened by the loss of confinement dimension. Geometry variations show distinguishable far-field optical polarizations, which could find useful applications in polarization-sensitive devices.
Resumo:
State-of-the-art image-set matching techniques typically implicitly model each image-set with a Gaussian distribution. Here, we propose to go beyond these representations and model image-sets as probability distribution functions (PDFs) using kernel density estimators. To compare and match image-sets, we exploit Csiszar´ f-divergences, which bear strong connections to the geodesic distance defined on the space of PDFs, i.e., the statistical manifold. Furthermore, we introduce valid positive definite kernels on the statistical manifold, which let us make use of more powerful classification schemes to match image-sets. Finally, we introduce a supervised dimensionality reduction technique that learns a latent space where f-divergences reflect the class labels of the data. Our experiments on diverse problems, such as video-based face recognition and dynamic texture classification, evidence the benefits of our approach over the state-of-the-art image-set matching methods.
Resumo:
Identifying unusual or anomalous patterns in an underlying dataset is an important but challenging task in many applications. The focus of the unsupervised anomaly detection literature has mostly been on vectorised data. However, many applications are more naturally described using higher-order tensor representations. Approaches that vectorise tensorial data can destroy the structural information encoded in the high-dimensional space, and lead to the problem of the curse of dimensionality. In this paper we present the first unsupervised tensorial anomaly detection method, along with a randomised version of our method. Our anomaly detection method, the One-class Support Tensor Machine (1STM), is a generalisation of conventional one-class Support Vector Machines to higher-order spaces. 1STM preserves the multiway structure of tensor data, while achieving significant improvement in accuracy and efficiency over conventional vectorised methods. We then leverage the theory of nonlinear random projections to propose the Randomised 1STM (R1STM). Our empirical analysis on several real and synthetic datasets shows that our R1STM algorithm delivers comparable or better accuracy to a state-of-the-art deep learning method and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.
Resumo:
Eight new open-framework inorganic-organic hybrid compounds based on indium have been synthesized employing hydrothermal methods. All of the compounds have InO6, C2O4, and HPO3/HPO4/SO4 units connected to form structures of different dimensionality Thus, the compounds have zero- (I), two- (II, III, IV, V, VII, and VIII), and three-dimensionally (VI) extended networks. The formation of the first zero-dimensional hybrid compound is noteworthy In addition, concomitant polymorphic structures have been observed in the present study. The molecular compound, I, was found to be reactive, and the transformation studies in the presence of a base (pyridine) give rise to the polymorphic structures of II and III, while the addition of an acid (H3PO3) gives rise to a new indium phosphite with a pillared layer structure (T1). Preliminary density functional theory calculations suggest that the stabilities of the polymorphs are different, with one of the forms (II) being preferred over the other, which is consistent with the observed experimental behavior. The oxalate units perform more than one role in the present structures. Thus, the oxalate units connect two In centers to satisfy the coordination requirements as well as to achieve charge balance in compounds II, IV, and VI. The terminal oxalate units observed in compounds I, IV, and V suggest the possibility of intermediate structures. Both in-plane and out-of-plane connectivity of the oxalate units were observed in compound VI. The 31 compounds have been characterized by powder X-ray diffraction, IR spectroscopy, thermogravimetric analysis, and P-31 NMR studies.
Resumo:
In this thesis, the solar wind-magnetosphere-ionosphere coupling is studied observationally, with the main focus on the ionospheric currents in the auroral region. The thesis consists of five research articles and an introductory part that summarises the most important results reached in the articles and places them in a wider context within the field of space physics. Ionospheric measurements are provided by the International Monitor for Auroral Geomagnetic Effects (IMAGE) magnetometer network, by the low-orbit CHAllenging Minisatellite Payload (CHAMP) satellite, by the European Incoherent SCATter (EISCAT) radar, and by the Imager for Magnetopause-to-Aurora Global Exploration (IMAGE) satellite. Magnetospheric observations, on the other hand, are acquired from the four spacecraft of the Cluster mission, and solar wind observations from the Advanced Composition Explorer (ACE) and Wind spacecraft. Within the framework of this study, a new method for determining the ionospheric currents from low-orbit satellite-based magnetic field data is developed. In contrast to previous techniques, all three current density components can be determined on a matching spatial scale, and the validity of the necessary one-dimensionality approximation, and thus, the quality of the results, can be estimated directly from the data. The new method is applied to derive an empirical model for estimating the Hall-to-Pedersen conductance ratio from ground-based magnetic field data, and to investigate the statistical dependence of the large-scale ionospheric currents on solar wind and geomagnetic parameters. Equations describing the amount of field-aligned current in the auroral region, as well as the location of the auroral electrojets, as a function of these parameters are derived. Moreover, the mesoscale (10-1000 km) ionospheric equivalent currents related to two magnetotail plasma sheet phenomena, bursty bulk flows and flux ropes, are studied. Based on the analysis of 22 events, the typical equivalent current pattern related to bursty bulk flows is established. For the flux ropes, on the other hand, only two conjugate events are found. As the equivalent current patterns during these two events are not similar, it is suggested that the ionospheric signatures of a flux rope depend on the orientation and the length of the structure, but analysis of additional events is required to determine the possible ionospheric connection of flux ropes.
Resumo:
The output of a laser is a high frequency propagating electromagnetic field with superior coherence and brightness compared to that emitted by thermal sources. A multitude of different types of lasers exist, which also translates into large differences in the properties of their output. Moreover, the characteristics of the electromagnetic field emitted by a laser can be influenced from the outside, e.g., by injecting an external optical field or by optical feedback. In the case of free-running solitary class-B lasers, such as semiconductor and Nd:YVO4 solid-state lasers, the phase space is two-dimensional, the dynamical variables being the population inversion and the amplitude of the electromagnetic field. The two-dimensional structure of the phase space means that no complex dynamics can be found. If a class-B laser is perturbed from its steady state, then the steady state is restored after a short transient. However, as discussed in part (i) of this Thesis, the static properties of class-B lasers, as well as their artificially or noise induced dynamics around the steady state, can be experimentally studied in order to gain insight on laser behaviour, and to determine model parameters that are not known ab initio. In this Thesis particular attention is given to the linewidth enhancement factor, which describes the coupling between the gain and the refractive index in the active material. A highly desirable attribute of an oscillator is stability, both in frequency and amplitude. Nowadays, however, instabilities in coupled lasers have become an active area of research motivated not only by the interesting complex nonlinear dynamics but also by potential applications. In part (ii) of this Thesis the complex dynamics of unidirectionally coupled, i.e., optically injected, class-B lasers is investigated. An injected optical field increases the dimensionality of the phase space to three by turning the phase of the electromagnetic field into an important variable. This has a radical effect on laser behaviour, since very complex dynamics, including chaos, can be found in a nonlinear system with three degrees of freedom. The output of the injected laser can be controlled in experiments by varying the injection rate and the frequency of the injected light. In this Thesis the dynamics of unidirectionally coupled semiconductor and Nd:YVO4 solid-state lasers is studied numerically and experimentally.
Resumo:
An efficient and statistically robust solution for the identification of asteroids among numerous sets of astrometry is presented. In particular, numerical methods have been developed for the short-term identification of asteroids at discovery, and for the long-term identification of scarcely observed asteroids over apparitions, a task which has been lacking a robust method until now. The methods are based on the solid foundation of statistical orbital inversion properly taking into account the observational uncertainties, which allows for the detection of practically all correct identifications. Through the use of dimensionality-reduction techniques and efficient data structures, the exact methods have a loglinear, that is, O(nlog(n)), computational complexity, where n is the number of included observation sets. The methods developed are thus suitable for future large-scale surveys which anticipate a substantial increase in the astrometric data rate. Due to the discontinuous nature of asteroid astrometry, separate sets of astrometry must be linked to a common asteroid from the very first discovery detections onwards. The reason for the discontinuity in the observed positions is the rotation of the observer with the Earth as well as the motion of the asteroid and the observer about the Sun. Therefore, the aim of identification is to find a set of orbital elements that reproduce the observed positions with residuals similar to the inevitable observational uncertainty. Unless the astrometric observation sets are linked, the corresponding asteroid is eventually lost as the uncertainty of the predicted positions grows too large to allow successful follow-up. Whereas the presented identification theory and the numerical comparison algorithm are generally applicable, that is, also in fields other than astronomy (e.g., in the identification of space debris), the numerical methods developed for asteroid identification can immediately be applied to all objects on heliocentric orbits with negligible effects due to non-gravitational forces in the time frame of the analysis. The methods developed have been successfully applied to various identification problems. Simulations have shown that the methods developed are able to find virtually all correct linkages despite challenges such as numerous scarce observation sets, astrometric uncertainty, numerous objects confined to a limited region on the celestial sphere, long linking intervals, and substantial parallaxes. Tens of previously unknown main-belt asteroids have been identified with the short-term method in a preliminary study to locate asteroids among numerous unidentified sets of single-night astrometry of moving objects, and scarce astrometry obtained nearly simultaneously with Earth-based and space-based telescopes has been successfully linked despite a substantial parallax. Using the long-term method, thousands of realistic 3-linkages typically spanning several apparitions have so far been found among designated observation sets each spanning less than 48 hours.
Resumo:
Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probability transition matrix per stage. Thus the curse of dimensionality affects FH-MDPs more severely than infinite-horizon MDPs. We propose two parametrized 'actor-critic' algorithms to compute optimal policies for FH-MDPs. Both algorithms use the two-timescale stochastic approximation technique, thus simultaneously performing gradient search in the parametrized policy space (the 'actor') on a slower timescale and learning the policy gradient (the 'critic') via a faster recursion. This is in contrast to methods where critic recursions learn the cost-to-go proper. We show w.p 1 convergence to a set with the necessary condition for constrained optima. The proposed parameterization is for FHMDPs with compact action sets, although certain exceptions can be handled. Further, a third algorithm for stochastic control of stopping time processes is presented. We explain why current policy evaluation methods do not work as critic to the proposed actor recursion. Simulation results from flow-control in communication networks attest to the performance advantages of all three algorithms.