811 resultados para GALAXIES, CLUSTERING
Resumo:
We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.
Resumo:
Genetic correlation (rg) analysis determines how much of the correlation between two measures is due to common genetic influences. In an analysis of 4 Tesla diffusion tensor images (DTI) from 531 healthy young adult twins and their siblings, we generalized the concept of genetic correlation to determine common genetic influences on white matter integrity, measured by fractional anisotropy (FA), at all points of the brain, yielding an NxN genetic correlation matrix rg(x,y) between FA values at all pairs of voxels in the brain. With hierarchical clustering, we identified brain regions with relatively homogeneous genetic determinants, to boost the power to identify causal single nucleotide polymorphisms (SNP). We applied genome-wide association (GWA) to assess associations between 529,497 SNPs and FA in clusters defined by hubs of the clustered genetic correlation matrix. We identified a network of genes, with a scale-free topology, that influences white matter integrity over multiple brain regions.
Resumo:
Imaging genetics aims to discover how variants in the human genome influence brain measures derived from images. Genome-wide association scans (GWAS) can screen the genome for common differences in our DNA that relate to brain measures. In small samples, GWAS has low power as individual gene effects are weak and one must also correct for multiple comparisons across the genome and the image. Here we extend recent work on genetic clustering of images, to analyze surface-based models of anatomy using GWAS. We performed spherical harmonic analysis of hippocampal surfaces, automatically extracted from brain MRI scans of 1254 subjects. We clustered hippocampal surface regions with common genetic influences by examining genetic correlations (r(g)) between the normalized deformation values at all pairs of surface points. Using genetic correlations to cluster surface measures, we were able to boost effect sizes for genetic associations, compared to clustering with traditional phenotypic correlations using Pearson's r.
Resumo:
To understand factors that affect brain connectivity and integrity, it is beneficial to automatically cluster white matter (WM) fibers into anatomically recognizable tracts. Whole brain tractography, based on diffusion-weighted MRI, generates vast sets of fibers throughout the brain; clustering them into consistent and recognizable bundles can be difficult as there are wide individual variations in the trajectory and shape of WM pathways. Here we introduce a novel automated tract clustering algorithm based on label fusion - a concept from traditional intensity-based segmentation. Streamline tractography generates many incorrect fibers, so our top-down approach extracts tracts consistent with known anatomy, by mapping multiple hand-labeled atlases into a new dataset. We fuse clustering results from different atlases, using a mean distance fusion scheme. We reliably extracted the major tracts from 105-gradient high angular resolution diffusion images (HARDI) of 198 young normal twins. To compute population statistics, we use a pointwise correspondence method to match, compare, and average WM tracts across subjects. We illustrate our method in a genetic study of white matter tract heritability in twins.
Resumo:
Automatic labeling of white matter fibres in diffusion-weighted brain MRI is vital for comparing brain integrity and connectivity across populations, but is challenging. Whole brain tractography generates a vast set of fibres throughout the brain, but it is hard to cluster them into anatomically meaningful tracts, due to wide individual variations in the trajectory and shape of white matter pathways. We propose a novel automatic tract labeling algorithm that fuses information from tractography and multiple hand-labeled fibre tract atlases. As streamline tractography can generate a large number of false positive fibres, we developed a top-down approach to extract tracts consistent with known anatomy, based on a distance metric to multiple hand-labeled atlases. Clustering results from different atlases were fused, using a multi-stage fusion scheme. Our "label fusion" method reliably extracted the major tracts from 105-gradient HARDI scans of 100 young normal adults. © 2012 Springer-Verlag.
Resumo:
We introduce a framework for population analysis of white matter tracts based on diffusion-weighted images of the brain. The framework enables extraction of fibers from high angular resolution diffusion images (HARDI); clustering of the fibers based partly on prior knowledge from an atlas; representation of the fiber bundles compactly using a path following points of highest density (maximum density path; MDP); and registration of these paths together using geodesic curve matching to find local correspondences across a population. We demonstrate our method on 4-Tesla HARDI scans from 565 young adults to compute localized statistics across 50 white matter tracts based on fractional anisotropy (FA). Experimental results show increased sensitivity in the determination of genetic influences on principal fiber tracts compared to the tract-based spatial statistics (TBSS) method. Our results show that the MDP representation reveals important parts of the white matter structure and considerably reduces the dimensionality over comparable fiber matching approaches.
Resumo:
Environmental acoustic recordings can be used to perform avian species richness surveys, whereby a trained ornithologist can observe the species present by listening to the recording. This could be made more efficient by using computational methods for iteratively selecting the richest parts of a long recording for the human observer to listen to, a process known as “smart sampling”. This allows scaling up to much larger ecological datasets. In this paper we explore computational approaches based on information and diversity of selected samples. We propose to use an event detection algorithm to estimate the amount of information present in each sample. We further propose to cluster the detected events for a better estimate of this amount of information. Additionally, we present a time dispersal approach to estimating diversity between iteratively selected samples. Combinations of approaches were evaluated on seven 24-hour recordings that have been manually labeled by bird watchers. The results show that on average all the methods we have explored would allow annotators to observe more new species in fewer minutes compared to a baseline of random sampling at dawn.
Resumo:
Multicentric carpotarsal osteolysis (MCTO) is a rare skeletal dysplasia characterized by aggressive osteolysis, particularly affecting the carpal and tarsal bones, and is frequently associated with progressive renal failure. Using exome capture and next-generation sequencing in five unrelated simplex cases of MCTO, we identified previously unreported missense mutations clustering within a 51 base pair region of the single exon of MAFB, validated by Sanger sequencing. A further six unrelated simplex cases with MCTO were also heterozygous for previously unreported mutations within this same region, as were affected members of two families with autosomal-dominant MCTO. MAFB encodes a transcription factor that negatively regulates RANKL-induced osteoclastogenesis and is essential for normal renal development. Identification of this gene paves the way for development of novel therapeutic approaches for this crippling disease and provides insight into normal bone and kidney development.
Resumo:
(The American Journal of Human Genetics, 90, 494–501; March 9, 2012) In the published version of this article, the amino acid alteration caused by c.161C>T should have been notated as p.Ser54Leu and not p.Pro54Leu. The wild-type amino acid is incorrectly notated in the main text, in Table 2, and in Figure 4. The authors regret this error. Additionally, The Journal regrets that this erratum, originally requested in 2012, was not published in a timely fashion.
Resumo:
This thesis has investigated how to cluster a large number of faces within a multi-media corpus in the presence of large session variation. Quality metrics are used to select the best faces to represent a sequence of faces; and session variation modelling improves clustering performance in the presence of wide variations across videos. Findings from this thesis contribute to improving the performance of both face verification systems and the fully automated clustering of faces from a large video corpus.
Resumo:
The work reported hen was motivated by a desire to verify the existence of structure - specifically MP-rich clusters induced by sodium bromide (NaBr) in the ternary liquid mixture 3-methylpyridine (Mf) + water(W) + NaBr. We present small-angle X-ray scattering (SAXS) measurements in this mixture. These measurements were obtained at room temperature (similar to 298 K) in the one-phase region (below the relevant lower consolute points, T(L)s) at different values of X (i.e., X = 0.02 - 0.17), where X is the weight fraction of NaBr in the mixture. Cluster-size distribution, estimated on the assumption that the clusters are spherical, shows systematic behaviour in that the peak of the distribution shifts rewards larger values of cluster radius as X increases. The largest spatial extent of the clusters (similar to 4.5 nm) is seen at X = 0.17. Data analysis assuming arbitrary shapes and sizes of clusters gives a limiting value of cluster size (- 4.5 nm) that is not very sensitive to X. It is suggested that the cluster size determined may not be the same as the usual critical-point fluctuations far removed from the critical point (T-L). The influence of the additional length scale due to clustering is discussed from the standpoint of crossover from Ising to mean-field critical behaviour, when moving away from the T-L.
Resumo:
Document clustering is one of the prominent methods for mining important information from the vast amount of data available on the web. However, document clustering generally suffers from the curse of dimensionality. Providentially in high dimensional space, data points tend to be more concentrated in some areas of clusters. We take advantage of this phenomenon by introducing a novel concept of dynamic cluster representation named as loci. Clusters’ loci are efficiently calculated using documents’ ranking scores generated from a search engine. We propose a fast loci-based semi-supervised document clustering algorithm that uses clusters’ loci instead of conventional centroids for assigning documents to clusters. Empirical analysis on real-world datasets shows that the proposed method produces cluster solutions with promising quality and is substantially faster than several benchmarked centroid-based semi-supervised document clustering methods.
Resumo:
The light distribution in the disks of many galaxies is ‘lopsided’ with a spatial extent much larger along one half of a galaxy than the other, as seen in M101. Recent observations show that the stellar disk in a typical spiral galaxy is significantly lopsided, indicating asymmetry in the disk mass distribution. The mean amplitude of lopsidedness is 0.1, measured as the Fourier amplitude of the m=1 component normalized to the average value. Thus, lopsidedness is common, and hence it is important to understand its origin and dynamics. This is a new and exciting area in galactic structure and dynamics, in contrast to the topic of bars and two-armed spirals (m=2) which has been extensively studied in the literature. Lopsidedness is ubiquitous and occurs in a variety of settings and tracers. It is seen in both stars and gas, in the outer disk and the central region, in the field and the group galaxies. The lopsided amplitude is higher by a factor of two for galaxies in a group. The lopsidedness has a strong impact on the dynamics of the galaxy, its evolution, the star formation in it, and on the growth of the central black hole and on the nuclear fuelling. We present here an overview of the observations that measure the lopsided distribution, as well as the theoretical progress made so far to understand its origin and properties. The physical mechanisms studied for its origin include tidal encounters, gas accretion and a global gravitational instability. The related open, challenging problems in this emerging area are discussed.
Resumo:
n this paper, a multistage evolutionary scheme is proposed for clustering in a large data base, like speech data. This is achieved by clustering a small subset of the entire sample set in each stage and treating the cluster centroids so obtained as samples, together with another subset of samples not considered previously, as input data to the next stage. This is continued till the whole sample set is exhausted. The clustering is accomplished by constructing a fuzzy similarity matrix and using the fuzzy techniques proposed here. The technique is illustrated by an efficient scheme for voiced-unvoiced-silence classification of speech.
Resumo:
The extragalactic diffuse emission at gamma-ray energies has interesting cosmological implications since these photons suffer little or no attenuation during their propagation from the site of origin. The emission could originate from either truly diffuse processes or from unresolved point sources such as AGNs, normal galaxies and starburst galaxies. Here, we examine the unresolved point source origin of the extragalactic gamma-ray background emission from normal galaxies and starburst galaxies. gamma-ray emission from normal galaxies is primarily coming from cosmic-ray interactions with interstellar matter and radiation (similar to 90%) along with a small contribution from discrete point sources (similar to 10%). Starburst galaxies are expected to have enhanced supernovae activity which leads to higher cosmic-ray densities, making starburst galaxies sufficiently luminous at gamma-ray energies to be detected by the current gamma-ray mission(Fermi Gamma-ray Space Telescope).