6 resultados para Feature space
em University of Queensland eSpace - Australia
Resumo:
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index image's multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partition's center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images have similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the dimensionality curse existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms image's text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partition's center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. To effectively integrate multi-features, we also investigated the following evidence combination techniques-Certainty Factor, Dempster Shafer Theory, Compound Probability, and Linear Combination. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude. And Certainty Factor and Dempster Shafer Theory perform best in combining multiple similarities from corresponding multiple features.
Resumo:
This paper considers a model-based approach to the clustering of tissue samples of a very large number of genes from microarray experiments. It is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. Frequently in practice, there are also clinical data available on those cases on which the tissue samples have been obtained. Here we investigate how to use the clinical data in conjunction with the microarray gene expression data to cluster the tissue samples. We propose two mixture model-based approaches in which the number of components in the mixture model corresponds to the number of clusters to be imposed on the tissue samples. One approach specifies the components of the mixture model to be the conditional distributions of the microarray data given the clinical data with the mixing proportions also conditioned on the latter data. Another takes the components of the mixture model to represent the joint distributions of the clinical and microarray data. The approaches are demonstrated on some breast cancer data, as studied recently in van't Veer et al. (2002).
Resumo:
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index imagersquos multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partitionrsquos center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images haves similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the ldquodimensionality curserdquo existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms imagersquos text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partitionrsquos center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude.
Resumo:
Conventionally, document classification researches focus on improving the learning capabilities of classifiers. Nevertheless, according to our observation, the effectiveness of classification is limited by the suitability of document representation. Intuitively, the more features that are used in representation, the more comprehensive that documents are represented. However, if a representation contains too many irrelevant features, the classifier would suffer from not only the curse of high dimensionality, but also overfitting. To address this problem of suitableness of document representations, we present a classifier-independent approach to measure the effectiveness of document representations. Our approach utilises a labelled document corpus to estimate the distribution of documents in the feature space. By looking through documents in this way, we can clearly identify the contributions made by different features toward the document classification. Some experiments have been performed to show how the effectiveness is evaluated. Our approach can be used as a tool to assist feature selection, dimensionality reduction and document classification.
Resumo:
We propose a novel interpretation and usage of Neural Network (NN) in modeling physiological signals, which are allowed to be nonlinear and/or nonstationary. The method consists of training a NN for the k-step prediction of a physiological signal, and then examining the connection-weight-space (CWS) of the NN to extract information about the signal generator mechanism. We de. ne a novel feature, Normalized Vector Separation (gamma(ij)), to measure the separation of two arbitrary states i and j in the CWS and use it to track the state changes of the generating system. The performance of the method is examined via synthetic signals and clinical EEG. Synthetic data indicates that gamma(ij) can track the system down to a SNR of 3.5 dB. Clinical data obtained from three patients undergoing carotid endarterectomy of the brain showed that EEG could be modeled (within a root-means-squared-error of 0.01) by the proposed method, and the blood perfusion state of the brain could be monitored via gamma(ij), with small NNs having no more than 21 connection weight altogether.
Resumo:
This paper reports on a total electron content space weather study of the nighttime Weddell Sea Anomaly, overlooked by previously published TOPEX/Poseidon climate studies, and of the nighttime ionosphere during the 1996/1997 southern summer. To ascertain the morphology of spatial TEC distribution over the oceans in terms of hourly, geomagnetic, longitudinal and summer-winter variations, the TOPEX TEC, magnetic, and published neutral wind velocity data are utilized. To understand the underlying physical processes, the TEC results are combined with inclination and declination data plus global magnetic field-line maps. To investigate spatial and temporal TEC variations, geographic/magnetic latitudes and local times are computed. As results show, the nighttime Weddell Sea Anomaly is a large (∼1,600(°)2; ∼22 million km2 estimated for a steady ionosphere) space weather feature. Extending between 200°E and 300°E (geographic), it is an ionization enhancement peaking at 50°S–60°S/250°E–270°E and continuing beyond 66°S. It develops where the spacing between the magnetic field lines is wide/medium, easterly declination is large-medium (20°–50°), and inclination is optimum (∼55°S). Its development and hourly variations are closely correlated with wind speed variations. There is a noticeable (∼43%) reduction in its average area during the high magnetic activity period investigated. Southern summer nighttime TECs follow closely the variations of declination and field-line configuration and therefore introduce a longitudinal division of four (Indian, western/eastern Pacific, Atlantic). Northern winter nighttime TECs measured over a limited area are rather uniform longitudinally because of the small declination variation. TOPEX maps depict the expected strong asymmetry in TEC distribution about the magnetic dip equator.