968 resultados para Unsupervised classification
Resumo:
Information theoretic active learning has been widely studied for probabilistic models. For simple regression an optimal myopic policy is easily tractable. However, for other tasks and with more complex models, such as classification with nonparametric models, the optimal solution is harder to compute. Current approaches make approximations to achieve tractability. We propose an approach that expresses information gain in terms of predictive entropies, and apply this method to the Gaussian Process Classifier (GPC). Our approach makes minimal approximations to the full information theoretic objective. Our experimental performance compares favourably to many popular active learning algorithms, and has equal or lower computational complexity. We compare well to decision theoretic approaches also, which are privy to more information and require much more computational time. Secondly, by developing further a reformulation of binary preference learning to a classification problem, we extend our algorithm to Gaussian Process preference learning.
Resumo:
The taxonomy of the douc and snub-nosed langurs has changed several times during the 20th century. The controversy over the systematic position of these animals has been due in part to difficulties in studying them: both the doucs and the snub-nosed langurs are rare in the wild and are generally poorly represented in institutional collections. This review is based on a detailed examination of relatively large numbers of specimens of most of the species of langurs concerned. An attempt was made to draw upon as many types of information as were available in order to make an assessment of the phyletic relationships between the langur species under discussion. Toward this end, quantitative and qualitative features of the skeleton, specific features of visceral anatomy and characteristics of the pelage were utilized. The final data matrix comprised 178 characters. The matrix was analyzed using the program Hennig86. The results of the analysis support the following conclusions: (1) that the douc and snub-nosed langurs are generically distinct and should be referred to as species of Pygathrix and Rhinopithecus, respectively; (2) that the Tonkin snub-nosed langur be placed in its own subgenus as Rhinopithecus (Presbytiscus) avunculus and that the Chinese snub-nosed langur thus be placed in the subgenus Rhinopithecus (Rhinopithecus); (3) that four extant species of Rhinopithecus be recognized: R. (Rhinopithecus) roxellana Milne Edwards, 1870; R. (Rhinopithecus) bieti Milne Edwards, 1897; R. (Rhinopithecus) brelichi Thomas, 1903, and R. (Presbytiscus) avunculus Dollman, 1912; (4) that the Chinese snub-nosed langurs fall into northern and southern subgroups divided by the Yangtze river; (5) that R. lantianensis Hu and Qi, 1978, is a valid fossil species, and (6) the precise affinities and taxonomic status of the fossil species R. tingianus Matthew and Granger, 1923, are unclear because the type specimen is a subadult.
Resumo:
In order to study the differentiation of Asian colobines, 14 variables measured on 123 skulls, including Rhinopithecus, Presbytis, Presbytiscus (Rhinopithecus avunculus), Pygathrix and Nasalis were analyzed by one-way, cluster and discriminant function analyses. Information on paleoenvironmental changes in China and southeast Asia since the late Tertiary was used to examine the influences of migratory routes and range of distribution in Asian colobines. A cladogram for 6 genera of Asian colobines was constructed from the results of various analyses. Some new points or revisions were suggested: (1) Following one of two migratory routes, ancient species of Asian colobines perhaps passed through Xizang (Tibet) along the northern bank of the Tethys sea and through the Heng Duan Shan regions of Yunnan into Vietnam. An ancient landmass linking Yunnan and Xizang was already present on the east bank of the Tethys sea. Accordingly, Asian colobines would have two centers of evolutionary origin: Sundaland and the Heng Duan Shan regions of China. (2) Pygathrix shares more cranial features with Presbytiscus than with Rhinopithecus. This differs somewhat from the conclusion reached by Groves. (3) Nasalis (karyotype: 2n = 48) may be the most primitive genus among Asian colobines. Certain features shared with Rhinopithecus, e.g. large body size, terrestrial activity and limb proportions, can be interpreted as symple-siomorphic characters. (4) Rhinopithecus, with respect to craniofacial features, is a special case among Asian colobines. It combines a high degree of evolutionary specialization with retention of some primitive features thought to have been present in the ancestral Asian colobine.
Resumo:
In this paper a method to incorporate linguistic information regarding single-word and compound verbs is proposed, as a first step towards an SMT model based on linguistically-classified phrases. By substituting these verb structures by the base form of the head verb, we achieve a better statistical word alignment performance, and are able to better estimate the translation model and generalize to unseen verb forms during translation. Preliminary experiments for the English - Spanish language pair are performed, and future research lines are detailed. © 2005 Association for Computational Linguistics.
Resumo:
The amount of original imaging information produced yearly during the last decade has experienced a tremendous growth in all industries due to the technological breakthroughs in digital imaging and electronic storage capabilities. This trend is affecting the construction industry as well, where digital cameras and image databases are gradually replacing traditional photography. Owners demand complete site photograph logs and engineers store thousands of images for each project to use in a number of construction management tasks like monitoring an activity's progress and keeping evidence of the "as built" in case any disputes arise. So far, retrieval methodologies are done manually with the user being responsible for imaging classification according to specific rules that serve a limited number of construction management tasks. New methods that, with the guidance of the user, can automatically classify and retrieve construction site images are being developed and promise to remove the heavy burden of manually indexing images. In this paper, both the existing methods and a novel image retrieval method developed by the authors for the classification and retrieval of construction site images are described and compared. Specifically a number of examples are deployed in order to present their advantages and limitations. The results from this comparison demonstrates that the content based image retrieval method developed by the authors can reduce the overall time spent for the classification and retrieval of construction images while providing the user with the flexibility to retrieve images according different classification schemes.
Resumo:
Data quality (DQ) assessment can be significantly enhanced with the use of the right DQ assessment methods, which provide automated solutions to assess DQ. The range of DQ assessment methods is very broad: from data profiling and semantic profiling to data matching and data validation. This paper gives an overview of current methods for DQ assessment and classifies the DQ assessment methods into an existing taxonomy of DQ problems. Specific examples of the placement of each DQ method in the taxonomy are provided and illustrate why the method is relevant to the particular taxonomy position. The gaps in the taxonomy, where no current DQ methods exist, show where new methods are required and can guide future research and DQ tool development.
Resumo:
McCullagh and Yang (2006) suggest a family of classification algorithms based on Cox processes. We further investigate the log Gaussian variant which has a number of appealing properties. Conditioned on the covariates, the distribution over labels is given by a type of conditional Markov random field. In the supervised case, computation of the predictive probability of a single test point scales linearly with the number of training points and the multiclass generalization is straightforward. We show new links between the supervised method and classical nonparametric methods. We give a detailed analysis of the pairwise graph representable Markov random field, which we use to extend the model to semi-supervised learning problems, and propose an inference method based on graph min-cuts. We give the first experimental analysis on supervised and semi-supervised datasets and show good empirical performance.
Resumo:
Correct classification of different metabolic cycle stages to identification cell cycle is significant in both human development and clinical diagnostics. However, it has no perfect method has been reached in classification of metabolic cycle yet. This paper exploringly puts forward an automatic classification method of metabolic cycle based on Biomimetic pattern recognition (BPR). As to the three phases of yeast metabolic cycle, the correct classification rate reaches 90%, 100% and 100% respectively.
Resumo:
In order to effectively improve the classification performance of neural network, first architecture of fuzzy neural network with fuzzy input was proposed. Next a cost function of fuzzy outputs and non-fuzzy targets was defined. Then a learning algorithm from the cost function for adjusting weights was derived. And then the fuzzy neural network was inversed and fuzzified inversion algorithm was proposed. Finally, computer simulations on real-world pattern classification problems examine the effectives of the proposed approach. The experiment results show that the proposed approach has the merits of high learning efficiency, high classification accuracy and high generalization capability.