929 resultados para Pattern-search methods


Relevância:

30.00% 30.00%

Publicador:

Resumo:

"C00-2118-0048."

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Existe um problema de representação em processamento de linguagem natural, pois uma vez que o modelo tradicional de bag-of-words representa os documentos e as palavras em uma unica matriz, esta tende a ser completamente esparsa. Para lidar com este problema, surgiram alguns métodos que são capazes de representar as palavras utilizando uma representação distribuída, em um espaço de dimensão menor e mais compacto, inclusive tendo a propriedade de relacionar palavras de forma semântica. Este trabalho tem como objetivo utilizar um conjunto de documentos obtido através do projeto Media Cloud Brasil para aplicar o modelo skip-gram em busca de explorar relações e encontrar padrões que facilitem na compreensão do conteúdo.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-06

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this Comment on Feng's paper [Phys. Rev. A 63, 052308 (2001)], we show that Grover's algorithm may be performed exactly using the gate set given, provided that small changes are made to the gate sequence. An analytic expression for the probability of success of Grover's algorithm for any unitary operator U instead of Hadamard gate is presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the statistical problem of catalogue matching from a machine learning perspective with the goal of producing probabilistic outputs, and using all available information. A framework is provided that unifies two existing approaches to producing probabilistic outputs in the literature, one based on combining distribution estimates and the other based on combining probabilistic classifiers. We apply both of these to the problem of matching the HI Parkes All Sky Survey radio catalogue with large positional uncertainties to the much denser SuperCOSMOS catalogue with much smaller positional uncertainties. We demonstrate the utility of probabilistic outputs by a controllable completeness and efficiency trade-off and by identifying objects that have high probability of being rare. Finally, possible biasing effects in the output of these classifiers are also highlighted and discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have developed an alignment-free method that calculates phylogenetic distances using a maximum-likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a data set of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rapid economic development has occurred during the past few decades in China with the Yangtze River Delta (YRD) area as one of the most progressive areas. The urbanization, industrialization, agricultural and aquaculture activities result in extensive production and application of chemicals. Organohalogen contaminants (OHCs) have been widely used as i.e. pesticides, flame retardants and plasticizers. They are persistent, bioaccumulative and pose a potential threat to ecosystem and human health. However, limited research has been conducted in the YRD with respect to chemicals environmental exposure. The main objective of this thesis is to investigate the contamination level, distribution pattern and sources of OHCs in the YRD. Wildlife from different habitats are used to indicate the environmental pollution situation, and evaluate selected matrices for use in long term biomonitoring to determine the environmental stress the contamination may cause. In addition, a method is developed for dicofol analysis. Moreover, a specific effort is made to introduce statistic power analysis to assist in optimal sampling design. The thesis results show extensive contamination of OHCs in wildlife in the YRD. The occurrences of high concentrations of chlorinated paraffins (CPs) are reported in wildlife, in particular in terrestrial species, (i.e. short-tailed mamushi snake and peregrine falcon). Impurities and byproducts of pentachlorophenol products, i.e. polychlorinated diphenyl ethers (PCDEs) and hydroxylated polychlorinated diphenyl ethers (OH-PCDEs) are identified and reported for the first time in eggs from black-crowned night heron and whiskered tern. High concentrations of octachlorodibenzo-p-dioxin (OCDD) are determined in these samples. The toxic equivalents (TEQs) of polychlorinated dibenzo-p-dioxin (PCDDs) and polychlorinated dibenzofurans (PCDFs) are at mean levels of 300 and 520 pg TEQ g-1lw (WHO2005 TEQ) in eggs from the two bird species, respectively. This is two orders of magnitude higher than European Union (EU) regulation limit in chicken eggs. Also, a novel pattern of polychlorinated biphenyls (PCBs) with octa- to decaCBs, contributing to as much as 20% of total PCBs therein, are reported in birds. The legacy POPs shows a common characteristic with relatively high level of organochlorine pesticides (i.e. DDT, hexacyclohexanes (HCHs) and Mirex), indicating historic applications. In contrast, rather low concentrations are shown of industrial chemicals such as PCBs and polybrominated diphenyl ethers (PBDEs). A refined and improved analytical method is developed to separate dicofol from its major decomposition compound, 4,4’-dichlorobenzophenone. Hence dicofol is possible to assess as such. Statistic power analysis demonstrates that sampling of sedentary species should be consistently spread over a larger area to monitor temporal trends of contaminants in a robust manner. The results presented in this thesis show high CPs and OCDD concentrations in wildlife. The levels and patterns of OHCs in YRD differ from other well studied areas of the world. This is likely due to the extensive production and use of chemicals in the YRD. The results strongly signal the need of research biomonitoring programs that meet the current situation of the YRD. Such programs will contribute to the management of chemicals and environment in YRD, with the potential to grow into the human health sector, and to expand to China as a whole.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new approach to optimisation is introduced based on a precise probabilistic statement of what is ideally required of an optimisation method. It is convenient to express the formalism in terms of the control of a stationary environment. This leads to an objective function for the controller which unifies the objectives of exploration and exploitation, thereby providing a quantitative principle for managing this trade-off. This is demonstrated using a variant of the multi-armed bandit problem. This approach opens new possibilities for optimisation algorithms, particularly by using neural network or other adaptive methods for the adaptive controller. It also opens possibilities for deepening understanding of existing methods. The realisation of these possibilities requires research into practical approximations of the exact formalism.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bayesian techniques have been developed over many years in a range of different fields, but have only recently been applied to the problem of learning in neural networks. As well as providing a consistent framework for statistical pattern recognition, the Bayesian approach offers a number of practical advantages including a potential solution to the problem of over-fitting. This chapter aims to provide an introductory overview of the application of Bayesian methods to neural networks. It assumes the reader is familiar with standard feed-forward network models and how to train them using conventional techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bayesian techniques have been developed over many years in a range of different fields, but have only recently been applied to the problem of learning in neural networks. As well as providing a consistent framework for statistical pattern recognition, the Bayesian approach offers a number of practical advantages including a potential solution to the problem of over-fitting. This chapter aims to provide an introductory overview of the application of Bayesian methods to neural networks. It assumes the reader is familiar with standard feed-forward network models and how to train them using conventional techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article reviews the statistical methods that have been used to study the planar distribution, and especially clustering, of objects in histological sections of brain tissue. The objective of these studies is usually quantitative description, comparison between patients or correlation between histological features. Objects of interest such as neurones, glial cells, blood vessels or pathological features such as protein deposits appear as sectional profiles in a two-dimensional section. These objects may not be randomly distributed within the section but exhibit a spatial pattern, a departure from randomness either towards regularity or clustering. The methods described include simple tests of whether the planar distribution of a histological feature departs significantly from randomness using randomized points, lines or sample fields and more complex methods that employ grids or transects of contiguous fields, and which can detect the intensity of aggregation and the sizes, distribution and spacing of clusters. The usefulness of these methods in understanding the pathogenesis of neurodegenerative diseases such as Alzheimer's disease and Creutzfeldt-Jakob disease is discussed. © 2006 The Royal Microscopical Society.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.