953 resultados para Artificial Intelligence, Constraint Programming, set variables, representation
Resumo:
Automatic signature verification is a well-established and an active area of research with numerous applications such as bank check verification, ATM access, etc. This paper proposes a novel approach to the problem of automatic off-line signature verification and forgery detection. The proposed approach is based on fuzzy modeling that employs the Takagi-Sugeno (TS) model. Signature verification and forgery detection are carried out using angle features extracted from box approach. Each feature corresponds to a fuzzy set. The features are fuzzified by an exponential membership function involved in the TS model, which is modified to include structural parameters. The structural parameters are devised to take account of possible variations due to handwriting styles and to reflect moods. The membership functions constitute weights in the TS model. The optimization of the output of the TS model with respect to the structural parameters yields the solution for the parameters. We have also derived two TS models by considering a rule for each input feature in the first formulation (Multiple rules) and by considering a single rule for all input features in the second formulation. In this work, we have found that TS model with multiple rules is better than TS model with single rule for detecting three types of forgeries; random, skilled and unskilled from a large database of sample signatures in addition to verifying genuine signatures. We have also devised three approaches, viz., an innovative approach and two intuitive approaches using the TS model with multiple rules for improved performance. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Resumo:
The research literature on metalieuristic and evolutionary computation has proposed a large number of algorithms for the solution of challenging real-world optimization problems. It is often not possible to study theoretically the performance of these algorithms unless significant assumptions are made on either the algorithm itself or the problems to which it is applied, or both. As a consequence, metalieuristics are typically evaluated empirically using a set of test problems. Unfortunately, relatively little attention has been given to the development of methodologies and tools for the large-scale empirical evaluation and/or comparison of metaheuristics. In this paper, we propose a landscape (test-problem) generator that can be used to generate optimization problem instances for continuous, bound-constrained optimization problems. The landscape generator is parameterized by a small number of parameters, and the values of these parameters have a direct and intuitive interpretation in terms of the geometric features of the landscapes that they produce. An experimental space is defined over algorithms and problems, via a tuple of parameters for any specified algorithm and problem class (here determined by the landscape generator). An experiment is then clearly specified as a point in this space, in a way that is analogous to other areas of experimental algorithmics, and more generally in experimental design. Experimental results are presented, demonstrating the use of the landscape generator. In particular, we analyze some simple, continuous estimation of distribution algorithms, and gain new insights into the behavior of these algorithms using the landscape generator.
Resumo:
With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.
Resumo:
Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value = 0.044) increase of Prop(MAD
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
In many advanced applications, data are described by multiple high-dimensional features. Moreover, different queries may weight these features differently; some may not even specify all the features. In this paper, we propose our solution to support efficient query processing in these applications. We devise a novel representation that compactly captures f features into two components: The first component is a 2D vector that reflects a distance range ( minimum and maximum values) of the f features with respect to a reference point ( the center of the space) in a metric space and the second component is a bit signature, with two bits per dimension, obtained by analyzing each feature's descending energy histogram. This representation enables two levels of filtering: The first component prunes away points that do not share similar distance ranges, while the bit signature filters away points based on the dimensions of the relevant features. Moreover, the representation facilitates the use of a single index structure to further speed up processing. We employ the classical B+-tree for this purpose. We also propose a KNN search algorithm that exploits the access orders of critical dimensions of highly selective features and partial distances to prune the search space more effectively. Our extensive experiments on both real-life and synthetic data sets show that the proposed solution offers significant performance advantages over sequential scan and retrieval methods using single and multiple VA-files.
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.
Resumo:
As empresas que almejam garantir e melhorar sua posição dentro de em um mercado cada vez mais competitivo precisam estar sempre atualizadas e em constante evolução. Na busca contínua por essa evolução, investem em projetos de Pesquisa & Desenvolvimento (P&D) e em seu capital humano para promover a criatividade e a inovação organizacional. As pessoas têm papel fundamental no desenvolvimento da inovação, mas para que isso possa florescer de forma constante é preciso comprometimento e criatividade para a geração de ideias. Criatividade é pensar o novo; inovação é fazer acontecer. Porém, encontrar pessoas com essas qualidades nem sempre é tarefa fácil e muitas vezes é preciso estimular essas habilidades e características para que se tornem efetivamente criativas. Os cursos de graduação podem ser uma importante ferramenta para trabalhar esses aspectos, características e habilidades, usando métodos e práticas de ensino que auxiliem no desenvolvimento da criatividade, pois o ambiente ensino-aprendizagem pesa significativamente na formação das pessoas. O objetivo deste estudo é de identificar quais fatores têm maior influência sobre o desenvolvimento da criatividade em um curso de graduação em administração, analisando a influência das práticas pedagógicas dos docentes e as barreiras internas dos discentes. O referencial teórico se baseia principalmente nos trabalhos de Alencar, Fleith, Torrance e Wechsler. A pesquisa transversal de abordagem quantitativa teve como público-alvo os alunos do curso de Administração de uma universidade confessional da Grande São Paulo, que responderam 465 questionários compostos de três escalas. Para as práticas docentes foi adaptada a escala de Práticas Docentes em relação à Criatividade. Para as barreiras internas foi adaptada a escala de Barreiras da Criatividade Pessoal. Para a análise da percepção do desenvolvimento da criatividade foi construída e validada uma escala baseada no referencial de características de uma pessoa criativa. As análises estatísticas descritivas e fatoriais exploratórias foram realizadas no software Statistical Package for the Social Sciences (SPSS), enquanto as análises fatoriais confirmatórias e a mensuração da influência das práticas pedagógicas e das barreiras internas sobre a percepção do desenvolvimento da criatividade foram realizadas por modelagem de equação estrutural utilizando o algoritmo Partial Least Squares (PLS), no software Smart PLS 2.0. Os resultados apontaram que as práticas pedagógicas e as barreiras internas dos discentes explicam 40% da percepção de desenvolvimento da criatividade, sendo as práticas pedagógicas que exercem maior influencia. A pesquisa também apontou que o tipo de temática e o período em que o aluno está cursando não têm influência sobre nenhum dos três construtos, somente o professor influencia as práticas pedagógicas.
Resumo:
As empresas que almejam garantir e melhorar sua posição dentro de em um mercado cada vez mais competitivo precisam estar sempre atualizadas e em constante evolução. Na busca contínua por essa evolução, investem em projetos de Pesquisa & Desenvolvimento (P&D) e em seu capital humano para promover a criatividade e a inovação organizacional. As pessoas têm papel fundamental no desenvolvimento da inovação, mas para que isso possa florescer de forma constante é preciso comprometimento e criatividade para a geração de ideias. Criatividade é pensar o novo; inovação é fazer acontecer. Porém, encontrar pessoas com essas qualidades nem sempre é tarefa fácil e muitas vezes é preciso estimular essas habilidades e características para que se tornem efetivamente criativas. Os cursos de graduação podem ser uma importante ferramenta para trabalhar esses aspectos, características e habilidades, usando métodos e práticas de ensino que auxiliem no desenvolvimento da criatividade, pois o ambiente ensino-aprendizagem pesa significativamente na formação das pessoas. O objetivo deste estudo é de identificar quais fatores têm maior influência sobre o desenvolvimento da criatividade em um curso de graduação em administração, analisando a influência das práticas pedagógicas dos docentes e as barreiras internas dos discentes. O referencial teórico se baseia principalmente nos trabalhos de Alencar, Fleith, Torrance e Wechsler. A pesquisa transversal de abordagem quantitativa teve como público-alvo os alunos do curso de Administração de uma universidade confessional da Grande São Paulo, que responderam 465 questionários compostos de três escalas. Para as práticas docentes foi adaptada a escala de Práticas Docentes em relação à Criatividade. Para as barreiras internas foi adaptada a escala de Barreiras da Criatividade Pessoal. Para a análise da percepção do desenvolvimento da criatividade foi construída e validada uma escala baseada no referencial de características de uma pessoa criativa. As análises estatísticas descritivas e fatoriais exploratórias foram realizadas no software Statistical Package for the Social Sciences (SPSS), enquanto as análises fatoriais confirmatórias e a mensuração da influência das práticas pedagógicas e das barreiras internas sobre a percepção do desenvolvimento da criatividade foram realizadas por modelagem de equação estrutural utilizando o algoritmo Partial Least Squares (PLS), no software Smart PLS 2.0. Os resultados apontaram que as práticas pedagógicas e as barreiras internas dos discentes explicam 40% da percepção de desenvolvimento da criatividade, sendo as práticas pedagógicas que exercem maior influencia. A pesquisa também apontou que o tipo de temática e o período em que o aluno está cursando não têm influência sobre nenhum dos três construtos, somente o professor influencia as práticas pedagógicas.
Resumo:
Yorick Wilks is a central figure in the fields of Natural Language Processing and Artificial Intelligence. His influence extends to many areas and includes contributions to Machines Translation, word sense disambiguation, dialogue modeling and Information Extraction. This book celebrates the work of Yorick Wilks in the form of a selection of his papers which are intended to reflect the range and depth of his work. The volume accompanies a Festschrift which celebrates his contribution to the fields of Computational Linguistics and Artificial Intelligence. The papers include early work carried out at Cambridge University, descriptions of groundbreaking work on Machine Translation and Preference Semantics as well as more recent works on belief modeling and computational semantics. The selected papers reflect Yorick’s contribution to both practical and theoretical aspects of automatic language processing.
Resumo:
We have recently developed a principled approach to interactive non-linear hierarchical visualization [8] based on the Generative Topographic Mapping (GTM). Hierarchical plots are needed when a single visualization plot is not sufficient (e.g. when dealing with large quantities of data). In this paper we extend our system by giving the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in [8], whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of GTMs is used. The latter is particularly useful when the plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a data set of 2300 18-dimensional points and mention extension of our system to accommodate discrete data types.
Resumo:
In Information Filtering (IF) a user may be interested in several topics in parallel. But IF systems have been built on representational models derived from Information Retrieval and Text Categorization, which assume independence between terms. The linearity of these models results in user profiles that can only represent one topic of interest. We present a methodology that takes into account term dependencies to construct a single profile representation for multiple topics, in the form of a hierarchical term network. We also introduce a series of non-linear functions for evaluating documents against the profile. Initial experiments produced positive results.
Resumo:
Health and safety policies may be regarded as the cornerstone for positive prevention of occupational accidents and diseases. The Health and Safety at Work, etc Act 1974 makes it a legal duty for employers to prepare and revise a written statement of a general policy with respect to the health and safety at work of employees as well as the organisation and arrangements for carrying out that policy. Despite their importance and the legal equipment to prepare them, health and safety policies have been found, in a large number of plastics processing companies (particularly small companies), to be poorly prepared, inadequately implemented and monitored. An important cause of these inadequacies is the lack of necessary health and safety knowledge and expertise to prepare, implement and monitor policies. One possible way of remedying this problem is to investigate the feasibility of using computers to develop expert system programs to simulate the health and safety (HS) experts' task of preparing the policies and assisting companies implement and monitor them. Such programs use artificial intelligence (AI) techniques to solve this sort of problems which are heuristic in nature and require symbolic reasoning. Expert systems have been used successfully in a variety of fields such as medicine and engineering. An important phase in the feasibility of development of such systems is the engineering of knowledge which consists of identifying the knowledge required, eliciting, structuring and representing it in an appropriate computer programming language.
Resumo:
In today's market, the global competition has put manufacturing businesses in great pressures to respond rapidly to dynamic variations in demand patterns across products and changing product mixes. To achieve substantial responsiveness, the manufacturing activities associated with production planning and control must be integrated dynamically, efficiently and cost-effectively. This paper presents an iterative agent bidding mechanism, which performs dynamic integration of process planning and production scheduling to generate optimised process plans and schedules in response to dynamic changes in the market and production environment. The iterative bidding procedure is carried out based on currency-like metrics in which all operations (e.g. machining processes) to be performed are assigned with virtual currency values, and resource agents bid for the operations if the costs incurred for performing them are lower than the currency values. The currency values are adjusted iteratively and resource agents re-bid for the operations based on the new set of currency values until the total production cost is minimised. A simulated annealing optimisation technique is employed to optimise the currency values iteratively. The feasibility of the proposed methodology has been validated using a test case and results obtained have proven the method outperforming non-agent-based methods.