948 resultados para probabilistic model


60.00% 60.00%



Most algorithms that focus on discovering frequent patterns from data streams assumed that the machinery is capable of managing all the incoming transactions without any delay; or without the need to drop transactions. However, this assumption is often impractical due to the inherent characteristics of data stream environments. Especially under high load conditions, there is often a shortage of system resources to process the incoming transactions. This causes unwanted latencies that in turn, affects the applicability of the data mining models produced – which often has a small window of opportunity. We propose a load shedding algorithm to address this issue. The algorithm adaptively detects overload situations and drops transactions from data streams using a probabilistic model. We tested our algorithm on both synthetic and real-life datasets to verify the feasibility of our algorithm.


60.00% 60.00%



There is currently no consensus as to how “acceptable risk” should be defined in emergency service response. Attempts to address this have relied upon the assumption that a probabilistic model of risk can be calculated and that acceptable levels of risk can be determined. Examples of this process can be seen in a number of emergency services, e.g. dynamic risk assessment utilised by a number of fire services.


60.00% 60.00%



The inherent variability in incoming material and process conditions in sheet metal forming makes quality control and the maintenance of consistency extremely difficult. A single FEM simulation is successful at predicting the formability for a given system, however lacks the ability to capture the variability in an actual production process due to the numerical deterministic nature. This paper investigates a probabilistic analytical model where the variation of five input parameters and their relationship to the sensitivity of springback in a stamping process is examined. A range of sheet tensions are investigated, simulating different operating windows in an attempt to highlight robust regions where the distribution of springback is small. A series of FEM simulations were also performed, to compare with the findings from the analytical model using AutoForm Sigma v4.04 and to validate the analytical model assumptions.

Results show that an increase in sheet tension not only decreases springback, but more importantly reduces the sensitivity of the process to variation. A relative sensitivity analysis has been performed where the most influential parameters and the changes in sensitivity at various sheet tensions have been investigated. Variation in the material parameters, yield stress and n-value were the most influential causes of springback variation, when compared to process input parameters such as friction, which had a small effect. The probabilistic model presented allows manufacturers to develop a more comprehensive assessment of the success of their forming processes by capturing the effects of inherent variation.


60.00% 60.00%



Human age estimation by face images is an interesting yet challenging research topic emerging in recent years. This paper extends our previous work on facial age estimation (a linear method named AGES). In order to match the nonlinear nature of the human aging progress, a new algorithm named KAGES is proposed based on a nonlinear subspace trained on the aging patterns, which are defined as sequences of individual face images sorted in time order. Both the training and test (age estimation) processes of KAGES rely on a probabilistic model of KPCA. In the experimental results, the performance of KAGES is not only better than all the compared algorithms, but also better than the human observers in age estimation. The results are sensitive to parameter choice however, and future research challenges are identified.


60.00% 60.00%



Collaborative filtering is an effective recommendation technique wherein the preference of an individual can potentially be predicted based on preferences of other members. Early algorithms often relied on the strong locality in the preference data, that is, it is enough to predict preference of a user on a particular item based on a small subset of other users with similar tastes or of other items with similar properties. More recently, dimensionality reduction techniques have proved to be equally competitive, and these are based on the co-occurrence patterns rather than locality. This paper explores and extends a probabilistic model known as Boltzmann Machine for collaborative filtering tasks. It seamlessly integrates both the similarity and cooccurrence in a principled manner. In particular, we study parameterisation options to deal with the ordinal nature of the preferences, and propose a joint modelling of both the user-based and item-based processes. Experiments on moderate and large-scale movie recommendation show that our framework rivals existing well-known methods.


60.00% 60.00%



Using film grammar as the underpinning, we study the extraction of structures in video based on color using a wide configuration of clustering methods combined with existing and new similarity measures. We study the visualisation of these structures, which we call Scene-Cluster Temporal Charts and show how it can bring out the interweaving of different themes and settings in a film. We also extract color events that filmmakers use to draw/force a viewer's attention to a shot/scene. This is done by first extracting a set of colors used rarely in film, and then building a probabilistic model for color event detection. We demonstrate with experimental results from ten movies that our algorithms are effective in the extraction of both scene-cluster temporal charts and color events.


60.00% 60.00%



We introduce a new method for face recognition using a versatile probabilistic model known as Restricted Boltzmann Machine (RBM). In particular, we propose to regularise the standard data likelihood learning with an information-theoretic distance metric defined on intra-personal images. This results in an effective face representation which captures the regularities in the face space and minimises the intra-personal variations. In addition, our method allows easy incorporation of multiple feature sets with controllable level of sparsity. Our experiments on a high variation dataset show that the proposed method is competitive against other metric learning rivals. We also investigated the RBM method under a variety of settings, including fusing facial parts and utilising localised feature detectors under varying resolutions. In particular, the accuracy is boosted from 71.8% with the standard whole-face pixels to 99.2% with combination of facial parts, localised feature extractors and appropriate resolutions.


60.00% 60.00%



O objetivo deste trabalho é testar a aplicação de um modelo gráfico probabilístico, denominado genericamente de Redes Bayesianas, para desenvolver modelos computacionais que possam ser utilizados para auxiliar a compreensão de problemas e/ou na previsão de variáveis de natureza econômica. Com este propósito, escolheu-se um problema amplamente abordado na literatura e comparou-se os resultados teóricos e experimentais já consolidados com os obtidos utilizando a técnica proposta. Para tanto,foi construído um modelo para a classificação da tendência do "risco país" para o Brasil a partir de uma base de dados composta por variáveis macroeconômicas e financeiras. Como medida do risco adotou-se o EMBI+ (Emerging Markets Bond Index Plus), por ser um indicador amplamente utilizado pelo mercado.


60.00% 60.00%



A Sigatoka-negra (Mycosphaerella fijiensis) ameaça os bananais comerciais em todas as áreas produtoras do mundo e provoca danos quantitativos e qualitativos na produção, acarretando sérios prejuízos financeiros. Faz-se necessário o estudo da vulnerabilidade das plantas em diversos estádios de desenvolvimento e das condições climáticas favoráveis à ocorrência da doença. Objetivou-se com este trabalho desenvolver um modelo probabilístico baseado em funções polinomiais que represente o risco de ocorrência da Sigatokanegra em função da vulnerabilidade decorrente de fatores intrínsecos à planta e ao ambiente. Realizou-se um estudo de caso, em bananal comercial localizado em Jacupiranga, Vale do Ribeira, SP, considerando o monitoramento semanal do estado da evolução da doença, séries temporais de dados meteorológicos e dados de sensoriamento remoto. Foram gerados mapas georreferenciados do risco da Sigatoka-negra em diferentes épocas do ano. Um modelo para estimar a evolução da doença a partir de imagens de satélite foi obtido com coeficiente de determinação R² igual a 0,9. A metodologia foi desenvolvida para a detecção de épocas e locais que reúnem condições favoráveis à ocorrência da Sigatoka-negra e pode ser aplicada, com os devidos ajustes, em diferentes localidades, para avaliar o risco da ocorrência da doença em polos produtores de banana.


60.00% 60.00%



Female broiler breeder productivity is based on the principles of thermal comfort that are directly related with the microclimate inside the housing. This research had the objective of monitoring the behavior of female broiler breeders, using the technology of radio-frequency, injectable transponders and readers in different existing microclimates inside a small scale distorted housing model. Eight birds with electronic identification were used. Three readers were used, in three different points inside the model: on the floor of the nest, in the passage besides the lateral wall and below the water facility. Dry bulb (DBT), wet bulb (WBT) and black globe (BGT) temperature were measured continuously. The results point out a distinct behavioral pattern of the birds regarding the environment exposition during the experiment. Three probabilistic models of behavior were developed from the recorded data: probabilistic model for the passage use: FP = 1.10 - 0.244 ln(DBT), probabilistic model for the water facility use: FB = 0.398 + 0.00866(DBT), and probabilistic model for the nest use: FN = 2.22 - 0.272 DBT + 0,011 DBT 2 - 0.000144 DBT 3.


60.00% 60.00%



Pós-graduação em Ciências Cartográficas - FCT


60.00% 60.00%



Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)


60.00% 60.00%



Os sequenciadores de nova geração como as plataformas Illumina e SOLiD geram uma grande quantidade de dados, comumente, acima de 10 Gigabytes de arquivos-texto. Particularmente, a plataforma SOLiD permite o sequenciamento de múltiplas amostras em uma única corrida (denominada de corrida multiplex) por meio de um sistema de marcação chamado Barcode. Esta funcionalidade requer um processo computacional para separação dos dados por amostra, pois, o sequenciador fornece a mistura de todas amostras em uma única saída. Este processo deve ser seguro a fim de evitar eventuais embaralhamentos que possam prejudicar as análises posteriores. Neste contexto, o presente trabalho propõe desenvolvimento de um modelo probabilístico capaz de caracterizar sistema de marcação utilizado em sequenciamentos multiplex. Os resultados obtidos corroboraram a suficiência do modelo obtido, o qual permite, dentre outras coisas, identificar faltas em algum passo do processo de sequenciamento; adaptar e desenvolver de novos protocolos para preparação de amostras, além de atribuir um Grau de Confiança aos dados gerados e guiar um processo de filtragem que respeite as características de cada sequenciamento, não descartando sequências úteis de forma arbitrária.


60.00% 60.00%



Abstract Background One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. Results A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. Conclusion The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.