915 resultados para Model Classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new isotherm is proposed here for adsorption of condensable vapors and gases on nonporous materials having type II isotherms according to the Brunauer-Deming-Deming-Teller (BDDH) classification. The isotherm combines the recent molecular-continuum model in the multilayer region, with other widely used models for sub-monolayer coverage, some of which satisfy the requirement of a Henry's law asymptote. The model is successfully tested using isotherm data for nitrogen adsorption on nonporous silica, carbon and alumina, as well as benzene and hexane adsorption on nonporous carbon. Based on the data fits, out of several different alternative choices of model for the monolayer region, the Freundlich and the Unilan models are found to be the most successful when combined with the multilayer model to predict the whole isotherm. The hybrid model is consequently applicable over a wide pressure range. (C) 2000 Elsevier Science B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a fuzzy approach to the Reed-Frost model for epidemic spreading taking into account uncertainties in the diagnostic of the infection. The heterogeneities in the infected group is based on the clinical signals of the individuals (symptoms, laboratorial exams, medical findings, etc.), which are incorporated into the dynamic of the epidemic. The infectivity level is time-varying and the classification of the individuals is performed through fuzzy relations. Simulations considering a real problem with data of the viral epidemic in a children daycare are performed and the results are compared with a stochastic Reed-Frost generalization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study evaluated the use of Raman spectroscopy to identify the spectral differences between normal (N), benign hyperplasia (BPH) and adenocarcinoma (CaP) in fragments of prostate biopsies in vitro with the aim of developing a spectral diagnostic model for tissue classification. A dispersive Raman spectrometer was used with 830 nm wavelength and 80 mW excitation. Following Raman data collection and tissue histopathology (48 fragments diagnosed as N, 43 as BPH and 14 as CaP), two diagnostic models were developed in order to extract diagnostic information: the first using PCA and Mahalanobis analysis techniques and the second one a simplified biochemical model based on spectral features of cholesterol, collagen, smooth muscle cell and adipocyte. Spectral differences between N, BPH and CaP tissues, were observed mainly in the Raman bands associated with proteins, lipids, nucleic and amino acids. The PCA diagnostic model showed a sensitivity and specificity of 100%, which indicates the ability of PCA and Mahalanobis distance techniques to classify tissue changes in vitro. Also, it was found that the relative amount of collagen decreased while the amount of cholesterol and adipocyte increased with severity of the disease. Smooth muscle cell increased in BPH tissue. These characteristics were used for diagnostic purposes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The majority of the world's population now resides in urban environments and information on the internal composition and dynamics of these environments is essential to enable preservation of certain standards of living. Remotely sensed data, especially the global coverage of moderate spatial resolution satellites such as Landsat, Indian Resource Satellite and Systeme Pour I'Observation de la Terre (SPOT), offer a highly useful data source for mapping the composition of these cities and examining their changes over time. The utility and range of applications for remotely sensed data in urban environments could be improved with a more appropriate conceptual model relating urban environments to the sampling resolutions of imaging sensors and processing routines. Hence, the aim of this work was to take the Vegetation-Impervious surface-Soil (VIS) model of urban composition and match it with the most appropriate image processing methodology to deliver information on VIS composition for urban environments. Several approaches were evaluated for mapping the urban composition of Brisbane city (south-cast Queensland, Australia) using Landsat 5 Thematic Mapper data and 1:5000 aerial photographs. The methods evaluated were: image classification; interpretation of aerial photographs; and constrained linear mixture analysis. Over 900 reference sample points on four transects were extracted from the aerial photographs and used as a basis to check output of the classification and mixture analysis. Distinctive zonations of VIS related to urban composition were found in the per-pixel classification and aggregated air-photo interpretation; however, significant spectral confusion also resulted between classes. In contrast, the VIS fraction images produced from the mixture analysis enabled distinctive densities of commercial, industrial and residential zones within the city to be clearly defined, based on their relative amount of vegetation cover. The soil fraction image served as an index for areas being (re)developed. The logical match of a low (L)-resolution, spectral mixture analysis approach with the moderate spatial resolution image data, ensured the processing model matched the spectrally heterogeneous nature of the urban environments at the scale of Landsat Thematic Mapper data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complete small subunit ribosomal RNA gene (ssrDNA) and partial (D1-D3) large subunit ribosomal RNA gene (lsrDNA) sequences were used to estimate the phylogeny of the Digenea via maximum parsimony and Bayesian inference. Here we contribute 80 new ssrDNA and 124 new lsrDNA sequences. Fully complementary data sets of the two genes were assembled from newly generated and previously published sequences and comprised 163 digenean taxa representing 77 nominal families and seven aspidogastrean outgroup taxa representing three families. Analyses were conducted on the genes independently as well as combined and separate analyses including only the higher plagiorchiidan taxa were performed using a reduced-taxon alignment including additional characters that could not be otherwise unambiguously aligned. The combined data analyses yielded the most strongly supported results and differences between the two methods of analysis were primarily in their degree of resolution. The Bayesian analysis including all taxa and characters, and incorporating a model of nucleotide substitution (general-time-reversible with among-site rate heterogeneity), was considered the best estimate of the phylogeny and was used to evaluate their classification and evolution. In broad terms, the Digenea forms a dichotomy that is split between a lineage leading to the Brachylaimoidea, Diplostomoidea and Schistosomatoidea (collectively the Diplostomida nomen novum (nom. nov.)) and the remainder of the Digenea (the Plagiorchiida), in which the Bivesiculata nom. nov. and Transversotremata nom. nov. form the two most basal lineages, followed by the Hemiurata. The remainder of the Plagiorchiida forms a large number of independent lineages leading to the crown clade Xiphidiata nom. nov. that comprises the Allocreadioidea, Gorgoderoidea, Microphalloidea and Plagiorchioidea, which are united by the presence of a penetrating stylet in their cercariae. Although a majority of families and to a lesser degree, superfamilies are supported as currently defined, the traditional divisions of the Echinostomida, Plagiorchiida and Strigeida were found to comprise non-natural assemblages. Therefore, the membership of established higher taxa are emended, new taxa erected and a revised, phylogenetically based classification proposed and discussed in light of ontogeny, morphology and taxonomic history. (C) 2003 Australian Society for Parasitology Inc. Published by Elsevier Science Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An energy-based swing hammer mill model has been developed for coke oven feed preparation. it comprises a mechanistic power model to determine the dynamic internal recirculation and a perfect mixing mill model with a dual-classification function to mimic the operations of crusher and screen. The model parameters were calibrated using a pilot-scale swing hammer mill at various operating conditions. The effects of the underscreen configurations and the feed sizes on hammer mill operations were demonstrated through the fitted model parameters. Relationships between the model parameters and the machine configurations were established. The model was validated using the independent experimental data of single lithotype coal tests with the same BJD pilot-scale hammer mill and full operation audit data of an industrial hammer mill. The outcome of the energy-based swing hammer mill model is the capability to simulate the impact of changing blends of coal or mill configurations and operating conditions on product size distribution. Alternatively, the model can be used to select the machine settings required to achieve a desired product. (C) 2003 Elsevier Science B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In music genre classification, most approaches rely on statistical characteristics of low-level features computed on short audio frames. In these methods, it is implicitly considered that frames carry equally relevant information loads and that either individual frames, or distributions thereof, somehow capture the specificities of each genre. In this paper we study the representation space defined by short-term audio features with respect to class boundaries, and compare different processing techniques to partition this space. These partitions are evaluated in terms of accuracy on two genre classification tasks, with several types of classifiers. Experiments show that a randomized and unsupervised partition of the space, used in conjunction with a Markov Model classifier lead to accuracies comparable to the state of the art. We also show that unsupervised partitions of the space tend to create less hubs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a methodology that was developed for the classification of Medium Voltage (MV) electricity customers. Starting from a sample of data bases, resulting from a monitoring campaign, Data Mining (DM) techniques are used in order to discover a set of a MV consumer typical load profile and, therefore, to extract knowledge regarding to the electric energy consumption patterns. In first stage, it was applied several hierarchical clustering algorithms and compared the clustering performance among them using adequacy measures. In second stage, a classification model was developed in order to allow classifying new consumers in one of the obtained clusters that had resulted from the previously process. Finally, the interpretation of the discovered knowledge are presented and discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE: To develop a Charlson-like comorbidity index based on clinical conditions and weights of the original Charlson comorbidity index. METHODS: Clinical conditions and weights were adapted from the International Classification of Diseases, 10th revision and applied to a single hospital admission diagnosis. The study included 3,733 patients over 18 years of age who were admitted to a public general hospital in the city of Rio de Janeiro, southeast Brazil, between Jan 2001 and Jan 2003. The index distribution was analyzed by gender, type of admission, blood transfusion, intensive care unit admission, age and length of hospital stay. Two logistic regression models were developed to predict in-hospital mortality including: a) the aforementioned variables and the risk-adjustment index (full model); and b) the risk-adjustment index and patient's age (reduced model). RESULTS: Of all patients analyzed, 22.3% had risk scores >1, and their mortality rate was 4.5% (66.0% of them had scores >1). Except for gender and type of admission, all variables were retained in the logistic regression. The models including the developed risk index had an area under the receiver operating characteristic curve of 0.86 (full model), and 0.76 (reduced model). Each unit increase in the risk score was associated with nearly 50% increase in the odds of in-hospital death. CONCLUSIONS: The risk index developed was able to effectively discriminate the odds of in-hospital death which can be useful when limited information is available from hospital databases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Não existe uma definição única de processo de memória de longo prazo. Esse processo é geralmente definido como uma série que possui um correlograma decaindo lentamente ou um espectro infinito de frequência zero. Também se refere que uma série com tal propriedade é caracterizada pela dependência a longo prazo e por não periódicos ciclos longos, ou que essa característica descreve a estrutura de correlação de uma série de longos desfasamentos ou que é convencionalmente expressa em termos do declínio da lei-potência da função auto-covariância. O interesse crescente da investigação internacional no aprofundamento do tema é justificado pela procura de um melhor entendimento da natureza dinâmica das séries temporais dos preços dos ativos financeiros. Em primeiro lugar, a falta de consistência entre os resultados reclama novos estudos e a utilização de várias metodologias complementares. Em segundo lugar, a confirmação de processos de memória longa tem implicações relevantes ao nível da (1) modelação teórica e econométrica (i.e., dos modelos martingale de preços e das regras técnicas de negociação), (2) dos testes estatísticos aos modelos de equilíbrio e avaliação, (3) das decisões ótimas de consumo / poupança e de portefólio e (4) da medição de eficiência e racionalidade. Em terceiro lugar, ainda permanecem questões científicas empíricas sobre a identificação do modelo geral teórico de mercado mais adequado para modelar a difusão das séries. Em quarto lugar, aos reguladores e gestores de risco importa saber se existem mercados persistentes e, por isso, ineficientes, que, portanto, possam produzir retornos anormais. O objetivo do trabalho de investigação da dissertação é duplo. Por um lado, pretende proporcionar conhecimento adicional para o debate da memória de longo prazo, debruçando-se sobre o comportamento das séries diárias de retornos dos principais índices acionistas da EURONEXT. Por outro lado, pretende contribuir para o aperfeiçoamento do capital asset pricing model CAPM, considerando uma medida de risco alternativa capaz de ultrapassar os constrangimentos da hipótese de mercado eficiente EMH na presença de séries financeiras com processos sem incrementos independentes e identicamente distribuídos (i.i.d.). O estudo empírico indica a possibilidade de utilização alternativa das obrigações do tesouro (OT’s) com maturidade de longo prazo no cálculo dos retornos do mercado, dado que o seu comportamento nos mercados de dívida soberana reflete a confiança dos investidores nas condições financeiras dos Estados e mede a forma como avaliam as respetiva economias com base no desempenho da generalidade dos seus ativos. Embora o modelo de difusão de preços definido pelo movimento Browniano geométrico gBm alegue proporcionar um bom ajustamento das séries temporais financeiras, os seus pressupostos de normalidade, estacionariedade e independência das inovações residuais são adulterados pelos dados empíricos analisados. Por isso, na procura de evidências sobre a propriedade de memória longa nos mercados recorre-se à rescaled-range analysis R/S e à detrended fluctuation analysis DFA, sob abordagem do movimento Browniano fracionário fBm, para estimar o expoente Hurst H em relação às séries de dados completas e para calcular o expoente Hurst “local” H t em janelas móveis. Complementarmente, são realizados testes estatísticos de hipóteses através do rescaled-range tests R/S , do modified rescaled-range test M - R/S e do fractional differencing test GPH. Em termos de uma conclusão única a partir de todos os métodos sobre a natureza da dependência para o mercado acionista em geral, os resultados empíricos são inconclusivos. Isso quer dizer que o grau de memória de longo prazo e, assim, qualquer classificação, depende de cada mercado particular. No entanto, os resultados gerais maioritariamente positivos suportam a presença de memória longa, sob a forma de persistência, nos retornos acionistas da Bélgica, Holanda e Portugal. Isto sugere que estes mercados estão mais sujeitos a maior previsibilidade (“efeito José”), mas também a tendências que podem ser inesperadamente interrompidas por descontinuidades (“efeito Noé”), e, por isso, tendem a ser mais arriscados para negociar. Apesar da evidência de dinâmica fractal ter suporte estatístico fraco, em sintonia com a maior parte dos estudos internacionais, refuta a hipótese de passeio aleatório com incrementos i.i.d., que é a base da EMH na sua forma fraca. Atendendo a isso, propõem-se contributos para aperfeiçoamento do CAPM, através da proposta de uma nova fractal capital market line FCML e de uma nova fractal security market line FSML. A nova proposta sugere que o elemento de risco (para o mercado e para um ativo) seja dado pelo expoente H de Hurst para desfasamentos de longo prazo dos retornos acionistas. O expoente H mede o grau de memória de longo prazo nos índices acionistas, quer quando as séries de retornos seguem um processo i.i.d. não correlacionado, descrito pelo gBm(em que H = 0,5 , confirmando- se a EMH e adequando-se o CAPM), quer quando seguem um processo com dependência estatística, descrito pelo fBm(em que H é diferente de 0,5, rejeitando-se a EMH e desadequando-se o CAPM). A vantagem da FCML e da FSML é que a medida de memória de longo prazo, definida por H, é a referência adequada para traduzir o risco em modelos que possam ser aplicados a séries de dados que sigam processos i.i.d. e processos com dependência não linear. Então, estas formulações contemplam a EMH como um caso particular possível.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion. © 2014 Springer-Verlag Berlin Heidelberg.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis submitted to the Instituto Superior de Estatística e Gestão de Informação da Universidade Nova de Lisboa in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in Information Management – Geographic Information Systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion.