979 resultados para clustering techniques


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion. © 2014 Springer-Verlag Berlin Heidelberg.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the present paper we focus on the performance of clustering algorithms using indices of paired agreement to measure the accordance between clusters and an a priori known structure. We specifically propose a method to correct all indices considered for agreement by chance - the adjusted indices are meant to provide a realistic measure of clustering performance. The proposed method enables the correction of virtually any index - overcoming previous limitations known in the literature - and provides very precise results. We use simulated datasets under diverse scenarios and discuss the pertinence of our proposal which is particularly relevant when poorly separated clusters are considered. Finally we compare the performance of EM and KMeans algorithms, within each of the simulated scenarios and generally conclude that EM generally yields best results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper analyses the provision of auxiliary clinical services that are typically carried out within the hospital. We estimate a exible cost function for the three most important (cost- wise) diagnostic techniques and therapeutic services in Portuguese hospitals: Clinical Pathology, Medical Imaging and Physical Medicine and Rehabilitation. Our objective in carrying out this estimation is the evaluation of economies of scale and scope in the provision of these services. For all services, we nd evidence of ray economies of scale and some evidence of economies of scope. These results have important policy implications and can be related to the ongoing discussion of where and how should hospitals provide these services.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação apresentada para obtenção do Grau de Doutor em Engenharia Electrotécnica e de Computadores pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A procura de padrões nos dados de modo a formar grupos é conhecida como aglomeração de dados ou clustering, sendo uma das tarefas mais realizadas em mineração de dados e reconhecimento de padrões. Nesta dissertação é abordado o conceito de entropia e são usados algoritmos com critérios entrópicos para fazer clustering em dados biomédicos. O uso da entropia para efetuar clustering é relativamente recente e surge numa tentativa da utilização da capacidade que a entropia possui de extrair da distribuição dos dados informação de ordem superior, para usá-la como o critério na formação de grupos (clusters) ou então para complementar/melhorar algoritmos existentes, numa busca de obtenção de melhores resultados. Alguns trabalhos envolvendo o uso de algoritmos baseados em critérios entrópicos demonstraram resultados positivos na análise de dados reais. Neste trabalho, exploraram-se alguns algoritmos baseados em critérios entrópicos e a sua aplicabilidade a dados biomédicos, numa tentativa de avaliar a adequação destes algoritmos a este tipo de dados. Os resultados dos algoritmos testados são comparados com os obtidos por outros algoritmos mais “convencionais" como o k-médias, os algoritmos de spectral clustering e um algoritmo baseado em densidade.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Ciência e Sistemas de Informação Geográfica

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent studies of mobile Web trends show the continued explosion of mobile-friend content. However, the wide number and heterogeneity of mobile devices poses several challenges for Web programmers, who want automatic delivery of context and adaptation of the content to mobile devices. Hence, the device detection phase assumes an important role in this process. In this chapter, the authors compare the most used approaches for mobile device detection. Based on this study, they present an architecture for detecting and delivering uniform m-Learning content to students in a Higher School. The authors focus mainly on the XML device capabilities repository and on the REST API Web Service for dealing with device data. In the former, the authors detail the respective capabilities schema and present a new caching approach. In the latter, they present an extension of the current API for dealing with it. Finally, the authors validate their approach by presenting the overall data and statistics collected through the Google Analytics service, in order to better understand the adherence to the mobile Web interface, its evolution over time, and the main weaknesses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Doutoramento em Conservação e Restauro, especialidade Teoria, História e Técnicas

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mathematical models and statistical analysis are key instruments in soil science scientific research as they can describe and/or predict the current state of a soil system. These tools allow us to explore the behavior of soil related processes and properties as well as to generate new hypotheses for future experimentation. A good model and analysis of soil properties variations, that permit us to extract suitable conclusions and estimating spatially correlated variables at unsampled locations, is clearly dependent on the amount and quality of data and of the robustness techniques and estimators. On the other hand, the quality of data is obviously dependent from a competent data collection procedure and from a capable laboratory analytical work. Following the standard soil sampling protocols available, soil samples should be collected according to key points such as a convenient spatial scale, landscape homogeneity (or non-homogeneity), land color, soil texture, land slope, land solar exposition. Obtaining good quality data from forest soils is predictably expensive as it is labor intensive and demands many manpower and equipment both in field work and in laboratory analysis. Also, the sampling collection scheme that should be used on a data collection procedure in forest field is not simple to design as the sampling strategies chosen are strongly dependent on soil taxonomy. In fact, a sampling grid will not be able to be followed if rocks at the predicted collecting depth are found, or no soil at all is found, or large trees bar the soil collection. Considering this, a proficient design of a soil data sampling campaign in forest field is not always a simple process and sometimes represents a truly huge challenge. In this work, we present some difficulties that have occurred during two experiments on forest soil that were conducted in order to study the spatial variation of some soil physical-chemical properties. Two different sampling protocols were considered for monitoring two types of forest soils located in NW Portugal: umbric regosol and lithosol. Two different equipments for sampling collection were also used: a manual auger and a shovel. Both scenarios were analyzed and the results achieved have allowed us to consider that monitoring forest soil in order to do some mathematical and statistical investigations needs a sampling procedure to data collection compatible to established protocols but a pre-defined grid assumption often fail when the variability of the soil property is not uniform in space. In this case, sampling grid should be conveniently adapted from one part of the landscape to another and this fact should be taken into consideration of a mathematical procedure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the present study three techniques for obtaining outer membrane enriched fractions from Yersinia pestis were evaluated. The techniques analysed were: differential solubilization of the cytoplasmic membrane with Sarkosyl or Triton X-100, and centrifugation in sucrose density gradients. The sodium dodecyl-sulfate polyacrylamide gel electrophoresis (SDS-PAGE) of outer membrane isolated by the different methods resulted in similar protein patterns. The measurement of NADH-dehydrogenase and succinate dehydrogenase (inner membrane enzymes) indicated that the outer membrane preparations obtained by the three methods were pure enough for analytical studies. In addition, preliminary evidences on the potential use of outer membrane proteins for the identification of geographic variants of Y. pestis wild isolates are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reliable flow simulation software is inevitable to determine an optimal injection strategy in Liquid Composite Molding processes. Several methodologies can be implemented into standard software in order to reduce CPU time. Post-processing techniques might be one of them. Post-processing a finite element solution is a well-known procedure, which consists in a recalculation of the originally obtained quantities such that the rate of convergence increases without the need for expensive remeshing techniques. Post-processing is especially effective in problems where better accuracy is required for derivatives of nodal variables in regions where Dirichlet essential boundary condition is imposed strongly. In previous works influence of smoothness of non-homogeneous Dirichlet condition, imposed on smooth front was examined. However, usually quite a non-smooth boundary is obtained at each time step of the infiltration process due to discretization. Then direct application of post-processing techniques does not improve final results as expected. The new contribution of this paper lies in improvement of the standard methodology. Improved results clearly show that the recalculated flow front is closer to the ”exact” one, is smoother that the previous one and it improves local disturbances of the “exact” solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Post-processing a finite element solution is a well-known technique, which consists in a recalculation of the originally obtained quantities such that the rate of convergence increases without the need for expensive remeshing techniques. Postprocessing is especially effective in problems where better accuracy is required for derivatives of nodal variables in regions where Dirichlet essential boundary condition is imposed strongly. Consequently such an approach can be exceptionally good in modelling of resin infiltration under quasi steady-state assumption by remeshing techniques and with explicit time integration, because only the free-front normal velocities are necessary to advance the resin front to the next position. The new contribution is the post-processing analysis and implementation of the freeboundary velocities of mesolevel infiltration analysis. Such implementation ensures better accuracy on even coarser meshes, which in consequence reduces the computational time also by the possibility of employing larger time steps.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Demand response can play a very relevant role in the context of power systems with an intensive use of distributed energy resources, from which renewable intermittent sources are a significant part. More active consumers participation can help improving the system reliability and decrease or defer the required investments. Demand response adequate use and management is even more important in competitive electricity markets. However, experience shows difficulties to make demand response be adequately used in this context, showing the need of research work in this area. The most important difficulties seem to be caused by inadequate business models and by inadequate demand response programs management. This paper contributes to developing methodologies and a computational infrastructure able to provide the involved players with adequate decision support on demand response programs and contracts design and use. The presented work uses DemSi, a demand response simulator that has been developed by the authors to simulate demand response actions and programs, which includes realistic power system simulation. It includes an optimization module for the application of demand response programs and contracts using deterministic and metaheuristic approaches. The proposed methodology is an important improvement in the simulator while providing adequate tools for demand response programs adoption by the involved players. A machine learning method based on clustering and classification techniques, resulting in a rule base concerning DR programs and contracts use, is also used. A case study concerning the use of demand response in an incident situation is presented.