959 resultados para Clustering methods
Resumo:
P>Typing methods to evaluate isolates in relation to their phenotypical and molecular characteristics are essential in epidemiological studies. In this study, Candida albicans biotypes were determined before and after storage in order to verify their stability. Twenty C. albicans isolates were typed by Randomly Amplified Polymorphic DNA (RAPD), production of phospholipase and proteinase exoenzymes (enzymotyping) and morphotyping before and after 180 days of storage in Sabouraud dextrose agar (SDA) and sterilised distilled water. Before the storage, 19 RAPD patterns, two enzymotypes and eight morphotypes were identified. The fragment patterns obtained by RAPD, on the one hand, were not significantly altered after storage. On the other hand, the majority of the isolates changed their enzymotype and morphotype after storage. RAPD typing provided the better discriminatory index (DI) among isolates (DI = 0.995) and maintained the profile identified, thereby confirming its utility in epidemiological surveys. Based on the low reproducibility observed after storage in SDA and distilled water by morphotyping (DI = 0.853) and enzymotyping (DI = 0.521), the use of these techniques is not recommended on stored isolates.
Resumo:
The supervised pattern recognition methods K-Nearest Neighbors (KNN), stepwise discriminant analysis (SDA), and soft independent modelling of class analogy (SIMCA) were employed in this work with the aim to investigate the relationship between the molecular structure of 27 cannabinoid compounds and their analgesic activity. Previous analyses using two unsupervised pattern recognition methods (PCA-principal component analysis and HCA-hierarchical cluster analysis) were performed and five descriptors were selected as the most relevants for the analgesic activity of the compounds studied: R (3) (charge density on substituent at position C(3)), Q (1) (charge on atom C(1)), A (surface area), log P (logarithm of the partition coefficient) and MR (molecular refractivity). The supervised pattern recognition methods (SDA, KNN, and SIMCA) were employed in order to construct a reliable model that can be able to predict the analgesic activity of new cannabinoid compounds and to validate our previous study. The results obtained using the SDA, KNN, and SIMCA methods agree perfectly with our previous model. Comparing the SDA, KNN, and SIMCA results with the PCA and HCA ones we could notice that all multivariate statistical methods classified the cannabinoid compounds studied in three groups exactly in the same way: active, moderately active, and inactive.
Resumo:
This paper critically assesses several loss allocation methods based on the type of competition each method promotes. This understanding assists in determining which method will promote more efficient network operations when implemented in deregulated electricity industries. The methods addressed in this paper include the pro rata [1], proportional sharing [2], loss formula [3], incremental [4], and a new method proposed by the authors of this paper, which is loop-based [5]. These methods are tested on a modified Nordic 32-bus network, where different case studies of different operating points are investigated. The varying results obtained for each allocation method at different operating points make it possible to distinguish methods that promote unhealthy competition from those that encourage better system operation.
Resumo:
We propose quadrature rules for the approximation of line integrals possessing logarithmic singularities and show their convergence. In some instances a superconvergence rate is demonstrated.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.
Resumo:
The artificial dissipation effects in some solutions obtained with a Navier-Stokes flow solver are demonstrated. The solvers were used to calculate the flow of an artificially dissipative fluid, which is a fluid having dissipative properties which arise entirely from the solution method itself. This was done by setting the viscosity and heat conduction coefficients in the Navier-Stokes solvers to zero everywhere inside the flow, while at the same time applying the usual no-slip and thermal conducting boundary conditions at solid boundaries. An artificially dissipative flow solution is found where the dissipation depends entirely on the solver itself. If the difference between the solutions obtained with the viscosity and thermal conductivity set to zero and their correct values is small, it is clear that the artificial dissipation is dominating and the solutions are unreliable.
Resumo:
Recent efforts in the characterization of air-water flows properties have included some clustering process analysis. A cluster of bubbles is defined as a group of two or more bubbles, with a distinct separation from other bubbles before and after the cluster. The present paper compares the results of clustering processes two hydraulic structures. That is, a large-size dropshaft and a hydraulic jump in a rectangular horizontal channel. The comparison highlighted some significant differences in clustering production and structures. Both dropshaft and hydraulic jump flows are complex turbulent shear flows, and some clustering index may provide some measure of the bubble-turbulence interactions and associated energy dissipation.
Resumo:
Conferences that deliver interactive sessions designed to enhance physician participation, such as role play, small discussion groups, workshops, hands-on training, problem- or case-based learning and individualised training sessions, are effective for physician education.
Resumo:
An investigation was undertaken to test the effectiveness of two procedures for recording boundaries and plot positions for scientific studies on farms on Leyte Island, the Philippines. The accuracy of a Garmin 76 Global Positioning System (GPS) unit and a compass and chain was checked under the same conditions. Tree canopies interfered with the ability of the satellite signal to reach the GPS and therefore the GPS survey was less accurate than the compass and chain survey. Where a high degree of accuracy is required, a compass and chain survey remains the most effective method of surveying land underneath tree canopies, providing operator error is minimised. For a large number of surveys and thus large amounts of data, a GPS is more appropriate than a compass and chain survey because data are easily up-loaded into a Geographic Information System (GIS). However, under dense canopies where satellite signals cannot reach the GPS, it may be necessary to revert to a compass survey or a combination of both methods.
Resumo:
The ability to predict leaf area and leaf area index is crucial in crop simulation models that predict crop growth and yield. Previous studies have shown existing methods of predicting leaf area to be inadequate when applied to a broad range of cultivars with different numbers of leaves. The objectives of the study were to (i) develop generalised methods of modelling individual and total plant leaf area, and leaf senescence, that do not require constants that are specific to environments and/or genotypes, (ii) re-examine the base, optimum, and maximum temperatures for calculation of thermal time for leaf senescence, and (iii) assess the method of calculation of individual leaf area from leaf length and leaf width in experimental work. Five cultivars of maize differing widely in maturity and adaptation were planted in October 1994 in south-eastern Queensland, and grown under non-limiting conditions of water and plant nutrient supplies. Additional data for maize plants with low total leaf number (12-17) grown at Katumani Research Centre, Kenya, were included to extend the range in the total leaf number per plant. The equation for the modified (slightly skewed) bell curve could be generalised for modelling individual leaf area, as all coefficients in it were related to total leaf number. Use of coefficients for individual genotypes can be avoided, and individual and total plant leaf area can be calculated from total leaf number. A single, logistic equation, relying on maximum plant leaf area and thermal time from emergence, was developed to predict leaf senescence. The base, optimum, and maximum temperatures for calculation of thermal time for leaf senescence were 8, 34, and 40 degrees C, and apply for the whole crop-cycle when used in modelling of leaf senescence. Thus, the modelling of leaf production and senescence is simplified, improved, and generalised. Consequently, the modelling of leaf area index (LAI) and variables that rely on LAI will be improved. For experimental purposes, we found that the calculation of leaf area from leaf length and leaf width remains appropriate, though the relationship differed slightly from previously published equations.