895 resultados para discriminant analysis and cluster analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Small angle X-ray scattering (SAXS) images of normal breast tissue and benign and malignant breast tumour tissues, fixed in formalin, were measured at the momentum transfer range of 0.063 nm(-1) <= q (=4 pi sin(theta/2)/lambda) <= 2.720 nm(-1). Four intrinsic parameters were extracted from the scattering profiles (1D SAXS image reduced) and, from the combination of these parameters, another three parameters were also created. All parameters, intrinsic and derived, were subject to discriminant analysis, and it was verified that parameters such as the area of diffuse scatter at the momentum transfer range 0.50 <= q <= 0.56 nm(-1), the ratio between areas of fifth-order axial and third-order lateral peaks and third-order axial spacing provide the most significant information for diagnosis (p < 0.001). Thus, in this work it was verified that by combining these three parameters it was possible to classify human breast tissues as normal, benign lesion or malignant lesion with a sensitivity of 83% and a specificity of 100%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The supervised pattern recognition methods K-Nearest Neighbors (KNN), stepwise discriminant analysis (SDA), and soft independent modelling of class analogy (SIMCA) were employed in this work with the aim to investigate the relationship between the molecular structure of 27 cannabinoid compounds and their analgesic activity. Previous analyses using two unsupervised pattern recognition methods (PCA-principal component analysis and HCA-hierarchical cluster analysis) were performed and five descriptors were selected as the most relevants for the analgesic activity of the compounds studied: R (3) (charge density on substituent at position C(3)), Q (1) (charge on atom C(1)), A (surface area), log P (logarithm of the partition coefficient) and MR (molecular refractivity). The supervised pattern recognition methods (SDA, KNN, and SIMCA) were employed in order to construct a reliable model that can be able to predict the analgesic activity of new cannabinoid compounds and to validate our previous study. The results obtained using the SDA, KNN, and SIMCA methods agree perfectly with our previous model. Comparing the SDA, KNN, and SIMCA results with the PCA and HCA ones we could notice that all multivariate statistical methods classified the cannabinoid compounds studied in three groups exactly in the same way: active, moderately active, and inactive.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A multivariate model using hierarchical clustering and discriminant analysis is used to identify clusters of community opportunity and community vulnerability across Australia's mega metropolitan regions, Variables used in the model measure aspects of structural economic change, occupational change, human capital, income, unemployment, family/household disadvantage, and housing stress. A nine-cluster solution is used to categorise communities across metropolitan space. Significant between-city variations in the incidence of these clusters of opportunity and vulnerability are apparent, suggesting the emergence of marked differentiation between Australia's mega metropolitan regions in their adjustments to changing economic and social conditions. JEL classification: C49, R11, R12.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pine forests constitute some of the most important renewable resources supplying timber, paper and chemical industries, among other functions. Characterization of the volatiles emitted by different Pinus species has proven to be an important tool to decode the process of host tree selection by herbivore insects, some of which cause serious economic damage to pines. Variations in the relative composition of the bouquet of semiochemicals are responsible for the outcome of different biological processes, such as mate finding, egg-laying site recognition and host selection. The volatiles present in phloem samples of four pine species, P. halepensis, P. sylvestris, P. pinaster and P. pinea, were identified and characterized with the aim of finding possible host-plant attractants for native pests, such as the bark beetle Tomicus piniperda. The volatile compounds emitted by phloem samples of pines were extracted by headspace solid-phase micro extraction, using a 2 cm 50/30 mm divinylbenzene/carboxen/polydimethylsiloxane table flex solid-phase microextraction fiber and its contents analyzed by high-resolution gas chromatography, using flame ionization and a non polar and chiral column phases. The components of the volatile fraction emitted by the phloem samples were identified by mass spectrometry using time-of-flight and quadrupole mass analyzers. The estimated relative composition was used to perform a discriminant analysis among pine species, by means of cluster and principal component analysis. It can be concluded that it is possible to discriminate pine species based on the monoterpenes emissions of phloem samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Morphometric variability among shrimp populations of the genus Palaemonetes Heller, 1869 from seven lakes (Huanayo and Urcococha, in Peru; Amanã, Mamirauá, Camaleão, Cristalino e Iruçanga, in Brasil) in the Amazon Basin, presumably belonging to Palaemonetes carteri Gordon, 1935 and Palaemonetes ivonicus Holthuis, 1950, were studied. The morphometric studies were carried out from the ratios obtained from the morphometric characters. Multivariated analysis (Principal Components Analysis-PCA, Discriminant Function Analysis and Cluster Analysis) were applied over the ratios. Intra- and interpopulation variations of the rostrum teeth, and the number of spines in the male appendix, were analyzed through descriptive statistics and bivariate analysis (Spearman Rank Correlation test). Results indicated a wide plasticity and overlapping in the studied ratios between populations. The Principal Components Analysis was not able to separate different populations, revealing a large intrapopulation plasticity and strong interpopulation similarity in the studied ratios. Although the Discriminant Functions Analysis was not able to fully discriminate populations, they could be allocated in three subgroups: 1) Cristalino and Iruçanga; 2) Huanayo, Urcococha and Camaleão and 3) Mamirauá and Amanã. The first two groups were morphometrically separated from each other, whereas the third one presented a strong overlap with the former two. The Cluster Analysis confirmed the first two subgroups separation, and indicated that the first and third groups were closely related. Rostrum teeth and number of spines in the appendix masculina showed a large intrapopulation variation and a strong overlapping among the studied populations, regardless of the species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Spanish savings banks attracted quite a considerable amount of interest within the scientific arena, especially subsequent to the disappearance of the regulatory constraints during the second decade of the 1980s. Nonetheless, a lack of research identified with respect to mainstream paths given by strategic groups, and the analysis of the total factor productivity. Therefore, on the basis of the resource-based view of the firm and cluster analysis, we make use of changes in structure and performance ratios in order to identify the strategic groups extant in the sector. We attain a threeways division, which we link with different input-output specifications defining strategic paths. Consequently, on the basis of these three dissimilar approaches we compute and decompose a Hicks-Moorsteen total factor productivity index. Obtained results put forward an interesting interpretation under a multi-strategic approach, together with the setbacks of employing cluster analysis within a complex strategic environment. Moreover, we also propose an ex-post method of analysing the outcomes of the decomposed total factor productivity index that could be merged with non-traditional techniques of forming strategic groups, such as cognitive approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cuticular hydrocarbons of larvae of individual strains of the Anopheles gambiae sensu stricto were investigated using gas liquid chromatography. Biomedical discriminant analysis involving multivariate statistics suggests that there was clear hydrocarbon difference between the Gambian(G3), the Nigerian (16CSS and, its malathion resistant substrain, REFMA) and the Tanzanian (KWA) strains. The high degree of segregation (95%) in hydrocarbons among the four strains investigated indicates that further analysis is needed to enable understanding of hydrocarbon variation in samples of An. gambiae especially from areas where these populations co-exist.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discriminant analysis was used to identify eggs of Capillaria spp. at specific level found in organic remains from an archaeological site in Patagonia, Argentina, dated of 6,540 ± 110 years before present. In order to distinguish eggshell morphology 149 eggs were measured and grouped into four arbitrary subsets. The analysis used on egg width and length discriminated them into different morphotypes (Wilks' lambda = 0.381, p < 0.05). The correlation analysis suggests that width was the most important variable to discriminate among the Capillaria spp. egg morphotypes (Pearson coefficient = 0.950, p < 0.05). The study of eggshell patterns, the relative frequency in the sample, and the morphometric data allowed us to correlate the four morphotypes with Capillaria species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Compositional data naturally arises from the scientific analysis of the chemicalcomposition of archaeological material such as ceramic and glass artefacts. Data of thistype can be explored using a variety of techniques, from standard multivariate methodssuch as principal components analysis and cluster analysis, to methods based upon theuse of log-ratios. The general aim is to identify groups of chemically similar artefactsthat could potentially be used to answer questions of provenance.This paper will demonstrate work in progress on the development of a documentedlibrary of methods, implemented using the statistical package R, for the analysis ofcompositional data. R is an open source package that makes available very powerfulstatistical facilities at no cost. We aim to show how, with the aid of statistical softwaresuch as R, traditional exploratory multivariate analysis can easily be used alongside, orin combination with, specialist techniques of compositional data analysis.The library has been developed from a core of basic R functionality, together withpurpose-written routines arising from our own research (for example that reported atCoDaWork'03). In addition, we have included other appropriate publicly availabletechniques and libraries that have been implemented in R by other authors. Availablefunctions range from standard multivariate techniques through to various approaches tolog-ratio analysis and zero replacement. We also discuss and demonstrate a smallselection of relatively new techniques that have hitherto been little-used inarchaeometric applications involving compositional data. The application of the libraryto the analysis of data arising in archaeometry will be demonstrated; results fromdifferent analyses will be compared; and the utility of the various methods discussed

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new quantitative approach of the mandibular sexual dimorphism, based on computer-aided image analysis and elliptical Fourier analysis of the mandibular outline in lateral view is presented. This method was applied to a series of 117 dentulous mandibles from 69 male and 48 female individuals native of Rhenish countries. Statistical discriminant analysis of the elliptical Fourier harmonics allowed the demonstration of a significant sexual dimorphism in 97.1% of males and 91.7% of females, i.e. in a higher proportion than in previous studies using classical metrical approaches. This original method opens interesting perspectives for increasing the accuracy of sex identification in current anthropological practice and in forensic procedures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Farm planning requires an assessment of the soil class. Research suggest that the Diagnosis and Recommendation Integrated System (DRIS) has the capacity to evaluate the nutritional status of coffee plantations, regardless of environmental conditions. Additionally, the use of DRIS could reduce the costs for farm planning. This study evaluated the relationship between the soil class and nutritional status of coffee plants (Coffea canephora Pierre) using the Critical Level (CL) and DRIS methods, based on two multivariate statistical methods (discriminant and multidimensional scaling analyses). During three consecutive years, yield and foliar concentration of nutrients (N, P, K, Ca, Mg, S, B, Zn, Mn, Fe and Cu) were obtained from coffee plantations cultivated in Espírito Santo state. Discriminant analysis showed that the soil class was an important factor determining the nutritional status of the coffee plants. The grouping separation by the CL method was not as effective as the DRIS one. The bidimensional analysis of Euclidean distances did not show the same relationship between plant nutritional status and soil class. Multidimensional scaling analysis by the CL method indicated that 93.3 % of the crops grouped into one cluster, whereas the DRIS method split the fields more evenly into three clusters. The DRIS method thus proved to be more consistent than the CL method for grouping coffee plantations by soil class.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One major methodological problem in analysis of sequence data is the determination of costs from which distances between sequences are derived. Although this problem is currently not optimally dealt with in the social sciences, it has some similarity with problems that have been solved in bioinformatics for three decades. In this article, the authors propose an optimization of substitution and deletion/insertion costs based on computational methods. The authors provide an empirical way of determining costs for cases, frequent in the social sciences, in which theory does not clearly promote one cost scheme over another. Using three distinct data sets, the authors tested the distances and cluster solutions produced by the new cost scheme in comparison with solutions based on cost schemes associated with other research strategies. The proposed method performs well compared with other cost-setting strategies, while it alleviates the justification problem of cost schemes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Artifacts are present in most of the electroencephalography (EEG) recordings, making it difficult to interpret or analyze the data. In this paper a cleaning procedure based on a multivariate extension of empirical mode decomposition is used to improve the quality of the data. This is achieved by applying the cleaning method to raw EEG data. Then, a synchrony measure is applied on the raw and the clean data in order to compare the improvement of the classification rate. Two classifiers are used, linear discriminant analysis and neural networks. For both cases, the classification rate is improved about 20%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new analytical method was developed to non-destructively determine pH and degree of polymerisation (DP) of cellulose in fibres in 19th 20th century painting canvases, and to identify the fibre type: cotton, linen, hemp, ramie or jute. The method is based on NIR spectroscopy and multivariate data analysis, while for calibration and validation a reference collection of 199 historical canvas samples was used. The reference collection was analysed destructively using microscopy and chemical analytical methods. Partial least squares regression was used to build quantitative methods to determine pH and DP, and linear discriminant analysis was used to determine the fibre type. To interpret the obtained chemical information, an expert assessment panel developed a categorisation system to discriminate between canvases that may not be fit to withstand excessive mechanical stress, e.g. transportation. The limiting DP for this category was found to be 600. With the new method and categorisation system, canvases of 12 Dalí paintings from the Fundació Gala-Salvador Dalí (Figueres, Spain) were non-destructively analysed for pH, DP and fibre type, and their fitness determined, which informs conservation recommendations. The study demonstrates that collection-wide canvas condition surveys can be performed efficiently and non-destructively, which could significantly improve collection management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT This study aimed to develop a methodology based on multivariate statistical analysis of principal components and cluster analysis, in order to identify the most representative variables in studies of minimum streamflow regionalization, and to optimize the identification of the hydrologically homogeneous regions for the Doce river basin. Ten variables were used, referring to the river basin climatic and morphometric characteristics. These variables were individualized for each of the 61 gauging stations. Three dependent variables that are indicative of minimum streamflow (Q7,10, Q90 and Q95). And seven independent variables that concern to climatic and morphometric characteristics of the basin (total annual rainfall – Pa; total semiannual rainfall of the dry and of the rainy season – Pss and Psc; watershed drainage area – Ad; length of the main river – Lp; total length of the rivers – Lt; and average watershed slope – SL). The results of the principal component analysis pointed out that the variable SL was the least representative for the study, and so it was discarded. The most representative independent variables were Ad and Psc. The best divisions of hydrologically homogeneous regions for the three studied flow characteristics were obtained using the Mahalanobis similarity matrix and the complete linkage clustering method. The cluster analysis enabled the identification of four hydrologically homogeneous regions in the Doce river basin.