922 resultados para principal component regression


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Laser desorption ionisation mass spectrometry (LDI-MS) has demonstrated to be an excellent analytical method for the forensic analysis of inks on a questioned document. The ink can be analysed directly on its substrate (paper) and hence offers a fast method of analysis as sample preparation is kept to a minimum and more importantly, damage to the document is minimised. LDI-MS has also previously been reported to provide a high power of discrimination in the statistical comparison of ink samples and has the potential to be introduced as part of routine ink analysis. This paper looks into the methodology further and evaluates statistically the reproducibility and the influence of paper on black gel pen ink LDI-MS spectra; by comparing spectra of three different black gel pen inks on three different paper substrates. Although generally minimal, the influences of sample homogeneity and paper type were found to be sample dependent. This should be taken into account to avoid the risk of false differentiation of black gel pen ink samples. Other statistical approaches such as principal component analysis (PCA) proved to be a good alternative to correlation coefficients for the comparison of whole mass spectra.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objectives Exposure assessment to a single pesticide does not capture the complexity of the occupational exposure. Recently, pesticide use patterns analysis has emerged as an alternative to study these exposures. The aim of this study is to identify the pesticide use pattern among flower growers in Mexico participating in the study on the endocrine and reproductive effects associated with pesticide exposure. Methods A cross-sectional study was carried out to gather retrospective information on pesticide use applying a questionnaire to the person in charge of the participating flower growing farms. Information about seasonal frequency of pesticide use (rainy and dry) for the years 2004 and 2005 was obtained. Principal components analysis was performed. Results Complete information was obtained for 88 farms and 23 pesticides were included in the analysis. Six principal components were selected, which explained more than 70% of the data variability. The identified pesticide use patterns during both years were: 1. fungicides benomyl, carbendazim, thiophanate and metalaxyl (both seasons), including triadimephon during the rainy season, chlorotalonyl and insecticide permethrin during the dry season; 2. insecticides oxamyl, biphenthrin and fungicide iprodione (both seasons), including insecticide methomyl during the dry season; 3. fungicide mancozeb and herbicide glyphosate (only during the rainy season); 4. insecticides metamidophos and parathion (both seasons); 5. insecticides omethoate and methomyl (only rainy season); and 6. insecticides abamectin and carbofuran (only dry season). Some pesticides do not show a clear pattern of seasonal use during the studied years. Conclusions The principal component analysis is useful to summarise a large set of exposure variables into smaller groups of exposure patterns, identifying the mixtures of pesticides in the occupational environment that may have an interactive effect on a particular health effect.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This study examines how structural determinants influence intermediary factors of child health inequities and how they operate through the communities where children live. In particular, we explore individual, family and community level characteristics associated with a composite indicator that quantitatively measures intermediary determinants of early childhood health in Colombia. We use data from the 2010 Colombian Demographic and Health Survey (DHS). Adopting the conceptual framework of the Commission on Social Determinants of Health (CSDH), three dimensions related to child health are represented in the index: behavioural factors, psychosocial factors and health system. In order to generate the weight of the variables and take into account the discrete nature of the data, principal component analysis (PCA) using polychoric correlations are employed in the index construction. Weighted multilevel models are used to examine community effects. The results show that the effect of household’s SES is attenuated when community characteristics are included, indicating the importance that the level of community development may have in mediating individual and family characteristics. The findings indicate that there is a significant variance in intermediary determinants of child health between-community, especially for those determinants linked to the health system, even after controlling for individual, family and community characteristics. These results likely reflect that whilst the community context can exert a greater influence on intermediary factors linked directly to health, in the case of psychosocial factors and the parent’s behaviours, the family context can be more important. This underlines the importance of distinguishing between community and family intervention programmes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In an earlier investigation (Burger et al., 2000) five sediment cores near the RodriguesTriple Junction in the Indian Ocean were studied applying classical statistical methods(fuzzy c-means clustering, linear mixing model, principal component analysis) for theextraction of endmembers and evaluating the spatial and temporal variation ofgeochemical signals. Three main factors of sedimentation were expected by the marinegeologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. Thedisplay of fuzzy membership values and/or factor scores versus depth providedconsistent results for two factors only; the ultra-basic component could not beidentified. The reason for this may be that only traditional statistical methods wereapplied, i.e. the untransformed components were used and the cosine-theta coefficient assimilarity measure.During the last decade considerable progress in compositional data analysis was madeand many case studies were published using new tools for exploratory analysis of thesedata. Therefore it makes sense to check if the application of suitable data transformations,reduction of the D-part simplex to two or three factors and visualinterpretation of the factor scores would lead to a revision of earlier results and toanswers to open questions . In this paper we follow the lines of a paper of R. Tolosana-Delgado et al. (2005) starting with a problem-oriented interpretation of the biplotscattergram, extracting compositional factors, ilr-transformation of the components andvisualization of the factor scores in a spatial context: The compositional factors will beplotted versus depth (time) of the core samples in order to facilitate the identification ofthe expected sources of the sedimentary process.Kew words: compositional data analysis, biplot, deep sea sediments

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In order to obtain a high-resolution Pleistocene stratigraphy, eleven continuouslycored boreholes, 100 to 220m deep were drilled in the northern part of the PoPlain by Regione Lombardia in the last five years. Quantitative provenanceanalysis (QPA, Weltje and von Eynatten, 2004) of Pleistocene sands was carriedout by using multivariate statistical analysis (principal component analysis, PCA,and similarity analysis) on an integrated data set, including high-resolution bulkpetrography and heavy-mineral analyses on Pleistocene sands and of 250 majorand minor modern rivers draining the southern flank of the Alps from West toEast (Garzanti et al, 2004; 2006). Prior to the onset of major Alpine glaciations,metamorphic and quartzofeldspathic detritus from the Western and Central Alpswas carried from the axial belt to the Po basin longitudinally parallel to theSouthAlpine belt by a trunk river (Vezzoli and Garzanti, 2008). This scenariorapidly changed during the marine isotope stage 22 (0.87 Ma), with the onset ofthe first major Pleistocene glaciation in the Alps (Muttoni et al, 2003). PCA andsimilarity analysis from core samples show that the longitudinal trunk river at thistime was shifted southward by the rapid southward and westward progradation oftransverse alluvial river systems fed from the Central and Southern Alps.Sediments were transported southward by braided river systems as well as glacialsediments transported by Alpine valley glaciers invaded the alluvial plain.Kew words: Detrital modes; Modern sands; Provenance; Principal ComponentsAnalysis; Similarity, Canberra Distance; palaeodrainage

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ecological niche modelling was used to predict the potential geographical distribution of Rhodnius nasutus Stål and Rhodnius neglectus Lent, in Brazil and to investigate the niche divergence between these morphologically similar triatomine species. The distribution of R. neglectus covered mainly the cerrado of Central Brazil, but the prediction maps also revealed its occurrence in transitional areas within the caatinga, Pantanal and Amazon biomes. The potential distribution of R. nasutus covered the Northeastern Region of Brazil in the semi-arid caatinga and the Maranhão babaçu forests. Clear ecological niche differences between these species were observed. R. nasutus occurred more in warmer and drier areas than R. neglectus. In the principal component analysis PC1 was correlated with altitude and temperature (mainly temperature in the coldest and driest months) and PC2 with vegetation index and precipitation. The prediction maps support potential areas of co-occurrence for these species in the Maranhão babaçu forests and in caatinga/cerrado transitional areas, mainly in state of Piaui. Entomologists engaged in Chagas disease vector surveillance should be aware that R. neglectus and R. nasutus can occur in the same localities of Northeastern Brazil. Thus, the identification of bugs in these areas should be improved by applying morphometrical and/or molecular methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To classify mosquito species based on common features of their habitats, samples were obtained fortnightly between June 2001-October 2003 in the subtropical province of Chaco, Argentina. Data on the type of larval habitat, nature of the habitat (artificial or natural), size, depth, location related to sunlight, distance to the neighbouring houses, type of substrate, organic material, vegetation and algae type and their presence were collected. Data on the permanence, temperature, pH, turbidity, colour, odour and movement of the larval habitat's water were also collected. From the cluster analysis, three groups of species associated by their degree of habitat similarity were obtained and are listed below. Group 1 consisted of Aedes aegypti. Group 2 consisted of Culex imitator, Culex davisi, Wyeomyia muehlensi and Toxorhynchites haemorrhoidalis separatus. Within group 3, two subgroups are distinguished: A (Psorophora ferox, Psorophora cyanescens, Psorophora varinervis, Psorophora confinnis, Psorophora cingulata, Ochlerotatus hastatus-oligopistus, Ochlerotatus serratus, Ochlerotatus scapularis, Culex intrincatus, Culex quinquefasciatus, Culex pilosus, Ochlerotatus albifasciatus, Culex bidens) and B (Culex maxi, Culex eduardoi, Culex chidesteri, Uranotaenia lowii, Uranotaenia pulcherrima, Anopheles neomaculipalpus, Anopheles triannulatus, Anopheles albitarsis, Uranotaenia apicalis, Mansonia humeralis and Aedeomyia squamipennis). Principal component analysis indicates that the size of the larval habitats and the presence of aquatic vegetation are the main characteristics that explain the variation among different species. In contrast, water permanence is second in importance. Water temperature, pH and the type of larval habitat are less important in explaining the clustering of species.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En aquest treball, es proposa un nou mètode per estimar en temps real la qualitat del producte final en processos per lot. Aquest mètode permet reduir el temps necessari per obtenir els resultats de qualitat de les anàlisi de laboratori. S'utiliza un model de anàlisi de componentes principals (PCA) construït amb dades històriques en condicions normals de funcionament per discernir si un lot finalizat és normal o no. Es calcula una signatura de falla pels lots anormals i es passa a través d'un model de classificació per la seva estimació. L'estudi proposa un mètode per utilitzar la informació de les gràfiques de contribució basat en les signatures de falla, on els indicadors representen el comportament de les variables al llarg del procés en les diferentes etapes. Un conjunt de dades compost per la signatura de falla dels lots anormals històrics es construeix per cercar els patrons i entrenar els models de classifcació per estimar els resultas dels lots futurs. La metodologia proposada s'ha aplicat a un reactor seqüencial per lots (SBR). Diversos algoritmes de classificació es proven per demostrar les possibilitats de la metodologia proposada.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In many socially monogamous birds, both partners perform extrapair copulations (EPC). As this behaviour potentially inflicts direct costs on females, they are currently hypothesized to search for genetic benefits for descendants, either as 'good' or 'complementary' genes. Although these hypotheses have found some support, several studies failed to find any beneficial consequence of EPC, and whether this behaviour is adaptive to females is subject to discussion. Here, we test these two hypotheses in a natural population of blue tits by accounting for the effect of most parameters known to potentially affect extrapair fertilization. Results suggest that female body mass affected the type of extrapair genetic benefits obtained. Heavy females obtained extrapair fertilizations when their social male was of low quality (as reflected by sexual display) and produced larger extrapair than within-pair chicks. Lean females obtained extrapair fertilizations when their social mate was genetically similar, thereby producing more heterozygous extrapair chicks. Our results suggest that mating patterns may be condition-dependent.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Although the reported aetiological agent of cutaneous leishmaniasis (CL) in Sri Lanka is Leishmania donovani, the sandfly vector remains unknown. Ninety-five sandflies, 60 females and 35 males, collected in six localities in the district of Matale, central Sri Lanka, close to current active transmission foci of CL were examined for taxonomically relevant characteristics. Eleven diagnostic morphological characters for female sandflies were compared with measurements described for Indian and Sri Lankan sandflies, including the now recognised Phlebotomus argentipes sensu lato species complex. The mean morphometric measurements of collected female sandflies differed significantly from published values for P. argentipes morphospecies B, now re-identified as Phlebotomus annandalei from Delft Island and northern Sri Lanka, from recently re-identified P. argentipes s.s. sibling species and from Phlebotomus glaucus. Furthermore, analysis of underlying variation in the morphometric data through principal component analysis also illustrated differences between the population described herein and previously recognised members of the P. argentipes species complex. Collectively, these results suggest that a morphologically distinct population, perhaps most closely related to P. glaucus of the P. argentipess. I. species complex, exists in areas of active CL transmission. Thus, research is required to determine the ability of this population of flies to transmit cutaneous leishmaniasis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Analyzing functional data often leads to finding common factors, for which functional principal component analysis proves to be a useful tool to summarize and characterize the random variation in a function space. The representation in terms of eigenfunctions is optimal in the sense of L-2 approximation. However, the eigenfunctions are not always directed towards an interesting and interpretable direction in the context of functional data and thus could obscure the underlying structure. To overcome such difficulty, an alternative to functional principal component analysis is proposed that produces directed components which may be more informative and easier to interpret. These structural components are similar to principal components, but are adapted to situations in which the domain of the function may be decomposed into disjoint intervals such that there is effectively independence between intervals and positive correlation within intervals. The approach is demonstrated with synthetic examples as well as real data. Properties for special cases are also studied.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Imaging mass spectrometry (IMS) represents an innovative tool in the cancer research pipeline, which is increasingly being used in clinical and pharmaceutical applications. The unique properties of the technique, especially the amount of data generated, make the handling of data from multiple IMS acquisitions challenging. This work presents a histology-driven IMS approach aiming to identify discriminant lipid signatures from the simultaneous mining of IMS data sets from multiple samples. The feasibility of the developed workflow is evaluated on a set of three human colorectal cancer liver metastasis (CRCLM) tissue sections. Lipid IMS on tissue sections was performed using MALDI-TOF/TOF MS in both negative and positive ionization modes after 1,5-diaminonaphthalene matrix deposition by sublimation. The combination of both positive and negative acquisition results was performed during data mining to simplify the process and interrogate a larger lipidome into a single analysis. To reduce the complexity of the IMS data sets, a sub data set was generated by randomly selecting a fixed number of spectra from a histologically defined region of interest, resulting in a 10-fold data reduction. Principal component analysis confirmed that the molecular selectivity of the regions of interest is maintained after data reduction. Partial least-squares and heat map analyses demonstrated a selective signature of the CRCLM, revealing lipids that are significantly up- and down-regulated in the tumor region. This comprehensive approach is thus of interest for defining disease signatures directly from IMS data sets by the use of combinatory data mining, opening novel routes of investigation for addressing the demands of the clinical setting.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND Differences in the distribution of genotypes between individuals of the same ethnicity are an important confounder factor commonly undervalued in typical association studies conducted in radiogenomics. OBJECTIVE To evaluate the genotypic distribution of SNPs in a wide set of Spanish prostate cancer patients for determine the homogeneity of the population and to disclose potential bias. DESIGN SETTING AND PARTICIPANTS A total of 601 prostate cancer patients from Andalusia, Basque Country, Canary and Catalonia were genotyped for 10 SNPs located in 6 different genes associated to DNA repair: XRCC1 (rs25487, rs25489, rs1799782), ERCC2 (rs13181), ERCC1 (rs11615), LIG4 (rs1805388, rs1805386), ATM (rs17503908, rs1800057) and P53 (rs1042522). The SNP genotyping was made in a Biotrove OpenArray® NT Cycler. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS Comparisons of genotypic and allelic frequencies among populations, as well as haplotype analyses were determined using the web-based environment SNPator. Principal component analysis was made using the SnpMatrix and XSnpMatrix classes and methods implemented as an R package. Non-supervised hierarchical cluster of SNP was made using MultiExperiment Viewer. RESULTS AND LIMITATIONS We observed that genotype distribution of 4 out 10 SNPs was statistically different among the studied populations, showing the greatest differences between Andalusia and Catalonia. These observations were confirmed in cluster analysis, principal component analysis and in the differential distribution of haplotypes among the populations. Because tumor characteristics have not been taken into account, it is possible that some polymorphisms may influence tumor characteristics in the same way that it may pose a risk factor for other disease characteristics. CONCLUSION Differences in distribution of genotypes within different populations of the same ethnicity could be an important confounding factor responsible for the lack of validation of SNPs associated with radiation-induced toxicity, especially when extensive meta-analysis with subjects from different countries are carried out.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. METHODS It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. RESULTS Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. CONCLUSIONS All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Natural fluctuations in soil microbial communities are poorly documented because of the inherent difficulty to perform a simultaneous analysis of the relative abundances of multiple populations over a long time period. Yet, it is important to understand the magnitudes of community composition variability as a function of natural influences (e.g., temperature, plant growth, or rainfall) because this forms the reference or baseline against which external disturbances (e.g., anthropogenic emissions) can be judged. Second, definition of baseline fluctuations in complex microbial communities may help to understand at which point the systems become unbalanced and cannot return to their original composition. In this paper, we examined the seasonal fluctuations in the bacterial community of an agricultural soil used for regular plant crop production by using terminal restriction fragment length polymorphism profiling (T-RFLP) of the amplified 16S ribosomal ribonucleic acid (rRNA) gene diversity. Cluster and statistical analysis of T-RFLP data showed that soil bacterial communities fluctuated very little during the seasons (similarity indices between 0.835 and 0.997) with insignificant variations in 16S rRNA gene richness and diversity indices. Despite overall insignificant fluctuations, between 8 and 30% of all terminal restriction fragments changed their relative intensity in a significant manner among consecutive time samples. To determine the magnitude of community variations induced by external factors, soil samples were subjected to either inoculation with a pure bacterial culture, addition of the herbicide mecoprop, or addition of nutrients. All treatments resulted in statistically measurable changes of T-RFLP profiles of the communities. Addition of nutrients or bacteria plus mecoprop resulted in bacteria composition, which did not return to the original profile within 14 days. We propose that at less than 70% similarity in T-RFLP, the bacterial communities risk to drift apart to inherently different states.