832 resultados para PRINCIPAL COMPONENTS
Resumo:
Nuevas biotecnologías permiten obtener información para caracterizar materiales genéticos a partir de múltiples marcadores, ya sean éstos moleculares y/o morfológicos. La ordenación del material genético a través de la exploración de patrones de variabilidad multidimensionales se aborda mediante diversas técnicas de análisis multivariado. Las técnicas multivariadas de reducción de dimensión (TRD) y la representación gráfica de las mismas cobran sustancial importancia en la visualización de datos multivariados en espacios de baja dimensión ya que facilitan la interpretación de interrelaciones entre las variables (marcadores) y entre los casos u observaciones bajo análisis. Tanto el Análisis de Componentes Principales, como el Análisis de Coordenadas Principales y el Análisis de Procrustes Generalizado son TRD aplicables a datos provenientes de marcadores moleculares y/o morfológicos. Los Árboles de Mínimo Recorrido y los biplots constituyen técnicas para lograr representaciones geométricas de resultados provenientes de TRD. En este trabajo se describen estas técnicas multivariadas y se ilustran sus aplicaciones sobre dos conjuntos de datos, moleculares y morfológicos, usados para caracterizar material genético fúngico.
Resumo:
The record of eolian deposition on the Ontong Java Plateau (OJP) since the Oligocene (approximately 33 Ma) has been investigated using dust grain size, dust flux, and dust mineralogy, with the goal of interpreting the paleoclimatology and paleometeorology of the western equatorial Pacific. Studies of modern dust dispersal in the Pacific have indicated that the equatorial regions receive contributions from both the Northern Hemisphere westerly winds and the equatorial easterlies; limited meteorological data suggest that low-altitude westerlies could also transport dust to OJP from proximal sources in the western Pacific. Previous studies have established the characteristics of the grain-size, flux, and mineralogy records of dust deposited in the North Pacific by the mid-latitude westerlies and in the eastern equatorial Pacific by the low-latitude easterlies since the Oligocene. By comparing the OJP records with the well-defined records of the mid-latitude westerlies and the low-latitude easterlies, the importance of multiple sources of dust to OJP can be recognized. OJP dust is composed of quartz, illite, kaolinite/chlorite, plagioclase feldspar, smectite, and heulandite. Mineral abundance profiles and principal components analysis (PCA) of the mineral abundance data have been used to identify assemblages of minerals that covary through all or part of the OJP record. Abundances of quartz, illite, and kaolinite/chlorite covary throughout the interval studied, defining a mineralogical assemblage supplied from Asia. Some plagioclase and smectite were also supplied as part of this assemblage during the late Miocene and Pliocene/Pleistocene, but other source areas have supplied significant amounts of plagioclase, smectite, and heulandite to OJP since the Oligocene. OJP dust is generally coarser than dust deposited by the Northern Hemisphere westerlies or the equatorial easterlies, and it accumulates more rapidly by 1-2 orders of magnitude. These relationships indicate the importance of the local sources on dust deposition at OJP. The grain-size and flux records of OJP dust do not exhibit most of the events observed in the corresponding records of the Northern Hemisphere westerlies or the equatorial easterlies, because these features are masked by the mixing of dust from several sources at OJP. The abundance record of the Asian dust assemblage at OJP, however, does contain most of the features characteristic of dust flux by means of the Northern Hemisphere westerlies, indicating that the paleoclimatic and paleometeorologic signal of a particular source area and wind system can be preserved in areas well beyond the region dominated by that source and those winds. Identifying such a signal requires "unmixing" the various dust assemblages, which can be accomplished by combining grain-size, flux, and mineralogic data.
Resumo:
Microbial communities and their associated metabolic activity in marine sediments have a profound impact on global biogeochemical cycles. Their composition and structure are attributed to geochemical and physical factors, but finding direct correlations has remained a challenge. Here we show a significant statistical relationship between variation in geochemical composition and prokaryotic community structure within deep-sea sediments. We obtained comprehensive geochemical data from two gravity cores near the hydrothermal vent field Loki's Castle at the Arctic Mid-Ocean Ridge, in the Norwegian-Greenland Sea. Geochemical properties in the rift valley sediments exhibited strong centimeter-scale stratigraphic variability. Microbial populations were profiled by pyrosequencing from 15 sediment horizons (59,364 16S rRNA gene tags), quantitatively assessed by qPCR, and phylogenetically analyzed. Although the same taxa were generally present in all samples, their relative abundances varied substantially among horizons and fluctuated between Bacteria- and Archaea-dominated communities. By independently summarizing covariance structures of the relative abundance data and geochemical data, using principal components analysis, we found a significant correlation between changes in geochemical composition and changes in community structure. Differences in organic carbon and mineralogy shaped the relative abundance of microbial taxa. We used correlations to build hypotheses about energy metabolisms, particularly of the Deep Sea Archaeal Group, specific Deltaproteobacteria, and sediment lineages of potentially anaerobic Marine Group I Archaea. We demonstrate that total prokaryotic community structure can be directly correlated to geochemistry within these sediments, thus enhancing our understanding of biogeochemical cycling and our ability to predict metabolisms of uncultured microbes in deep-sea sediments.
Resumo:
Benthic foraminifers were studied from lower Paleocene through upper Oligocene sections from Sites 747 and 748. The composition of the benthic foraminifer species suggests a middle to lower bathyal (600-2000 m) paleodepth during the Neogene and a probable upper abyssal (2000-3000 m) paleodepth during the Paleocene at Site 747. Site 748 is thought to have remained at middle to lower bathyal paleodepths throughout the Cenozoic. Principal component analysis distinguished four major benthic foraminifer assemblages: (1) a Paleocene Stensioina beccariiformis assemblage at Sites 747 and 748, (2) an early Eocene Nuttallides truempyi assemblage at lower bathyal Site 747, (3) an early through middle Eocene Stilostomella-Lenticulina assemblage at middle bathyal Site 748, and (4) a latest Eocene through Oligocene Cibicidoides-Astrononion pusillum assemblage at both sites. Major benthic foraminifer changes, as indicated by the principal components and first and last appearances, occurred at or close to the Paleocene/Eocene boundary, and in the late Eocene close to the middle/late Eocene boundary.
Resumo:
Late Pleistocene sea level has been reconstructed from ocean sediment core data using a wide variety of proxies and models. However, the accuracy of individual reconstructions is limited by measurement error, local variations in salinity and temperature, and assumptions particular to each technique. Here we present a sea level stack (average) which increases the signal-to-noise ratio of individual reconstructions. Specifically, we perform principal component analysis (PCA) on seven records from 0-430 ka and five records from 0-798 ka. The first principal component, which we use as the stack, describes ~80 % of the variance in the data and is similar using either five or seven records. After scaling the stack based on Holocene and Last Glacial Maximum (LGM) sea level estimates, the stack agrees to within 5 m with isostatically adjusted coral sea level estimates for Marine Isotope Stages 5e and 11 (125 and 400 ka, respectively). When we compare the sea level stack with the d18O of benthic foraminifera, we find that sea level change accounts for about ~40 % of the total orbital-band variance in benthic d18O, compared to a 65 % contribution during the LGM-to-Holocene transition. Additionally, the second and third principal components of our analyses reflect differences between proxy records associated with spatial variations in the d18O of seawater.
Resumo:
As the ocean undergoes acidification, marine organisms will become increasingly exposed to reduced pH, yet variability in many coastal settings complicates our ability to accurately estimate pH exposure for those organisms that are difficult to track. Here we present shell-based geochemical proxies that reflect pH exposure from laboratory and field settings in larvae of the mussels Mytilus californianus and M. galloprovincialis. Laboratory-based proxies were generated from shells precipitated at pH 7.51 to 8.04. U/Ca, Sr/Ca, and multielemental signatures represented as principal components varied with pH for both species. Of these, U/Ca was the best predictor of pH and did not vary with larval size, with semidiurnal pH fluctuations, or with oxygen concentration. Field applications of U/Ca were tested with mussel larvae reared in situ at both known and unknown pH conditions. Larval shells precipitated in a region of greater upwelling had higher U/Ca, and these U/Ca values corresponded well with the laboratory-derived U/Ca-pH proxy. Retention of the larval shell after settlement in molluscs allows use of this geochemical proxy to assess ocean acidification effects on marine populations.
Resumo:
Non-destructive, visual evaluation and mechanical testing techniques were used to assess the structural properties of 374 samples of chestnut (Castanea sativa). The principal components method was applied to establish and interpret correlations between variables obtained of modulus of elasticity, bending strength and density. The static modulus of elasticity presented higher correlation values than those obtained using non-destructive methods. Bending strength presented low correlations with the non-destructive parameters, but there was some relation to the different knot ratios defined. The relationship was stronger with the most widely used ratio, CKDR. No significant correlations were observed between any of the variables and density.
Resumo:
Data from an attitudinal survey and stated preference ranking experiment conducted in two urban European interchanges (i.e. City-HUBs) in Madrid (Spain) and Thessaloniki (Greece) show that the importance that City-HUBs users attach to the intermodal infrastructure varies strongly as a function of their perceptions of time spent in the interchange (i.e.intermodal transfer and waiting time). A principal components analysis allocates respondents (i.e. city-HUB users) to two classes with substantially different perceptions of time saving when they make a transfer and of time using during their waiting time.
Resumo:
The application of the Electro-Mechanical Impedance (EMI) method for damage detection in Structural Health Monitoring has noticeable increased in recent years. EMI method utilizes piezoelectric transducers for directly measuring the mechanical properties of the host structure, obtaining the so called impedance measurement, highly influenced by the variations of dynamic parameters of the structure. These measurements usually contain a large number of frequency points, as well as a high number of dimensions, since each frequency range swept can be considered as an independent variable. That makes this kind of data hard to handle, increasing the computational costs and being substantially time-consuming. In that sense, the Principal Component Analysis (PCA)-based data compression has been employed in this work, in order to enhance the analysis capability of the raw data. Furthermore, a Support Vector Machine (SVM), which has been widespread used in machine learning and pattern recognition fields, has been applied in this study in order to model any possible existing pattern in the PCAcompress data, using for that just the first two Principal Components. Different known non-damaged and damaged measurements of an experimental tested beam were used as training input data for the SVM algorithm, using as test input data the same amount of cases measured in beams with unknown structural health conditions. Thus, the purpose of this work is to demonstrate how, with a few impedance measurements of a beam as raw data, its healthy status can be determined based on pattern recognition procedures.
Resumo:
Large-scale circulations patterns (ENSO, NAO) have been shown to have a significant impact on seasonal weather, and therefore on crop yield over many parts of the world(Garnett and Khandekar, 1992; Aasa et al., 2004; Rozas and Garcia-Gonzalez, 2012). In this study, we analyze the influence of large-scale circulation patterns and regional climate on the principal components of maize yield variability in Iberian Peninsula (IP) using reanalysis datasets. Additionally, we investigate the modulation of these relationships by multidecadal patterns. This study is performed analyzing long time series of maize yield, only climate dependent, computed with the crop model CERES-maize (Jones and Kiniry, 1986) included in Decision Support System for Agrotechnology Transfer (DSSAT v.4.5).
Resumo:
Las patologías de la voz se han transformado en los últimos tiempos en una problemática social con cierto calado. La contaminación de las ciudades, hábitos como el de fumar, el uso de aparatos de aire acondicionado, etcétera, contribuyen a ello. Esto alcanza más relevancia en profesionales que utilizan su voz de manera frecuente, como, por ejemplo, locutores, cantantes, profesores o teleoperadores. Por todo ello resultan de especial interés las técnicas de ayuda al diagnóstico que son capaces de extraer conclusiones clínicas a partir de una muestra de la voz grabada con un micrófono, frente a otras invasivas que implican la exploración utilizando laringoscopios, fibroscopios o videoendoscopios, técnicas en cualquier caso mucho más molestas para los pacientes al exigir la introducción parcial del instrumental citado por la garganta, en actuaciones consideradas de tipo quirúrgico. Dentro de aquellas técnicas se ha avanzado mucho en un período de tiempo relativamente corto. En lo que se refiere al diagnóstico de patologías, hemos pasado en los últimos quince años de trabajar principalmente con parámetros extraídos de la señal de voz –tanto en el dominio del tiempo como en el de la frecuencia– y con escalas elaboradas con valoraciones subjetivas realizadas por expertos a hacerlo también con parámetros procedentes de estimaciones de la fuente glótica. La importancia de utilizar la fuente glótica reside, a grandes rasgos, en que se trata de una señal vinculada directamente al estado de la estructura laríngea del locutor y también en que está generalmente menos influida por el tracto vocal que la señal de voz. Es conocido que el tracto vocal guarda más relación con el mensaje hablado, y su presencia dificulta el proceso de detección de patología vocal. Estas estimaciones de la fuente glótica han sido obtenidas a través de técnicas de filtrado inverso desarrolladas por nuestro grupo de investigación. Hemos conseguido, además, profundizar en la naturaleza de la señal glótica: somos capaces de descomponerla y relacionarla con parámetros biomecánicos de los propios pliegues vocales, obteniendo estimaciones de elementos como la masa, la pérdida de energía o la elasticidad del cuerpo y de la cubierta del pliegue, entre otros. De las componentes de la fuente glótica surgen también los denominados parámetros biométricos, relacionados con la forma de la señal, que constituyen por sí mismos una firma biométrica del individuo. También trabajaremos con parámetros temporales, relacionados con las diferentes etapas que se observan dentro de la señal glótica durante un ciclo de fonación. Por último, consideraremos parámetros clásicos de perturbación y energía de la señal. En definitiva, contamos ahora con una considerable cantidad de parámetros glóticos que conforman una base estadística multidimensional, destinada a ser capaz de discriminar personas con voces patológicas o disfónicas de aquellas que no presentan patología en la voz o con voces sanas o normofónicas. Esta tesis doctoral se ocupa de varias cuestiones: en primer lugar, es necesario analizar cuidadosamente estos nuevos parámetros, por lo que ofreceremos una completa descripción estadística de los mismos. También estudiaremos cuestiones como la distribución de los parámetros atendiendo a criterios como el de normalidad estadística de los mismos, ocupándonos especialmente de la diferencia entre las distribuciones que presentan sujetos sanos y sujetos con patología vocal. Para todo ello emplearemos diferentes técnicas estadísticas: generación de elementos y diagramas descriptivos, pruebas de normalidad y diversos contrastes de hipótesis, tanto paramétricos como no paramétricos, que considerarán la diferencia entre los grupos de personas sanas y los grupos de personas con alguna patología relacionada con la voz. Además, nos interesa encontrar relaciones estadísticas entre los parámetros, de cara a eliminar posibles redundancias presentes en el modelo, a reducir la dimensionalidad del problema y a establecer un criterio de importancia relativa en los parámetros en cuanto a su capacidad discriminante para el criterio patológico/sano. Para ello se aplicarán técnicas estadísticas como la Correlación Lineal Bivariada y el Análisis Factorial basado en Componentes Principales. Por último, utilizaremos la conocida técnica de clasificación Análisis Discriminante, aplicada a diferentes combinaciones de parámetros y de factores, para determinar cuáles de ellas son las que ofrecen tasas de acierto más prometedoras. Para llevar a cabo la experimentación se ha utilizado una base de datos equilibrada y robusta formada por doscientos sujetos, cien de ellos pertenecientes al género femenino y los restantes cien al género masculino, con una proporción también equilibrada entre los sujetos que presentan patología vocal y aquellos que no la presentan. Una de las aplicaciones informáticas diseñada para llevar a cabo la recogida de muestras también es presentada en esta tesis. Los distintos estudios estadísticos realizados nos permitirán identificar aquellos parámetros que tienen una mayor contribución a la hora de detectar la presencia de patología vocal. Alguno de los estudios, además, nos permitirá presentar una ordenación de los parámetros en base a su importancia para realizar la detección. Por otra parte, también concluiremos que en ocasiones es conveniente realizar una reducción de la dimensionalidad de los parámetros para mejorar las tasas de detección. Por fin, las propias tasas de detección constituyen quizá la conclusión más importante del trabajo. Todos los análisis presentes en el trabajo serán realizados para cada uno de los dos géneros, de acuerdo con diversos estudios previos que demuestran que los géneros masculino y femenino deben tratarse de forma independiente debido a las diferencias orgánicas observadas entre ambos. Sin embargo, en lo referente a la detección de patología vocal contemplaremos también la posibilidad de trabajar con la base de datos unificada, comprobando que las tasas de acierto son también elevadas. Abstract Voice pathologies have become recently in a social problem that has reached a certain concern. Pollution in cities, smoking habits, air conditioning, etc. contributes to it. This problem is more relevant for professionals who use their voice frequently: speakers, singers, teachers, actors, telemarketers, etc. Therefore techniques that are capable of drawing conclusions from a sample of the recorded voice are of particular interest for the diagnosis as opposed to other invasive ones, involving exploration by laryngoscopes, fiber scopes or video endoscopes, which are techniques much less comfortable for patients. Voice quality analysis has come a long way in a relatively short period of time. In regard to the diagnosis of diseases, we have gone in the last fifteen years from working primarily with parameters extracted from the voice signal (both in time and frequency domains) and with scales drawn from subjective assessments by experts to produce more accurate evaluations with estimates derived from the glottal source. The importance of using the glottal source resides broadly in that this signal is linked to the state of the speaker's laryngeal structure. Unlike the voice signal (phonated speech) the glottal source, if conveniently reconstructed using adaptive lattices, may be less influenced by the vocal tract. As it is well known the vocal tract is related to the articulation of the spoken message and its influence complicates the process of voice pathology detection, unlike when using the reconstructed glottal source, where vocal tract influence has been almost completely removed. The estimates of the glottal source have been obtained through inverse filtering techniques developed by our research group. We have also deepened into the nature of the glottal signal, dissecting it and relating it to the biomechanical parameters of the vocal folds, obtaining several estimates of items such as mass, loss or elasticity of cover and body of the vocal fold, among others. From the components of the glottal source also arise the so-called biometric parameters, related to the shape of the signal, which are themselves a biometric signature of the individual. We will also work with temporal parameters related to the different stages that are observed in the glottal signal during a cycle of phonation. Finally, we will take into consideration classical perturbation and energy parameters. In short, we have now a considerable amount of glottal parameters in a multidimensional statistical basis, designed to be able to discriminate people with pathologic or dysphonic voices from those who do not show pathology. This thesis addresses several issues: first, a careful analysis of these new parameters is required, so we will offer a complete statistical description of them. We will also discuss issues such as distribution of the parameters, considering criteria such as their statistical normality. We will take special care in the analysis of the difference between distributions from healthy subjects and the distributions from pathological subjects. To reach these goals we will use different statistical techniques such as: generation of descriptive items and diagramas, tests for normality and hypothesis testing, both parametric and nonparametric. These latter techniques consider the difference between the groups of healthy subjects and groups of people with an illness related to voice. In addition, we are interested in finding statistical relationships between parameters. There are various reasons behind that: eliminate possible redundancies in the model, reduce the dimensionality of the problem and establish a criterion of relative importance in the parameters. The latter reason will be done in terms of discriminatory power for the criterion pathological/healthy. To this end, statistical techniques such as Bivariate Linear Correlation and Factor Analysis based on Principal Components will be applied. Finally, we will use the well-known technique of Discriminant Analysis classification applied to different combinations of parameters and factors to determine which of these combinations offers more promising success rates. To perform the experiments we have used a balanced and robust database, consisting of two hundred speakers, one hundred of them males and one hundred females. We have also used a well-balanced proportion where subjects with vocal pathology as well as subjects who don´t have a vocal pathology are equally represented. A computer application designed to carry out the collection of samples is also presented in this thesis. The different statistical analyses performed will allow us to determine which parameters contribute in a more decisive way in the detection of vocal pathology. Therefore, some of the analyses will even allow us to present a ranking of the parameters based on their importance for the detection of vocal pathology. On the other hand, we will also conclude that it is sometimes desirable to perform a dimensionality reduction in order to improve the detection rates. Finally, detection rates themselves are perhaps the most important conclusion of the work. All the analyses presented in this work have been performed for each of the two genders in agreement with previous studies showing that male and female genders should be treated independently, due to the observed functional differences between them. However, with regard to the detection of vocal pathology we will consider the possibility of working with the unified database, ensuring that the success rates obtained are also high.
Resumo:
Fourier transform-infrared/statistics models demonstrate that the malignant transformation of morphologically normal human ovarian and breast tissues involves the creation of a high degree of structural modification (disorder) in DNA, before restoration of order in distant metastases. Order–disorder transitions were revealed by methods including principal components analysis of infrared spectra in which DNA samples were represented by points in two-dimensional space. Differences between the geometric sizes of clusters of points and between their locations revealed the magnitude of the order–disorder transitions. Infrared spectra provided evidence for the types of structural changes involved. Normal ovarian DNAs formed a tight cluster comparable to that of normal human blood leukocytes. The DNAs of ovarian primary carcinomas, including those that had given rise to metastases, had a high degree of disorder, whereas the DNAs of distant metastases from ovarian carcinomas were relatively ordered. However, the spectra of the metastases were more diverse than those of normal ovarian DNAs in regions assigned to base vibrations, implying increased genetic changes. DNAs of normal female breasts were substantially disordered (e.g., compared with the human blood leukocytes) as were those of the primary carcinomas, whether or not they had metastasized. The DNAs of distant breast cancer metastases were relatively ordered. These findings evoke a unified theory of carcinogenesis in which the creation of disorder in the DNA structure is an obligatory process followed by the selection of ordered, mutated DNA forms that ultimately give rise to metastases.
Resumo:
Recent experiments using electrical and N-methyl-d-aspartate microstimulation of the spinal cord gray matter and cutaneous stimulation of the hindlimb of spinalized frogs have provided evidence for a modular organization of the frog’s spinal cord circuitry. A “module” is a functional unit in the spinal cord circuitry that generates a specific motor output by imposing a specific pattern of muscle activation. The output of a module can be characterized as a force field: the collection of the isometric forces generated at the ankle over different locations in the leg’s workspace. Different modules can be combined independently so that their force fields linearly sum. The goal of this study was to ascertain whether the force fields generated by the activation of supraspinal structures could result from combinations of a small number of modules. We recorded a set of force fields generated by the electrical stimulation of the vestibular nerve in seven frogs, and we performed a principal component analysis to study the dimensionality of this set. We found that 94% of the total variation of the data is explained by the first five principal components, a result that indicates that the dimensionality of the set of fields evoked by vestibular stimulation is low. This result is compatible with the hypothesis that vestibular fields are generated by combinations of a small number of spinal modules.
Resumo:
This paper describes a variety of statistical methods for obtaining precise quantitative estimates of the similarities and differences in the structures of semantic domains in different languages. The methods include comparing mean correlations within and between groups, principal components analysis of interspeaker correlations, and analysis of variance of speaker by question data. Methods for graphical displays of the results are also presented. The methods give convergent results that are mutually supportive and equivalent under suitable interpretation. The methods are illustrated on the semantic domain of emotion terms in a comparison of the semantic structures of native English and native Japanese speaking subjects. We suggest that, in comparative studies concerning the extent to which semantic structures are universally shared or culture-specific, both similarities and differences should be measured and compared rather than placing total emphasis on one or the other polar position.
Resumo:
The global amino acid compositions as deduced from the complete genomic sequences of six thermophilic archaea, two thermophilic bacteria, 17 mesophilic bacteria and two eukaryotic species were analysed by hierarchical clustering and principal components analysis. Both methods showed an influence of several factors on amino acid composition. Although GC content has a dominant effect, thermophilic species can be identified by their global amino acid compositions alone. This study presents a careful statistical analysis of factors that affect amino acid composition and also yielded specific features of the average amino acid composition of thermophilic species. Moreover, we introduce the first example of a ‘compositional tree’ of species that takes into account not only homologous proteins, but also proteins unique to particular species. We expect this simple yet novel approach to be a useful additional tool for the study of phylogeny at the genome level.