85 resultados para Automatic Analysis of Multivariate Categorical Data Sets


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rainfall can be modeled as a spatially correlated random field superimposed on a background mean value; therefore, geostatistical methods are appropriate for the analysis of rain gauge data. Nevertheless, there are certain typical features of these data that must be taken into account to produce useful results, including the generally non-Gaussian mixed distribution, the inhomogeneity and low density of observations, and the temporal and spatial variability of spatial correlation patterns. Many studies show that rigorous geostatistical analysis performs better than other available interpolation techniques for rain gauge data. Important elements are the use of climatological variograms and the appropriate treatment of rainy and nonrainy areas. Benefits of geostatistical analysis for rainfall include ease of estimating areal averages, estimation of uncertainties, and the possibility of using secondary information (e.g., topography). Geostatistical analysis also facilitates the generation of ensembles of rainfall fields that are consistent with a given set of observations, allowing for a more realistic exploration of errors and their propagation in downstream models, such as those used for agricultural or hydrological forecasting. This article provides a review of geostatistical methods used for kriging, exemplified where appropriate by daily rain gauge data from Ethiopia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We provide a unified framework for a range of linear transforms that can be used for the analysis of terahertz spectroscopic data, with particular emphasis on their application to the measurement of leaf water content. The use of linear transforms for filtering, regression, and classification is discussed. For illustration, a classification problem involving leaves at three stages of drought and a prediction problem involving simulated spectra are presented. Issues resulting from scaling the data set are discussed. Using Lagrange multipliers, we arrive at the transform that yields the maximum separation between the spectra and show that this optimal transform is equivalent to computing the Euclidean distance between the samples. The optimal linear transform is compared with the average for all the spectra as well as with the Karhunen–Loève transform to discriminate a wet leaf from a dry leaf. We show that taking several principal components into account is equivalent to defining new axes in which data are to be analyzed. The procedure shows that the coefficients of the Karhunen–Loève transform are well suited to the process of classification of spectra. This is in line with expectations, as these coefficients are built from the statistical properties of the data set analyzed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Overall phylogenetic relationships within the genus Pelargonium (Geraniaceae) were inferred based on DNA sequences from mitochondrial(mt)-encoded nad1 b/c exons and from chloroplast(cp)-encoded trnL (UAA) 5' exon-trnF (GAA) exon regions using two species of Geranium and Sarcocaulon vanderetiae as outgroups. The group II intron between nad1 exons b and c was found to be absent from the Pelargonium, Geranium, and Sarcocaulon sequences presented here as well as from Erodium, which is the first recorded loss of this intron in angiosperms. Separate phylogenetic analyses of the mtDNA and cpDNA data sets produced largely congruent topologies, indicating linkage between mitochondrial and chloroplast genome inheritance. Simultaneous analysis of the combined data sets yielded a well-resolved topology with high clade support exhibiting a basic split into small and large chromosome species, the first group containing two lineages and the latter three. One large chromosome lineage (x = 11) comprises species from sections Myrrhidium and Chorisma and is sister to a lineage comprising P. mutans (x = 11) and species from section Jenkinsonia (x = 9). Sister to these two lineages is a lineage comprising species from sections Ciconium (x = 9) and Subsucculentia (x = 10). Cladistic evaluation of this pattern suggests that x = 11 is the ancestral basic chromosome number for the genus.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is thought that speciation in phytophagous insects is often due to colonization of novel host plants, because radiations of plant and insect lineages are typically asynchronous. Recent phylogenetic comparisons have supported this model of diversification for both insect herbivores and specialized pollinators. An exceptional case where contemporaneous plant insect diversification might be expected is the obligate mutualism between fig trees (Ficus species, Moraceae) and their pollinating wasps (Agaonidae, Hymenoptera). The ubiquity and ecological significance of this mutualism in tropical and subtropical ecosystems has long intrigued biologists, but the systematic challenge posed by >750 interacting species pairs has hindered progress toward understanding its evolutionary history. In particular, taxon sampling and analytical tools have been insufficient for large-scale co-phylogenetic analyses. Here, we sampled nearly 200 interacting pairs of fig and wasp species from across the globe. Two supermatrices were assembled: on average, wasps had sequences from 77% of six genes (5.6kb), figs had sequences from 60% of five genes (5.5 kb), and overall 850 new DNA sequences were generated for this study. We also developed a new analytical tool, Jane 2, for event-based phylogenetic reconciliation analysis of very large data sets. Separate Bayesian phylogenetic analyses for figs and fig wasps under relaxed molecular clock assumptions indicate Cretaceous diversification of crown groups and contemporaneous divergence for nearly half of all fig and pollinator lineages. Event-based co-phylogenetic analyses further support the co-diversification hypothesis. Biogeographic analyses indicate that the presentday distribution of fig and pollinator lineages is consistent with an Eurasian origin and subsequent dispersal, rather than with Gondwanan vicariance. Overall, our findings indicate that the fig-pollinator mutualism represents an extreme case among plant-insect interactions of coordinated dispersal and long-term co-diversification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to determine whether geographical differences impact the composition of bacterial communities present in the airways of cystic fibrosis (CF) patients attending CF centers in the United States or United Kingdom. Thirty-eight patients were matched on the basis of clinical parameters into 19 pairs comprised of one U.S. and one United Kingdom patient. Analysis was performed to determine what, if any, bacterial correlates could be identified. Two culture-independent strategies were used: terminal restriction fragment length polymorphism (T-RFLP) profiling and 16S rRNA clone sequencing. Overall, 73 different terminal restriction fragment lengths were detected, ranging from 2 to 10 for U.S. and 2 to 15 for United Kingdom patients. The statistical analysis of T-RFLP data indicated that patient pairing was successful and revealed substantial transatlantic similarities in the bacterial communities. A small number of bands was present in the vast majority of patients in both locations, indicating that these are species common to the CF lung. Clone sequence analysis also revealed that a number of species not traditionally associated with the CF lung were present in both sample groups. The species number per sample was similar, but differences in species presence were observed between sample groups. Cluster analysis revealed geographical differences in bacterial presence and relative species abundance. Overall, the U.S. samples showed tighter clustering with each other compared to that of United Kingdom samples, which may reflect the lower diversity detected in the U.S. sample group. The impact of cross-infection and biogeography is considered, and the implications for treating CF lung infections also are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel approach to the automatic classification of very large data sets composed of terahertz pulse transient signals, highlighting their potential use in biochemical, biomedical, pharmaceutical and security applications. Two different types of THz spectra are considered in the classification process. Firstly a binary classification study of poly-A and poly-C ribonucleic acid samples is performed. This is then contrasted with a difficult multi-class classification problem of spectra from six different powder samples that although have fairly indistinguishable features in the optical spectrum, they also possess a few discernable spectral features in the terahertz part of the spectrum. Classification is performed using a complex-valued extreme learning machine algorithm that takes into account features in both the amplitude as well as the phase of the recorded spectra. Classification speed and accuracy are contrasted with that achieved using a support vector machine classifier. The study systematically compares the classifier performance achieved after adopting different Gaussian kernels when separating amplitude and phase signatures. The two signatures are presented as feature vectors for both training and testing purposes. The study confirms the utility of complex-valued extreme learning machine algorithms for classification of the very large data sets generated with current terahertz imaging spectrometers. The classifier can take into consideration heterogeneous layers within an object as would be required within a tomographic setting and is sufficiently robust to detect patterns hidden inside noisy terahertz data sets. The proposed study opens up the opportunity for the establishment of complex-valued extreme learning machine algorithms as new chemometric tools that will assist the wider proliferation of terahertz sensing technology for chemical sensing, quality control, security screening and clinic diagnosis. Furthermore, the proposed algorithm should also be very useful in other applications requiring the classification of very large datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the most pervasive assumptions about human brain evolution is that it involved relative enlargement of the frontal lobes. We show that this assumption is without foundation. Analysis of five independent data sets using correctly scaled measures and phylogenetic methods reveals that the size of human frontal lobes, and of specific frontal regions, is as expected relative to the size of other brain structures. Recent claims for relative enlargement of human frontal white matter volume, and for relative enlargement shared by all great apes, seem to be mistaken. Furthermore, using a recently developed method for detecting shifts in evolutionary rates, we find that the rate of change in relative frontal cortex volume along the phylogenetic branch leading to humans was unremarkable and that other branches showed significantly faster rates of change. Although absolute and proportional frontal region size increased rapidly in humans, this change was tightly correlated with corresponding size increases in other areas andwhole brain size, and with decreases in frontal neuron densities. The search for the neural basis of human cognitive uniqueness should therefore focus less on the frontal lobes in isolation and more on distributed neural networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We discuss the modeling of dielectric responses of electromagnetically excited networks which are composed of a mixture of capacitors and resistors. Such networks can be employed as lumped-parameter circuits to model the response of composite materials containing conductive and insulating grains. The dynamics of the excited network systems are studied using a state space model derived from a randomized incidence matrix. Time and frequency domain responses from synthetic data sets generated from state space models are analyzed for the purpose of estimating the fraction of capacitors in the network. Good results were obtained by using either the time-domain response to a pulse excitation or impedance data at selected frequencies. A chemometric framework based on a Successive Projections Algorithm (SPA) enables the construction of multiple linear regression (MLR) models which can efficiently determine the ratio of conductive to insulating components in composite material samples. The proposed method avoids restrictions commonly associated with Archie’s law, the application of percolation theory or Kohlrausch-Williams-Watts models and is applicable to experimental results generated by either time domain transient spectrometers or continuous-wave instruments. Furthermore, it is quite generic and applicable to tomography, acoustics as well as other spectroscopies such as nuclear magnetic resonance, electron paramagnetic resonance and, therefore, should be of general interest across the dielectrics community.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A multivariate fit to the variation in global mean surface air temperature anomaly over the past half century is presented. The fit procedure allows for the effect of response time on the waveform, amplitude and lag of each radiative forcing input, and each is allowed to have its own time constant. It is shown that the contribution of solar variability to the temperature trend since 1987 is small and downward; the best estimate is -1.3% and the 2sigma confidence level sets the uncertainty range of -0.7 to -1.9%. The result is the same if one quantifies the solar variation using galactic cosmic ray fluxes (for which the analysis can be extended back to 1953) or the most accurate total solar irradiance data composite. The rise in the global mean air surface temperatures is predominantly associated with a linear increase that represents the combined effects of changes in anthropogenic well-mixed greenhouse gases and aerosols, although, in recent decades, there is also a considerable contribution by a relative lack of major volcanic eruptions. The best estimate is that the anthropogenic factors contribute 75% of the rise since 1987, with an uncertainty range (set by the 2sigma confidence level using an AR(1) noise model) of 49–160%; thus, the uncertainty is large, but we can state that at least half of the temperature trend comes from the linear term and that this term could explain the entire rise. The results are consistent with the intergovernmental panel on climate change (IPCC) estimates of the changes in radiative forcing (given for 1961–1995) and are here combined with those estimates to find the response times, equilibrium climate sensitivities and pertinent heat capacities (i.e. the depth into the oceans to which a given radiative forcing variation penetrates) of the quasi-periodic (decadal-scale) input forcing variations. As shown by previous studies, the decadal-scale variations do not penetrate as deeply into the oceans as the longer term drifts and have shorter response times. Hence, conclusions about the response to century-scale forcing changes (and hence the associated equilibrium climate sensitivity and the temperature rise commitment) cannot be made from studies of the response to shorter period forcing changes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Baking and 2-g mixograph analyses were performed for 55 cultivars (19 spring and 36 winter wheat) from various quality classes from the 2002 harvest in Poland. An instrumented 2-g direct-drive mixograph was used to study the mixing characteristics of the wheat cultivars. A number of parameters were extracted automatically from each mixograph trace and correlated with baking volume and flour quality parameters (protein content and high molecular weight glutenin subunit [HMW-GS] composition by SDS-PAGE) using multiple linear regression statistical analysis. Principal component analysis of the mixograph data discriminated between four flour quality classes, and predictions of baking volume were obtained using several selected mixograph parameters, chosen using a best subsets regression routine, giving R-2 values of 0.862-0.866. In particular, three new spring wheat strains (CHD 502a-c) recently registered in Poland were highly discriminated and predicted to give high baking volume on the basis of two mixograph parameters: peak bandwidth and 10-min bandwidth.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Robot-mediated therapies offer entirely new approaches to neurorehabilitation. In this paper we present the results obtained from trialling the GENTLE/S neurorehabilitation system assessed using the upper limb section of the Fugl-Meyer ( FM) outcome measure. Methods: We demonstrate the design of our clinical trial and its results analysed using a novel statistical approach based on a multivariate analytical model. This paper provides the rational for using multivariate models in robot-mediated clinical trials and draws conclusions from the clinical data gathered during the GENTLE/S study. Results: The FM outcome measures recorded during the baseline ( 8 sessions), robot-mediated therapy ( 9 sessions) and sling-suspension ( 9 sessions) was analysed using a multiple regression model. The results indicate positive but modest recovery trends favouring both interventions used in GENTLE/S clinical trial. The modest recovery shown occurred at a time late after stroke when changes are not clinically anticipated. Conclusion: This study has applied a new method for analysing clinical data obtained from rehabilitation robotics studies. While the data obtained during the clinical trial is of multivariate nature, having multipoint and progressive nature, the multiple regression model used showed great potential for drawing conclusions from this study. An important conclusion to draw from this paper is that this study has shown that the intervention and control phase both caused changes over a period of 9 sessions in comparison to the baseline. This might indicate that use of new challenging and motivational therapies can influence the outcome of therapies at a point when clinical changes are not expected. Further work is required to investigate the effects arising from early intervention, longer exposure and intensity of the therapies. Finally, more function-oriented robot-mediated therapies or sling-suspension therapies are needed to clarify the effects resulting from each intervention for stroke recovery.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We are developing computational tools supporting the detailed analysis of the dependence of neural electrophysiological response on dendritic morphology. We approach this problem by combining simulations of faithful models of neurons (experimental real life morphological data with known models of channel kinetics) with algorithmic extraction of morphological and physiological parameters and statistical analysis. In this paper, we present the novel method for an automatic recognition of spike trains in voltage traces, which eliminates the need for human intervention. This enables classification of waveforms with consistent criteria across all the analyzed traces and so it amounts to reduction of the noise in the data. This method allows for an automatic extraction of relevant physiological parameters necessary for further statistical analysis. In order to illustrate the usefulness of this procedure to analyze voltage traces, we characterized the influence of the somatic current injection level on several electrophysiological parameters in a set of modeled neurons. This application suggests that such an algorithmic processing of physiological data extracts parameters in a suitable form for further investigation of structure-activity relationship in single neurons.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Variability in the strength of the stratospheric Lagrangian mean meridional or Brewer-Dobson circulation and horizontal mixing into the tropics over the past three decades are examined using observations of stratospheric mean age of air and ozone. We use a simple representation of the stratosphere, the tropical leaky pipe (TLP) model, guided by mean meridional circulation and horizontal mixing changes in several reanalyses data sets and chemistry climate model (CCM) simulations, to help elucidate reasons for the observed changes in stratospheric mean age and ozone. We find that the TLP model is able to accurately simulate multiyear variability in ozone following recent major volcanic eruptions and the early 2000s sea surface temperature changes, as well as the lasting impact on mean age of relatively short-term circulation perturbations. We also find that the best quantitative agreement with the observed mean age and ozone trends over the past three decades is found assuming a small strengthening of the mean circulation in the lower stratosphere, a moderate weakening of the mean circulation in the middle and upper stratosphere, and a moderate increase in the horizontal mixing into the tropics. The mean age trends are strongly sensitive to trends in the horizontal mixing into the tropics, and the uncertainty in the mixing trends causes uncertainty in the mean circulation trends. Comparisons of the mean circulation and mixing changes suggested by the measurements with those from a recent suite of CCM runs reveal significant differences that may have important implications on the accurate simulation of future stratospheric climate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An analysis method for diffusion tensor (DT) magnetic resonance imaging data is described, which, contrary to the standard method (multivariate fitting), does not require a specific functional model for diffusion-weighted (DW) signals. The method uses principal component analysis (PCA) under the assumption of a single fibre per pixel. PCA and the standard method were compared using simulations and human brain data. The two methods were equivalent in determining fibre orientation. PCA-derived fractional anisotropy and DT relative anisotropy had similar signal-to-noise ratio (SNR) and dependence on fibre shape. PCA-derived mean diffusivity had similar SNR to the respective DT scalar, and it depended on fibre anisotropy. Appropriate scaling of the PCA measures resulted in very good agreement between PCA and DT maps. In conclusion, the assumption of a specific functional model for DW signals is not necessary for characterization of anisotropic diffusion in a single fibre.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Advanced Along-Track Scanning Radiometer (AATSR) was launched on Envisat in March 2002. The AATSR instrument is designed to retrieve precise and accurate global sea surface temperature (SST) that, combined with the large data set collected from its predecessors, ATSR and ATSR-2, will provide a long term record of SST data that is greater than 15 years. This record can be used for independent monitoring and detection of climate change. The AATSR validation programme has successfully completed its initial phase. The programme involves validation of the AATSR derived SST values using in situ radiometers, in situ buoys and global SST fields from other data sets. The results of the initial programme presented here will demonstrate that the AATSR instrument is currently close to meeting its scientific objectives of determining global SST to an accuracy of 0.3 K (one sigma). For night time data, the analysis gives a warm bias of between +0.04 K (0.28 K) for buoys to +0.06 K (0.20 K) for radiometers, with slightly higher errors observed for day time data, showing warm biases of between +0.02 (0.39 K) for buoys to +0.11 K (0.33 K) for radiometers. They show that the ATSR series of instruments continues to be the world leader in delivering accurate space-based observations of SST, which is a key climate parameter.