933 resultados para multivariate Methoden
Resumo:
We propose a family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight. The originality comes from introducing multidimensional instead of univariate scale variables for the mixture of scaled Gaussian family of distributions. In contrast to most existing approaches, the derived distributions can account for a variety of shapes and have a simple tractable form with a closed-form probability density function whatever the dimension. We examine a number of properties of these distributions and illustrate them in the particular case of Pearson type VII and t tails. For these latter cases, we provide maximum likelihood estimation of the parameters and illustrate their modelling flexibility on simulated and real data clustering examples.
Resumo:
BACKGROUND: In order to rapidly and efficiently screen potential biofuel feedstock candidates for quintessential traits, robust high-throughput analytical techniques must be developed and honed. The traditional methods of measuring lignin syringyl/guaiacyl (S/G) ratio can be laborious, involve hazardous reagents, and/or be destructive. Vibrational spectroscopy can furnish high-throughput instrumentation without the limitations of the traditional techniques. Spectral data from mid-infrared, near-infrared, and Raman spectroscopies was combined with S/G ratios, obtained using pyrolysis molecular beam mass spectrometry, from 245 different eucalypt and Acacia trees across 17 species. Iterations of spectral processing allowed the assembly of robust predictive models using partial least squares (PLS). RESULTS: The PLS models were rigorously evaluated using three different randomly generated calibration and validation sets for each spectral processing approach. Root mean standard errors of prediction for validation sets were lowest for models comprised of Raman (0.13 to 0.16) and mid-infrared (0.13 to 0.15) spectral data, while near-infrared spectroscopy led to more erroneous predictions (0.18 to 0.21). Correlation coefficients (r) for the validation sets followed a similar pattern: Raman (0.89 to 0.91), mid-infrared (0.87 to 0.91), and near-infrared (0.79 to 0.82). These statistics signify that Raman and mid-infrared spectroscopy led to the most accurate predictions of S/G ratio in a diverse consortium of feedstocks. CONCLUSION: Eucalypts present an attractive option for biofuel and biochemical production. Given the assortment of over 900 different species of Eucalyptus and Corymbia, in addition to various species of Acacia, it is necessary to isolate those possessing ideal biofuel traits. This research has demonstrated the validity of vibrational spectroscopy to efficiently partition different potential biofuel feedstocks according to lignin S/G ratio, significantly reducing experiment and analysis time and expense while providing non-destructive, accurate, global, predictive models encompassing a diverse array of feedstocks.
Resumo:
Close to one half of the LHC events are expected to be due to elastic or inelastic diffractive scattering. Still, predictions based on extrapolations of experimental data at lower energies differ by large factors in estimating the relative rate of diffractive event categories at the LHC energies. By identifying diffractive events, detailed studies on proton structure can be carried out. The combined forward physics objects: rapidity gaps, forward multiplicity and transverse energy flows can be used to efficiently classify proton-proton collisions. Data samples recorded by the forward detectors, with a simple extension, will allow first estimates of the single diffractive (SD), double diffractive (DD), central diffractive (CD), and non-diffractive (ND) cross sections. The approach, which uses the measurement of inelastic activity in forward and central detector systems, is complementary to the detection and measurement of leading beam-like protons. In this investigation, three different multivariate analysis approaches are assessed in classifying forward physics processes at the LHC. It is shown that with gene expression programming, neural networks and support vector machines, diffraction can be efficiently identified within a large sample of simulated proton-proton scattering events. The event characteristics are visualized by using the self-organizing map algorithm.
Resumo:
The basic characteristic of a chaotic system is its sensitivity to the infinitesimal changes in its initial conditions. A limit to predictability in chaotic system arises mainly due to this sensitivity and also due to the ineffectiveness of the model to reveal the underlying dynamics of the system. In the present study, an attempt is made to quantify these uncertainties involved and thereby improve the predictability by adopting a multivariate nonlinear ensemble prediction. Daily rainfall data of Malaprabha basin, India for the period 1955-2000 is used for the study. It is found to exhibit a low dimensional chaotic nature with the dimension varying from 5 to 7. A multivariate phase space is generated, considering a climate data set of 16 variables. The chaotic nature of each of these variables is confirmed using false nearest neighbor method. The redundancy, if any, of this atmospheric data set is further removed by employing principal component analysis (PCA) method and thereby reducing it to eight principal components (PCs). This multivariate series (rainfall along with eight PCs) is found to exhibit a low dimensional chaotic nature with dimension 10. Nonlinear prediction employing local approximation method is done using univariate series (rainfall alone) and multivariate series for different combinations of embedding dimensions and delay times. The uncertainty in initial conditions is thus addressed by reconstructing the phase space using different combinations of parameters. The ensembles generated from multivariate predictions are found to be better than those from univariate predictions. The uncertainty in predictions is decreased or in other words predictability is increased by adopting multivariate nonlinear ensemble prediction. The restriction on predictability of a chaotic series can thus be altered by quantifying the uncertainty in the initial conditions and also by including other possible variables, which may influence the system. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Granger causality is increasingly being applied to multi-electrode neurophysiological and functional imaging data to characterize directional interactions between neurons and brain regions. For a multivariate dataset, one might be interested in different subsets of the recorded neurons or brain regions. According to the current estimation framework, for each subset, one conducts a separate autoregressive model fitting process, introducing the potential for unwanted variability and uncertainty. In this paper, we propose a multivariate framework for estimating Granger causality. It is based on spectral density matrix factorization and offers the advantage that the estimation of such a matrix needs to be done only once for the entire multivariate dataset. For any subset of recorded data, Granger causality can be calculated through factorizing the appropriate submatrix of the overall spectral density matrix.
Resumo:
We consider refined versions of Markov chains related to juggling introduced by Warrington. We further generalize the construction to juggling with arbitrary heights as well as infinitely many balls, which are expressed more succinctly in terms of Markov chains on integer partitions. In all cases, we give explicit product formulas for the stationary probabilities. The normalization factor in one case can be explicitly written as a homogeneous symmetric polynomial. We also refine and generalize enriched Markov chains on set partitions. Lastly, we prove that in one case, the stationary distribution is attained in bounded time.
Resumo:
Biomolecular recognition underlying drug-target interactions is determined by both binding affinity and specificity. Whilst, quantification of binding efficacy is possible, determining specificity remains a challenge, as it requires affinity data for multiple targets with the same ligand dataset. Thus, understanding the interaction space by mapping the target space to model its complementary chemical space through computational techniques are desirable. In this study, active site architecture of FabD drug target in two apicomplexan parasites viz. Plasmodium falciparum (PfFabD) and Toxoplasma gondii (TgFabD) is explored, followed by consensus docking calculations and identification of fifteen best hit compounds, most of which are found to be derivatives of natural products. Subsequently, machine learning techniques were applied on molecular descriptors of six FabD homologs and sixty ligands to induce distinct multivariate partial-least square models. The biological space of FabD mapped by the various chemical entities explain their interaction space in general. It also highlights the selective variations in FabD of apicomplexan parasites with that of the host. Furthermore, chemometric models revealed the principal chemical scaffolds in PfFabD and TgFabD as pyrrolidines and imidazoles, respectively, which render target specificity and improve binding affinity in combination with other functional descriptors conducive for the design and optimization of the leads.
Resumo:
Climate change in response to a change in external forcing can be understood in terms of fast response to the imposed forcing and slow feedback associated with surface temperature change. Previous studies have investigated the characteristics of fast response and slow feedback for different forcing agents. Here we examine to what extent that fast response and slow feedback derived from time-mean results of climate model simulations can be used to infer total climate change. To achieve this goal, we develop a multivariate regression model of climate change, in which the change in a climate variable is represented by a linear combination of its sensitivity to CO2 forcing, solar forcing, and change in global mean surface temperature. We derive the parameters of the regression model using time-mean results from a set of HadCM3L climate model step-forcing simulations, and then use the regression model to emulate HadCM3L-simulated transient climate change. Our results show that the regression model emulates well HadCM3L-simulated temporal evolution and spatial distribution of climate change, including surface temperature, precipitation, runoff, soil moisture, cloudiness, and radiative fluxes under transient CO2 and/or solar forcing scenarios. Our findings suggest that temporal and spatial patterns of total change for the climate variables considered here can be represented well by the sum of fast response and slow feedback. Furthermore, by using a simple 1-D heat-diffusion climate model, we show that the temporal and spatial characteristics of climate change under transient forcing scenarios can be emulated well using information from step-forcing simulations alone.
Resumo:
Fish assemblage structure of Maryland's coastal lagoon complex was analyzed for spatial and seasonal patterns for the period 1991-2000. Data was made available by Maryland Department of Natural Resources from their MD Coastal Bays Finfish Survey. Dominant species from separate trawl and wiw surveys included blue crab Callinectes sapidus (erroneously included here as a "fish" due to its dominance and commercial importance), bay anchovy Anchoa mitchilli, spot Leiostomous xanthurus, silver perch Bairdiella ehrysoura, and Atlantic menhaden Brevwrtia tyrannus. Ninety-four fish species were identified in the two surveys, a diversity substantially higher than other survey records for Middle Atlantic Bight estuarine and lagoon systems (richness=26 to 78 species). Total species richness for the trawl survey was highest in Chincoteague and lowest in Assawoman and Sinepuxent. On the other hand, mean richness per tow (-area) and related Shannon Weiner Diversity Index were significantly higher in the northern two bays (Assawoman and Isle of Wight Bays) than in the two southern bays (Chincoteague or Sinepuxent Bays). For the seine survey, effort-adjusted diversity indices were significantly lower for Chincoteague Bay than for the other three bays. Higher relative abundances were observed in the northern bays than in the southern bays. The trawl survey exhibited the lowest catch-per-site in Sinepuxent Bay and the highest in Assawoman Bay. The seine survey had the lowest catch-per-site in Chincoteague Bay while the other three embayments were of similar magnitude. There was clear seasonality in assemblage structure with peak abundance and diversity in the summer compared to other seasons. Blue crabs in particular showed a c. 2-fold decline in relative abundance from early summer to fall, which is likely attributable to harvest removals (i.e., an exploitation rate of c. 50%). Seagrass coverage, although increasing over the course of the 10 year survey, did not have obvious effects on species diversity and abundance across or within the embayments, although it did have positive associations with two important species: bay anchovy and summer flounder Pavalich thys dentatus. Atlantic menhaden were most dominant in Assawoman Bay, which could be related to higher primary production typically observed in this Bay in comparison to the other three. (PDF contains 99 pages)
Resumo:
Investigations on the avoidance reactions of pelagic schooling fish (herring and sprat) released by an approaching fishery vessel were carried out during the 378th cruise of FRC "Solea" from 25 September to 3 October 1995 in the Arkona Sea, southern Baltic. An echosounder system EK 500/BI500 with a 38 kHz transducer mounted on a towed body as weIl as a 120 kHz hull mounted transducer were used. Fish densities were measured synchronously as well as under the ship as at a laterally distances from the ship by the transducer of the towed body. By these means the variation of fish densities up to a certain distance from the ship is possible. The advantage of using an echo integrating system for these measurements is, that it works also for not schooling fish and under conditions where schooling fish disperse (e.g. at night).