967 resultados para Data matrix
Resumo:
In a seminal paper, Aitchison and Lauder (1985) introduced classical kernel density estimation techniques in the context of compositional data analysis. Indeed, they gave two options for the choice of the kernel to be used in the kernel estimator. One of these kernels is based on the use the alr transformation on the simplex SD jointly with the normal distribution on RD-1. However, these authors themselves recognized that this method has some deficiencies. A method for overcoming these dificulties based on recent developments for compositional data analysis and multivariate kernel estimation theory, combining the ilr transformation with the use of the normal density with a full bandwidth matrix, was recently proposed in Martín-Fernández, Chacón and Mateu- Figueras (2006). Here we present an extensive simulation study that compares both methods in practice, thus exploring the finite-sample behaviour of both estimators
Resumo:
The quantitative estimation of Sea Surface Temperatures from fossils assemblages is a fundamental issue in palaeoclimatic and paleooceanographic investigations. The Modern Analogue Technique, a widely adopted method based on direct comparison of fossil assemblages with modern coretop samples, was revised with the aim of conforming it to compositional data analysis. The new CODAMAT method was developed by adopting the Aitchison metric as distance measure. Modern coretop datasets are characterised by a large amount of zeros. The zero replacement was carried out by adopting a Bayesian approach to the zero replacement, based on a posterior estimation of the parameter of the multinomial distribution. The number of modern analogues from which reconstructing the SST was determined by means of a multiple approach by considering the Proxies correlation matrix, Standardized Residual Sum of Squares and Mean Squared Distance. This new CODAMAT method was applied to the planktonic foraminiferal assemblages of a core recovered in the Tyrrhenian Sea. Kew words: Modern analogues, Aitchison distance, Proxies correlation matrix, Standardized Residual Sum of Squares
Resumo:
Populations of Lesser Scaup (Aythya affinis) have declined markedly in North America since the early 1980s. When considering alternatives for achieving population recovery, it would be useful to understand how the rate of population growth is functionally related to the underlying vital rates and which vital rates affect population growth rate the most if changed (which need not be those that influenced historical population declines). To establish a more quantitative basis for learning about life history and population dynamics of Lesser Scaup, we summarized published and unpublished estimates of vital rates recorded between 1934 and 2005, and developed matrix life-cycle models with these data for females breeding in the boreal forest, prairie-parklands, and both regions combined. We then used perturbation analysis to evaluate the effect of changes in a variety of vital-rate statistics on finite population growth rate and abundance. Similar to Greater Scaup (Aythya marila), our modeled population growth rate for Lesser Scaup was most sensitive to unit and proportional change in adult female survival during the breeding and non-breeding seasons, but much less so to changes in fecundity parameters. Interestingly, population growth rate was also highly sensitive to unit and proportional changes in the mean of nesting success, duckling survival, and juvenile survival. Given the small samples of data for key aspects of the Lesser Scaup life cycle, we recommend additional research on vital rates that demonstrate a strong effect on population growth and size (e.g., adult survival probabilities). Our life-cycle models should be tested and regularly updated in the future to simultaneously guide science and management of Lesser Scaup populations in an adaptive context.
Resumo:
Two wavelet-based control variable transform schemes are described and are used to model some important features of forecast error statistics for use in variational data assimilation. The first is a conventional wavelet scheme and the other is an approximation of it. Their ability to capture the position and scale-dependent aspects of covariance structures is tested in a two-dimensional latitude-height context. This is done by comparing the covariance structures implied by the wavelet schemes with those found from the explicit forecast error covariance matrix, and with a non-wavelet- based covariance scheme used currently in an operational assimilation scheme. Qualitatively, the wavelet-based schemes show potential at modeling forecast error statistics well without giving preference to either position or scale-dependent aspects. The degree of spectral representation can be controlled by changing the number of spectral bands in the schemes, and the least number of bands that achieves adequate results is found for the model domain used. Evidence is found of a trade-off between the localization of features in positional and spectral spaces when the number of bands is changed. By examining implied covariance diagnostics, the wavelet-based schemes are found, on the whole, to give results that are closer to diagnostics found from the explicit matrix than from the nonwavelet scheme. Even though the nature of the covariances has the right qualities in spectral space, variances are found to be too low at some wavenumbers and vertical correlation length scales are found to be too long at most scales. The wavelet schemes are found to be good at resolving variations in position and scale-dependent horizontal length scales, although the length scales reproduced are usually too short. The second of the wavelet-based schemes is often found to be better than the first in some important respects, but, unlike the first, it has no exact inverse transform.
Resumo:
The complexity inherent in climate data makes it necessary to introduce more than one statistical tool to the researcher to gain insight into the climate system. Empirical orthogonal function (EOF) analysis is one of the most widely used methods to analyze weather/climate modes of variability and to reduce the dimensionality of the system. Simple structure rotation of EOFs can enhance interpretability of the obtained patterns but cannot provide anything more than temporal uncorrelatedness. In this paper, an alternative rotation method based on independent component analysis (ICA) is considered. The ICA is viewed here as a method of EOF rotation. Starting from an initial EOF solution rather than rotating the loadings toward simplicity, ICA seeks a rotation matrix that maximizes the independence between the components in the time domain. If the underlying climate signals have an independent forcing, one can expect to find loadings with interpretable patterns whose time coefficients have properties that go beyond simple noncorrelation observed in EOFs. The methodology is presented and an application to monthly means sea level pressure (SLP) field is discussed. Among the rotated (to independence) EOFs, the North Atlantic Oscillation (NAO) pattern, an Arctic Oscillation–like pattern, and a Scandinavian-like pattern have been identified. There is the suggestion that the NAO is an intrinsic mode of variability independent of the Pacific.
Resumo:
Cross-hole anisotropic electrical and seismic tomograms of fractured metamorphic rock have been obtained at a test site where extensive hydrological data were available. A strong correlation between electrical resistivity anisotropy and seismic compressional-wave velocity anisotropy has been observed. Analysis of core samples from the site reveal that the shale-rich rocks have fabric-related average velocity anisotropy of between 10% and 30%. The cross-hole seismic data are consistent with these values, indicating that observed anisotropy might be principally due to the inherent rock fabric rather than to the aligned sets of open fractures. One region with velocity anisotropy greater than 30% has been modelled as aligned open fractures within an anisotropic rock matrix and this model is consistent with available fracture density and hydraulic transmissivity data from the boreholes and the cross-hole resistivity tomography data. However, in general the study highlights the uncertainties that can arise, due to the relative influence of rock fabric and fluid-filled fractures, when using geophysical techniques for hydrological investigations.
Resumo:
With its highly fluctuating ion production matrix-assisted laser desorption/ionization (MALDI) poses many practical challenges for its application in mass spectrometry. Instrument tuning and quantitative ion abundance measurements using ion signal alone depend on a stable ion beam. Liquid MALDI matrices have been shown to be a promising alternative to the commonly used solid matrices. Their application in areas where a stable ion current is essential has been discussed but only limited data have been provided to demonstrate their practical use and advantages in the formation of stable MALDI ion beams. In this article we present experimental data showing high MALDI ion beam stability over more than two orders of magnitude at high analytical sensitivity (low femtomole amount prepared) for quantitative peptide abundance measurements and instrument tuning in a MALDI Q-TOF mass spectrometer. Samples were deposited on an inexpensive conductive hydrophobic surface and shrunk to droplets <10 nL in size. By using a sample droplet <10 nL it was possible to acquire data from a single irradiated spot for roughly 10,000 shots with little variation in ion signal intensity at a laser repetition rate of 5-20 Hz.
Resumo:
We have combined several key sample preparation steps for the use of a liquid matrix system to provide high analytical sensitivity in automated ultraviolet -- matrix-assisted laser desorption/ionisation -- mass spectrometry (UV-MALDI-MS). This new sample preparation protocol employs a matrix-mixture which is based on the glycerol matrix-mixture described by Sze et al. The low-femtomole sensitivity that is achievable with this new preparation protocol enables proteomic analysis of protein digests comparable to solid-state matrix systems. For automated data acquisition and analysis, the MALDI performance of this liquid matrix surpasses the conventional solid-state MALDI matrices. Besides the inherent general advantages of liquid samples for automated sample preparation and data acquisition the use of the presented liquid matrix significantly reduces the extent of unspecific ion signals in peptide mass fingerprints compared to typically used solid matrices, such as 2,5-dihydroxybenzoic acid (DHB) or alpha-cyano-hydroxycinnamic acid (CHCA). In particular, matrix and low-mass ion signals and ion signals resulting from cation adduct formation are dramatically reduced. Consequently, the confidence level of protein identification by peptide mass mapping of in-solution and in-gel digests is generally higher.
Resumo:
We have combined several key sample preparation steps for the use of a liquid matrix system to provide high analytical sensitivity in automated ultraviolet - matrix-assisted laser desorption/ ionisation - mass spectrometry (UV-MALDI-MS). This new sample preparation protocol employs a matrix-mixture which is based on the glycerol matrix-mixture described by Sze et al. U. Am. Soc. Mass Spectrom. 1998, 9, 166-174). The low-ferntomole sensitivity that is achievable with this new preparation protocol enables proteomic analysis of protein digests comparable to solid-state matrix systems. For automated data acquisition and analysis, the MALDI performance of this liquid matrix surpasses the conventional solid-state MALDI matrices. Besides the inherent general advantages of liquid samples for automated sample preparation and data acquisition the use of the presented liquid matrix significantly reduces the extent of unspecific ion signals in peptide mass fingerprints compared to typically used solid matrices, such as 2,5-dihydrox-ybenzoic acid (DHB) or alpha-cyano-hydroxycinnamic acid (CHCA). In particular, matrix and lowmass ion signals and ion signals resulting from cation adduct formation are dramatically reduced. Consequently, the confidence level of protein identification by peptide mass mapping of in-solution and in-gel digests is generally higher.
Resumo:
The molecular structures of NbOBr3, NbSCl3, and NbSBr3 have been determined by gas-phase electron diffraction (GED) at nozzle-tip temperatures of 250 degreesC, taking into account the possible presence of NbOCl3 as a contaminant in the NbSCl3 sample and NbOBr3 in the NbSBr3 sample. The experimental data are consistent with trigonal-pyramidal molecules having C-3v symmetry. Infrared spectra of molecules trapped in argon or nitrogen matrices were recorded and exhibit the characteristic fundamental stretching modes for C-3v species. Well resolved isotopic fine structure (Cl-35 and Cl-37) was observed for NbSCl3, and for NbOCl3 which occurred as an impurity in the NbSCl3 spectra. Quantum mechanical calculations of the structures and vibrational frequencies of the four YNbX3 molecules (Y = O, S; X = Cl, Br) were carried out at several levels of theory, most importantly B3LYP DFT with either the Stuttgart RSC ECP or Hay-Wadt (n + 1) ECP VDZ basis set for Nb and the 6-311 G* basis set for the nonmetal atoms. Theoretical values for the bond lengths are 0.01-0.04 Angstrom longer than the experimental ones of type r(a), in accord with general experience, but the bond angles with theoretical minus experimental differences of only 1.0-1.5degrees are notably accurate. Symmetrized force fields were also calculated. The experimental bond lengths (r(g)/Angstrom) and angles (angle(alpha)/deg) with estimated 2sigma uncertainties from GED are as follows. NbOBr3: r(Nb=O) = 1.694(7), r(Nb-Br) = 2.429(2), angle(O=Nb-Br) = 107.3(5), angle(Br-Nb-Br) = 111.5(5). NbSBr3: r(Nb=S) = 2.134(10), r(Nb-Br) = 2.408(4), angle(S=Nb-Br) = 106.6(7), angle(Br-Nb-Br) = 112.2(6). NbSCl3: Nb=S) = 2.120(10), r(Nb-Cl) = 2.271(6), angle(S=Nb-Cl) = 107.8(12), angle(Cl-Nb-Cl) = 111.1(11).
Resumo:
Event-related functional magnetic resonance imaging (efMRI) has emerged as a powerful technique for detecting brains' responses to presented stimuli. A primary goal in efMRI data analysis is to estimate the Hemodynamic Response Function (HRF) and to locate activated regions in human brains when specific tasks are performed. This paper develops new methodologies that are important improvements not only to parametric but also to nonparametric estimation and hypothesis testing of the HRF. First, an effective and computationally fast scheme for estimating the error covariance matrix for efMRI is proposed. Second, methodologies for estimation and hypothesis testing of the HRF are developed. Simulations support the effectiveness of our proposed methods. When applied to an efMRI dataset from an emotional control study, our method reveals more meaningful findings than the popular methods offered by AFNI and FSL. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
We propose a unified data modeling approach that is equally applicable to supervised regression and classification applications, as well as to unsupervised probability density function estimation. A particle swarm optimization (PSO) aided orthogonal forward regression (OFR) algorithm based on leave-one-out (LOO) criteria is developed to construct parsimonious radial basis function (RBF) networks with tunable nodes. Each stage of the construction process determines the center vector and diagonal covariance matrix of one RBF node by minimizing the LOO statistics. For regression applications, the LOO criterion is chosen to be the LOO mean square error, while the LOO misclassification rate is adopted in two-class classification applications. By adopting the Parzen window estimate as the desired response, the unsupervised density estimation problem is transformed into a constrained regression problem. This PSO aided OFR algorithm for tunable-node RBF networks is capable of constructing very parsimonious RBF models that generalize well, and our analysis and experimental results demonstrate that the algorithm is computationally even simpler than the efficient regularization assisted orthogonal least square algorithm based on LOO criteria for selecting fixed-node RBF models. Another significant advantage of the proposed learning procedure is that it does not have learning hyperparameters that have to be tuned using costly cross validation. The effectiveness of the proposed PSO aided OFR construction procedure is illustrated using several examples taken from regression and classification, as well as density estimation applications.
Resumo:
Objectives: Our objective was to test the performance of CA125 in classifying serum samples from a cohort of malignant and benign ovarian cancers and age-matched healthy controls and to assess whether combining information from matrix-assisted laser desorption/ionization (MALDI) time-of-flight profiling could improve diagnostic performance. Materials and Methods: Serum samples from women with ovarian neoplasms and healthy volunteers were subjected to CA125 assay and MALDI time-of-flight mass spectrometry (MS) profiling. Models were built from training data sets using discriminatory MALDI MS peaks in combination with CA125 values and tested their ability to classify blinded test samples. These were compared with models using CA125 threshold levels from 193 patients with ovarian cancer, 290 with benign neoplasm, and 2236 postmenopausal healthy controls. Results: Using a CA125 cutoff of 30 U/mL, an overall sensitivity of 94.8% (96.6% specificity) was obtained when comparing malignancies versus healthy postmenopausal controls, whereas a cutoff of 65 U/mL provided a sensitivity of 83.9% (99.6% specificity). High classification accuracies were obtained for early-stage cancers (93.5% sensitivity). Reasons for high accuracies include recruitment bias, restriction to postmenopausal women, and inclusion of only primary invasive epithelial ovarian cancer cases. The combination of MS profiling information with CA125 did not significantly improve the specificity/accuracy compared with classifications on the basis of CA125 alone. Conclusions: We report unexpectedly good performance of serum CA125 using threshold classification in discriminating healthy controls and women with benign masses from those with invasive ovarian cancer. This highlights the dependence of diagnostic tests on the characteristics of the study population and the crucial need for authors to provide sufficient relevant details to allow comparison. Our study also shows that MS profiling information adds little to diagnostic accuracy. This finding is in contrast with other reports and shows the limitations of serum MS profiling for biomarker discovery and as a diagnostic tool
Resumo:
This paper introduces a method for simulating multivariate samples that have exact means, covariances, skewness and kurtosis. We introduce a new class of rectangular orthogonal matrix which is fundamental to the methodology and we call these matrices L matrices. They may be deterministic, parametric or data specific in nature. The target moments determine the L matrix then infinitely many random samples with the same exact moments may be generated by multiplying the L matrix by arbitrary random orthogonal matrices. This methodology is thus termed “ROM simulation”. Considering certain elementary types of random orthogonal matrices we demonstrate that they generate samples with different characteristics. ROM simulation has applications to many problems that are resolved using standard Monte Carlo methods. But no parametric assumptions are required (unless parametric L matrices are used) so there is no sampling error caused by the discrete approximation of a continuous distribution, which is a major source of error in standard Monte Carlo simulations. For illustration, we apply ROM simulation to determine the value-at-risk of a stock portfolio.
Resumo:
The background error covariance matrix, B, is often used in variational data assimilation for numerical weather prediction as a static and hence poor approximation to the fully dynamic forecast error covariance matrix, Pf. In this paper the concept of an Ensemble Reduced Rank Kalman Filter (EnRRKF) is outlined. In the EnRRKF the forecast error statistics in a subspace defined by an ensemble of states forecast by the dynamic model are found. These statistics are merged in a formal way with the static statistics, which apply in the remainder of the space. The combined statistics may then be used in a variational data assimilation setting. It is hoped that the nonlinear error growth of small-scale weather systems will be accurately captured by the EnRRKF, to produce accurate analyses and ultimately improved forecasts of extreme events.