959 resultados para Statistical parameters
Resumo:
The discrete-time Markov chain is commonly used in describing changes of health states for chronic diseases in a longitudinal study. Statistical inferences on comparing treatment effects or on finding determinants of disease progression usually require estimation of transition probabilities. In many situations when the outcome data have some missing observations or the variable of interest (called a latent variable) can not be measured directly, the estimation of transition probabilities becomes more complicated. In the latter case, a surrogate variable that is easier to access and can gauge the characteristics of the latent one is usually used for data analysis. ^ This dissertation research proposes methods to analyze longitudinal data (1) that have categorical outcome with missing observations or (2) that use complete or incomplete surrogate observations to analyze the categorical latent outcome. For (1), different missing mechanisms were considered for empirical studies using methods that include EM algorithm, Monte Carlo EM and a procedure that is not a data augmentation method. For (2), the hidden Markov model with the forward-backward procedure was applied for parameter estimation. This method was also extended to cover the computation of standard errors. The proposed methods were demonstrated by the Schizophrenia example. The relevance of public health, the strength and limitations, and possible future research were also discussed. ^
Resumo:
Mineralogic, petrographic, and geochemical analyses of sediments recovered from two Leg 166 Ocean Drilling Program cores on the western slope of Great Bahama Bank (308 m and 437 m water depth) are used to characterize early marine diagenesis of these shallow-water, periplatform carbonates. The most pronounced diagenetic products are well-lithified intervals found almost exclusively in glacial lowstand deposits and interpreted to have formed at or near the seafloor (i.e., hardgrounds). Hardground cements are composed of high-Mg calcite (~14 mol% MgCO3), and exhibit textures typically associated with seafloor cementation. Geochemically, hardgrounds are characterized by increased d18O and Mg contents and decreased d13C, Sr, and Na contents relative to their less lithified counterparts. Despite being deposited in shallow waters that are supersaturated with the common carbonate minerals, it is clear that these sediments are also undergoing shallow subsurface diagenesis. Calculation of saturation states shows that pore waters become undersaturated with aragonite within the upper 10 m at both sites. Dissolution, and likely recrystallization, of metastable carbonates is manifested by increases in interstitial water Sr and Sr/Ca profiles with depth. We infer that the reduction in mineral saturation states and subsequent dissolution are being driven by the oxidation of organic matter in this Fe-poor carbonate system. Precipitation of burial diagenetic phases is indicated by the down-core appearance of dolomite and corresponding decrease in interstitial water Mg, and the presence of low-Mg calcite cements observed in scanning electron microscope photomicrographs.
Resumo:
The purpose of this study was: (1) To make an attempt at finding a stratification of the snowpack in order to help remove ambiguities in dating the snowlayers by standard methods. (2) To verify the depth at which the transition between firn and ice occurs. Clearly the first goal was missed, the structural information in a temperate firn being strongly smoothed out in time. Interesting details like horizontal ice lenses and layers of "cold snow" however, were revealed. In spite of strong variations of density, gravimetric density PG and ice density PI, computed from point density, are identical for the firn pack between Z = 2.0 m and 6.0 m. p(ice) = 0.522 ± 0.034 x 10**3 kg/m**3. The ice density of 0.8 x 10**3 kg/m**3, the assumed transition between firn and ice, was found to occur at a depth of Z= 19 m. Even at this level, rather important variations in density may be localized. Between Z= 19 m and 21 m, the ice density varies from 0.774 x 10**3 to 0.860 x 10**3 kg/m**3.
Resumo:
Parameters in the photosynthesis-irradiance (P-E) relationship of phytoplankton were measured at weekly to bi-weekly intervals for 20 yr at 6 stations on the Rhode River, Maryland (USA). Variability in the light-saturated photosynthetic rate, PBmax, was partitioned into interannual, seasonal, and spatial components. The seasonal component of the variance was greatest, followed by interannual and then spatial. Physiological models of PBmax based on balanced growth or photoacclimation predicted the overall mean and most of the range, but not individual observations, and failed to capture important features of the seasonal and interannual variability. PBmax correlated most strongly with temperature and the concentration of dissolved inorganic carbon (IC), with lesser correlations with chlorophyll a, diffuse attenuation coefficient, and a principal component of the species composition. In statistical models, temperature and IC correlated best with the seasonal pattern, but temperature peaked in late July, out of phase with PBmax, which peaked in September, coincident with the maximum in monthly averaged IC concentration. In contrast with the seasonal pattern, temperature did not contribute to interannual variation, which instead was governed by IC and the additional lesser correlates. Spatial variation was relatively weak and uncorrelated with ancillary measurements. The results demonstrate that both the overall distribution of PBmax and its relationship with environmental correlates may vary from year to year. Coefficients in empirical statistical models became stable after including 7 to 10 yr of data. The main correlates of PBmax are amenable to automated monitoring, so that future estimates of primary production might be made without labor-intensive incubations.
Resumo:
Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.
Resumo:
Alzheimer's disease (AD) is the most common cause of dementia. Over the last few years, a considerable effort has been devoted to exploring new biomarkers. Nevertheless, a better understanding of brain dynamics is still required to optimize therapeutic strategies. In this regard, the characterization of mild cognitive impairment (MCI) is crucial, due to the high conversion rate from MCI to AD. However, only a few studies have focused on the analysis of magnetoencephalographic (MEG) rhythms to characterize AD and MCI. In this study, we assess the ability of several parameters derived from information theory to describe spontaneous MEG activity from 36 AD patients, 18 MCI subjects and 26 controls. Three entropies (Shannon, Tsallis and Rényi entropies), one disequilibrium measure (based on Euclidean distance ED) and three statistical complexities (based on Lopez Ruiz–Mancini–Calbet complexity LMC) were used to estimate the irregularity and statistical complexity of MEG activity. Statistically significant differences between AD patients and controls were obtained with all parameters (p < 0.01). In addition, statistically significant differences between MCI subjects and controls were achieved by ED and LMC (p < 0.05). In order to assess the diagnostic ability of the parameters, a linear discriminant analysis with a leave-one-out cross-validation procedure was applied. The accuracies reached 83.9% and 65.9% to discriminate AD and MCI subjects from controls, respectively. Our findings suggest that MCI subjects exhibit an intermediate pattern of abnormalities between normal aging and AD. Furthermore, the proposed parameters provide a new description of brain dynamics in AD and MCI.
Resumo:
A reliability analysis method is proposed that starts with the identification of all variables involved. These are divided in three groups: (a) variables fixed by codes, as loads and strength project values, and their corresponding partial safety coefficients, (b) geometric variables defining the dimension of the main elements involved, (c) the cost variables, including the possible damages caused by failure, (d) the random variables as loads, strength, etc., and (e)the variables defining the statistical model, as the family of distribution and its corresponding parameters. Once the variables are known, the II-theorem is used to obtain a minimum equivalent set of non-dimensional variables, which is used to define the limit states. This allows a reduction in the number of variables involved and a better understanding of their coupling effects. Two minimum cost criteria are used for selecting the project dimensions. One is based on a bounded-probability of failure, and the other on a total cost, including the damages of the possible failure. Finally, the method is illustrated by means of an application.
Resumo:
In Operational Modal Analysis (OMA) of a structure, the data acquisition process may be repeated many times. In these cases, the analyst has several similar records for the modal analysis of the structure that have been obtained at di�erent time instants (multiple records). The solution obtained varies from one record to another, sometimes considerably. The differences are due to several reasons: statistical errors of estimation, changes in the external forces (unmeasured forces) that modify the output spectra, appearance of spurious modes, etc. Combining the results of the di�erent individual analysis is not straightforward. To solve the problem, we propose to make the joint estimation of the parameters using all the records. This can be done in a very simple way using state space models and computing the estimates by maximum-likelihood. The method provides a single result for the modal parameters that combines optimally all the records.
Resumo:
Computing the modal parameters of large structures in Operational Modal Analysis often requires to process data from multiple non simultaneously recorded setups of sensors. These setups share some sensors in common, the so-called reference sensors that are fixed for all the measurements, while the other sensors are moved from one setup to the next. One possibility is to process the setups separately what result in different modal parameter estimates for each setup. Then the reference sensors are used to merge or glue the different parts of the mode shapes to obtain global modes, while the natural frequencies and damping ratios are usually averaged. In this paper we present a state space model that can be used to process all setups at once so the global mode shapes are obtained automatically and subsequently only a value for the natural frequency and damping ratio of each mode is computed. We also present how this model can be estimated using maximum likelihood and the Expectation Maximization algorithm. We apply this technique to real data measured at a footbridge.
Resumo:
The helix-coil transition equilibrium of polypeptides in aqueous solution was studied by molecular dynamics simulation. The peptide growth simulation method was introduced to generate dynamic models of polypeptide chains in a statistical (random) coil or an alpha-helical conformation. The key element of this method is to build up a polypeptide chain during the course of a molecular transformation simulation, successively adding whole amino acid residues to the chain in a predefined conformation state (e.g., alpha-helical or statistical coil). Thus, oligopeptides of the same length and composition, but having different conformations, can be incrementally grown from a common precursor, and their relative conformational free energies can be calculated as the difference between the free energies for growing the individual peptides. This affords a straightforward calculation of the Zimm-Bragg sigma and s parameters for helix initiation and helix growth. The calculated sigma and s parameters for the polyalanine alpha-helix are in reasonable agreement with the experimental measurements. The peptide growth simulation method is an effective way to study quantitatively the thermodynamics of local protein folding.
Resumo:
Commercial explosives behave non-ideally in rock blasting. A direct and convenient measure of non-ideality is the detonation velocity. In this study, an alternative model fitted to experimental unconfined detonation velocity data is proposed and the effect of confinement on the detonation velocity is modelled. Unconfined data of several explosives showing various levels of nonideality were successfully modelled. The effect of confinement on detonation velocity was modelled empirically based on field detonation velocity measurements. Confined detonation velocity is a function of the ideal detonation velocity, unconfined detonation velocity at a given blasthole diameter and rock stiffness. For a given explosive and charge diameter, as confinement increases detonation velocity increases. The confinement model is implemented in a simple engineering based non-ideal detonation model. A number of simulations are carried out and analysed to predict the explosive performance parameters for the adopted blasting conditions.
Resumo:
Grass pollen is an important risk factor for allergic rhinitis and asthma in Australia and is the most prevalent pollen component of the aerospora of Brisbane, accounting for 71.6% of the annual airborne pollen load. A 5-year (June 1994-May 1999) monitoring program shows the grass pollen season to occur during the summer and autumn months (December-April), however the timing of onset and intensity of the season vary from year to year. During the pollen season, Poaceae counts exceeding 30 grains m(-3) were recorded on 244 days and coincided with maximum temperatures of 28.1 +/- 2.0degreesC. In this study, statistical associations between atmospheric grass pollen loads and several weather parameters, including maximum temperature, minimum temperature and precipitation, were investigated. Spearman's correlation analysis demonstrated that daily grass pollen counts were positively associated (P < 0.0001) with maximum and minimum temperature during each sampling year. Precipitation, although considered a less important daily factor (P < 0.05), was observed to remove pollen grains from the atmosphere during significant periods of rainfall. This study provides the first insight into the influence of meteorological variables, in particular temperature, on atmospheric Poaceae pollen counts in Brisbane. An awareness of these associations is critical for the prevention and management of allergy and asthma for atopic individuals within this region.
Resumo:
Statistical tests of Load-Unload Response Ratio (LURR) signals are carried in order to verify statistical robustness of the previous studies using the Lattice Solid Model (MORA et al., 2002b). In each case 24 groups of samples with the same macroscopic parameters (tidal perturbation amplitude A, period T and tectonic loading rate k) but different particle arrangements are employed. Results of uni-axial compression experiments show that before the normalized time of catastrophic failure, the ensemble average LURR value rises significantly, in agreement with the observations of high LURR prior to the large earthquakes. In shearing tests, two parameters are found to control the correlation between earthquake occurrence and tidal stress. One is, A/(kT) controlling the phase shift between the peak seismicity rate and the peak amplitude of the perturbation stress. With an increase of this parameter, the phase shift is found to decrease. Another parameter, AT/k, controls the height of the probability density function (Pdf) of modeled seismicity. As this parameter increases, the Pdf becomes sharper and narrower, indicating a strong triggering. Statistical studies of LURR signals in shearing tests also suggest that except in strong triggering cases, where LURR cannot be calculated due to poor data in unloading cycles, the larger events are more likely to occur in higher LURR periods than the smaller ones, supporting the LURR hypothesis.