Biblioteca Digital

995 resultados para Statistical bias

The Importance of Scale for Spatial-confounding Bias and Precision of Spatial Regression Estimators

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Increasingly, regression models are used when residuals are spatially correlated. Prominent examples include studies in environmental epidemiology to understand the chronic health effects of pollutants. I consider the effects of residual spatial structure on the bias and precision of regression coefficients, developing a simple framework in which to understand the key issues and derive informative analytic results. When the spatial residual is induced by an unmeasured confounder, regression models with spatial random effects and closely-related models such as kriging and penalized splines are biased, even when the residual variance components are known. Analytic and simulation results show how the bias depends on the spatial scales of the covariate and the residual; bias is reduced only when there is variation in the covariate at a scale smaller than the scale of the unmeasured confounding. I also discuss how the scales of the residual and the covariate affect efficiency and uncertainty estimation when the residuals can be considered independent of the covariate. In an application on the association between black carbon particulate matter air pollution and birth weight, controlling for large-scale spatial variation appears to reduce bias from unmeasured confounders, while increasing uncertainty in the estimated pollution effect.

Statistical analysis of personal radiofrequency electromagnetic field measurements with nondetects

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Exposimeters are increasingly applied in bioelectromagnetic research to determine personal radiofrequency electromagnetic field (RF-EMF) exposure. The main advantages of exposimeter measurements are their convenient handling for study participants and the large amount of personal exposure data, which can be obtained for several RF-EMF sources. However, the large proportion of measurements below the detection limit is a challenge for data analysis. With the robust ROS (regression on order statistics) method, summary statistics can be calculated by fitting an assumed distribution to the observed data. We used a preliminary sample of 109 weekly exposimeter measurements from the QUALIFEX study to compare summary statistics computed by robust ROS with a naïve approach, where values below the detection limit were replaced by the value of the detection limit. For the total RF-EMF exposure, differences between the naïve approach and the robust ROS were moderate for the 90th percentile and the arithmetic mean. However, exposure contributions from minor RF-EMF sources were considerably overestimated with the naïve approach. This results in an underestimation of the exposure range in the population, which may bias the evaluation of potential exposure-response associations. We conclude from our analyses that summary statistics of exposimeter data calculated by robust ROS are more reliable and more informative than estimates based on a naïve approach. Nevertheless, estimates of source-specific medians or even lower percentiles depend on the assumed data distribution and should be considered with caution.

Multivariable adjustments counteract spectrum and test review bias in accuracy studies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVES: The STAndards for Reporting studies of Diagnostic accuracy (STARD) for investigators and editors and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) for reviewers and readers offer guidelines for the quality and reporting of test accuracy studies. These guidelines address and propose some solutions to two major threats to validity: spectrum bias and test review bias. STUDY DESIGN AND SETTING: Using a clinical example, we demonstrate that these solutions fail and propose an alternative solution that concomitantly addresses both sources of bias. We also derive formulas that prove the generality of our arguments. RESULTS: A logical extension of our ideas is to extend STARD item 23 by adding a requirement for multivariable statistical adjustment using information collected in QUADAS items 1, 2, and 12 and STARD items 3-5, 11, 15, and 18. CONCLUSION: We recommend reporting not only variation of diagnostic accuracy across subgroups (STARD item 23) but also the effects of the multivariable adjustments on test performance. We also suggest that the QUADAS be supplemented by an item addressing the appropriateness of statistical methods, in particular whether multivariable adjustments have been included in the analysis.

Statistical model-based segmentation of the proximal femur in digital antero-posterior (AP) pelvic radiographs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE Segmentation of the proximal femur in digital antero-posterior (AP) pelvic radiographs is required to create a three-dimensional model of the hip joint for use in planning and treatment. However, manually extracting the femoral contour is tedious and prone to subjective bias, while automatic segmentation must accommodate poor image quality, anatomical structure overlap, and femur deformity. A new method was developed for femur segmentation in AP pelvic radiographs. METHODS Using manual annotations on 100 AP pelvic radiographs, a statistical shape model (SSM) and a statistical appearance model (SAM) of the femur contour were constructed. The SSM and SAM were used to segment new AP pelvic radiographs with a three-stage approach. At initialization, the mean SSM model is coarsely registered to the femur in the AP radiograph through a scaled rigid registration. Mahalanobis distance defined on the SAM is employed as the search criteria for each annotated suggested landmark location. Dynamic programming was used to eliminate ambiguities. After all landmarks are assigned, a regularized non-rigid registration method deforms the current mean shape of SSM to produce a new segmentation of proximal femur. The second and third stages are iteratively executed to convergence. RESULTS A set of 100 clinical AP pelvic radiographs (not used for training) were evaluated. The mean segmentation error was [Formula: see text], requiring [Formula: see text] s per case when implemented with Matlab. The influence of the initialization on segmentation results was tested by six clinicians, demonstrating no significance difference. CONCLUSIONS A fast, robust and accurate method for femur segmentation in digital AP pelvic radiographs was developed by combining SSM and SAM with dynamic programming. This method can be extended to segmentation of other bony structures such as the pelvis.

BAYESIAN STATISTICAL METHODS IN GENE-ENVIRONMENT AND GENE-GENE INTERACTION STUDIES

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

Left, right and interval bivariate censored data: Evaluating screening mammography in the presence of lead -time bias, length bias and over-detection

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Of the large clinical trials evaluating screening mammography efficacy, none included women ages 75 and older. Recommendations on an upper age limit at which to discontinue screening are based on indirect evidence and are not consistent. Screening mammography is evaluated using observational data from the SEER-Medicare linked database. Measuring the benefit of screening mammography is difficult due to the impact of lead-time bias, length bias and over-detection. The underlying conceptual model divides the disease into two stages: pre-clinical (T0) and symptomatic (T1) breast cancer. Treating the time in these phases as a pair of dependent bivariate observations, (t0,t1), estimates are derived to describe the distribution of this random vector. To quantify the effect of screening mammography, statistical inference is made about the mammography parameters that correspond to the marginal distribution of the symptomatic phase duration (T1). This shows the hazard ratio of death from breast cancer comparing women with screen-detected tumors to those detected at their symptom onset is 0.36 (0.30, 0.42), indicating a benefit among the screen-detected cases. ^

Operator and replicability bias in comparative taphonomic studies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The operator effect is a well-known methodological bias already quantified in some taphonomic studies. However, the replicability effect, i.e., the use of taphonomic attributes as a replicable scientific method, has not been taken into account to the present. Here, we quantified for the first time this replicability bias using different multivariate statistical techniques, testing if the operator effect is related to the replicability effect. We analyzed the results reported by 15 operators working on the same dataset. Each operator analyzed 30 biological remains (bivalve shells) from five different sites, considering the attributes fragmentation, edge rounding, corrasion, bioerosion and secondary color. The operator effect followed the same pattern reported in previous studies, characterized by a worse correspondence for those attributes having more than two levels of damage categories. However, the effect did not appear to have relation with the replicability effect, because nearly all operators found differences among sites. Despite the binary attribute bioerosion exhibited 83% of correspondence among operators it was the taphonomic attributes that showed the highest dispersion among operators (28%). Therefore, we conclude that binary attributes (despite showing a reduction of the operator effect) diminish replicability, resulting in different interpretations of concordant data. We found that a variance value of nearly 8% among operators, was enough to generate a different taphonomic interpretation, in a Q-mode cluster analysis. The results reported here showed that the statistical method employed influences the level of replicability and comparability of a study and that the availability of results may be a valid alternative to reduce bias.

Regularization for sparsity in statistical analysis and machine learning

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.

Lidar uncertainty in complex terrain development of a bias correction methodology

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El autor ha trabajado como parte del equipo de investigación en mediciones de viento en el Centro Nacional de Energías Renovables (CENER), España, en cooperación con la Universidad Politécnica de Madrid y la Universidad Técnica de Dinamarca. El presente reporte recapitula el trabajo de investigación realizado durante los últimos 4.5 años en el estudio de las fuentes de error de los sistemas de medición remota de viento, basados en la tecnología lidar, enfocado al error causado por los efectos del terreno complejo. Este trabajo corresponde a una tarea del paquete de trabajo dedicado al estudio de sistemas remotos de medición de viento, perteneciente al proyecto de intestigación europeo del 7mo programa marco WAUDIT. Adicionalmente, los datos de viento reales han sido obtenidos durante las campañas de medición en terreno llano y terreno complejo, pertenecientes al también proyecto de intestigación europeo del 7mo programa marco SAFEWIND. El principal objetivo de este trabajo de investigación es determinar los efectos del terreno complejo en el error de medición de la velocidad del viento obtenida con los sistemas de medición remota lidar. Con este conocimiento, es posible proponer una metodología de corrección del error de las mediciones del lidar. Esta metodología está basada en la estimación de las variaciones del campo de viento no uniforme dentro del volumen de medición del lidar. Las variaciones promedio del campo de viento son predichas a partir de los resultados de las simulaciones computacionales de viento RANS, realizadas para el parque experimental de Alaiz. La metodología de corrección es verificada con los resultados de las simulaciones RANS y validadas con las mediciones reales adquiridas en la campaña de medición en terreno complejo. Al inicio de este reporte, el marco teórico describiendo el principio de medición de la tecnología lidar utilizada, es presentado con el fin de familiarizar al lector con los principales conceptos a utilizar a lo largo de este trabajo. Posteriormente, el estado del arte es presentado en donde se describe los avances realizados en el desarrollo de la la tecnología lidar aplicados al sector de la energía eólica. En la parte experimental de este trabajo de investigación se ha estudiado los datos adquiridos durante las dos campañas de medición realizadas. Estas campañas has sido realizadas en terreno llano y complejo, con el fin de complementar los conocimiento adquiridos en casa una de ellas y poder comparar los efectos del terreno en las mediciones de viento realizadas con sistemas remotos lidar. La primer campaña experimental se desarrollo en terreno llano, en el parque de ensayos de aerogeneradores H0vs0re, propiedad de DTU Wind Energy (anteriormente Ris0). La segunda campaña experimental se llevó a cabo en el parque de ensayos de aerogeneradores Alaiz, propiedad de CENER. Exactamente los mismos dos equipos lidar fueron utilizados en estas campañas, haciendo de estos experimentos altamente relevantes en el contexto de evaluación del recurso eólico. Un equipo lidar está basado en tecnología de onda continua, mientras que el otro está basado en tecnología de onda pulsada. La velocidad del viento fue medida, además de con los equipos lidar, con anemómetros de cazoletas, veletas y anemómetros verticales, instalados en mástiles meteorológicos. Los sensores del mástil meteorológico son considerados como las mediciones de referencia en el presente estudio. En primera instancia, se han analizado los promedios diez minútales de las medidas de viento. El objetivo es identificar las principales fuentes de error en las mediciones de los equipos lidar causadas por diferentes condiciones atmosféricas y por el flujo no uniforme de viento causado por el terreno complejo. El error del lidar ha sido estudiado como función de varias propiedades estadísticas del viento, como lo son el ángulo vertical de inclinación, la intensidad de turbulencia, la velocidad vertical, la estabilidad atmosférica y las características del terreno. El propósito es usar este conocimiento con el fin de definir criterios de filtrado de datos. Seguidamente, se propone una metodología para corregir el error del lidar causado por el campo de viento no uniforme, producido por la presencia de terreno complejo. Esta metodología está basada en el análisis matemático inicial sobre el proceso de cálculo de la velocidad de viento por los equipos lidar de onda continua. La metodología de corrección propuesta hace uso de las variaciones de viento calculadas a partir de las simulaciones RANS realizadas para el parque experimental de Alaiz. Una ventaja importante que presenta esta metodología es que las propiedades el campo de viento real, presentes en las mediciones instantáneas del lidar de onda continua, puede dar paso a análisis adicionales como parte del trabajo a futuro. Dentro del marco del proyecto, el trabajo diario se realizó en las instalaciones de CENER, con supervisión cercana de la UPM, incluyendo una estancia de 1.5 meses en la universidad. Durante esta estancia, se definió el análisis matemático de las mediciones de viento realizadas por el equipo lidar de onda continua. Adicionalmente, los efectos del campo de viento no uniforme sobre el error de medición del lidar fueron analíticamente definidos, después de asumir algunas simplificaciones. Adicionalmente, durante la etapa inicial de este proyecto se desarrollo una importante trabajo de cooperación con DTU Wind Energy. Gracias a esto, el autor realizó una estancia de 1.5 meses en Dinamarca. Durante esta estancia, el autor realizó una visita a la campaña de medición en terreno llano con el fin de aprender los aspectos básicos del diseño de campañas de medidas experimentales, el estudio del terreno y los alrededores y familiarizarse con la instrumentación del mástil meteorológico, el sistema de adquisición y almacenamiento de datos, así como de el estudio y reporte del análisis de mediciones. ABSTRACT The present report summarizes the research work performed during last 4.5 years of investigation on the sources of lidar bias due to complex terrain. This work corresponds to one task of the remote sensing work package, belonging to the FP7 WAUDIT project. Furthermore, the field data from the wind velocity measurement campaigns of the FP7 SafeWind project have been used in this report. The main objective of this research work is to determine the terrain effects on the lidar bias in the measured wind velocity. With this knowledge, it is possible to propose a lidar bias correction methodology. This methodology is based on an estimation of the wind field variations within the lidar scan volume. The wind field variations are calculated from RANS simulations performed from the Alaiz test site. The methodology is validated against real scale measurements recorded during an eight month measurement campaign at the Alaiz test site. Firstly, the mathematical framework of the lidar sensing principle is introduced and an overview of the state of the art is presented. The experimental part includes the study of two different, but complementary experiments. The first experiment was a measurement campaign performed in flat terrain, at DTU Wind Energy H0vs0re test site, while the second experiment was performed in complex terrain at CENER Alaiz test site. Exactly the same two lidar devices, based on continuous wave and pulsed wave systems, have been used in the two consecutive measurement campaigns, making this a relevant experiment in the context of wind resource assessment. The wind velocity was sensed by the lidars and standard cup anemometry and wind vanes (installed on a met mast). The met mast sensors are considered as the reference wind velocity measurements. The first analysis of the experimental data is dedicated to identify the main sources of lidar bias present in the 10 minute average values. The purpose is to identify the bias magnitude introduced by different atmospheric conditions and by the non-uniform wind flow resultant of the terrain irregularities. The lidar bias as function of several statistical properties of the wind flow like the tilt angle, turbulence intensity, vertical velocity, atmospheric stability and the terrain characteristics have been studied. The aim of this exercise is to use this knowledge in order to define useful lidar bias data filters. Then, a methodology to correct the lidar bias caused by non-uniform wind flow is proposed, based on the initial mathematical analysis of the lidar measurements. The proposed lidar bias correction methodology has been developed focusing on the the continuous wave lidar system. In a last step, the proposed lidar bias correction methodology is validated with the data of the complex terrain measurement campaign. The methodology makes use of the wind field variations obtained from the RANS analysis. The results are presented and discussed. The advantage of this methodology is that the wind field properties at the Alaiz test site can be studied with more detail, based on the instantaneous measurements of the CW lidar. Within the project framework, the daily basis work has been done at CENER, with close guidance and support from the UPM, including an exchange period of 1.5 months. During this exchange period, the mathematical analysis of the lidar sensing of the wind velocity was defined. Furthermore, the effects of non-uniform wind fields on the lidar bias were analytically defined, after making some assumptions for the sake of simplification. Moreover, there has been an important cooperation with DTU Wind Energy, where a secondment period of 1.5 months has been done as well. During the secondment period at DTU Wind Energy, an important introductory learning has taken place. The learned aspects include the design of an experimental measurement campaign in flat terrain, the site assessment study of obstacles and terrain conditions, the data acquisition and processing, as well as the study and reporting of the measurement analysis.

Improving precision and reducing bias in biological surveys: estimating false-negative error rates

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of presence/absence data in wildlife management and biological surveys is widespread. There is a growing interest in quantifying the sources of error associated with these data. We show that false-negative errors (failure to record a species when in fact it is present) can have a significant impact on statistical estimation of habitat models using simulated data. Then we introduce an extension of logistic modeling, the zero-inflated binomial (ZIB) model that permits the estimation of the rate of false-negative errors and the correction of estimates of the probability of occurrence for false-negative errors by using repeated. visits to the same site. Our simulations show that even relatively low rates of false negatives bias statistical estimates of habitat effects. The method with three repeated visits eliminates the bias, but estimates are relatively imprecise. Six repeated visits improve precision of estimates to levels comparable to that achieved with conventional statistics in the absence of false-negative errors In general, when error rates are less than or equal to50% greater efficiency is gained by adding more sites, whereas when error rates are >50% it is better to increase the number of repeated visits. We highlight the flexibility of the method with three case studies, clearly demonstrating the effect of false-negative errors for a range of commonly used survey methods.

Information and the evolution of codon bias

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The informational properties of biological systems are the subject of much debate and research. I present a general argument in favor of the existence and central importance of information in organisms, followed by a case study of the genetic code (specifically, codon bias) and the translation system from the perspective of information. The codon biases of 831 Bacteria and Archeae are analyzed and modeled as points in a 64-dimensional statistical space. The major results are that (1) codon bias evolution does not follow canonical patterns, and (2) the use of coding space in organsims is a subset of the total possible coding space. These findings imply that codon bias is a unique adaptive mechanism that owes its existence to organisms' use of information in representing genes, and that there is a particularly biological character to the resulting biased coding and information use.

SAMPLING BIAS IN EVALUATING THE PROBABILITY OF SEISMICALLY INDUCED SOIL LIQUEFACTION WITH SPT & CPT CASE HISTORIES

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several deterministic and probabilistic methods are used to evaluate the probability of seismically induced liquefaction of a soil. The probabilistic models usually possess some uncertainty in that model and uncertainties in the parameters used to develop that model. These model uncertainties vary from one statistical model to another. Most of the model uncertainties are epistemic, and can be addressed through appropriate knowledge of the statistical model. One such epistemic model uncertainty in evaluating liquefaction potential using a probabilistic model such as logistic regression is sampling bias. Sampling bias is the difference between the class distribution in the sample used for developing the statistical model and the true population distribution of liquefaction and non-liquefaction instances. Recent studies have shown that sampling bias can significantly affect the predicted probability using a statistical model. To address this epistemic uncertainty, a new approach was developed for evaluating the probability of seismically-induced soil liquefaction, in which a logistic regression model in combination with Hosmer-Lemeshow statistic was used. This approach was used to estimate the population (true) distribution of liquefaction to non-liquefaction instances of standard penetration test (SPT) and cone penetration test (CPT) based most updated case histories. Apart from this, other model uncertainties such as distribution of explanatory variables and significance of explanatory variables were also addressed using KS test and Wald statistic respectively. Moreover, based on estimated population distribution, logistic regression equations were proposed to calculate the probability of liquefaction for both SPT and CPT based case history. Additionally, the proposed probability curves were compared with existing probability curves based on SPT and CPT case histories.

Growth Curves For Girls With Turner Syndrome.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this study was to review the growth curves for Turner syndrome, evaluate the methodological and statistical quality, and suggest potential growth curves for clinical practice guidelines. The search was carried out in the databases Medline and Embase. Of 1006 references identified, 15 were included. Studies constructed curves for weight, height, weight/height, body mass index, head circumference, height velocity, leg length, and sitting height. The sample ranged between 47 and 1,565 (total = 6,273) girls aged 0 to 24 y, born between 1950 and 2006. The number of measures ranged from 580 to 9,011 (total = 28,915). Most studies showed strengths such as sample size, exclusion of the use of growth hormone and androgen, and analysis of confounding variables. However, the growth curves were restricted to height, lack of information about selection bias, limited distributional properties, and smoothing aspects. In conclusion, we observe the need to construct an international growth reference for girls with Turner syndrome, in order to provide support for clinical practice guidelines.

Classificação de sistemas meteorológicos e comparação da precipitação estimada pelo radar e medida pela rede telemétrica na bacia hidrográfica do alto Tietê

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Foram analisadas características da precipitação estimada a partir de 145.194 campos de refletividade, de um total de 827 dias entre 1998 e 2003, obtidos do Radar Meteorológico de São Paulo (RSP). Os eventos foram classificados de acordo com intensidades de precipitação; em Convectivos (EC) e Estratiformes (EE). Quanto à morfologia, cinco tipos de sistemas foram identificados; Convecção Isolada (CI), Brisa Marítima (BM), Linhas de Instabilidade (LI), Bandas Dispersas (BD) e Frentes Frias (FF). Eventos convectivos dominam na primavera e verão e estratiformes no outono e inverno. A CI e a BM tiveram maiores picos de atuação entre outubro e março enquanto as FF de abril a setembro. BD atuam durante todo o ano e as LI só não foram observadas nos meses de junho e julho. Uma comparação pontual entre a precipitação medida pela telemetria e estimada com o radar foi realizada e, mostrou haver, na maioria dos casos, um viés positivo do RSP, para acumulações de 10, 30 e 60 minutos. Com o objetivo de integrar as estimativas de precipitação do radar com as medidas da rede telemétrica, por meio de uma análise objetiva estatística, foram obtidas dos campos de precipitação do radar as estruturas das correlações espaciais em função da distância para acumulações de chuva de 15, 30, 60 e 120 minutos para os cinco tipos de sistemas precipitantes que foram caracterizados. As curvas das correlações espaciais médias de todos os eventos de precipitação de cada sistema foram ajustadas por funções polinomiais de sexta ordem. Os resultados indicam diferenças significativas na estrutura espacial das correlações entre os sistemas precipitantes.

Previsões climáticas sazonais sobre o Brasil: avaliação do RegCM3 aninhado no modelo global CPTEC/COLA

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Este trabalho avalia o desempenho de previsões sazonais do modelo climático regional RegCM3, aninhado ao modelo global CPTEC/COLA. As previsões com o RegCM3 utilizaram 60 km de resolução horizontal num domínio que inclui grande parte da América do Sul. As previsões do RegCM3 e CPTEC/COLA foram avaliadas utilizando as análises de chuva e temperatura do ar do Climate Prediction Center (CPC) e National Centers for Enviromental Prediction (NCEP), respectivamente. Entre maio de 2005 e julho de 2007, 27 previsões sazonais de chuva e temperatura do ar (exceto a temperatura do CPTEC/COLA, que possui 26 previsões) foram avaliadas em três regiões do Brasil: Nordeste (NDE), Sudeste (SDE) e Sul (SUL). As previsões do RegCM3 também foram comparadas com as climatologias das análises. De acordo com os índices estatísticos (bias, coeficiente de correlação, raiz quadrada do erro médio quadrático e coeficiente de eficiência), nas três regiões (NDE, SDE e SUL) a chuva sazonal prevista pelo RegCM3 é mais próxima da observada do que a prevista pelo CPTEC/COLA. Além disto, o RegCM3 também é melhor previsor da chuva sazonal do que da média das observações nas três regiões. Para temperatura, as previsões do RegCM3 são superiores às do CPTEC/COLA nas áreas NDE e SUL, enquanto o CPTEC/COLA é superior no SDE. Finalmente, as previsões de chuva e temperatura do RegCM3 são mais próximas das observações do que a climatologia observada. Estes resultados indicam o potencial de utilização do RegCM3 para previsão sazonal, que futuramente deverá ser explorado através de previsão por conjunto.

«
1
2
3
4
5
6
7
8
...
66
67
»