918 resultados para Parametric and semiparametric methods
Resumo:
An increasing focus in evolutionary biology is on the interplay between mesoscale ecological and evolutionary processes such as population demographics, habitat tolerance, and especially geographic distribution, as potential drivers responsible for patterns of diversification and extinction over geologic time. However, few studies to date connect organismal processes such as survival and reproduction through mesoscale patterns to long-term macroevolutionary trends. In my dissertation, I investigate how mechanism of seed dispersal, mediated through geographic range size, influences diversification rates in the Rosales (Plantae: Anthophyta). In my first chapter, I validate the phylogenetic comparative methods that I use in my second and third chapters. Available state speciation and extinction (SSE) models assumptions about evolution known to be false through fossil data. I show, however, that as long as net diversification rates remain positive – a condition likely true for the Rosales – these violations of SSE’s assumptions do not cause significantly biased results. With SSE methods validated, my second chapter reconstructs three associations that appear to increase diversification rate for Rosalean genera: (1) herbaceous habit; (2) a three-way interaction combining animal dispersal, high within-genus species richness, and geographic range on multiple continents; (3) a four-way interaction combining woody habit with the other three characteristics of (2). I suggest that the three- and four-way interactions represent colonization ability and resulting extinction resistance in the face of late Cenozoic climate change; however, there are other possibilities as well that I hope to investigate in future research. My third chapter reconstructs the phylogeographic history of the Rosales using both non-fossil-assisted SSE methods as well as fossil-informed traditional phylogeographic analysis. Ancestral state reconstructions indicate that the Rosaceae diversified in North America while the other Rosalean families diversified elsewhere, possibly in Eurasia. SSE is able to successfully identify groups of genera that were likely to have been ancestrally widespread, but has poorer taxonomic resolution than methods that use fossil data. In conclusion, these chapters together suggest several potential causal links between organismal, mesoscale, and geologic scale processes, but further work will be needed to test the hypotheses that I raise here.
Resumo:
Background: Statistical analysis of DNA microarray data provides a valuable diagnostic tool for the investigation of genetic components of diseases. To take advantage of the multitude of available data sets and analysis methods, it is desirable to combine both different algorithms and data from different studies. Applying ensemble learning, consensus clustering and cross-study normalization methods for this purpose in an almost fully automated process and linking different analysis modules together under a single interface would simplify many microarray analysis tasks. Results: We present ArrayMining.net, a web-application for microarray analysis that provides easy access to a wide choice of feature selection, clustering, prediction, gene set analysis and cross-study normalization methods. In contrast to other microarray-related web-tools, multiple algorithms and data sets for an analysis task can be combined using ensemble feature selection, ensemble prediction, consensus clustering and cross-platform data integration. By interlinking different analysis tools in a modular fashion, new exploratory routes become available, e.g. ensemble sample classification using features obtained from a gene set analysis and data from multiple studies. The analysis is further simplified by automatic parameter selection mechanisms and linkage to web tools and databases for functional annotation and literature mining. Conclusion: ArrayMining.net is a free web-application for microarray analysis combining a broad choice of algorithms based on ensemble and consensus methods, using automatic parameter selection and integration with annotation databases.
Resumo:
The purpose of this study was to examine the relationship between the structure of jobs and burnout, and to assess to what extent, if any this relationship was moderated by individual coping methods. This study was supported by the Karasek's (1998) Job Demand-Control-Support theory of work stress as well as Maslach and Leiter's (1993) theory of burnout. Coping was examined as a moderator based on the conceptualization of Lazarus and Folkman (1984). Two overall overarching questions framed this study: (a) what is the relationship between job structure, as operationalized by job title, and burnout across different occupations in support services in a large municipal school district? and (b) To what extent do individual differences in coping methods moderate this relationship? This study was a cross-sectional study of county public school bus drivers, bus aides, mechanics, and clerical workers (N = 253) at three bus depot locations within the same district using validated survey instruments for data collection. Hypotheses were tested using simultaneous regression analyses. Findings indicated that there were statistically significant and relevant relationships among the variables of interest; job demands, job control, burnout, and ways of coping. There was a relationship between job title and physical job demands. There was no evidence to support a relationship between job title and psychological demands. Furthermore, there was a relationship between physical demands, emotional exhaustion and personal accomplishment; key indicators of burnout. Results showed significant correlations between individual ways of coping as a moderator between job structure, operationalized by job title, and individual employee burnout adding empirical evidence to the occupational stress literature. Based on the findings, there are implications for theory, research, and practice. For theory and research, the findings suggest the importance of incorporating transactional models in the study of occupational stress. In the area of practice, the findings highlight the importance of enriching jobs, increasing job control, and providing individual-level training related to stress reduction.
Resumo:
Geophysical surveying and geoelectricalmethods are effective to study permafrost distribution and conditions in polar environments. Geoelectrical methods are particularly suited to study the spatial distribution of permafrost because of its high electrical resistivity in comparison with that of soil or rock above 0 °C. In the South Shetland Islands permafrost is considered to be discontinuous up to elevations of 20–40ma.s.l., changing to continuous at higher altitudes. There are no specific data about the distribution of permafrost in Byers Peninsula, in Livingston Island, which is the largest ice-free area in the South Shetland Islands. With the purpose of better understanding the occurrence of permanent frozen conditions in this area, a geophysical survey using an electrical resistivity tomography (ERT)methodologywas conducted during the January 2015 field season, combined with geomorphological and ecological studies. Three overlapping electrical resistivity tomographies of 78meach were done along the same profile which ran from the coast to the highest raised beaches. The three electrical resistivity tomographies are combined in an electrical resistivitymodel which represents the distribution of the electrical resistivity of the ground to depths of about 13malong 158m. Several patches of high electrical resistivity were found, and interpreted as patches of sporadic permafrost. The lower limits of sporadic to discontinuous permafrost in the area are confirmed by the presence of permafrost-related landforms nearby. There is a close correspondence between moss patches and permafrost patches along the geoelectrical transect.
Resumo:
The Earth we know today was not always so. Over millions of years have undergone significant ch an g e s brought about by numerous geological phenomena aimed at your balance, some internal order, creating new geological formations and other external order smoothing formations previously created. From t h e tectonic standpoint, Angola is located in a relatively stable area which gives it a certain p ri v i l e g e w h e n compared with some Asian countries or even Americans where quite often occur earthquakes and volcanic eruptions. However, the same cannot be said in relation to the occurrence of an external geodynamics phenomena, such as the ravines, which in recent years has taken shape in many provinces, especially due to anthropogenic activity, giving rise to geological hazards, increasing the risk of damage in buildings and others infrastructures, losses direct or indirect in economic activities and loss of human lives. We understand that the reducing of these risks starts, in particular, by their identification, for later take preventive measures. This work is the result of some research work carried out by the authors through erosion courses of s o i l and stabilization of soils subject to erosion phenomena, carried out by Engineering Laboratory of Angola (LEA). For the realization of this work, we resorted to cartographic data query, literature, listening to s o m e o f the provincial representatives and local residents, as well as the observation in lo co o f s o m e af f e ct ed areas. The results allow us to infer that the main provinces affected by ravine phenomenon are located in Central and Northern highlands, as well as in the eastern region, and more recently in Cuando-Cub an go province. Not ruling out, however, other regions, such as in Luanda and Cabinda [1]. Relatively the causes, we can say that the ravines in Angola are primarily due to the combination of three natural factors: climate, topography and type of soil [2]. When we add the anthropogenic activit y , namely the execution of construction works, the drainage system obstructio n, exploration of m i n e ral s, agriculture and fires, it is verified an increasing of the phenomenon, often requiring immedi at e act i o n . These interventions can be done through structural or engineering measures and by the stabilization measures on the degraded soil cover [3]. We present an example of stabilization measures throu g h t h e deployment of a local vegetation called Pennisetum purpureum. It is expected that the results may contribute to a better understanding of the causes of the ravine phenomenon in Angola and that the adopted stabilization method can be adapted in other affected provinces in order to prevent and making the contention of the ravines.
Resumo:
The book Worldwide Wound Healing - Innovation in Natural and Conventional Methods develops a set of themes on the healing and treatment of complex wounds through evidence-based practice with innovations in the use of natural and conventional methods. It is an innovative way that promotes the integration of conventional and natural perspectives in wound healing, with a unique focus on the quality of life of the patient.
Resumo:
This study aimed to compare four establishment methods of mixed swards of Tangolagrass and forage peanut (Arachis pintoi).
Resumo:
The aim of this work is to study some of the density estimation tec- niques and to apply to the segmentation of medical images. Medical images are used to help the diagnostic of tumor diseases as well as to plan and deliver treatment. A computer image is an array of values representing colors in some scale. The smallest element of the image to which it is possible to assign a value is called pixel. Segmen- tation is the process of dividing the image in portions through the classi¯cation of each pixel. The simplest way of classi¯cation is by thresholding, given the number of portions and the threshold values. Another method is constructing a histogram of the pixel values and assign a portion to each pike. The threshold is the mean between two pikes. As the histogram does not form a smooth curve it is di±cult to discern between true pikes and random variation. Density estimation methods allow the estimation of a smooth curve. Image data can be considered as mixture of different densities. In this project parametric and nonparametric methods for density estimation will be addressed and some of them are applied to CT image data
Resumo:
BACKGROUND: We sought to characterize the impact that hepatitis C virus (HCV) infection has on CD4 cells during the first 48 weeks of antiretroviral therapy (ART) in previously ART-naive human immunodeficiency virus (HIV)-infected patients. METHODS: The HIV/AIDS Drug Treatment Programme at the British Columbia Centre for Excellence in HIV/AIDS distributes all ART in this Canadian province. Eligible individuals were those whose first-ever ART included 2 nucleoside reverse transcriptase inhibitors and either a protease inhibitor or a nonnucleoside reverse transcriptase inhibitor and who had a documented positive result for HCV antibody testing. Outcomes were binary events (time to an increase of > or = 75 CD4 cells/mm3 or an increase of > or = 10% in the percentage of CD4 cells in the total T cell population [CD4 cell fraction]) and continuous repeated measures. Statistical analyses used parametric and nonparametric methods, including multivariate mixed-effects linear regression analysis and Cox proportional hazards analysis. RESULTS: Of 1186 eligible patients, 606 (51%) were positive and 580 (49%) were negative for HCV antibodies. HCV antibody-positive patients were slower to have an absolute (P<.001) and a fraction (P = .02) CD4 cell event. In adjusted Cox proportional hazards analysis (controlling for age, sex, baseline absolute CD4 cell count, baseline pVL, type of ART initiated, AIDS diagnosis at baseline, adherence to ART regimen, and number of CD4 cell measurements), HCV antibody-positive patients were less likely to have an absolute CD4 cell event (adjusted hazard ratio [AHR], 0.84 [95% confidence interval [CI], 0.72-0.98]) and somewhat less likely to have a CD4 cell fraction event (AHR, 0.89 [95% CI, 0.70-1.14]) than HCV antibody-negative patients. In multivariate mixed-effects linear regression analysis, HCV antibody-negative patients had increases of an average of 75 cells in the absolute CD4 cell count and 4.4% in the CD4 cell fraction, compared with 20 cells and 1.1% in HCV antibody-positive patients, during the first 48 weeks of ART, after adjustment for time-updated pVL, number of CD4 cell measurements, and other factors. CONCLUSION: HCV antibody-positive HIV-infected patients may have an altered immunologic response to ART.
Resumo:
El análisis determinista de seguridad (DSA) es el procedimiento que sirve para diseñar sistemas, estructuras y componentes relacionados con la seguridad en las plantas nucleares. El DSA se basa en simulaciones computacionales de una serie de hipotéticos accidentes representativos de la instalación, llamados escenarios base de diseño (DBS). Los organismos reguladores señalan una serie de magnitudes de seguridad que deben calcularse en las simulaciones, y establecen unos criterios reguladores de aceptación (CRA), que son restricciones que deben cumplir los valores de esas magnitudes. Las metodologías para realizar los DSA pueden ser de 2 tipos: conservadoras o realistas. Las metodologías conservadoras utilizan modelos predictivos e hipótesis marcadamente pesimistas, y, por ello, relativamente simples. No necesitan incluir un análisis de incertidumbre de sus resultados. Las metodologías realistas se basan en hipótesis y modelos predictivos realistas, generalmente mecanicistas, y se suplementan con un análisis de incertidumbre de sus principales resultados. Se les denomina también metodologías BEPU (“Best Estimate Plus Uncertainty”). En ellas, la incertidumbre se representa, básicamente, de manera probabilista. Para metodologías conservadores, los CRA son, simplemente, restricciones sobre valores calculados de las magnitudes de seguridad, que deben quedar confinados en una “región de aceptación” de su recorrido. Para metodologías BEPU, el CRA no puede ser tan sencillo, porque las magnitudes de seguridad son ahora variables inciertas. En la tesis se desarrolla la manera de introducción de la incertidumbre en los CRA. Básicamente, se mantiene el confinamiento a la misma región de aceptación, establecida por el regulador. Pero no se exige el cumplimiento estricto sino un alto nivel de certidumbre. En el formalismo adoptado, se entiende por ello un “alto nivel de probabilidad”, y ésta corresponde a la incertidumbre de cálculo de las magnitudes de seguridad. Tal incertidumbre puede considerarse como originada en los inputs al modelo de cálculo, y propagada a través de dicho modelo. Los inputs inciertos incluyen las condiciones iniciales y de frontera al cálculo, y los parámetros empíricos de modelo, que se utilizan para incorporar la incertidumbre debida a la imperfección del modelo. Se exige, por tanto, el cumplimiento del CRA con una probabilidad no menor a un valor P0 cercano a 1 y definido por el regulador (nivel de probabilidad o cobertura). Sin embargo, la de cálculo de la magnitud no es la única incertidumbre existente. Aunque un modelo (sus ecuaciones básicas) se conozca a la perfección, la aplicación input-output que produce se conoce de manera imperfecta (salvo que el modelo sea muy simple). La incertidumbre debida la ignorancia sobre la acción del modelo se denomina epistémica; también se puede decir que es incertidumbre respecto a la propagación. La consecuencia es que la probabilidad de cumplimiento del CRA no se puede conocer a la perfección; es una magnitud incierta. Y así se justifica otro término usado aquí para esta incertidumbre epistémica: metaincertidumbre. Los CRA deben incorporar los dos tipos de incertidumbre: la de cálculo de la magnitud de seguridad (aquí llamada aleatoria) y la de cálculo de la probabilidad (llamada epistémica o metaincertidumbre). Ambas incertidumbres pueden introducirse de dos maneras: separadas o combinadas. En ambos casos, el CRA se convierte en un criterio probabilista. Si se separan incertidumbres, se utiliza una probabilidad de segundo orden; si se combinan, se utiliza una probabilidad única. Si se emplea la probabilidad de segundo orden, es necesario que el regulador imponga un segundo nivel de cumplimiento, referido a la incertidumbre epistémica. Se denomina nivel regulador de confianza, y debe ser un número cercano a 1. Al par formado por los dos niveles reguladores (de probabilidad y de confianza) se le llama nivel regulador de tolerancia. En la Tesis se razona que la mejor manera de construir el CRA BEPU es separando las incertidumbres, por dos motivos. Primero, los expertos defienden el tratamiento por separado de incertidumbre aleatoria y epistémica. Segundo, el CRA separado es (salvo en casos excepcionales) más conservador que el CRA combinado. El CRA BEPU no es otra cosa que una hipótesis sobre una distribución de probabilidad, y su comprobación se realiza de forma estadística. En la tesis, los métodos estadísticos para comprobar el CRA BEPU en 3 categorías, según estén basados en construcción de regiones de tolerancia, en estimaciones de cuantiles o en estimaciones de probabilidades (ya sea de cumplimiento, ya sea de excedencia de límites reguladores). Según denominación propuesta recientemente, las dos primeras categorías corresponden a los métodos Q, y la tercera, a los métodos P. El propósito de la clasificación no es hacer un inventario de los distintos métodos en cada categoría, que son muy numerosos y variados, sino de relacionar las distintas categorías y citar los métodos más utilizados y los mejor considerados desde el punto de vista regulador. Se hace mención especial del método más utilizado hasta el momento: el método no paramétrico de Wilks, junto con su extensión, hecha por Wald, al caso multidimensional. Se decribe su método P homólogo, el intervalo de Clopper-Pearson, típicamente ignorado en el ámbito BEPU. En este contexto, se menciona el problema del coste computacional del análisis de incertidumbre. Los métodos de Wilks, Wald y Clopper-Pearson requieren que la muestra aleatortia utilizada tenga un tamaño mínimo, tanto mayor cuanto mayor el nivel de tolerancia exigido. El tamaño de muestra es un indicador del coste computacional, porque cada elemento muestral es un valor de la magnitud de seguridad, que requiere un cálculo con modelos predictivos. Se hace especial énfasis en el coste computacional cuando la magnitud de seguridad es multidimensional; es decir, cuando el CRA es un criterio múltiple. Se demuestra que, cuando las distintas componentes de la magnitud se obtienen de un mismo cálculo, el carácter multidimensional no introduce ningún coste computacional adicional. Se prueba así la falsedad de una creencia habitual en el ámbito BEPU: que el problema multidimensional sólo es atacable desde la extensión de Wald, que tiene un coste de computación creciente con la dimensión del problema. En el caso (que se da a veces) en que cada componente de la magnitud se calcula independientemente de los demás, la influencia de la dimensión en el coste no se puede evitar. Las primeras metodologías BEPU hacían la propagación de incertidumbres a través de un modelo sustitutivo (metamodelo o emulador) del modelo predictivo o código. El objetivo del metamodelo no es su capacidad predictiva, muy inferior a la del modelo original, sino reemplazar a éste exclusivamente en la propagación de incertidumbres. Para ello, el metamodelo se debe construir con los parámetros de input que más contribuyan a la incertidumbre del resultado, y eso requiere un análisis de importancia o de sensibilidad previo. Por su simplicidad, el modelo sustitutivo apenas supone coste computacional, y puede estudiarse exhaustivamente, por ejemplo mediante muestras aleatorias. En consecuencia, la incertidumbre epistémica o metaincertidumbre desaparece, y el criterio BEPU para metamodelos se convierte en una probabilidad simple. En un resumen rápido, el regulador aceptará con más facilidad los métodos estadísticos que menos hipótesis necesiten; los exactos más que los aproximados; los no paramétricos más que los paramétricos, y los frecuentistas más que los bayesianos. El criterio BEPU se basa en una probabilidad de segundo orden. La probabilidad de que las magnitudes de seguridad estén en la región de aceptación no sólo puede asimilarse a una probabilidad de éxito o un grado de cumplimiento del CRA. También tiene una interpretación métrica: representa una distancia (dentro del recorrido de las magnitudes) desde la magnitud calculada hasta los límites reguladores de aceptación. Esta interpretación da pie a una definición que propone esta tesis: la de margen de seguridad probabilista. Dada una magnitud de seguridad escalar con un límite superior de aceptación, se define el margen de seguridad (MS) entre dos valores A y B de la misma como la probabilidad de que A sea menor que B, obtenida a partir de las incertidumbres de A y B. La definición probabilista de MS tiene varias ventajas: es adimensional, puede combinarse de acuerdo con las leyes de la probabilidad y es fácilmente generalizable a varias dimensiones. Además, no cumple la propiedad simétrica. El término margen de seguridad puede aplicarse a distintas situaciones: distancia de una magnitud calculada a un límite regulador (margen de licencia); distancia del valor real de la magnitud a su valor calculado (margen analítico); distancia desde un límite regulador hasta el valor umbral de daño a una barrera (margen de barrera). Esta idea de representar distancias (en el recorrido de magnitudes de seguridad) mediante probabilidades puede aplicarse al estudio del conservadurismo. El margen analítico puede interpretarse como el grado de conservadurismo (GC) de la metodología de cálculo. Utilizando la probabilidad, se puede cuantificar el conservadurismo de límites de tolerancia de una magnitud, y se pueden establecer indicadores de conservadurismo que sirvan para comparar diferentes métodos de construcción de límites y regiones de tolerancia. Un tópico que nunca se abordado de manera rigurosa es el de la validación de metodologías BEPU. Como cualquier otro instrumento de cálculo, una metodología, antes de poder aplicarse a análisis de licencia, tiene que validarse, mediante la comparación entre sus predicciones y valores reales de las magnitudes de seguridad. Tal comparación sólo puede hacerse en escenarios de accidente para los que existan valores medidos de las magnitudes de seguridad, y eso ocurre, básicamente en instalaciones experimentales. El objetivo último del establecimiento de los CRA consiste en verificar que se cumplen para los valores reales de las magnitudes de seguridad, y no sólo para sus valores calculados. En la tesis se demuestra que una condición suficiente para este objetivo último es la conjunción del cumplimiento de 2 criterios: el CRA BEPU de licencia y un criterio análogo, pero aplicado a validación. Y el criterio de validación debe demostrarse en escenarios experimentales y extrapolarse a plantas nucleares. El criterio de licencia exige un valor mínimo (P0) del margen probabilista de licencia; el criterio de validación exige un valor mínimo del margen analítico (el GC). Esos niveles mínimos son básicamente complementarios; cuanto mayor uno, menor el otro. La práctica reguladora actual impone un valor alto al margen de licencia, y eso supone que el GC exigido es pequeño. Adoptar valores menores para P0 supone menor exigencia sobre el cumplimiento del CRA, y, en cambio, más exigencia sobre el GC de la metodología. Y es importante destacar que cuanto mayor sea el valor mínimo del margen (de licencia o analítico) mayor es el coste computacional para demostrarlo. Así que los esfuerzos computacionales también son complementarios: si uno de los niveles es alto (lo que aumenta la exigencia en el cumplimiento del criterio) aumenta el coste computacional. Si se adopta un valor medio de P0, el GC exigido también es medio, con lo que la metodología no tiene que ser muy conservadora, y el coste computacional total (licencia más validación) puede optimizarse. ABSTRACT Deterministic Safety Analysis (DSA) is the procedure used in the design of safety-related systems, structures and components of nuclear power plants (NPPs). DSA is based on computational simulations of a set of hypothetical accidents of the plant, named Design Basis Scenarios (DBS). Nuclear regulatory authorities require the calculation of a set of safety magnitudes, and define the regulatory acceptance criteria (RAC) that must be fulfilled by them. Methodologies for performing DSA van be categorized as conservative or realistic. Conservative methodologies make use of pessimistic model and assumptions, and are relatively simple. They do not need an uncertainty analysis of their results. Realistic methodologies are based on realistic (usually mechanistic) predictive models and assumptions, and need to be supplemented with uncertainty analyses of their results. They are also termed BEPU (“Best Estimate Plus Uncertainty”) methodologies, and are typically based on a probabilistic representation of the uncertainty. For conservative methodologies, the RAC are simply the restriction of calculated values of safety magnitudes to “acceptance regions” defined on their range. For BEPU methodologies, the RAC cannot be so simple, because the safety magnitudes are now uncertain. In the present Thesis, the inclusion of uncertainty in RAC is studied. Basically, the restriction to the acceptance region must be fulfilled “with a high certainty level”. Specifically, a high probability of fulfillment is required. The calculation uncertainty of the magnitudes is considered as propagated from inputs through the predictive model. Uncertain inputs include model empirical parameters, which store the uncertainty due to the model imperfection. The fulfillment of the RAC is required with a probability not less than a value P0 close to 1 and defined by the regulator (probability or coverage level). Calculation uncertainty is not the only one involved. Even if a model (i.e. the basic equations) is perfectly known, the input-output mapping produced by the model is imperfectly known (unless the model is very simple). This ignorance is called epistemic uncertainty, and it is associated to the process of propagation). In fact, it is propagated to the probability of fulfilling the RAC. Another term used on the Thesis for this epistemic uncertainty is metauncertainty. The RAC must include the two types of uncertainty: one for the calculation of the magnitude (aleatory uncertainty); the other one, for the calculation of the probability (epistemic uncertainty). The two uncertainties can be taken into account in a separate fashion, or can be combined. In any case the RAC becomes a probabilistic criterion. If uncertainties are separated, a second-order probability is used; of both are combined, a single probability is used. On the first case, the regulator must define a level of fulfillment for the epistemic uncertainty, termed regulatory confidence level, as a value close to 1. The pair of regulatory levels (probability and confidence) is termed the regulatory tolerance level. The Thesis concludes that the adequate way of setting the BEPU RAC is by separating the uncertainties. There are two reasons to do so: experts recommend the separation of aleatory and epistemic uncertainty; and the separated RAC is in general more conservative than the joint RAC. The BEPU RAC is a hypothesis on a probability distribution, and must be statistically tested. The Thesis classifies the statistical methods to verify the RAC fulfillment in 3 categories: methods based on tolerance regions, in quantile estimators and on probability (of success or failure) estimators. The former two have been termed Q-methods, whereas those in the third category are termed P-methods. The purpose of our categorization is not to make an exhaustive survey of the very numerous existing methods. Rather, the goal is to relate the three categories and examine the most used methods from a regulatory standpoint. Special mention deserves the most used method, due to Wilks, and its extension to multidimensional variables (due to Wald). The counterpart P-method of Wilks’ is Clopper-Pearson interval, typically ignored in the BEPU realm. The problem of the computational cost of an uncertainty analysis is tackled. Wilks’, Wald’s and Clopper-Pearson methods require a minimum sample size, which is a growing function of the tolerance level. The sample size is an indicator of the computational cost, because each element of the sample must be calculated with the predictive models (codes). When the RAC is a multiple criteria, the safety magnitude becomes multidimensional. When all its components are output of the same calculation, the multidimensional character does not introduce additional computational cost. In this way, an extended idea in the BEPU realm, stating that the multi-D problem can only be tackled with the Wald extension, is proven to be false. When the components of the magnitude are independently calculated, the influence of the problem dimension on the cost cannot be avoided. The former BEPU methodologies performed the uncertainty propagation through a surrogate model of the code, also termed emulator or metamodel. The goal of a metamodel is not the predictive capability, clearly worse to the original code, but the capacity to propagate uncertainties with a lower computational cost. The emulator must contain the input parameters contributing the most to the output uncertainty, and this requires a previous importance analysis. The surrogate model is practically inexpensive to run, so that it can be exhaustively analyzed through Monte Carlo. Therefore, the epistemic uncertainty due to sampling will be reduced to almost zero, and the BEPU RAC for metamodels includes a simple probability. The regulatory authority will tend to accept the use of statistical methods which need a minimum of assumptions: exact, nonparametric and frequentist methods rather than approximate, parametric and bayesian methods, respectively. The BEPU RAC is based on a second-order probability. The probability of the safety magnitudes being inside the acceptance region is a success probability and can be interpreted as a fulfillment degree if the RAC. Furthermore, it has a metric interpretation, as a distance (in the range of magnitudes) from calculated values of the magnitudes to acceptance regulatory limits. A probabilistic definition of safety margin (SM) is proposed in the thesis. The same from a value A to other value B of a safety magnitude is defined as the probability that A is less severe than B, obtained from the uncertainties if A and B. The probabilistic definition of SM has several advantages: it is nondimensional, ranges in the interval (0,1) and can be easily generalized to multiple dimensions. Furthermore, probabilistic SM are combined according to the probability laws. And a basic property: probabilistic SM are not symmetric. There are several types of SM: distance from a calculated value to a regulatory limit (licensing margin); or from the real value to the calculated value of a magnitude (analytical margin); or from the regulatory limit to the damage threshold (barrier margin). These representations of distances (in the magnitudes’ range) as probabilities can be applied to the quantification of conservativeness. Analytical margins can be interpreted as the degree of conservativeness (DG) of the computational methodology. Conservativeness indicators are established in the Thesis, useful in the comparison of different methods of constructing tolerance limits and regions. There is a topic which has not been rigorously tackled to the date: the validation of BEPU methodologies. Before being applied in licensing, methodologies must be validated, on the basis of comparisons of their predictions ad real values of the safety magnitudes. Real data are obtained, basically, in experimental facilities. The ultimate goal of establishing RAC is to verify that real values (aside from calculated values) fulfill them. In the Thesis it is proved that a sufficient condition for this goal is the conjunction of 2 criteria: the BEPU RAC and an analogous criterion for validation. And this las criterion must be proved in experimental scenarios and extrapolated to NPPs. The licensing RAC requires a minimum value (P0) of the probabilistic licensing margin; the validation criterion requires a minimum value of the analytical margin (i.e., of the DG). These minimum values are basically complementary; the higher one of them, the lower the other one. The regulatory practice sets a high value on the licensing margin, so that the required DG is low. The possible adoption of lower values for P0 would imply weaker exigence on the RCA fulfillment and, on the other hand, higher exigence on the conservativeness of the methodology. It is important to highlight that a higher minimum value of the licensing or analytical margin requires a higher computational cost. Therefore, the computational efforts are also complementary. If medium levels are adopted, the required DG is also medium, and the methodology does not need to be very conservative. The total computational effort (licensing plus validation) could be optimized.
Semiparametric estimates of the supply and demand effects of disability on labor force participation
Resumo:
This paper modifies and uses the semiparametric methods of Ichimura and Lee (1991) on standard cross-section data to decompose the effect of disability on labor force participation into a demand and a supply effect. It shows that straightforward use of Ichimura and Lee leads to meaningless results while imposing monotonicity on the unknown function leads to substantial results. The paper finds that supply effects dominate the demand effects of disability.
Resumo:
Thesis (Ph.D.)--University of Washington, 2015
Resumo:
Study on variable stars is an important topic of modern astrophysics. After the invention of powerful telescopes and high resolving powered CCD’s, the variable star data is accumulating in the order of peta-bytes. The huge amount of data need lot of automated methods as well as human experts. This thesis is devoted to the data analysis on variable star’s astronomical time series data and hence belong to the inter-disciplinary topic, Astrostatistics. For an observer on earth, stars that have a change in apparent brightness over time are called variable stars. The variation in brightness may be regular (periodic), quasi periodic (semi-periodic) or irregular manner (aperiodic) and are caused by various reasons. In some cases, the variation is due to some internal thermo-nuclear processes, which are generally known as intrinsic vari- ables and in some other cases, it is due to some external processes, like eclipse or rotation, which are known as extrinsic variables. Intrinsic variables can be further grouped into pulsating variables, eruptive variables and flare stars. Extrinsic variables are grouped into eclipsing binary stars and chromospheri- cal stars. Pulsating variables can again classified into Cepheid, RR Lyrae, RV Tauri, Delta Scuti, Mira etc. The eruptive or cataclysmic variables are novae, supernovae, etc., which rarely occurs and are not periodic phenomena. Most of the other variations are periodic in nature. Variable stars can be observed through many ways such as photometry, spectrophotometry and spectroscopy. The sequence of photometric observa- xiv tions on variable stars produces time series data, which contains time, magni- tude and error. The plot between variable star’s apparent magnitude and time are known as light curve. If the time series data is folded on a period, the plot between apparent magnitude and phase is known as phased light curve. The unique shape of phased light curve is a characteristic of each type of variable star. One way to identify the type of variable star and to classify them is by visually looking at the phased light curve by an expert. For last several years, automated algorithms are used to classify a group of variable stars, with the help of computers. Research on variable stars can be divided into different stages like observa- tion, data reduction, data analysis, modeling and classification. The modeling on variable stars helps to determine the short-term and long-term behaviour and to construct theoretical models (for eg:- Wilson-Devinney model for eclips- ing binaries) and to derive stellar properties like mass, radius, luminosity, tem- perature, internal and external structure, chemical composition and evolution. The classification requires the determination of the basic parameters like pe- riod, amplitude and phase and also some other derived parameters. Out of these, period is the most important parameter since the wrong periods can lead to sparse light curves and misleading information. Time series analysis is a method of applying mathematical and statistical tests to data, to quantify the variation, understand the nature of time-varying phenomena, to gain physical understanding of the system and to predict future behavior of the system. Astronomical time series usually suffer from unevenly spaced time instants, varying error conditions and possibility of big gaps. This is due to daily varying daylight and the weather conditions for ground based observations and observations from space may suffer from the impact of cosmic ray particles. Many large scale astronomical surveys such as MACHO, OGLE, EROS, xv ROTSE, PLANET, Hipparcos, MISAO, NSVS, ASAS, Pan-STARRS, Ke- pler,ESA, Gaia, LSST, CRTS provide variable star’s time series data, even though their primary intention is not variable star observation. Center for Astrostatistics, Pennsylvania State University is established to help the astro- nomical community with the aid of statistical tools for harvesting and analysing archival data. Most of these surveys releases the data to the public for further analysis. There exist many period search algorithms through astronomical time se- ries analysis, which can be classified into parametric (assume some underlying distribution for data) and non-parametric (do not assume any statistical model like Gaussian etc.,) methods. Many of the parametric methods are based on variations of discrete Fourier transforms like Generalised Lomb-Scargle peri- odogram (GLSP) by Zechmeister(2009), Significant Spectrum (SigSpec) by Reegen(2007) etc. Non-parametric methods include Phase Dispersion Minimi- sation (PDM) by Stellingwerf(1978) and Cubic spline method by Akerlof(1994) etc. Even though most of the methods can be brought under automation, any of the method stated above could not fully recover the true periods. The wrong detection of period can be due to several reasons such as power leakage to other frequencies which is due to finite total interval, finite sampling interval and finite amount of data. Another problem is aliasing, which is due to the influence of regular sampling. Also spurious periods appear due to long gaps and power flow to harmonic frequencies is an inherent problem of Fourier methods. Hence obtaining the exact period of variable star from it’s time series data is still a difficult problem, in case of huge databases, when subjected to automation. As Matthew Templeton, AAVSO, states “Variable star data analysis is not always straightforward; large-scale, automated analysis design is non-trivial”. Derekas et al. 2007, Deb et.al. 2010 states “The processing of xvi huge amount of data in these databases is quite challenging, even when looking at seemingly small issues such as period determination and classification”. It will be beneficial for the variable star astronomical community, if basic parameters, such as period, amplitude and phase are obtained more accurately, when huge time series databases are subjected to automation. In the present thesis work, the theories of four popular period search methods are studied, the strength and weakness of these methods are evaluated by applying it on two survey databases and finally a modified form of cubic spline method is intro- duced to confirm the exact period of variable star. For the classification of new variable stars discovered and entering them in the “General Catalogue of Vari- able Stars” or other databases like “Variable Star Index“, the characteristics of the variability has to be quantified in term of variable star parameters.
Resumo:
In many applications the observed data can be viewed as a censored high dimensional full data random variable X. By the curve of dimensionality it is typically not possible to construct estimators that are asymptotically efficient at every probability distribution in a semiparametric censored data model of such a high dimensional censored data structure. We provide a general method for construction of one-step estimators that are efficient at a chosen submodel of the full-data model, are still well behaved off this submodel and can be chosen to always improve on a given initial estimator. These one-step estimators rely on good estimators of the censoring mechanism and thus will require a parametric or semiparametric model for the censoring mechanism. We present a general theorem that provides a template for proving the desired asymptotic results. We illustrate the general one-step estimation methods by constructing locally efficient one-step estimators of marginal distributions and regression parameters with right-censored data, current status data and bivariate right-censored data, in all models allowing the presence of time-dependent covariates. The conditions of the asymptotics theorem are rigorously verified in one of the examples and the key condition of the general theorem is verified for all examples.
Resumo:
2000 Mathematics Subject Classi cation: 62N01, 62N05, 62P10, 92D10, 92D30.