971 resultados para ALS data-set
Resumo:
Aus der Lebensverlaufsperspektive wird die intergenerationale Mobilität von Männern und Frauen in den Kohorten 1929-31, 1939-41 und 1949-51 untersucht. In welchem Umfang hat die Expansion des öffentlichen Dienstes Mobilitätschancen eröffnet? Inwieweit hat der öffentliche Dienst als Sonderstruktur im Gegensatz zur Privatwirtschaft seine Funktion als "Mobilitätskanal" ausgeweitet? Modifizieren für den öffentlichen Dienst charakteristische institutionelle Regelungen der Rekrutierung und Allokation von Arbeitskräften diese Funktion? Für empirische Analysen wurden Längsschnittdaten des Lebensverlaufsprojekts am Berliner Max-Planck-Institut für Bildungsforschung herangezogen. Zunehmende herkunftsbedingte und bildungsmäßige Ungleichheit bestimmen einen Großteil der Chancen intergenerationaler Mobilität. Die Ausdehnung der Staatsbeschäftigung hat dazu geführt, daß in der Kohortenabfolge vor allem die Berufsanfänger aufstiegen, die in der Lage waren, in den öffentlichen Dienst einzutreten. Das Nachholen beim Berufseinstieg verpaßter Aufstiege ist kaum möglich, und dies gelingt auch nicht durch die Beschäftigung im öffentlichen Dienst. Für die Wahrscheinlichkeit intergenerationaler Aufstiege im Berufsverlauf gibt es keine sektorspezifischen Unterschiede. Staatsbeschäftigte unterliegen aufgrund der Besitzstandswahrung einem deutlich geringeren Abstiegsrisiko als privatwirtschaftlich Beschäftigte. Der Staatssektor hat seine Funktion als Aufstiegskanal für Berufsanfänger ausgeweitet und garantiert seinen langfristig Beschäftigten die erreichte Statuslage. Damit ist der öffentliche Dienst ein weiteres Strukturprinzip sozialer Ungleichheit.
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
The radar reflectivity of an ice-sheet bed is a primary measurement for discriminating between thawed and frozen beds. Uncertainty in englacial radar attenuation and its spatial variation introduces corresponding uncertainty in estimates of basal reflectivity. Radar attenuation is proportional to ice conductivity, which depends on the concentrations of acid and sea-salt chloride and the temperature of the ice. We synthesize published conductivity measurements to specify an ice-conductivity model and find that some of the dielectric properties of ice at radar frequencies are not yet well constrained. Using depth profiles of ice-core chemistry and borehole temperature and an average of the experimental values for the dielectric properties, we calculate an attenuation rate profile for Siple Dome, West Antarctica. The depth-averaged modeled attenuation rate at Siple Dome (20.0 +/- 5.7 dB km(-1)) is somewhat lower than the value derived from radar profiles (25.3 +/- 1.1 dB km(-1)). Pending more experimental data on the dielectric properties of ice, we can match the modeled and radar-derived attenuation rates by an adjustment to the value for the pure ice conductivity that is within the range of reported values. Alternatively, using the pure ice dielectric properties derived from the most extensive single data set, the modeled depth-averaged attenuation rate is 24.0 +/- 2.2 dB km(-1). This work shows how to calculate englacial radar attenuation using ice chemistry and temperature data and establishes a basis for mapping spatial variations in radar attenuation across an ice sheet.
Resumo:
Results from the Zurich study have shown lasting associations between sport practice and mental health. The effects are pronounced in people with pre-exising mental health problems. This analysis aims to replicate these results with the large Swiss Household Panel data set and to provide more differentiated results. The analysis covered the interviews 1999-2003 and included 3891 stayers, i.e., participants who were interviewed in all years. The outcome variables are depression / blues / anxiety, weakness / weariness, sleeping problems, energy / optimism. Confounding variables include sex, age, education level, citizenship. The analyses were carried out with mixed models (depression, optimism) and GEE models (weakness, sleep). About 60% of the SHP participants practise weekly or daily an individual or a team sport. A similar proportion enjoys a frequent physical activity (for half an hour minimum) which makes oneself slightly breathless. There are slight age-specific differences but also noteworthy regional differences. Practice of sport is clearly interrelated with self-reported depressive symptoms, optimism and weakness. This applies even though some relevant confounders – sex, educational level and citizenship – were introduced into the model. However, no relevant interaction effects with time could be shown. Moreover, direct interrelations commonly led to better fits than models with lagged variables, thus indicating that delayed effects of sport practice on the self-reported psychological complaints are less important. Model variants resulted for specific subgroups, for example, participants with a high vs. low initial activity level. Lack of sport practice is an interesting marker for serious psychological symptoms and mental disorders. The background of this association may differ in different subgroups, and should stimulate further investigations in this area.
Resumo:
We present a new thermodynamic activity-composition model for di-trioctahedral chlorite in the system FeO–MgO–Al2O3–SiO2–H2O that is based on the Holland–Powell internally consistent thermodynamic data set. The model is formulated in terms of four linearly independent end-members, which are amesite, clinochlore, daphnite and sudoite. These account for the most important crystal-chemical substitutions in chlorite, the Fe–Mg, Tschermak and di-trioctahedral substitution. The ideal part of end-member activities is modeled with a mixing-on-site formalism, and non-ideality is described by a macroscopic symmetric (regular) formalism. The symmetric interaction parameters were calibrated using a set of 271 published chlorite analyses for which robust independent temperature estimates are available. In addition, adjustment of the standard state thermodynamic properties of sudoite was required to accurately reproduce experimental brackets involving sudoite. This new model was tested by calculating representative P–T sections for metasediments at low temperatures (<400 °C), in particular sudoite and chlorite bearing metapelites from Crete. Comparison between the calculated mineral assemblages and field data shows that the new model is able to predict the coexistence of chlorite and sudoite at low metamorphic temperatures. The predicted lower limit of the chloritoid stability field is also in better agreement with petrological observations. For practical applications to metamorphic and hydrothermal environments, two new semi-empirical chlorite geothermometers named Chl(1) and Chl(2) were calibrated based on the chlorite + quartz + water equilibrium (2 clinochlore + 3 sudoite = 4 amesite + 4 H2O + 7 quartz). The Chl(1) thermometer requires knowledge of the (Fe3+/ΣFe) ratio in chlorite and predicts correct temperatures for a range of redox conditions. The Chl(2) geothermometer which assumes that all iron in chlorite is ferrous has been applied to partially recrystallized detrital chlorite from the Zone houillère in the French Western Alps.
Resumo:
Academic and industrial research in the late 90s have brought about an exponential explosion of DNA sequence data. Automated expert systems are being created to help biologists to extract patterns, trends and links from this ever-deepening ocean of information. Two such systems aimed on retrieving and subsequently utilizing phylogenetically relevant information have been developed in this dissertation, the major objective of which was to automate the often difficult and confusing phylogenetic reconstruction process. ^ Popular phylogenetic reconstruction methods, such as distance-based methods, attempt to find an optimal tree topology (that reflects the relationships among related sequences and their evolutionary history) by searching through the topology space. Various compromises between the fast (but incomplete) and exhaustive (but computationally prohibitive) search heuristics have been suggested. An intelligent compromise algorithm that relies on a flexible “beam” search principle from the Artificial Intelligence domain and uses the pre-computed local topology reliability information to adjust the beam search space continuously is described in the second chapter of this dissertation. ^ However, sometimes even a (virtually) complete distance-based method is inferior to the significantly more elaborate (and computationally expensive) maximum likelihood (ML) method. In fact, depending on the nature of the sequence data in question either method might prove to be superior. Therefore, it is difficult (even for an expert) to tell a priori which phylogenetic reconstruction method—distance-based, ML or maybe maximum parsimony (MP)—should be chosen for any particular data set. ^ A number of factors, often hidden, influence the performance of a method. For example, it is generally understood that for a phylogenetically “difficult” data set more sophisticated methods (e.g., ML) tend to be more effective and thus should be chosen. However, it is the interplay of many factors that one needs to consider in order to avoid choosing an inferior method (potentially a costly mistake, both in terms of computational expenses and in terms of reconstruction accuracy.) ^ Chapter III of this dissertation details a phylogenetic reconstruction expert system that selects a superior proper method automatically. It uses a classifier (a Decision Tree-inducing algorithm) to map a new data set to the proper phylogenetic reconstruction method. ^
Resumo:
Lovell and Rouse (LR) have recently proposed a modification of the standard DEA model that overcomes the infeasibility problem often encountered in computing super-efficiency. In the LR procedure one appropriately scales up the observed input vector (scale down the output vector) of the relevant super-efficient firm thereby usually creating its inefficient surrogate. An alternative procedure proposed in this paper uses the directional distance function introduced by Chambers, Chung, and Färe and the resulting Nerlove-Luenberger (NL) measure of super-efficiency. The fact that the directional distance function combines features of both an input-oriented and an output-oriented model, generally leads to a more complete ranking of the observations than either of the oriented models. An added advantage of this approach is that the NL super-efficiency measure is unique and does not depend on any arbitrary choice of a scaling parameter. A data set on international airlines from Coelli, Perelman, and Griffel-Tatje (2002) is utilized in an illustrative empirical application.
Resumo:
Despite the extensive work on currency mismatches, research on the determinants and effects of maturity mismatches is scarce. In this paper I show that emerging market maturity mismatches are negatively affected by capital inflows and price volatilities. Furthermore, I find that banks with low maturity mismatches are more profitable during crisis periods but less profitable otherwise. The later result implies that banks face a tradeoff between higher returns and risk, hence channeling short term capital into long term loans is caused by cronyism and implicit guarantees rather than the depth of the financial market. The positive relationship between maturity mismatches and price volatility, on the other hand, shows that the banks of countries with high exchange rate and interest rate volatilities can not, or choose not to hedge themselves. These results follow from a panel regression on a data set I constructed by merging bank level data with aggregate data. This is advantageous over traditional studies which focus only on aggregate data.
Resumo:
This data set contains a time series of plant height measurements (vegetative and reproductive) from the main experiment plots of a large grassland biodiversity experiment (the Jena Experiment; see further details below). In addition, data on species specific plant heights for the main experiment are available from 2002. In the main experiment, 82 grassland plots of 20 x 20 m were established from a pool of 60 species belonging to four functional groups (grasses, legumes, tall and small herbs). In May 2002, varying numbers of plant species from this species pool were sown into the plots to create a gradient of plant species richness (1, 2, 4, 8, 16 and 60 species) and functional richness (1, 2, 3, 4 functional groups). Plots were maintained by bi-annual weeding and mowing. 1. Plant height was recorded, generally, twice a year just before biomass harvest (during peak standing biomass in late May and in late August). Methodologies of measuring height have varied somewhat over the years. In earlier year the streched plant height was measured, while in later years the standing height without streching the plant was measured. Vegetative height was measured either as the height of the highest leaf or as the length of the main axis of non-flowering plants. Regenerating height was measured either as the height of the highest flower on a plant or as the height of the main axis of flowering. Sampled plants were either randomly selected in the core area of plots or along transects in defined distances. For details refer to the description of individual years. Starting in 2006, also the plots of the management experiment, that altered mowing frequency and fertilized subplots (see further details in the general description of the Jena Experiment) were sampled. 2. Species specific plant height was recorded two times in 2002: in late July (vegetative height) and just before biomass harvest during peak standing biomass in late August (vegetative and regenerative height). For each plot and each sown species in the species pool, 3 plant individuals (if present) from the central area of the plots were randomly selected and used to measure vegetative height (non-flowering indviduals) and regenerative height (flowering individuals) as stretched height. Provided are the means over the three measuremnts per plant species per plot.