77 resultados para model selection in binary regression
Resumo:
Although neuroimaging research has evidenced specific responses to visual food stimuli based on their nutritional quality (e.g., energy density, fat content), brain processes underlying portion size selection remain largely unexplored. We identified spatio-temporal brain dynamics in response to meal images varying in portion size during a task of ideal portion selection for prospective lunch intake and expected satiety. Brain responses to meal portions judged by the participants as 'too small', 'ideal' and 'too big' were measured by means of electro-encephalographic (EEG) recordings in 21 normal-weight women. During an early stage of meal viewing (105-145ms), data showed an incremental increase of the head-surface global electric field strength (quantified via global field power; GFP) as portion judgments ranged from 'too small' to 'too big'. Estimations of neural source activity revealed that brain regions underlying this effect were located in the insula, middle frontal gyrus and middle temporal gyrus, and are similar to those reported in previous studies investigating responses to changes in food nutritional content. In contrast, during a later stage (230-270ms), GFP was maximal for the 'ideal' relative to the 'non-ideal' portion sizes. Greater neural source activity to 'ideal' vs. 'non-ideal' portion sizes was observed in the inferior parietal lobule, superior temporal gyrus and mid-posterior cingulate gyrus. Collectively, our results provide evidence that several brain regions involved in attention and adaptive behavior track 'ideal' meal portion sizes as early as 230ms during visual encounter. That is, responses do not show an increase paralleling the amount of food viewed (and, in extension, the amount of reward), but are shaped by regulatory mechanisms.
Resumo:
Educational institutions are considered a keystone for the establishment of a meritocratic society. They supposedly serve two functions: an educational function that promotes learning for all, and a selection function that sorts individuals into different programs, and ultimately social positions, based on individual merit. We study how the function of selection relates to support for assessment practices known to harm vs. benefit lower status students, through the perceived justice principles underlying these practices. We study two assessment practices: normative assessment-focused on ranking and social comparison, known to hinder the success of lower status students-and formative assessment-focused on learning and improvement, known to benefit lower status students. Normative assessment is usually perceived as relying on an equity principle, with rewards being allocated based on merit and should thus appear as positively associated with the function of selection. Formative assessment is usually perceived as relying on corrective justice that aims to ensure equality of outcomes by considering students' needs, which makes it less suitable for the function of selection. A questionnaire measuring these constructs was administered to university students. Results showed that believing that education is intended to select the best students positively predicts support for normative assessment, through increased perception of its reliance on equity, and negatively predicts support for formative assessment, through reduced perception of its ability to establish corrective justice. This study suggests that the belief in the function of selection as inherent to educational institutions can contribute to the reproduction of social inequalities by preventing change from assessment practices known to disadvantage lowerstatus student, namely normative assessment, to more favorable practices, namely formative assessment, and by promoting matching beliefs in justice principles.
Resumo:
We investigate what processes may underlie heterogeneity in social preferences. We address this question by examining participants' decisions and associated response times across 12 mini-ultimatum games. Using a finite mixture model and cross-validating its classification with a response time analysis, we identified four groups of responders: one group takes little to no account of the proposed split or the foregone allocation and swiftly accepts any positive offer; two groups process primarily the objective properties of the allocations (fairness and kindness) and need more time the more properties need to be examined; and a fourth group, which takes more time than the others, appears to take into account what they would have proposed had they been put in the role of the proposer. We discuss implications of this joint decision-response time analysis.
Resumo:
The signalling function of melanin-based colouration is debated. Sexual selection theory states that ornaments should be costly to produce, maintain, wear or display to signal quality honestly to potential mates or competitors. An increasing number of studies supports the hypothesis that the degree of melanism covaries with aspects of body condition (e.g. body mass or immunity), which has contributed to change the initial perception that melanin-based colour ornaments entail no costs. Indeed, the expression of many (but not all) melanin-based colour traits is weakly sensitive to the environment but strongly heritable suggesting that these colour traits are relatively cheap to produce and maintain, thus raising the question of how such colour traits could signal quality honestly. Here I review the production, maintenance and wearing/displaying costs that can generate a correlation between melanin-based colouration and body condition, and consider other evolutionary mechanisms that can also lead to covariation between colour and body condition. Because genes controlling melanic traits can affect numerous phenotypic traits, pleiotropy could also explain a linkage between body condition and colouration. Pleiotropy may result in differently coloured individuals signalling different aspects of quality that are maintained by frequency-dependent selection or local adaptation. Colouration may therefore not signal absolute quality to potential mates or competitors (e.g. dark males may not achieve a higher fitness than pale males); otherwise genetic variation would be rapidly depleted by directional selection. As a consequence, selection on heritable melanin-based colouration may not always be directional, but mate choice may be conditional to environmental conditions (i.e. context-dependent sexual selection). Despite the interest of evolutionary biologists in the adaptive value of melanin-based colouration, its actual role in sexual selection is still poorly understood.
Resumo:
This paper presents the general regression neural networks (GRNN) as a nonlinear regression method for the interpolation of monthly wind speeds in complex Alpine orography. GRNN is trained using data coming from Swiss meteorological networks to learn the statistical relationship between topographic features and wind speed. The terrain convexity, slope and exposure are considered by extracting features from the digital elevation model at different spatial scales using specialised convolution filters. A database of gridded monthly wind speeds is then constructed by applying GRNN in prediction mode during the period 1968-2008. This study demonstrates that using topographic features as inputs in GRNN significantly reduces cross-validation errors with respect to low-dimensional models integrating only geographical coordinates and terrain height for the interpolation of wind speed. The spatial predictability of wind speed is found to be lower in summer than in winter due to more complex and weaker wind-topography relationships. The relevance of these relationships is studied using an adaptive version of the GRNN algorithm which allows to select the useful terrain features by eliminating the noisy ones. This research provides a framework for extending the low-dimensional interpolation models to high-dimensional spaces by integrating additional features accounting for the topographic conditions at multiple spatial scales. Copyright (c) 2012 Royal Meteorological Society.
Resumo:
The role of land cover change as a significant component of global change has become increasingly recognized in recent decades. Large databases measuring land cover change, and the data which can potentially be used to explain the observed changes, are also becoming more commonly available. When developing statistical models to investigate observed changes, it is important to be aware that the chosen sampling strategy and modelling techniques can influence results. We present a comparison of three sampling strategies and two forms of grouped logistic regression models (multinomial and ordinal) in the investigation of patterns of successional change after agricultural land abandonment in Switzerland. Results indicated that both ordinal and nominal transitional change occurs in the landscape and that the use of different sampling regimes and modelling techniques as investigative tools yield different results. Synthesis and applications. Our multimodel inference identified successfully a set of consistently selected indicators of land cover change, which can be used to predict further change, including annual average temperature, the number of already overgrown neighbouring areas of land and distance to historically destructive avalanche sites. This allows for more reliable decision making and planning with respect to landscape management. Although both model approaches gave similar results, ordinal regression yielded more parsimonious models that identified the important predictors of land cover change more efficiently. Thus, this approach is favourable where land cover change pattern can be interpreted as an ordinal process. Otherwise, multinomial logistic regression is a viable alternative.
Resumo:
Background Multiple logistic regression is precluded from many practical applications in ecology that aim to predict the geographic distributions of species because it requires absence data, which are rarely available or are unreliable. In order to use multiple logistic regression, many studies have simulated "pseudo-absences" through a number of strategies, but it is unknown how the choice of strategy influences models and their geographic predictions of species. In this paper we evaluate the effect of several prevailing pseudo-absence strategies on the predictions of the geographic distribution of a virtual species whose "true" distribution and relationship to three environmental predictors was predefined. We evaluated the effect of using a) real absences b) pseudo-absences selected randomly from the background and c) two-step approaches: pseudo-absences selected from low suitability areas predicted by either Ecological Niche Factor Analysis: (ENFA) or BIOCLIM. We compared how the choice of pseudo-absence strategy affected model fit, predictive power, and information-theoretic model selection results. Results Models built with true absences had the best predictive power, best discriminatory power, and the "true" model (the one that contained the correct predictors) was supported by the data according to AIC, as expected. Models based on random pseudo-absences had among the lowest fit, but yielded the second highest AUC value (0.97), and the "true" model was also supported by the data. Models based on two-step approaches had intermediate fit, the lowest predictive power, and the "true" model was not supported by the data. Conclusion If ecologists wish to build parsimonious GLM models that will allow them to make robust predictions, a reasonable approach is to use a large number of randomly selected pseudo-absences, and perform model selection based on an information theoretic approach. However, the resulting models can be expected to have limited fit.
Resumo:
Over thirty years ago, Leamer (1983) - among many others - expressed doubts about the quality and usefulness of empirical analyses for the economic profession by stating that "hardly anyone takes data analyses seriously. Or perhaps more accurately, hardly anyone takes anyone else's data analyses seriously" (p.37). Improvements in data quality, more robust estimation methods and the evolution of better research designs seem to make that assertion no longer justifiable (see Angrist and Pischke (2010) for a recent response to Leamer's essay). The economic profes- sion and policy makers alike often rely on empirical evidence as a means to investigate policy relevant questions. The approach of using scientifically rigorous and systematic evidence to identify policies and programs that are capable of improving policy-relevant outcomes is known under the increasingly popular notion of evidence-based policy. Evidence-based economic policy often relies on randomized or quasi-natural experiments in order to identify causal effects of policies. These can require relatively strong assumptions or raise concerns of external validity. In the context of this thesis, potential concerns are for example endogeneity of policy reforms with respect to the business cycle in the first chapter, the trade-off between precision and bias in the regression-discontinuity setting in chapter 2 or non-representativeness of the sample due to self-selection in chapter 3. While the identification strategies are very useful to gain insights into the causal effects of specific policy questions, transforming the evidence into concrete policy conclusions can be challenging. Policy develop- ment should therefore rely on the systematic evidence of a whole body of research on a specific policy question rather than on a single analysis. In this sense, this thesis cannot and should not be viewed as a comprehensive analysis of specific policy issues but rather as a first step towards a better understanding of certain aspects of a policy question. The thesis applies new and innovative identification strategies to policy-relevant and topical questions in the fields of labor economics and behavioral environmental economics. Each chapter relies on a different identification strategy. In the first chapter, we employ a difference- in-differences approach to exploit the quasi-experimental change in the entitlement of the max- imum unemployment benefit duration to identify the medium-run effects of reduced benefit durations on post-unemployment outcomes. Shortening benefit duration carries a double- dividend: It generates fiscal benefits without deteriorating the quality of job-matches. On the contrary, shortened benefit durations improve medium-run earnings and employment possibly through containing the negative effects of skill depreciation or stigmatization. While the first chapter provides only indirect evidence on the underlying behavioral channels, in the second chapter I develop a novel approach that allows to learn about the relative impor- tance of the two key margins of job search - reservation wage choice and search effort. In the framework of a standard non-stationary job search model, I show how the exit rate from un- employment can be decomposed in a way that is informative on reservation wage movements over the unemployment spell. The empirical analysis relies on a sharp discontinuity in unem- ployment benefit entitlement, which can be exploited in a regression-discontinuity approach to identify the effects of extended benefit durations on unemployment and survivor functions. I find evidence that calls for an important role of reservation wage choices for job search be- havior. This can have direct implications for the optimal design of unemployment insurance policies. The third chapter - while thematically detached from the other chapters - addresses one of the major policy challenges of the 21st century: climate change and resource consumption. Many governments have recently put energy efficiency on top of their agendas. While pricing instru- ments aimed at regulating the energy demand have often been found to be short-lived and difficult to enforce politically, the focus of energy conservation programs has shifted towards behavioral approaches - such as provision of information or social norm feedback. The third chapter describes a randomized controlled field experiment in which we discuss the effective- ness of different types of feedback on residential electricity consumption. We find that detailed and real-time feedback caused persistent electricity reductions on the order of 3 to 5 % of daily electricity consumption. Also social norm information can generate substantial electricity sav- ings when designed appropriately. The findings suggest that behavioral approaches constitute effective and relatively cheap way of improving residential energy-efficiency.
Resumo:
In this paper we study the relevance of multiple kernel learning (MKL) for the automatic selection of time series inputs. Recently, MKL has gained great attention in the machine learning community due to its flexibility in modelling complex patterns and performing feature selection. In general, MKL constructs the kernel as a weighted linear combination of basis kernels, exploiting different sources of information. An efficient algorithm wrapping a Support Vector Regression model for optimizing the MKL weights, named SimpleMKL, is used for the analysis. In this sense, MKL performs feature selection by discarding inputs/kernels with low or null weights. The approach proposed is tested with simulated linear and nonlinear time series (AutoRegressive, Henon and Lorenz series).
Resumo:
Aim This study used data from temperate forest communities to assess: (1) five different stepwise selection methods with generalized additive models, (2) the effect of weighting absences to ensure a prevalence of 0.5, (3) the effect of limiting absences beyond the environmental envelope defined by presences, (4) four different methods for incorporating spatial autocorrelation, and (5) the effect of integrating an interaction factor defined by a regression tree on the residuals of an initial environmental model. Location State of Vaud, western Switzerland. Methods Generalized additive models (GAMs) were fitted using the grasp package (generalized regression analysis and spatial predictions, http://www.cscf.ch/grasp). Results Model selection based on cross-validation appeared to be the best compromise between model stability and performance (parsimony) among the five methods tested. Weighting absences returned models that perform better than models fitted with the original sample prevalence. This appeared to be mainly due to the impact of very low prevalence values on evaluation statistics. Removing zeroes beyond the range of presences on main environmental gradients changed the set of selected predictors, and potentially their response curve shape. Moreover, removing zeroes slightly improved model performance and stability when compared with the baseline model on the same data set. Incorporating a spatial trend predictor improved model performance and stability significantly. Even better models were obtained when including local spatial autocorrelation. A novel approach to include interactions proved to be an efficient way to account for interactions between all predictors at once. Main conclusions Models and spatial predictions of 18 forest communities were significantly improved by using either: (1) cross-validation as a model selection method, (2) weighted absences, (3) limited absences, (4) predictors accounting for spatial autocorrelation, or (5) a factor variable accounting for interactions between all predictors. The final choice of model strategy should depend on the nature of the available data and the specific study aims. Statistical evaluation is useful in searching for the best modelling practice. However, one should not neglect to consider the shapes and interpretability of response curves, as well as the resulting spatial predictions in the final assessment.
Resumo:
Many of the most interesting questions ecologists ask lead to analyses of spatial data. Yet, perhaps confused by the large number of statistical models and fitting methods available, many ecologists seem to believe this is best left to specialists. Here, we describe the issues that need consideration when analysing spatial data and illustrate these using simulation studies. Our comparative analysis involves using methods including generalized least squares, spatial filters, wavelet revised models, conditional autoregressive models and generalized additive mixed models to estimate regression coefficients from synthetic but realistic data sets, including some which violate standard regression assumptions. We assess the performance of each method using two measures and using statistical error rates for model selection. Methods that performed well included generalized least squares family of models and a Bayesian implementation of the conditional auto-regressive model. Ordinary least squares also performed adequately in the absence of model selection, but had poorly controlled Type I error rates and so did not show the improvements in performance under model selection when using the above methods. Removing large-scale spatial trends in the response led to poor performance. These are empirical results; hence extrapolation of these findings to other situations should be performed cautiously. Nevertheless, our simulation-based approach provides much stronger evidence for comparative analysis than assessments based on single or small numbers of data sets, and should be considered a necessary foundation for statements of this type in future.
Resumo:
OBJECTIVE: : To determine the influence of nebulizer types and nebulization modes on bronchodilator delivery in a mechanically ventilated pediatric lung model. DESIGN: : In vitro, laboratory study. SETTING: : Research laboratory of a university hospital. INTERVENTIONS: : Using albuterol as a marker, three nebulizer types (jet nebulizer, ultrasonic nebulizer, and vibrating-mesh nebulizer) were tested in three nebulization modes in a nonhumidified bench model mimicking the ventilatory pattern of a 10-kg infant. The amounts of albuterol deposited on the inspiratory filters (inhaled drug) at the end of the endotracheal tube, on the expiratory filters, and remaining in the nebulizers or in the ventilator circuit were determined. Particle size distribution of the nebulizers was also measured. MEASUREMENTS AND MAIN RESULTS: : The inhaled drug was 2.8% ± 0.5% for the jet nebulizer, 10.5% ± 2.3% for the ultrasonic nebulizer, and 5.4% ± 2.7% for the vibrating-mesh nebulizer in intermittent nebulization during the inspiratory phase (p < 0.01). The most efficient nebulizer was the vibrating-mesh nebulizer in continuous nebulization (13.3% ± 4.6%, p < 0.01). Depending on the nebulizers, a variable but important part of albuterol was observed as remaining in the nebulizers (jet and ultrasonic nebulizers), or being expired or lost in the ventilator circuit (all nebulizers). Only small particles (range 2.39-2.70 µm) reached the end of the endotracheal tube. CONCLUSIONS: : Important differences between nebulizer types and nebulization modes were seen for albuterol deposition at the end of the endotracheal tube in an in vitro pediatric ventilator-lung model. New aerosol devices, such as ultrasonic and vibrating-mesh nebulizers, were more efficient than the jet nebulizer.
Resumo:
A stringent branch-site codon model was used to detect positive selection in vertebrate evolution. We show that the test is robust to the large evolutionary distances involved. Positive selection was detected in 77% of 884 genes studied. Most positive selection concerns a few sites on a single branch of the phylogenetic tree: Between 0.9% and 4.7% of sites are affected by positive selection depending on the branches. No functional category was overrepresented among genes under positive selection. Surprisingly, whole genome duplication had no effect on the prevalence of positive selection, whether the fish-specific genome duplication or the two rounds at the origin of vertebrates. Thus positive selection has not been limited to a few gene classes, or to specific evolutionary events such as duplication, but has been pervasive during vertebrate evolution.
Disentangling the effects of key innovations on the diversification of Bromelioideae (bromeliaceae).
Resumo:
The evolution of key innovations, novel traits that promote diversification, is often seen as major driver for the unequal distribution of species richness within the tree of life. In this study, we aim to determine the factors underlying the extraordinary radiation of the subfamily Bromelioideae, one of the most diverse clades among the neotropical plant family Bromeliaceae. Based on an extended molecular phylogenetic data set, we examine the effect of two putative key innovations, that is, the Crassulacean acid metabolism (CAM) and the water-impounding tank, on speciation and extinction rates. To this aim, we develop a novel Bayesian implementation of the phylogenetic comparative method, binary state speciation and extinction, which enables hypotheses testing by Bayes factors and accommodates the uncertainty on model selection by Bayesian model averaging. Both CAM and tank habit were found to correlate with increased net diversification, thus fulfilling the criteria for key innovations. Our analyses further revealed that CAM photosynthesis is correlated with a twofold increase in speciation rate, whereas the evolution of the tank had primarily an effect on extinction rates that were found five times lower in tank-forming lineages compared to tank-less clades. These differences are discussed in the light of biogeography, ecology, and past climate change.
Resumo:
Excessive exposure to solar ultraviolet (UV) is the main cause of skin cancer. Specific prevention should be further developed to target overexposed or highly vulnerable populations. A better characterisation of anatomical UV exposure patterns is however needed for specific prevention. To develop a regression model for predicting the UV exposure ratio (ER, ratio between the anatomical dose and the corresponding ground level dose) for each body site without requiring individual measurements. A 3D numeric model (SimUVEx) was used to compute ER for various body sites and postures. A multiple fractional polynomial regression analysis was performed to identify predictors of ER. The regression model used simulation data and its performance was tested on an independent data set. Two input variables were sufficient to explain ER: the cosine of the maximal daily solar zenith angle and the fraction of the sky visible from the body site. The regression model was in good agreement with the simulated data ER (R(2)=0.988). Relative errors up to +20% and -10% were found in daily doses predictions, whereas an average relative error of only 2.4% (-0.03% to 5.4%) was found in yearly dose predictions. The regression model predicts accurately ER and UV doses on the basis of readily available data such as global UV erythemal irradiance measured at ground surface stations or inferred from satellite information. It renders the development of exposure data on a wide temporal and geographical scale possible and opens broad perspectives for epidemiological studies and skin cancer prevention.