923 resultados para model selection in binary regression
Resumo:
The three-dimensional structure of human uropepsin complexed with pepstatin has been modelled using human pepsin as a template. Uropepsin is an aspartic proteinase from the urine, produced in the form of pepsinogen A in the gastric mucosa. The structure is bilobal, consisting of two predominantly beta -sheet lobes which, as observed in other aspartic proteinases, are related by a pseudo twofold axis. A structural comparison between binary complexes of pepsin:pepstatin and uropepsin:pepstatin is discussed. (C) 2001 Academic Press.
Resumo:
The dispersion relations along the principal symmetry directions in BCC lithium-sodium alloys are calculated using second-order perturbation theory. The local modified Hoshino-Youngmodel potential was used for the lithium and the local Harrison model potential for sodium. The phonon density of states, the root mean square displacements and (Θ-T) curves are also calculated. In the absence of experimental data, just the theoretical predictions are presented here.
Resumo:
Feature selection aims to find the most important information from a given set of features. As this task can be seen as an optimization problem, the combinatorial growth of the possible solutions may be inviable for a exhaustive search. In this paper we propose a new nature-inspired feature selection technique based on the Charged System Search (CSS), which has never been applied to this context so far. The wrapper approach combines the power of exploration of CSS together with the speed of the Optimum-Path Forest classifier to find the set of features that maximizes the accuracy in a validating set. Experiments conducted in four public datasets have demonstrated the validity of the proposed approach can outperform some well-known swarm-based techniques. © 2013 Springer-Verlag.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Binary stars are frequent in the universe, with about 50% of the known main sequence stars being located at a multiple star system (Abt, 1979). Even though, they are universally thought as second rate sites for the location of exo-planets and the habitable zone, due to the difficulties of detection and high perturbation that could prevent planet formation and long term stability. In this work we show that planets in binary star systems can have regular orbits and remain on the habitable zone. We introduce a stability criterium based on the solution of the restricted three body problem and apply it to describe the short period planar and three-dimentional stability zones of S-type orbits around each star of the Alpha Centauri system. We develop as well a semi-analytical secular model to study the long term dynamics of fictional planets in the habitable zone of those stars and we verify that planets on the habitable zone would be in regular orbits with any eccentricity and with inclination to the binary orbital plane up until 35 degrees. We show as well that the short period oscillations on the semi-major axis is 100 times greater than the Earth's, but at all the time the planet would still be found inside the Habitable zone.
Recurrent antitopographic inhibition mediates competitive stimulus selection in an attention network
Resumo:
Topographically organized neurons represent multiple stimuli within complex visual scenes and compete for subsequent processing in higher visual centers. The underlying neural mechanisms of this process have long been elusive. We investigate an experimentally constrained model of a midbrain structure: the optic tectum and the reciprocally connected nucleus isthmi. We show that a recurrent antitopographic inhibition mediates the competitive stimulus selection between distant sensory inputs in this visual pathway. This recurrent antitopographic inhibition is fundamentally different from surround inhibition in that it projects on all locations of its input layer, except to the locus from which it receives input. At a larger scale, the model shows how a focal top-down input from a forebrain region, the arcopallial gaze field, biases the competitive stimulus selection via the combined activation of a local excitation and the recurrent antitopographic inhibition. Our findings reveal circuit mechanisms of competitive stimulus selection and should motivate a search for anatomical implementations of these mechanisms in a range of vertebrate attentional systems.
Resumo:
The Receiver Operating Characteristic (ROC) curve is a prominent tool for characterizing the accuracy of continuous diagnostic test. To account for factors that might invluence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date practical model checking techniques suitable for validating existing ROC regression models are not yet available. In this paper, we develop cumulative residual based procedures to graphically and numerically assess the goodness-of-fit for some commonly used ROC regression models, and show how specific components of these models can be examined within this framework. We derive asymptotic null distributions for the residual process and discuss resampling procedures to approximate these distributions in practice. We illustrate our methods with a dataset from the Cystic Fibrosis registry.
Resumo:
Intermodal rail/road freight transport constitutes an alternative to long-haul road transport for the distribution of large volumes of goods. The paper introduces the intermodal transportation problem for the tactical planning of mode and service selection. In rail mode, shippers either book train capacity on a per-unit basis or charter block trains completely. Road mode is used for short-distance haulage to intermodal terminals and for direct shipments to customers. We analyze the competition of road and intermodal transportation with regard to freight consolidation and service cost on a model basis. The approach is applied to a distribution system of an industrial company serving customers in eastern Europe. The case study investigates the impact of transport cost and consolidation on the optimal modal split.
Resumo:
When considering data from many trials, it is likely that some of them present a markedly different intervention effect or exert an undue influence on the summary results. We develop a forward search algorithm for identifying outlying and influential studies in meta-analysis models. The forward search algorithm starts by fitting the hypothesized model to a small subset of likely outlier-free studies and proceeds by adding studies into the set one-by-one that are determined to be closest to the fitted model of the existing set. As each study is added to the set, plots of estimated parameters and measures of fit are monitored to identify outliers by sharp changes in the forward plots. We apply the proposed outlier detection method to two real data sets; a meta-analysis of 26 studies that examines the effect of writing-to-learn interventions on academic achievement adjusting for three possible effect modifiers, and a meta-analysis of 70 studies that compares a fluoride toothpaste treatment to placebo for preventing dental caries in children. A simple simulated example is used to illustrate the steps of the proposed methodology, and a small-scale simulation study is conducted to evaluate the performance of the proposed method. Copyright © 2016 John Wiley & Sons, Ltd.
Resumo:
Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^
Resumo:
Geostrophic surface velocities can be derived from the gradients of the mean dynamic topography-the difference between the mean sea surface and the geoid. Therefore, independently observed mean dynamic topography data are valuable input parameters and constraints for ocean circulation models. For a successful fit to observational dynamic topography data, not only the mean dynamic topography on the particular ocean model grid is required, but also information about its inverse covariance matrix. The calculation of the mean dynamic topography from satellite-based gravity field models and altimetric sea surface height measurements, however, is not straightforward. For this purpose, we previously developed an integrated approach to combining these two different observation groups in a consistent way without using the common filter approaches (Becker et al. in J Geodyn 59(60):99-110, 2012, doi:10.1016/j.jog.2011.07.0069; Becker in Konsistente Kombination von Schwerefeld, Altimetrie und hydrographischen Daten zur Modellierung der dynamischen Ozeantopographie, 2012, http://nbn-resolving.de/nbn:de:hbz:5n-29199). Within this combination method, the full spectral range of the observations is considered. Further, it allows the direct determination of the normal equations (i.e., the inverse of the error covariance matrix) of the mean dynamic topography on arbitrary grids, which is one of the requirements for ocean data assimilation. In this paper, we report progress through selection and improved processing of altimetric data sets. We focus on the preprocessing steps of along-track altimetry data from Jason-1 and Envisat to obtain a mean sea surface profile. During this procedure, a rigorous variance propagation is accomplished, so that, for the first time, the full covariance matrix of the mean sea surface is available. The combination of the mean profile and a combined GRACE/GOCE gravity field model yields a mean dynamic topography model for the North Atlantic Ocean that is characterized by a defined set of assumptions. We show that including the geodetically derived mean dynamic topography with the full error structure in a 3D stationary inverse ocean model improves modeled oceanographic features over previous estimates.