Biblioteca Digital

990 resultados para threshold random variable

An empirical evaluation of the Random Forests classifier models for variable selection in a large-scale lung cancer case-control study

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^

Variable threshold algorithm for division of labor analyzed as a dynamical system

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Division of labor is a widely studied aspect of colony behavior of social insects. Division of labor models indicate how individuals distribute themselves in order to perform different tasks simultaneously. However, models that study division of labor from a dynamical system point of view cannot be found in the literature. In this paper, we define a division of labor model as a discrete-time dynamical system, in order to study the equilibrium points and their properties related to convergence and stability. By making use of this analytical model, an adaptive algorithm based on division of labor can be designed to satisfy dynamic criteria. In this way, we have designed and tested an algorithm that varies the response thresholds in order to modify the dynamic behavior of the system. This behavior modification allows the system to adapt to specific environmental and collective situations, making the algorithm a good candidate for distributed control applications. The variable threshold algorithm is based on specialization mechanisms. It is able to achieve an asymptotically stable behavior of the system in different environments and independently of the number of individuals. The algorithm has been successfully tested under several initial conditions and number of individuals.

Random error in willingness to pay measurement: A multiple indicators, latent variable approach to the reliability of contingent values

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The reliability of measurement refers to unsystematic error in observed responses. Investigations of the prevalence of random error in stated estimates of willingness to pay (WTP) are important to an understanding of why tests of validity in CV can fail. However, published reliability studies have tended to adopt empirical methods that have practical and conceptual limitations when applied to WTP responses. This contention is supported in a review of contingent valuation reliability studies that demonstrate important limitations of existing approaches to WTP reliability. It is argued that empirical assessments of the reliability of contingent values may be better dealt with by using multiple indicators to measure the latent WTP distribution. This latent variable approach is demonstrated with data obtained from a WTP study for stormwater pollution abatement. Attitude variables were employed as a way of assessing the reliability of open-ended WTP (with benchmarked payment cards) for stormwater pollution abatement. The results indicated that participants' decisions to pay were reliably measured, but not the magnitude of the WTP bids. This finding highlights the need to better discern what is actually being measured in VVTP studies, (C) 2003 Elsevier B.V. All rights reserved.

Threshold behaviour and final outcome of an epidemic on a random network with household structure

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper considers a stochastic SIR (susceptible-infective-removed) epidemic model in which individuals may make infectious contacts in two ways, both within 'households' (which for ease of exposition are assumed to have equal size) and along the edges of a random graph describing additional social contacts. Heuristically-motivated branching process approximations are described, which lead to a threshold parameter for the model and methods for calculating the probability of a major outbreak, given few initial infectives, and the expected proportion of the population who are ultimately infected by such a major outbreak. These approximate results are shown to be exact as the number of households tends to infinity by proving associated limit theorems. Moreover, simulation studies indicate that these asymptotic results provide good approximations for modestly-sized finite populations. The extension to unequal sized households is discussed briefly.

On the resilience of long cycles in random graphs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we determine the local and global resilience of random graphs G(n,p) (p >> n(-1)) with respect to the property of containing a cycle of length at least (1 - alpha)n. Roughly speaking, given alpha > 0, we determine the smallest r(g) (G, alpha) with the property that almost surely every subgraph of G = G(n,p) having more than r(g) (G, alpha)vertical bar E(G)vertical bar edges contains a cycle of length at least (1 - alpha)n (global resilience). We also obtain, for alpha < 1/2, the smallest r(l) (G, alpha) such that any H subset of G having deg(H) (v) larger than r(l) (G, alpha) deg(G) (v) for all v is an element of V(G) contains a cycle of length at least (1 - alpha)n (local resilience). The results above are in fact proved in the more general setting of pseudorandom graphs.

TESTING STATISTICAL HYPOTHESIS ON RANDOM TREES AND APPLICATIONS TO THE PROTEIN CLASSIFICATION PROBLEM

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov-Smirnov-type goodness-of-fit test proposed by Balding et at. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford-Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton-Watson related processes.

Constant-random practice and the adaptive process in motor learning with varying amounts of constant practice

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The adaptive process in motor learning was examined in terms of effects of varying amounts of constant practice performed before random practice. Participants pressed five response keys sequentially, the last one coincident with the lighting of a final visual stimulus provided by a complex coincident timing apparatus. Different visual stimulus speeds were used during the random practice. 33 children (M age=11.6 yr.) were randomly assigned to one of three experimental groups: constant-random, constant-random 33%, and constant-random 66%. The constant-random group practiced constantly until they reached a criterion of performance stabilization three consecutive trials within 50 msec. of error. The other two groups had additional constant practice of 33 and 66%, respectively, of the number of trials needed to achieve the stabilization criterion. All three groups performed 36 trials under random practice; in the adaptation phase, they practiced at a different visual stimulus speed adopted in the stabilization phase. Global performance measures were absolute, constant, and variable errors, and movement pattern was analyzed by relative timing and overall movement time. There was no group difference in relation to global performance measures and overall movement time. However, differences between the groups were observed on movement pattern, since constant-random 66% group changed its relative timing performance in the adaptation phase.

Impact of Atypical Retardation Patterns on Detection of Glaucoma Progression using the GDx with Variable Corneal Compensation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: To evaluate the impact of atypical retardation patterns (ARP) on detection of progressive retinal nerve fiber layer (RNFL) loss using scanning laser polarimetry with variable corneal compensation (VCC). DESIGN: Observational cohort study. METHODS: The study included 377 eyes of 221 patients with a median follow-up of 4.0 years. Images were obtained annually with the GDx VCC (Carl Zeiss Med, itec Inc, Dublin, California, USA), along with optic disc stereophotographs and standard automated perimetry (SAP) visual fields. Progression was determined by the Guided Progression Analysis software for SAP and by masked assessment of stereophotographs by expert graders. The typical scan score (TSS) was used to quantify the presence of ARPs on GDx VCC images. Random coefficients models were used to evaluate the relationship between ARP and RNFL thickness measurements over time. RESULTS: Thirty-eight eyes (10%) showed progression over time on visual fields, stereophotographs, or both. Changes in TSS scores from baseline were significantly associated with changes in RNFL thickness measurements in both progressing and nonprogressing eyes. Each I unit increase in TSS score was associated with a 0.19-mu m decrease in RNFL thickness measurement (P < .001) over time. CONCLUSIONS: ARPs had a significant effect on detection of progressive RNFL loss with the GDx VCC. Eyes with large amounts of atypical patterns, great fluctuations on these patterns over time, or both may show changes in measurements that can appear falsely as glaucomatous progression or can mask true changes in the RNFL. (Am J Ophthalmol 2009;148:155-163. (C) 2009 by Elsevier Inc. All rights reserved.)

Computation of the linear elastic properties of random porous materials with a wide variety of microstructure

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A finite-element method is used to study the elastic properties of random three-dimensional porous materials with highly interconnected pores. We show that Young's modulus, E, is practically independent of Poisson's ratio of the solid phase, nu(s), over the entire solid fraction range, and Poisson's ratio, nu, becomes independent of nu(s) as the percolation threshold is approached. We represent this behaviour of nu in a flow diagram. This interesting but approximate behaviour is very similar to the exactly known behaviour in two-dimensional porous materials. In addition, the behaviour of nu versus nu(s) appears to imply that information in the dilute porosity limit can affect behaviour in the percolation threshold limit. We summarize the finite-element results in terms of simple structure-property relations, instead of tables of data, to make it easier to apply the computational results. Without using accurate numerical computations, one is limited to various effective medium theories and rigorous approximations like bounds and expansions. The accuracy of these equations is unknown for general porous media. To verify a particular theory it is important to check that it predicts both isotropic elastic moduli, i.e. prediction of Young's modulus alone is necessary but not sufficient. The subtleties of Poisson's ratio behaviour actually provide a very effective method for showing differences between the theories and demonstrating their ranges of validity. We find that for moderate- to high-porosity materials, none of the analytical theories is accurate and, at present, numerical techniques must be relied upon.

Usefulness of clinical risk markers and ischemic threshold to stratify risk in patients undergoing major noncardiac surgery

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The risk of cardiac events in patients undergoing major noncardiac surgery is dependent on their clinical characteristics and the results of stress testing. The purpose of this study was to develop a composite approach to defining levels of risk and to examine whether different approaches to prophylaxis influenced this prediction of outcome. One hundred forty-five consecutive patients (aged 68 +/- 9 years, 79 men) with >1 clinical risk variable were studied with standard dobutamine-atropine stress echo before major noncardiac surgery. Risk levels were stratified according to the presence of ischemia (new or worsening wall motion abnormality), ischemic threshold (heart rate at development of ischemia), and number of clinical risk variables. Patients were followed for perioperative events (during hospital admission) and death or infarction over the subsequent 16 10 months. Ten perioperative events occurred in 105 patients who proceeded to surgery (10%, 95% confidence interval [CI] 5% to 17%), 40 being cancelled because of cardiac or other risk. No ischemia was identified in 56 patients, 1 of whom (1.8%) had a perioperative infarction. Of the 49 patients with ischemia, 22 (45%) had 1 or 2 clinical risk factors; 2 (9%, 95% CI 1% to 29%) had events. Another 15 patients had a high ischemic threshold and 3 or 4 risk factors; 3 (20%, 95% Cl 4% to 48%) had events. Twelve patients had a low ischemic threshold and 3 or 4 risk factors; 4 (33%, 95% CI 10% to 65%) had events. Preoperative myocardial revascularization was performed in only 3 patients, none of whom had events. Perioperative and long-term events occurred despite the use of beta blockers; 7 of 41 eta blocker-treated patients had a perioperative event (17%, 95% CI 7% to 32%); these treated patients were at higher anticipated risk than untreated patients (20 +/- 24% vs 10 +/- 19%, p = 0.02). The total event rate over late follow-up was 13%, and was predicted by dobutamine-atropine stress echo results and heart rate response. (C) 2002 by Excerpta Medica, Inc.

A nonlinear threshold model for the dependence of extremes of stationary sequences

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the main implications of the efficient market hypothesis (EMH) is that expected future returns on financial assets are not predictable if investors are risk neutral. In this paper we argue that financial time series offer more information than that this hypothesis seems to supply. In particular we postulate that runs of very large returns can be predictable for small time periods. In order to prove this we propose a TAR(3,1)-GARCH(1,1) model that is able to describe two different types of extreme events: a first type generated by large uncertainty regimes where runs of extremes are not predictable and a second type where extremes come from isolated dread/joy events. This model is new in the literature in nonlinear processes. Its novelty resides on two features of the model that make it different from previous TAR methodologies. The regimes are motivated by the occurrence of extreme values and the threshold variable is defined by the shock affecting the process in the preceding period. In this way this model is able to uncover dependence and clustering of extremes in high as well as in low volatility periods. This model is tested with data from General Motors stocks prices corresponding to two crises that had a substantial impact in financial markets worldwide; the Black Monday of October 1987 and September 11th, 2001. By analyzing the periods around these crises we find evidence of statistical significance of our model and thereby of predictability of extremes for September 11th but not for Black Monday. These findings support the hypotheses of a big negative event producing runs of negative returns in the first case, and of the burst of a worldwide stock market bubble in the second example. JEL classification: C12; C15; C22; C51 Keywords and Phrases: asymmetries, crises, extreme values, hypothesis testing, leverage effect, nonlinearities, threshold models

Propagation connectivity of random hypergraphs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study the concept of propagation connectivity on random 3-uniform hypergraphs. This concept is inspired by a simple linear time algorithm for solving instances of certain constraint satisfaction problems. We derive upper and lower bounds for the propagation connectivity threshold, and point out some algorithmic implications.

State-dependent threshold STAR models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we consider extensions of smooth transition autoregressive (STAR) models to situations where the threshold is a time-varying function of variables that affect the separation of regimes of the time series under consideration. Our specification is motivated by the observation that unusually high/low values for an economic variable may sometimes be best thought of in relative terms. State-dependent logistic STAR and contemporaneous-threshold STAR models are introduced and discussed. These models are also used to investigate the dynamics of U.S. short-term interest rates, where the threshold is allowed to be a function of past output growth and inflation.

Growth in a Cross-Section of Cities: Location, Increasing Returns or Random Growth?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article analyzes empirically the main existing theories on income and population city growth: increasing returns to scale, locational fundamentals and random growth. To do this we implement a threshold nonlinearity test that extends standard linear growth regression models to a dataset on urban, climatological and macroeconomic variables on 1,175 U.S. cities. Our analysis reveals the existence of increasing returns when per-capita income levels are beyond $19; 264. Despite this, income growth is mostly explained by social and locational fundamentals. Population growth also exhibits two distinct equilibria determined by a threshold value of 116,300 inhabitants beyond which city population grows at a higher rate. Income and population growth do not go hand in hand, implying an optimal level of population beyond which income growth stagnates or deteriorates

Assessing the Relationship between the Baseline Value of a Continuous Variable and Subsequent Change over Time

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing the relationship between the baseline value and subsequent change of a continuous variable is a frequent matter of inquiry in cohort studies. These analyses are surprisingly complex, particularly if only two waves of data are available. It is unclear for non-biostatisticians where the complexity of this analysis lies and which statistical method is adequate.With the help of simulated longitudinal data of body mass index in children,we review statistical methods for the analysis of the association between the baseline value and subsequent change, assuming linear growth with time. Key issues in such analyses are mathematical coupling, measurement error, variability of change between individuals, and regression to the mean. Ideally, it is better to rely on multiple repeated measurements at different times and a linear random effects model is a standard approach if more than two waves of data are available. If only two waves of data are available, our simulations show that Blomqvist's method - which consists in adjusting for measurement error variance the estimated regression coefficient of observed change on baseline value - provides accurate estimates. The adequacy of the methods to assess the relationship between the baseline value and subsequent change depends on the number of data waves, the availability of information on measurement error, and the variability of change between individuals.

«
1
2
...
4
5
6
7
8
9
10
...
65
66
»