923 resultados para random oracle
On degeneracy and invariances of random fields paths with applications in Gaussian process modelling
Resumo:
We study pathwise invariances and degeneracies of random fields with motivating applications in Gaussian process modelling. The key idea is that a number of structural properties one may wish to impose a priori on functions boil down to degeneracy properties under well-chosen linear operators. We first show in a second order set-up that almost sure degeneracy of random field paths under some class of linear operators defined in terms of signed measures can be controlled through the two first moments. A special focus is then put on the Gaussian case, where these results are revisited and extended to further linear operators thanks to state-of-the-art representations. Several degeneracy properties are tackled, including random fields with symmetric paths, centred paths, harmonic paths, or sparse paths. The proposed approach delivers a number of promising results and perspectives in Gaussian process modelling. In a first numerical experiment, it is shown that dedicated kernels can be used to infer an axis of symmetry. Our second numerical experiment deals with conditional simulations of a solution to the heat equation, and it is found that adapted kernels notably enable improved predictions of non-linear functionals of the field such as its maximum.
Resumo:
Theory on plant succession predicts a temporal increase in the complexity of spatial community structure and of competitive interactions: initially random occurrences of early colonising species shift towards spatially and competitively structured plant associations in later successional stages. Here we use long-term data on early plant succession in a German post mining area to disentangle the importance of random colonisation, habitat filtering, and competition on the temporal and spatial development of plant community structure. We used species co-occurrence analysis and a recently developed method for assessing competitive strength and hierarchies (transitive versus intransitive competitive orders) in multispecies communities. We found that species turnover decreased through time within interaction neighbourhoods, but increased through time outside interaction neighbourhoods. Successional change did not lead to modular community structure. After accounting for species richness effects, the strength of competitive interactions and the proportion of transitive competitive hierarchies increased through time. Although effects of habitat filtering were weak, random colonization and subsequent competitive interactions had strong effects on community structure. Because competitive strength and transitivity were poorly correlated with soil characteristics, there was little evidence for context dependent competitive strength associated with intransitive competitive hierarchies.
Resumo:
This paper examines how preference correlation and intercorrelation combine to influence the length of a decentralized matching market's path to stability. In simulated experiments, marriage markets with various preference specifications begin at an arbitrary matching of couples and proceed toward stability via the random mechanism proposed by Roth and Vande Vate (1990). The results of these experiments reveal that fundamental preference characteristics are critical in predicting how long the market will take to reach a stable matching. In particular, intercorrelation and correlation are shown to have an exponential impact on the number of blocking pairs that must be randomly satisfied before stability is attained. The magnitude of the impact is dramatically different, however, depending on whether preferences are positively or negatively intercorrelated.
Resumo:
We present a framework for fitting multiple random walks to animal movement paths consisting of ordered sets of step lengths and turning angles. Each step and turn is assigned to one of a number of random walks, each characteristic of a different behavioral state. Behavioral state assignments may be inferred purely from movement data or may include the habitat type in which the animals are located. Switching between different behavioral states may be modeled explicitly using a state transition matrix estimated directly from data, or switching probabilities may take into account the proximity of animals to landscape features. Model fitting is undertaken within a Bayesian framework using the WinBUGS software. These methods allow for identification of different movement states using several properties of observed paths and lead naturally to the formulation of movement models. Analysis of relocation data from elk released in east-central Ontario, Canada, suggests a biphasic movement behavior: elk are either in an "encamped" state in which step lengths are small and turning angles are high, or in an "exploratory" state, in which daily step lengths are several kilometers and turning angles are small. Animals encamp in open habitat (agricultural fields and opened forest), but the exploratory state is not associated with any particular habitat type.
Resumo:
Objective. To determine the accuracy of the urine protein:creatinine ratio (pr:cr) in predicting 300 mg of protein in 24-hour urine collection in pregnant patients with suspected preeclampsia. ^ Methods. A systematic review was performed. Articles were identified through electronic databases and the relevant citations were hand searching of textbooks and review articles. Included studies evaluated patients for suspected preeclampsia with a 24-hour urine sample and a pr:cr. Only English language articles were included. The studies that had patients with chronic illness such as chronic hypertension, diabetes mellitus or renal impairment were excluded from the review. Two researchers extracted accuracy data for pr:cr relative to a gold standard of 300 mg of protein in 24-hour sample as well as population and study characteristics. The data was analyzed and summarized in tabular and graphical form. ^ Results. Sixteen studies were identified and only three studies met our inclusion criteria with 510 total patients. The studies evaluated different cut-points for positivity of pr:cr from 130 mg/g to 700 mg/g. Sensitivities and specificities for pr:cr of 130mg/g -150 mg/g were 90-93% and 33-65%, respectively; for a pr:cr of 300 mg/g were 81-95% and 52-80%, respectively; for a pr:cr of 600-700mg/g were 85-87% and 96-97%, respectively. ^ Conclusion. The value of a random pr:cr to exclude pre-eclampsia is limited because even low levels of pr:cr (130-150 mg/g) may miss up to 10% of patients with significant proteinuria. A pr:cr of more than 600 mg/g may obviate a 24-hour collection.^
Resumo:
Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^
Resumo:
Gastroesophageal reflux disease is a common condition affecting 25 to 40% of the population and causes significant morbidity in the U.S., accounting for at least 9 million office visits to physicians with estimated annual costs of $10 billion. Previous research has not clearly established whether infection with Helicobacter pylori, a known cause of peptic ulcer, atrophic gastritis and non cardia adenocarcinoma of the stomach, is associated with gastroesophageal reflux disease. This study is a secondary analysis of data collected in a cross-sectional study of a random sample of adult residents of Ciudad Juarez, Mexico, that was conducted in 2004 (Prevalence and Determinants of Chronic Atrophic Gastritis Study or CAG study, Dr. Victor M. Cardenas, Principal Investigator). In this study, the presence of gastroesophageal reflux disease was based on responses to the previously validated Spanish Language Dyspepsia Questionnaire. Responses to this questionnaire indicating the presence of gastroesophageal reflux symptoms and disease were compared with the presence of H. pylori infection as measured by culture, histology and rapid urease test, and with findings of upper endoscopy (i.e., hiatus hernia and erosive and atrophic esophagitis). The prevalence ratio was calculated using bivariate, stratified and multivariate negative binomial logistic regression analyses in order to assess the relation between active H. pylori infection and the prevalence of gastroesophageal reflux typical syndrome and disease, while controlling for known risk factors of gastroesophageal reflux disease such as obesity. In a random sample of 174 adults 48 (27.6%) of the study participants had typical reflux syndrome and only 5% (or 9/174) had gastroesophageal reflux disease per se according to the Montreal consensus, which defines reflux syndromes and disease based on whether the symptoms are perceived as troublesome by the subject. There was no association between H. pylori infection and typical reflux syndrome or gastroesophageal reflux disease. However, we found that in this Northern Mexican population, there was a moderate association (Prevalence Ratio=2.5; 95% CI=1.3, 4.7) between obesity (≥30 kg/m2) and typical reflux syndrome. Management and prevention of obesity will significantly curb the growing numbers of persons affected by gastroesophageal reflux symptoms and disease in Northern Mexico. ^
Resumo:
This paper presents an algorithm for generating scale-free networks with adjustable clustering coefficient. The algorithm is based on a random walk procedure combined with a triangle generation scheme which takes into account genetic factors; this way, preferential attachment and clustering control are implemented using only local information. Simulations are presented which support the validity of the scheme, characterizing its tuning capabilities.
Resumo:
We establish a refined version of the Second Law of Thermodynamics for Langevin stochastic processes describing mesoscopic systems driven by conservative or non-conservative forces and interacting with thermal noise. The refinement is based on the Monge-Kantorovich optimal mass transport and becomes relevant for processes far from quasi-stationary regime. General discussion is illustrated by numerical analysis of the optimal memory erasure protocol for a model for micron-size particle manipulated by optical tweezers.