136 resultados para bootstrapping
Resumo:
Scenarios for the emergence or bootstrap of a lexicon involve the repeated interaction between at least two agents who must reach a consensus on how to name N objects using H words. Here we consider minimal models of two types of learning algorithms: cross-situational learning, in which the individuals determine the meaning of a word by looking for something in common across all observed uses of that word, and supervised operant conditioning learning, in which there is strong feedback between individuals about the intended meaning of the words. Despite the stark differences between these learning schemes, we show that they yield the same communication accuracy in the limits of large N and H, which coincides with the result of the classical occupancy problem of randomly assigning N objects to H words.
Resumo:
We analyse the finite-sample behaviour of two second-order bias-corrected alternatives to the maximum-likelihood estimator of the parameters in a multivariate normal regression model with general parametrization proposed by Patriota and Lemonte [A. G. Patriota and A. J. Lemonte, Bias correction in a multivariate regression model with genereal parameterization, Stat. Prob. Lett. 79 (2009), pp. 1655-1662]. The two finite-sample corrections we consider are the conventional second-order bias-corrected estimator and the bootstrap bias correction. We present the numerical results comparing the performance of these estimators. Our results reveal that analytical bias correction outperforms numerical bias corrections obtained from bootstrapping schemes.
Resumo:
In order to extend previous SAR and QSAR studies, 3D-QSAR analysis has been performed using CoMFA and CoMSIA approaches applied to a set of 39 alpha-(N)-heterocyclic carboxaldehydes thiosemicarbazones with their inhibitory activity values (IC(50)) evaluated against ribonucleotide reductase (RNR) of H.Ep.-2 cells (human epidermoid carcinoma), taken from selected literature. Both rigid and field alignment methods, taking the unsubstituted 2-formylpyridine thiosemicarbazone in its syn conformation as template, have been used to generate multiple predictive CoMFA and CoMSIA models derived from training sets and validated with the corresponding test sets. Acceptable predictive correlation coefficients (Q(cv)(2) from 0.360 to 0.609 for CoMFA and Q(cv)(2) from 0.394 to 0.580 for CoMSIA models) with high fitted correlation coefficients (r` from 0.881 to 0.981 for CoMFA and r(2) from 0.938 to 0.993 for CoMSIA models) and low standard errors (s from 0.135 to 0.383 for CoMFA and s from 0.098 to 0.240 for CoMSIA models) were obtained. More precise CoMFA and CoMSIA models have been derived considering the subset of thiosemicarbazones (TSC) substituted only at 5-position of the pyridine ring (n=22). Reasonable predictive correlation coefficients (Q(cv)(2) from 0.486 to 0.683 for CoMFA and Q(cv)(2) from 0.565 to 0.791 for CoMSIA models) with high fitted correlation coefficients (r(2) from 0.896 to 0.997 for CoMFA and r(2) from 0.991 to 0.998 for CoMSIA models) and very low standard errors (s from 0.040 to 0.179 for CoMFA and s from 0.029 to 0.068 for CoMSIA models) were obtained. The stability of each CoMFA and CoMSIA models was further assessed by performing bootstrapping analysis. For the two sets the generated CoMSIA models showed, in general, better statistics than the corresponding CoMFA models. The analysis of CoMFA and CoMSIA contour maps suggest that a hydrogen bond acceptor near the nitrogen of the pyridine ring can enhance inhibitory activity values. This observation agrees with literature data, which suggests that the nitrogen pyridine lone pairs can complex with the iron ion leading to species that inhibits RNR. The derived CoMFA and CoMSIA models contribute to understand the structural features of this class of TSC as antitumor agents in terms of steric, electrostatic, hydrophobic and hydrogen bond donor and hydrogen bond acceptor fields as well as to the rational design of this key enzyme inhibitors.
Resumo:
The context of this report and the IRIDIA laboratory are described in the preface. Evolutionary Robotics and the box-pushing task are presented in the introduction.The building of a test system supporting Evolutionary Robotics experiments is then detailed. This system is made of a robot simulator and a Genetic Algorithm. It is used to explore the possibility of evolving box-pushing behaviours. The bootstrapping problem is explained, and a novel approach for dealing with it is proposed, with results presented.Finally, ideas for extending this approach are presented in the conclusion.
Resumo:
This paper deals with the testing of autoregressive conditional duration (ACD) models by gauging the distance between the parametric density and hazard rate functions implied by the duration process and their non-parametric estimates. We derive the asymptotic justification using the functional delta method for fixed and gamma kernels, and then investigate the finite-sample properties through Monte Carlo simulations. Although our tests display some size distortion, bootstrapping suffices to correct the size without compromising their excellent power. We show the practical usefulness of such testing procedures for the estimation of intraday volatility patterns.
Resumo:
Asset allocation decisions and value at risk calculations rely strongly on volatility estimates. Volatility measures such as rolling window, EWMA, GARCH and stochastic volatility are used in practice. GARCH and EWMA type models that incorporate the dynamic structure of volatility and are capable of forecasting future behavior of risk should perform better than constant, rolling window volatility models. For the same asset the model that is the ‘best’ according to some criterion can change from period to period. We use the reality check test∗ to verify if one model out-performs others over a class of re-sampled time-series data. The test is based on re-sampling the data using stationary bootstrapping. For each re-sample we check the ‘best’ model according to two criteria and analyze the distribution of the performance statistics. We compare constant volatility, EWMA and GARCH models using a quadratic utility function and a risk management measurement as comparison criteria. No model consistently out-performs the benchmark.
Resumo:
Paratelmatobius and Scythrophrys are leptodactylid frogs endemic to the Brazilian Atlantic forest and their close phylogenetic relationship was recently inferred in an analysis that included Paratelmatobius sp. and S. sawayae. To investigate the interspecific relationships among Paratelmatobius and Scythrophrys species, we analyzed a mitochondrial region (approximately 2.4 kb) that included the ribosomal genes 12S and 16S and the tRNAval in representatives of all known localities of these genera and in 54 other species. Maximum parsimony inferences were done using PAUP* and support for the clades was evaluated by bootstrapping. A cytogenetic analysis using Giemsa staining, C-banding and silver staining was also done for those populations of Paratelmatobius not included in previous cytogenetic studies of this genus in order to assess their karyotype differentiation. Our results suggested Paratelmatobius and Scythrophrys formed a clade strongly supported by bootstrapping, which corroborated their very close phylogenetic relationship. Among the Paratelmatobius species, two clades were identified and corroborated the groups P. mantiqueira and P. cardosoi previously proposed based on morphological characters. The karyotypes of Paratelmatobius sp. 2 and Paratelmatobius sp. 3 described here had diploid chromosome number 2n = 24 and showed many similarities with karyotypes of other Paratelmatobius representatives. The cytogenetic data and the phylogenetic analysis allowed the proposal/corroboration of several hypotheses for the karyotype differentiation within Paratelmatobius and Scythrophrys. Namely the telocentric pair No. 4 represented a synapomorphy of P. cardosoi and Paratelmatobius sp. 2, while chromosome pair No. 5 with interstitial C-bands could be interpreted as a synapomorphy of the P. cardosoi group. The NOR-bearing chromosome No. 10 in the karyotype of P. poecilogaster was considered homeologous to chromosome No. 10 in the karyotype of Scythrophrys sp., chromosome No. 9 in the karyotype of Paratelmatobius sp. 1, chromosome No. 8 in the karyotypes of Paratelmatobius sp. 2 and of Paratelmatobius sp. 3, and chromosome No. 7 in the karyotype of P. cardosoi. A hypothesis for the evolutionary divergence of these NOR-bearing chromosomes, which probably involved events like gain in heteochromatin, was proposed.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Pós-graduação em Agronomia (Entomologia Agrícola) - FCAV
Resumo:
Gastric cancer is the second leading cause of cancer-related death worldwide. The identification of new cancer biomarkers is necessary to reduce the mortality rates through the development of new screening assays and early diagnosis, as well as new target therapies. In this study, we performed a proteomic analysis of noncardia gastric neoplasias of individuals from Northern Brazil. The proteins were analyzed by two-dimensional electrophoresis and mass spectrometry. For the identification of differentially expressed proteins, we used statistical tests with bootstrapping resampling to control the type I error in the multiple comparison analyses. We identified 111 proteins involved in gastric carcinogenesis. The computational analysis revealed several proteins involved in the energy production processes and reinforced the Warburg effect in gastric cancer. ENO1 and HSPB1 expression were further evaluated. ENO1 was selected due to its role in aerobic glycolysis that may contribute to the Warburg effect. Although we observed two up-regulated spots of ENO1 in the proteomic analysis, the mean expression of ENO1 was reduced in gastric tumors by western blot. However, mean ENO1 expression seems to increase in more invasive tumors. This lack of correlation between proteomic and western blot analyses may be due to the presence of other ENO1 spots that present a slightly reduced expression, but with a high impact in the mean protein expression. In neoplasias, HSPB1 is induced by cellular stress to protect cells against apoptosis. In the present study, HSPB1 presented an elevated protein and mRNA expression in a subset of gastric cancer samples. However, no association was observed between HSPB1 expression and clinicopathological characteristics. Here, we identified several possible biomarkers of gastric cancer in individuals from Northern Brazil. These biomarkers may be useful for the assessment of prognosis and stratification for therapy if validated in larger clinical study sets.
Resumo:
This thesis concerns artificially intelligent natural language processing systems that are capable of learning the properties of lexical items (properties like verbal valency or inflectional class membership) autonomously while they are fulfilling their tasks for which they have been deployed in the first place. Many of these tasks require a deep analysis of language input, which can be characterized as a mapping of utterances in a given input C to a set S of linguistically motivated structures with the help of linguistic information encoded in a grammar G and a lexicon L: G + L + C → S (1) The idea that underlies intelligent lexical acquisition systems is to modify this schematic formula in such a way that the system is able to exploit the information encoded in S to create a new, improved version of the lexicon: G + L + S → L' (2) Moreover, the thesis claims that a system can only be considered intelligent if it does not just make maximum usage of the learning opportunities in C, but if it is also able to revise falsely acquired lexical knowledge. So, one of the central elements in this work is the formulation of a couple of criteria for intelligent lexical acquisition systems subsumed under one paradigm: the Learn-Alpha design rule. The thesis describes the design and quality of a prototype for such a system, whose acquisition components have been developed from scratch and built on top of one of the state-of-the-art Head-driven Phrase Structure Grammar (HPSG) processing systems. The quality of this prototype is investigated in a series of experiments, in which the system is fed with extracts of a large English corpus. While the idea of using machine-readable language input to automatically acquire lexical knowledge is not new, we are not aware of a system that fulfills Learn-Alpha and is able to deal with large corpora. To instance four major challenges of constructing such a system, it should be mentioned that a) the high number of possible structural descriptions caused by highly underspeci ed lexical entries demands for a parser with a very effective ambiguity management system, b) the automatic construction of concise lexical entries out of a bulk of observed lexical facts requires a special technique of data alignment, c) the reliability of these entries depends on the system's decision on whether it has seen 'enough' input and d) general properties of language might render some lexical features indeterminable if the system tries to acquire them with a too high precision. The cornerstone of this dissertation is the motivation and development of a general theory of automatic lexical acquisition that is applicable to every language and independent of any particular theory of grammar or lexicon. This work is divided into five chapters. The introductory chapter first contrasts three different and mutually incompatible approaches to (artificial) lexical acquisition: cue-based queries, head-lexicalized probabilistic context free grammars and learning by unification. Then the postulation of the Learn-Alpha design rule is presented. The second chapter outlines the theory that underlies Learn-Alpha and exposes all the related notions and concepts required for a proper understanding of artificial lexical acquisition. Chapter 3 develops the prototyped acquisition method, called ANALYZE-LEARN-REDUCE, a framework which implements Learn-Alpha. The fourth chapter presents the design and results of a bootstrapping experiment conducted on this prototype: lexeme detection, learning of verbal valency, categorization into nominal count/mass classes, selection of prepositions and sentential complements, among others. The thesis concludes with a review of the conclusions and motivation for further improvements as well as proposals for future research on the automatic induction of lexical features.
Resumo:
Little is known about the learning of the skills needed to perform ultrasound- or nerve stimulator-guided peripheral nerve blocks. The aim of this study was to compare the learning curves of residents trained in ultrasound guidance versus residents trained in nerve stimulation for axillary brachial plexus block. Ten residents with no previous experience with using ultrasound received ultrasound training and another ten residents with no previous experience with using nerve stimulation received nerve stimulation training. The novices' learning curves were generated by retrospective data analysis out of our electronic anaesthesia database. Individual success rates were pooled, and the institutional learning curve was calculated using a bootstrapping technique in combination with a Monte Carlo simulation procedure. The skills required to perform successful ultrasound-guided axillary brachial plexus block can be learnt faster and lead to a higher final success rate compared to nerve stimulator-guided axillary brachial plexus block.
Resumo:
Estimation of the number of mixture components (k) is an unsolved problem. Available methods for estimation of k include bootstrapping the likelihood ratio test statistics and optimizing a variety of validity functionals such as AIC, BIC/MDL, and ICOMP. We investigate the minimization of distance between fitted mixture model and the true density as a method for estimating k. The distances considered are Kullback-Leibler (KL) and “L sub 2”. We estimate these distances using cross validation. A reliable estimate of k is obtained by voting of B estimates of k corresponding to B cross validation estimates of distance. This estimation methods with KL distance is very similar to Monte Carlo cross validated likelihood methods discussed by Smyth (2000). With focus on univariate normal mixtures, we present simulation studies that compare the cross validated distance method with AIC, BIC/MDL, and ICOMP. We also apply the cross validation estimate of distance approach along with AIC, BIC/MDL and ICOMP approach, to data from an osteoporosis drug trial in order to find groups that differentially respond to treatment.
Resumo:
This event study investigates the impact of the Japanese nuclear disaster in Fukushima-Daiichi on the daily stock prices of French, German, Japanese, and U.S. nuclear utility and alternative energy firms. Hypotheses regarding the (cumulative) abnormal returns based on a three-factor model are analyzed through joint tests by multivariate regression models and bootstrapping. Our results show significant abnormal returns for Japanese nuclear utility firms during the one-week event window and the subsequent four-week post-event window. Furthermore, while French and German nuclear utility and alternative energy stocks exhibit significant abnormal returns during the event window, we cannot confirm abnormal returns for U.S. stocks.
Resumo:
Peatlands are widely exploited archives of paleoenvironmental change. We developed and compared multiple transfer functions to infer peatland depth to the water table (DWT) and pH based on testate amoeba (percentages, or presence/absence), bryophyte presence/absence, and vascular plant presence/absence data from sub-alpine peatlands in the SE Swiss Alps in order to 1) compare the performance of single-proxy vs. multi-proxy models and 2) assess the performance of presence/absence models. Bootstrapping cross-validation showing the best performing single-proxy transfer functions for both DWT and pH were those based on bryophytes. The best performing transfer functions overall for DWT were those based on combined testate amoebae percentages, bryophytes and vascular plants; and, for pH, those based on testate amoebae and bryophytes. The comparison of DWT and pH inferred from testate amoeba percentages and presence/absence data showed similar general patterns but differences in the magnitude and timing of some shifts. These results show new directions for paleoenvironmental research, 1) suggesting that it is possible to build good-performing transfer functions using presence/absence data, although with some loss of accuracy, and 2) supporting the idea that multi-proxy inference models may improve paleoecological reconstruction. The performance of multi-proxy and single-proxy transfer functions should be further compared in paleoecological data.