966 resultados para Random Forests Classifier
Resumo:
This work presents a new general purpose classifier named Averaged Extended Tree Augmented Naive Bayes (AETAN), which is based on combining the advantageous characteristics of Extended Tree Augmented Naive Bayes (ETAN) and Averaged One-Dependence Estimator (AODE) classifiers. We describe the main properties of the approach and algorithms for learning it, along with an analysis of its computational time complexity. Empirical results with numerous data sets indicate that the new approach is superior to ETAN and AODE in terms of both zero-one classification accuracy and log loss. It also compares favourably against weighted AODE and hidden Naive Bayes. The learning phase of the new approach is slower than that of its competitors, while the time complexity for the testing phase is similar. Such characteristics suggest that the new classifier is ideal in scenarios where online learning is not required.
Resumo:
What is meant by the term random? Do we understand how to identify which type of randomisation to use in our future research projects? We, as researchers, often explain randomisation to potential research participants as being a 50/50 chance of selection to either an intervention or control group, akin to drawing numbers out of a hat. Is this an accurate explanation? And are all methods of randomisation equal? This paper aims to guide the researcher through the different techniques used to randomise participants with examples of how they can be used in educational research.
Resumo:
Urothelial cancer (UC) is highly recurrent and can progress from non-invasive (NMIUC) to a more aggressive muscle-invasive (MIUC) subtype that invades the muscle tissue layer of the bladder. We present a proof of principle study that network-based features of gene pairs can be used to improve classifier performance and the functional analysis of urothelial cancer gene expression data. In the first step of our procedure each individual sample of a UC gene expression dataset is inflated by gene pair expression ratios that are defined based on a given network structure. In the second step an elastic net feature selection procedure for network-based signatures is applied to discriminate between NMIUC and MIUC samples. We performed a repeated random subsampling cross validation in three independent datasets. The network signatures were characterized by a functional enrichment analysis and studied for the enrichment of known cancer genes. We observed that the network-based gene signatures from meta collections of proteinprotein interaction (PPI) databases such as CPDB and the PPI databases HPRD and BioGrid improved the classification performance compared to single gene based signatures. The network based signatures that were derived from PPI databases showed a prominent enrichment of cancer genes (e.g., TP53, TRIM27 and HNRNPA2Bl). We provide a novel integrative approach for large-scale gene expression analysis for the identification and development of novel diagnostical targets in bladder cancer. Further, our method allowed to link cancer gene associations to network-based expression signatures that are not observed in gene-based expression signatures.
Resumo:
The pinewood nematode, is the causal agent of pine wilt disease, a serious threat to native pine forest in eastern Asia (Japan, Korea, China and Taiwan) and some parts of North America (USA, Canada and Mexico). In 1999, this nematode was found and identified for the first time in Portugal and in Europe. The detection of this quarantine pest in Portugal has indicated the need to know more about the distribution of Bursaphelenchus spp. in coniferous trees in Europe in order to describe the geographic range of the species and to act quickly in case of the nematode’s unwanted introduction into other European regions. Pine forest has a wide distribution in Turkey that increases the number of susceptible host trees for pinewood nematode. Because of these resaons, some regions of Turkey were surveyed for the presence of the nematode. Three different species of Bursaphelenchus were found. However, B. xylophilus was not detected. The detection of B. mucronatus, very similar to B. xylophilus biologically and morphologically, is very important. The presence of this species indicates that B. xylophilus could spread easly in conifer forests of Turkey. A study was conducted to determine the pathogenicity of B. mucronatus and 80% of seedlings of P. sylvestris were wilted. Biological characteristics of M. galloprovincialis were compared with M. carolinensis, Nort American vector, and some of them were found to be similar.
Resumo:
O nemátode da madeira do pinheiro (NMP), Bursaphelenchus xylophiius, tem uma extensa distribuição na América do Norte, e encontra-se atualmente distribuído ao longo da maioria dos territórios de Canadá e dos Estados Unidos. Durante o último século, esta espécie foi transportada pelo Homem para outras regiões do mundo (não-nativas), associadas com o comércio e o fluxo global de produtos de origem florestal. Atualmente, esta espécie invasiva está reportada para algumas regiões do SE asiático (China, Japão, Coreia e Taiwan) e mais recentemente para a Europa (Portugal). Devido ao impacto que este organismo agente da doença da murchidão dos pinheiros causa nas florestas nativas destas regiões esta espécie assume uma elevada importância económica a nível mundial Em Portugal, a distribuição do NMP encontra-se confinada a uma área restrita e limitada (500 000 ha), a sul de Lisboa (península de Setúbal); contudo, constitui uma das maiores ameaças às florestas de pinheiro do país e da UE. Ate recentemente, nenhum consenso existia quanto à origem do NMP em Portugal. Diversas hipóteses têm sido colocadas para explicar esta introdução, nomeadamente a partir de zonas onde o nematode ocorre naturalmente (América do Norte), ou de outras áreas (não-nativas) onde o nematode se comporta como uma espécie invasiva (Leste da Ásia). A fim de avaliar a variabilidade genética do NMP proveniente da área afetada em Portugal, foram utilizadas várias técnicas moleculares, designadamente o random amplified polymorphic DNA (RAPD-PCR) e o satellite DNA (satDNA). No caso do RAPD-PCR, foram utilizados 24 isolados do NMP provenientes de Portugal, 1 proveniente da América do Norte e 1 da Ásia, tendo sido utilizado como out-group um isolado de B. mucronatus. A partir dos 28 RAPD primers utilizados obtiveram-se 640 fragmentos. No caso do satDNA, foram utilizados 21 isolados do NMP provenientes de Portugal, obtendo-se no total 206 sequências da família MspI. Ambos os métodos revelaram uma elevada similaridade genética entre os vários isolados do NMP da área afetada em Portugal O nível reduzido de diversidade genética obtido entre os isolados portugueses do NMP, permite concluir que se trata de uma única introdução deste organismo em Portugal, e proveniente de uma região asiática. A inexistência de uma de correlação entre a variabilidade genética e a distribuição geográfica do NMP dentro da área afetada em Portugal, indica que o NMP se encontra distribuído de forma uniforme ao longo de toda a área afetada, provavelmente relacionado com a distribuição e a expansão natural do inseto vector. The pinewood nematode (PWN), Bursaphelenchus xylophilus, has a wide distribution in North America, and is present throughout most of the territories of Canada and the United Stata. During the last century, this species has been transported by man to several non-native regions of the world, associated with trade and the global flow of forest products. Up to date, this invasive species has been reported from Asia (PR China, Japan, Korea and Taiwan) and more recently in Europe (Portugal). Due to the impact on native pine forests of these regions, this nematode species, the causal agent of pine wilt disease, is of great economic importance worldwide. In Portugal, the distribution of the PWN has been constrained to a relatively small area (500 000 ha) in the south of Lisbon (Setúbal Peninsula); however, it has become the most serious threat to pine forests in the country. Until recently, no consensus had emerged on the possible pathway of the PWN introduction in Portugal. Several hypotheses have been put forward to explain this introduction, such as an origin from endemic areas where the nematode naturally occurs (North America), or non-endemic areas where the nematode behaves as an exotic pest (East Asia). Random amplified polymorphic DNA (RAPD-PCR) and satellite DNA (satDNA) techniques were used in order to assess the level of genetic variability and genetic relationships, among several isolates of the PWN, representative of the entire affected area in Portugal. In the case of RAPD-PCR, 24 Portuguese isolates, plus two additional isolates of B. xylophilus, representing North America and East Asia were included. B. mucronatus was used as an out-group. Twenty-eight random primers generated a total of 640 DNA fragments. With satDNA, 206 Mspl sequence repeats were obtained from 21 Portuguese isolates of B. xylophilus. Both molecular methods revealed a high genetic similarity among the Portuguese isolates, and the low level of genetic diversity strongly suggests that they were dispersed recently from a single introduction, and from East Asia. The lack of apparent relationship between the genetic variability and the geographic distribution of the PWN within the affected area, suggests that the recent introduction of this pest (and pathogen) in Portugal has been uniformly distributed since its establishment, probably following the natural distribution and expansion of the insect vector.
Resumo:
The use of preference-based measures of health in the measurement of Health Related Quality of Life has become widely used in health economics. Hence, the development of preference-based measures of health has been a major concern for researchers throughout the world. This study aims to model health state preference data using a new preference-based measure of health (the SF- 6D) and to suggest alternative models for predicting health state utilities using fixed and random effects models. It also seeks to investigate the problems found in the SF-6D and to suggest eventual changes to it.
Resumo:
The problem of Small Area Estimation is about how to produce reliable estimates of domain characteristics when the sample sizes within the domain is very small ou even zero.
Resumo:
We propose here the hypothesis that all of United Kingdom (UK) is likely to be affected by Ganoderma sp. spores, an important plant pathogen. We suggest that the main sources of this pathogen, which acts as a bioaerosol, are the widely scattered woodlands in the country, although remote sources must not be neglected. The hypothesis is based on related studies on bioaerosols and supported by new observations from a non-forest site and model calculations to support our hypothesis. Hourly concentrations of Ganoderma sp. spores were measured from 2006 to 2010 using a 7-day volumetric spore trap at the city of Worcester. The concentrations peak during the night and early in the morning. This suggests that the main spore sources are located a few hours away with respect to air masses transport and reach urban areas thanks to air masses transport. The back-trajectory analysis was applied to determine the location of Ganoderma sp. spore sources. The analysis of back-trajectories demonstrated that 78% of the air masses reached Worcester from a 180° arc direction from the East to West. Three episodes were selected for detailed investigation and they revealed that during the episodes air masses always passed main UK woodlands before the arrival in Worcester, independently of their origin, but the long distance transport under certain conditions might be possible. Our studies suggest that the sources of UK Ganoderma sp. spores are mainly to be found in UK. Hence our studies suggest that research and mitigation strategies in UK should give their main attention to national sources, without neglecting the contribution from long distance transport.
Resumo:
This paper presents a biased random-key genetic algorithm for the resource constrained project scheduling problem. The chromosome representation of the problem is based on random keys. Active schedules are constructed using a priority-rule heuristic in which the priorities of the activities are defined by the genetic algorithm. A forward-backward improvement procedure is applied to all solutions. The chromosomes supplied by the genetic algorithm are adjusted to reflect the solutions obtained by the improvement procedure. The heuristic is tested on a set of standard problems taken from the literature and compared with other approaches. The computational results validate the effectiveness of the proposed algorithm.
Resumo:
This paper presents a genetic algorithm for the Resource Constrained Project Scheduling Problem (RCPSP). The chromosome representation of the problem is based on random keys. The schedule is constructed using a heuristic priority rule in which the priorities of the activities are defined by the genetic algorithm. The heuristic generates parameterized active schedules. The approach was tested on a set of standard problems taken from the literature and compared with other approaches. The computational results validate the effectiveness of the proposed algorithm.
Resumo:
In this paper we study the modifications that occurred in some forest soil properties after a prescribed fire. The research focused on the alterations of soil pH, soil moisture and soil organic matter content during a two-year span, from 2008 to 2009. The study site is located in Anjos, Vieira do Minho municipality, a forest site that has suffered from recurrent wildfires for several decades. Furze (Ulex, sp.), broom (Cytisus, sp.), gorse (Chamaespartum tridentatum) and a very few disperse adult pine (Pinus sylvestris) are the predominant vegetation type in the study area. The average height of this shrub vegetation is around 1.5 m. The prescribed fire was conducted by the National Forestry Authority (AFN) in November 2008. Fuzzy Boolean Nets (FBN) were used to evaluate the alteration in soil parameters when compared with adjacent spots where: i) no fire occurrence was registered since 1998; ii) fire occurrence was registered in 2008; and iii) vegetation pruning by mechanical cut was done in Spring six months prior to the prescribed fire event. Results suggest that in the particular case of the studied site, Anjos, the observed soil properties alterations cannot be related with the prescribed fire.
Resumo:
Magical ideation and belief in the paranormal is considered to represent a trait-like character; people either believe in it or not. Yet, anecdotes indicate that exposure to an anomalous event can turn skeptics into believers. This transformation is likely to be accompanied by altered cognitive functioning such as impaired judgments of event likelihood. Here, we investigated whether the exposure to an anomalous event changes individuals' explicit traditional (religious) and non-traditional (e.g., paranormal) beliefs as well as cognitive biases that have previously been associated with non-traditional beliefs, e.g., repetition avoidance when producing random numbers in a mental dice task. In a classroom, 91 students saw a magic demonstration after their psychology lecture. Before the demonstration, half of the students were told that the performance was done respectively by a conjuror (magician group) or a psychic (psychic group). The instruction influenced participants' explanations of the anomalous event. Participants in the magician, as compared to the psychic group, were more likely to explain the event through conjuring abilities while the reverse was true for psychic abilities. Moreover, these explanations correlated positively with their prior traditional and non-traditional beliefs. Finally, we observed that the psychic group showed more repetition avoidance than the magician group, and this effect remained the same regardless of whether assessed before or after the magic demonstration. We conclude that pre-existing beliefs and contextual suggestions both influence people's interpretations of anomalous events and associated cognitive biases. Beliefs and associated cognitive biases are likely flexible well into adulthood and change with actual life events.
Resumo:
This paper presents a new theory of random consumer demand. The primitive is a collection of probability distributions, rather than a binary preference. Various assumptions constrain these distributions, including analogues of common assumptions about preferences such as transitivity, monotonicity and convexity. Two results establish a complete representation of theoretically consistent random demand. The purpose of this theory of random consumer demand is application to empirical consumer demand problems. To this end, the theory has several desirable properties. It is intrinsically stochastic, so the econometrician can apply it directly without adding extrinsic randomness in the form of residuals. Random demand is parsimoniously represented by a single function on the consumption set. Finally, we have a practical method for statistical inference based on the theory, described in McCausland (2004), a companion paper.