971 resultados para Decision trees


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper the issues of Ukrainian new three-level pension system are discussed. First, the paper presents the mathematical model that allows calculating the optimal size of contributions to the non-state pension fund. Next, the non-state pension fund chooses an Asset Management Company. To do so it is proposed to use an approach based on Kohonen networks to classify asset management companies that work in Ukrainian market. Further, when the asset management company is chosen, it receives the pension contributions of the participants of the non-pension fund. Asset Management Company has to invest these contributions profitably. This paper proposes an approach for choosing the most profitable investment project using decision trees. The new pension system has been lawfully ratified only four years ago and is still developing, that is why this paper is very important.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

ACM Computing Classification System (1998): I.4.9, I.4.10.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Empirical studies of education programs and systems, by nature, rely upon use of student outcomes that are measurable. Often, these come in the form of test scores. However, in light of growing evidence about the long-run importance of other student skills and behaviors, the time has come for a broader approach to evaluating education. This dissertation undertakes experimental, quasi-experimental, and descriptive analyses to examine social, behavioral, and health-related mechanisms of the educational process. My overarching research question is simply, which inside- and outside-the-classroom features of schools and educational interventions are most beneficial to students in the long term? Furthermore, how can we apply this evidence toward informing policy that could effectively reduce stark social, educational, and economic inequalities?

The first study of three assesses mechanisms by which the Fast Track project, a randomized intervention in the early 1990s for high-risk children in four communities (Durham, NC; Nashville, TN; rural PA; and Seattle, WA), reduced delinquency, arrests, and health and mental health service utilization in adolescence through young adulthood (ages 12-20). A decomposition of treatment effects indicates that about a third of Fast Track’s impact on later crime outcomes can be accounted for by improvements in social and self-regulation skills during childhood (ages 6-11), such as prosocial behavior, emotion regulation and problem solving. These skills proved less valuable for the prevention of mental and physical health problems.

The second study contributes new evidence on how non-instructional investments – such as increased spending on school social workers, guidance counselors, and health services – affect multiple aspects of student performance and well-being. Merging several administrative data sources spanning the 1996-2013 school years in North Carolina, I use an instrumental variables approach to estimate the extent to which local expenditure shifts affect students’ academic and behavioral outcomes. My findings indicate that exogenous increases in spending on non-instructional services not only reduce student absenteeism and disciplinary problems (important predictors of long-term outcomes) but also significantly raise student achievement, in similar magnitude to corresponding increases in instructional spending. Furthermore, subgroup analyses suggest that investments in student support personnel such as social workers, health services, and guidance counselors, in schools with concentrated low-income student populations could go a long way toward closing socioeconomic achievement gaps.

The third study examines individual pathways that lead to high school graduation or dropout. It employs a variety of machine learning techniques, including decision trees, random forests with bagging and boosting, and support vector machines, to predict student dropout using longitudinal administrative data from North Carolina. I consider a large set of predictor measures from grades three through eight including academic achievement, behavioral indicators, and background characteristics. My findings indicate that the most important predictors include eighth grade absences, math scores, and age-for-grade as well as early reading scores. Support vector classification (with a high cost parameter and low gamma parameter) predicts high school dropout with the highest overall validity in the testing dataset at 90.1 percent followed by decision trees with boosting and interaction terms at 89.5 percent.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Social structure is a key determinant of population biology and is central to the way animals exploit their environment. The risk of predation is often invoked as an important factor influencing the evolution of social structure in cetaceans and other mammals, but little direct information is available about how cetaceans actually respond to predators or other perceived threats. The playback of sounds to an animal is a powerful tool for assessing behavioral responses to predators, but quantifying behavioral responses to playback experiments requires baseline knowledge of normal behavioral patterns and variation. The central goal of my dissertation is to describe baseline foraging behavior for the western Atlantic short-finnned pilot whales (Globicephala macrohynchus) and examine the role of social organization in their response to predators. To accomplish this I used multi-sensor digital acoustic tags (DTAGs), satellite-linked time-depth recorders (SLTDR), and playback experiments to study foraging behavior and behavioral response to predators in pilot whales. Fine scale foraging strategies and population level patterns were identified by estimating the body size and examining the location and movement around feeding events using data collected with DTAGs deployed on 40 pilot whales in summers of 2008-2014 off the coast of Cape Hatteras, North Carolina. Pilot whales were found to forage throughout the water column and performed feeding buzzes at depths ranging from 29-1176 meters. The results indicated potential habitat segregation in foraging depth in short-finned pilot whales with larger individuals foraging on average at deeper depths. Calculated aerobic dive limit for large adult males was approximately 6 minutes longer than that of females and likely facilitated the difference in foraging depth. Furthermore, the buzz frequency and speed around feeding attempts indicate this population pilot whales are likely targeting multiple small prey items. Using these results, I built decision trees to inform foraging dive classification in coarse, long-term dive data collected with SLTDRs deployed on 6 pilot whales in the summers of 2014 and 2015 in the same area off the coast of North Carolina. I used these long term foraging records to compare diurnal foraging rates and depths, as well as classify bouts with a maximum likelihood method, and evaluate behavioral aerobic dive limits (ADLB) through examination of dive durations and inter-dive intervals. Dive duration was the best predictor of foraging, with dives >400.6 seconds classified as foraging, and a 96% classification accuracy. There were no diurnal patterns in foraging depth or rates and average duration of bouts was 2.94 hours with maximum bout durations lasting up to 14 hours. The results indicated that pilot whales forage in relatively long bouts and the ADLB indicate that pilot whales rarely, if ever exceed their aerobic limits. To evaluate the response to predators I used controlled playback experiments to examine the behavioral responses of 10 of the tagged short-finned pilot whales off Cape Hatteras, North Carolina and 4 Risso’s dolphins (Grampus griseus) off Southern California to the calls of mammal-eating killer whales (MEK). Both species responded to a subset of MEK calls with increased movement, swim speed and increased cohesion of the focal groups, but the two species exhibited different directional movement and vocal responses. Pilot whales increased their call rate and approached the sound source, but Risso’s dolphins exhibited no change in their vocal behavior and moved in a rapid, directed manner away from the source. Thus, at least to a sub-set of mammal-eating killer whale calls, these two study species reacted in a manner that is consistent with their patterns of social organization. Pilot whales, which live in relatively permanent groups bound by strong social bonds, responded in a manner that built on their high levels of social cohesion. In contrast, Risso’s dolphins exhibited an exaggerated flight response and moved rapidly away from the sound source. The fact that both species responded strongly to a select number of MEK calls, suggests that structural features of signals play critical contextual roles in the probability of response to potential threats in odontocete cetaceans.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este trabalho propõe um estudo de sinais cerebrais aplicados em sistemas BCI (Brain-Computer Interface - Interfaces Cérebro Computador), através do uso de Árvores de Decisão e da análise dessas árvores com base nas Neurociências. Para realizar o tratamento dos dados são necessárias 5 fases: aquisição de dados, pré-processamento, extração de características, classificação e validação. Neste trabalho, todas as fases são contempladas. Contudo, enfatiza-se as fases de classificação e de validação. Na classificação utiliza-se a técnica de Inteligência Artificial denominada Árvores de Decisão. Essa técnica é reconhecida na literatura como uma das formas mais simples e bem sucedidas de algoritmos de aprendizagem. Já a fase de validação é realizada nos estudos baseados na Neurociência, que é um conjunto das disciplinas que estudam o sistema nervoso, sua estrutura, seu desenvolvimento, funcionamento, evolução, relação com o comportamento e a mente, e também suas alterações. Os resultados obtidos neste trabalho são promissores, mesmo sendo iniciais, visto que podem melhor explicar, com a utilização de uma forma automática, alguns processos cerebrais.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objectives Dietary fibre (DF) is one of the components of diet that strongly contributes to health improvements, particularly on the gastrointestinal system. Hence, this work intended to evaluate the relations between some sociodemographic variables such as age, gender, level of education, living environment or country on the levels of knowledge about dietary fibre (KADF), its sources and its effects on human health, using a validated scale. Study design The present study was a cross-sectional study. Methods A methodological study was conducted with 6010 participants, residing in 10 countries from different continents (Europe, America, Africa). The instrument was a questionnaire of self-response, aimed at collecting information on knowledge about food fibres. The instrument was used to validate a scale (KADF) which model was used in the present work to identify the best predictors of knowledge. The statistical tools used were as follows: basic descriptive statistics, decision trees, inferential analysis (t-test for independent samples with Levene test and one-way ANOVA with multiple comparisons post hoc tests). Results The results showed that the best predictor for the three types of knowledge evaluated (about DF, about its sources and about its effects on human health) was always the country, meaning that the social, cultural and/or political conditions greatly determine the level of knowledge. On the other hand, the tests also showed that statistically significant differences were encountered regarding the three types of knowledge for all sociodemographic variables evaluated: age, gender, level of education, living environment and country. Conclusions The results showed that to improve the level of knowledge the actions planned should not be delineated in general as to reach all sectors of the populations, and that in addressing different people, different methodologies must be designed so as to provide an effective health education.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The current Amazon landscape consists of heterogeneous mosaics formed by interactions between the original forest and productive activities. Recognizing and quantifying the characteristics of these landscapes is essential for understanding agricultural production chains, assessing the impact of policies, and in planning future actions. Our main objective was to construct the regionalization of agricultural production for Rondônia State (Brazilian Amazon) at the municipal level. We adopted a decision tree approach, using land use maps derived from remote sensing data (PRODES and TerraClass) combined with socioeconomic data. The decision trees allowed us to allocate municipalities to one of five agricultural production systems: (i) coexistence of livestock production and intensive agriculture; (ii) semi-intensive beef and milk production; (iii) semi-intensive beef production; (iv) intensive beef and milk production, and; (v) intensive beef production. These production systems are, respectively, linked to mechanized agriculture (i), traditional cattle farming with low management, with (ii) or without (iii) a significant presence of dairy farming, and to more intensive livestock farming with (iv) or without (v) a significant presence of dairy farming. The municipalities and associated production systems were then characterized using a wide variety of quantitative metrics grouped into four dimensions: (i) agricultural production; (ii) economics; (iii) territorial configuration, and; (iv) social characteristics. We found that production systems linked to mechanized agriculture predominate in the south of the state, while intensive farming is mainly found in the center of the state. Semi-intensive livestock farming is mainly located close to the southwest frontier and in the north of the state, where human occupation of the territory is not fully consolidated. This distributional pattern reflects the origins of the agricultural production system of Rondônia. Moreover, the characterization of the production systems provides insights into the pattern of occupation of the Amazon and the socioeconomic consequences of continuing agricultural expansion.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Méthodologie: Simulation; Analyse discriminante linéaire et logistique; Arbres de classification; Réseaux de neurones en base radiale

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A problemática relacionada com a modelação da qualidade da água de albufeiras pode ser abordada de diversos pontos de vista. Neste trabalho recorre-se a metodologias de resolução de problemas que emanam da Área Cientifica da Inteligência Artificial, assim como a ferramentas utilizadas na procura de soluções como as Árvores de Decisão, as Redes Neuronais Artificiais e a Aproximação de Vizinhanças. Actualmente os métodos de avaliação da qualidade da água são muito restritivos já que não permitem aferir a qualidade da água em tempo real. O desenvolvimento de modelos de previsão baseados em técnicas de Descoberta de Conhecimento em Bases de Dados, mostrou ser uma alternativa tendo em vista um comportamento pró-activo que pode contribuir decisivamente para diagnosticar, preservar e requalificar as albufeiras. No decurso do trabalho, foi utilizada a aprendizagem não-supervisionada tendo em vista estudar a dinâmica das albufeiras sendo descritos dois comportamentos distintos, relacionados com a época do ano. ABSTRACT: The problems related to the modelling of water quality in reservoirs can be approached from different viewpoints. This work resorts to methods of resolving problems emanating from the Scientific Area of Artificial lntelligence as well as to tools used in the search for solutions such as Decision Trees, Artificial Neural Networks and Nearest-Neighbour Method. Currently, the methods for assessing water quality are very restrictive because they do not indicate the water quality in real time. The development of forecasting models, based on techniques of Knowledge Discovery in Databases, shows to be an alternative in view of a pro-active behavior that may contribute to diagnose, maintain and requalify the water bodies. ln this work. unsupervised learning was used to study the dynamics of reservoirs, being described two distinct behaviors, related to the time of year.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The present Dissertation shows how recent statistical analysis tools and open datasets can be exploited to improve modelling accuracy in two distinct yet interconnected domains of flood hazard (FH) assessment. In the first Part, unsupervised artificial neural networks are employed as regional models for sub-daily rainfall extremes. The models aim to learn a robust relation to estimate locally the parameters of Gumbel distributions of extreme rainfall depths for any sub-daily duration (1-24h). The predictions depend on twenty morphoclimatic descriptors. A large study area in north-central Italy is adopted, where 2238 annual maximum series are available. Validation is performed over an independent set of 100 gauges. Our results show that multivariate ANNs may remarkably improve the estimation of percentiles relative to the benchmark approach from the literature, where Gumbel parameters depend on mean annual precipitation. Finally, we show that the very nature of the proposed ANN models makes them suitable for interpolating predicted sub-daily rainfall quantiles across space and time-aggregation intervals. In the second Part, decision trees are used to combine a selected blend of input geomorphic descriptors for predicting FH. Relative to existing DEM-based approaches, this method is innovative, as it relies on the combination of three characteristics: (1) simple multivariate models, (2) a set of exclusively DEM-based descriptors as input, and (3) an existing FH map as reference information. First, the methods are applied to northern Italy, represented with the MERIT DEM (∼90m resolution), and second, to the whole of Italy, represented with the EU-DEM (25m resolution). The results show that multivariate approaches may (a) significantly enhance flood-prone areas delineation relative to a selected univariate one, (b) provide accurate predictions of expected inundation depths, (c) produce encouraging results in extrapolation, (d) complete the information of imperfect reference maps, and (e) conveniently convert binary maps into continuous representation of FH.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Il quark-gluon plasma (QGP) è uno stato della materia previsto dalla cromodinamica quantistica. L’esperimento ALICE a LHC ha tra i suoi obbiettivi principali lo studio della materia fortemente interagente e le proprietà del QGP attraverso collisioni di ioni pesanti ultra-relativistici. Per un’esaustiva comprensione di tali proprietà, le stesse misure effettuate su sistemi collidenti più piccoli (collisioni protone-protone e protone-ione) sono necessarie come riferimento. Le recenti analisi dei dati raccolti ad ALICE hanno mostrato che la nostra comprensione dei meccanismi di adronizzazione di quark pesanti non è completa, perchè i dati ottenuti in collisioni pp e p-Pb non sono riproducibili utilizzando modelli basati sui risultati ottenuti con collisioni e+e− ed ep. Per questo motivo, nuovi modelli teorici e fenomenologici, in grado di riprodurre le misure sperimentali, sono stati proposti. Gli errori associati a queste nuove misure sperimentali al momento non permettono di verificare in maniera chiara la veridicità dei diversi modelli proposti. Nei prossimi anni sarà quindi fondamentale aumentare la precisione di tali misure sperimentali; d’altra parte, stimare il numero delle diverse specie di particelle prodotte in una collisione può essere estremamente complicato. In questa tesi, il numero di barioni Lc prodotti in un campione di dati è stato ottenuto utilizzando delle tecniche di machine learning, in grado di apprendere pattern e imparare a distinguere candidate di segnale da quelle di fondo. Si sono inoltre confrontate tre diverse implementazioni di un algoritmo di Boosted Decision Trees (BDT) e si è utilizzata quella più performante per ricostruire il barione Lc in collisioni pp raccolte dall’esperimento ALICE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Le recenti analisi dei dati raccolti ad ALICE dimostrano che la nostra comprensione dei fenomeni di adronizzazione dei sapori pesanti è ancora incompleta, perché le misure effettuate su collisioni pp, p-Pb e Pb-Pb non sono riproducibili da modelli teorici basati su altre tipologie di collisione come e+e−. In particolare, i risultati sembrano indicare che il principio di universalità, che assume che le funzioni di frammentazione di quark e gluoni siano indipendenti dal tipo di sistema interagente, non sia valido. Per questo motivo sono stati sviluppati nuovi modelli teorici e fenomenologici, capaci di riprodurre in modo più o meno accurato i dati sperimentali. Questi modelli differiscono tra di loro soprattutto a bassi valori di impulso trasverso pT . L’analisi dati a basso pT si rivela dunque di fondamentale importanza, in quanto permette di discriminare, tra i vari modelli, quelli che sono realmente in grado di riprodurre i dati sperimentali e quelli che non lo sono. Inoltre può fornire una conferma sperimentale dei fenomeni fisici su cui tale modello si basa. In questa tesi è stato estratto il numero di barioni Λ+c (yield ) prodotto in collisioni pp a √s = 13 TeV , nel range di impulso trasverso 0 < pT (Λ+c ) < 1 GeV/c. É stato fatto uso di una tecnica di machine learning che sfrutta un algoritmo di tipo Boosted Decision Trees (BDT) implementato dal pacchetto TMVA, al fine di identificare ed eliminare una grossa parte del fondo statistico e semplificare notevolmente l’analisi vera e propria. Il grado di attendibilità della misura è stata verificata eseguendo l’estrazione dello yield con due approcci diversi: il primo, modellando il fondo combinatoriale con una funzione analitica; successivamente con la creazione di un template statistico creato ad hoc con la tecnica delle track rotations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Uno degli obiettivi più ambizioni e interessanti dell'informatica, specialmente nel campo dell'intelligenza artificiale, consiste nel raggiungere la capacità di far ragionare un computer in modo simile a come farebbe un essere umano. I più recenti successi nell'ambito delle reti neurali profonde, specialmente nel campo dell'elaborazione del testo in linguaggio naturale, hanno incentivato lo studio di nuove tecniche per affrontare tale problema, a cominciare dal ragionamento deduttivo, la forma più semplice e lineare di ragionamento logico. La domanda fondamentale alla base di questa tesi è infatti la seguente: in che modo una rete neurale basata sull'architettura Transformer può essere impiegata per avanzare lo stato dell'arte nell'ambito del ragionamento deduttivo in linguaggio naturale? Nella prima parte di questo lavoro presento uno studio approfondito di alcune tecnologie recenti che hanno affrontato questo problema con intuizioni vincenti. Da questa analisi emerge come particolarmente efficace l'integrazione delle reti neurali con tecniche simboliche più tradizionali. Nella seconda parte propongo un focus sull'architettura ProofWriter, che ha il pregio di essere relativamente semplice e intuitiva pur presentando prestazioni in linea con quelle dei concorrenti. Questo approfondimento mette in luce la capacità dei modelli T5, con il supporto del framework HuggingFace, di produrre più risposte alternative, tra cui è poi possibile cercare esternamente quella corretta. Nella terza e ultima parte fornisco un prototipo che mostra come si può impiegare tale tecnica per arricchire i sistemi tipo ProofWriter con approcci simbolici basati su nozioni linguistiche, conoscenze specifiche sul dominio applicativo o semplice buonsenso. Ciò che ne risulta è un significativo miglioramento dell'accuratezza rispetto al ProofWriter originale, ma soprattutto la dimostrazione che è possibile sfruttare tale capacità dei modelli T5 per migliorarne le prestazioni.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Il quark top è una delle particelle fondamentali del Modello Standard, ed è osservato a LHC nelle collisioni a più elevata energia. In particolare, la coppia top-antitop (tt̄) è prodotta tramite interazione forte da eventi gluone-gluone (gg) oppure collisioni di quark e antiquark (qq̄). I diversi meccanismi di produzione portano ad avere coppie con proprietà diverse: un esempio è lo stato di spin di tt̄, che vicino alla soglia di produzione è maggiormente correlato nel caso di un evento gg. Uno studio che voglia misurare l’entità di tali correlazioni risulta quindi essere significativamente facilitato da un metodo di discriminazione delle coppie risultanti sulla base del loro canale di produzione. Il lavoro qui presentato ha quindi lo scopo di ottenere uno strumento per effettuare tale differenziazione, attraverso l’uso di tecniche di analisi multivariata. Tali metodi sono spesso applicati per separare un segnale da un fondo che ostacola l’analisi, in questo caso rispettivamente gli eventi gg e qq̄. Si dice che si ha a che fare con un problema di classificazione. Si è quindi studiata la prestazione di diversi algoritmi di analisi, prendendo in esame le distribuzioni di numerose variabili associate al processo di produzione di coppie tt̄. Si è poi selezionato il migliore in base all’efficienza di riconoscimento degli eventi di segnale e alla reiezione degli eventi di fondo. Per questo elaborato l’algoritmo più performante è il Boosted Decision Trees, che permette di ottenere da un campione con purezza iniziale 0.81 una purezza finale di 0.92, al costo di un’efficienza ridotta a 0.74.