966 resultados para data Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trata da aplicação de ferramentas de Data Mining e do conceito de Data Warehouse à coleta e análise de dados obtidos a partir das ações da Secretaria de Estado da Educação de São Paulo. A variável dependente considerada na análise é o resultado do rendimento das escolas estaduais obtido através das notas de avaliação do SARESP (prova realizada no estado de São Paulo). O data warehouse possui ainda dados operacionais e de ações já realizadas, possibilitando análise de influência nos resultados

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Variations in the phenotypic expression of heterozygous beta thalassemia reflect the formation of different populations. To better understand the profile of heterozygous beta-thalassemia of the Brazilian population, we aimed at establishing parameters to direct the diagnosis of carriers and calculate the frequency from information stored in an electronic database. Using a Data Mining tool, we evaluated information on 10,960 blood samples deposited in a relational database. Over the years, improved diagnostic technology has facilitated the elucidation of suspected beta thalassemia heterozygote cases with an average frequency of 3.5% of referred cases. We also found that the Brazilian beta thalassemia trait has classic increases of Hb A2 and Hb F (60%), mainly caused by mutations in beta zero thalassemia, especially in the southeast of the country.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article introduces the software program called EthoSeq, which is designed to extract probabilistic behavioral sequences (tree-generated sequences, or TGSs) from observational data and to prepare a TGS-species matrix for phylogenetic analysis. The program uses Graph Theory algorithms to automatically detect behavioral patterns within the observational sessions. It includes filtering tools to adjust the search procedure to user-specified statistical needs. Preliminary analyses of data sets, such as grooming sequences in birds and foraging tactics in spiders, uncover a large number of TGSs which together yield single phylogenetic trees. An example of the use of the program is our analysis of felid grooming sequences, in which we have obtained 1,386 felid grooming TGSs for seven species, resulting in a single phylogeny. These results show that behavior is definitely useful in phylogenetic analysis. EthoSeq simplifies and automates such analyses, uncovers much of the hidden patterns of long behavioral sequences, and prepares this data for further analysis with standard phylogenetic programs. We hope it will encourage many empirical studies on the evolution of behavior.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increase in the number of spatial data collected has motivated the development of geovisualisation techniques, aiming to provide an important resource to support the extraction of knowledge and decision making. One of these techniques are 3D graphs, which provides a dynamic and flexible increase of the results analysis obtained by the spatial data mining algorithms, principally when there are incidences of georeferenced objects in a same local. This work presented as an original contribution the potentialisation of visual resources in a computational environment of spatial data mining and, afterwards, the efficiency of these techniques is demonstrated with the use of a real database. The application has shown to be very interesting in interpreting obtained results, such as patterns that occurred in a same locality and to provide support for activities which could be done as from the visualisation of results. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods: Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results: This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion: The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that, the multi-relational proposed algorithm, unlike other algorithms of this approach, is efficient for use in large relational databases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increase in new electronic devices had generated a considerable increase in obtaining spatial data information; hence these data are becoming more and more widely used. As well as for conventional data, spatial data need to be analyzed so interesting information can be retrieved from them. Therefore, data clustering techniques can be used to extract clusters of a set of spatial data. However, current approaches do not consider the implicit semantics that exist between a region and an object’s attributes. This paper presents an approach that enhances spatial data mining process, so they can use the semantic that exists within a region. A framework was developed, OntoSDM, which enables spatial data mining algorithms to communicate with ontologies in order to enhance the algorithm’s result. The experiments demonstrated a semantically improved result, generating more interesting clusters, therefore reducing manual analysis work of an expert.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The reproductive performance of cattle may be influenced by several factors, but mineral imbalances are crucial in terms of direct effects on reproduction. Several studies have shown that elements such as calcium, copper, iron, magnesium, selenium, and zinc are essential for reproduction and can prevent oxidative stress. However, toxic elements such as lead, nickel, and arsenic can have adverse effects on reproduction. In this paper, we applied a simple and fast method of multi-element analysis to bovine semen samples from Zebu and European classes used in reproduction programs and artificial insemination. Samples were analyzed by inductively coupled plasma spectrometry (ICP-MS) using aqueous medium calibration and the samples were diluted in a proportion of 1:50 in a solution containing 0.01% (vol/vol) Triton X-100 and 0.5% (vol/vol) nitric acid. Rhodium, iridium, and yttrium were used as the internal standards for ICP-MS analysis. To develop a reliable method of tracing the class of bovine semen, we used data mining techniques that make it possible to classify unknown samples after checking the differentiation of known-class samples. Based on the determination of 15 elements in 41 samples of bovine semen, 3 machine-learning tools for classification were applied to determine cattle class. Our results demonstrate the potential of support vector machine (SVM), multilayer perceptron (MLP), and random forest (RF) chemometric tools to identify cattle class. Moreover, the selection tools made it possible to reduce the number of chemical elements needed from 15 to just 8.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multi-element analysis of honey samples was carried out with the aim of developing a reliable method of tracing the origin of honey. Forty-two chemical elements were determined (Al, Cu, Pb, Zn, Mn, Cd, Tl, Co, Ni, Rb, Ba, Be, Bi, U, V, Fe, Pt, Pd, Te, Hf, Mo, Sn, Sb, P, La, Mg, I, Sm, Tb, Dy, Sd, Th, Pr, Nd, Tm, Yb, Lu, Gd, Ho, Er, Ce, Cr) by inductively coupled plasma mass spectrometry (ICP-MS). Then, three machine learning tools for classification and two for attribute selection were applied in order to prove that it is possible to use data mining tools to find the region where honey originated. Our results clearly demonstrate the potential of Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Random Forest (RF) chemometric tools for honey origin identification. Moreover, the selection tools allowed a reduction from 42 trace element concentrations to only 5. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that, the multi-relational proposed algorithm, unlike other algorithms of this approach, is efficient for use in large relational databases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[ES]En este artículo se describe la experiencia de la aplicación de técnicas de EDM (clustering) a un curso disponible en la plataforma Ude@ de la Universidad de Antioquia. El objetivo es clasificar los patrones de interacción de los estudiantes a partir de la información almacenada en la base de datos de la plataforma Moodle. Para ello, se generan informes sobre el uso de los recursos y la autoevaluación que permiten analizar el comportamiento y los patrones de navegación de los estudiantes durante el uso del LMS (Learning Management System).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Il trauma cranico é tra le piú importanti patologie traumatiche. Ogni anno 250 pazienti ogni 100.000 abitanti vengono ricoverati in Italia per un trauma cranico. La mortalitá é di circa 17 casi per 100.000 abitanti per anno. L’Italia si trova in piena “media” Europea considerando l’incidenza media in Europa di 232 casi per 100.000 abitanti ed una mortalitá di 15 casi per 100.000 abitanti. Degli studi hanno indicato come una terapia anticoagulante é uno dei principali fattori di rischio di evolutiviá di una lesione emorragica. Al contrario della terapia anticoagulante, il rischio emorragico correlato ad una terapia antiaggregante é a tutt’oggi ancora in fase di verifica. Il problema risulta rilevante in particolare nella popolazione occidentale in quanto l’impiego degli antiaggreganti é progressivamente sempre piú diffuso. Questo per la politica di prevenzione sostenuta dalle linee guida nazionali e internazionali in termini di prevenzione del rischio cardiovascolare, in particolare nelle fasce di popolazione di etá piú avanzata. Per la prima volta, é stato dimostrato all’ospedale di Forlí[1], su una casistica sufficientemente ampia, che la terapia cronica con antiaggreganti, per la preven- zione del rischio cardiovascolare, puó rivelarsi un significativo fattore di rischio di complicanze emorragiche in un soggetto con trauma cranico, anche di grado lieve. L’ospedale per approfondire e convalidare i risultati della ricerca ha condotto, nell’anno 2009, una nuova indagine. La nuova indagine ha coinvolto oltre l’ospedale di Forlí altri trentuno centri ospedalieri italiani. Questo lavoro di ricerca vuole, insieme ai ricercatori dell’ospedale di Forlí, verificare: “se una terapia con antiaggreganti influenzi l’evolutivitá, in senso peggiorativo, di una lesione emorragica conseguente a trauma cranico lieve - moderato - severo in un soggetto adulto”, grazie ai dati raccolti dai centri ospedalieri nel 2009. Il documento é strutturato in due parti. La prima parte piú teorica, vuole fissare i concetti chiave riguardanti il contesto della ricerca e la metodologia usata per analizzare i dati. Mentre, la seconda parte piú pratica, vuole illustrare il lavoro fatto per rispondere al quesito della ricerca. La prima parte é composta da due capitoli, che sono: • Il capitolo 1: dove sono descritti i seguenti concetti: cos’é un trauma cra- nico, cos’é un farmaco di tipo anticoagulante e cos’é un farmaco di tipo antiaggregante; • Il capitolo 2: dove é descritto cos’é il Data Mining e quali tecniche sono state usate per analizzare i dati. La seconda parte é composta da quattro capitoli, che sono: • Il capitolo 3: dove sono state descritte: la struttura dei dati raccolti dai trentadue centri ospedalieri, la fase di pre-processing e trasformazione dei dati. Inoltre in questo capitolo sono descritti anche gli strumenti utilizzati per analizzare i dati; • Il capitolo 4: dove é stato descritto come é stata eseguita l’analisi esplorativa dei dati. • Il capitolo 5: dove sono descritte le analisi svolte sui dati e soprattutto i risultati che le analisi, grazie alle tecniche di Data Mining, hanno prodotto per rispondere al quesito della ricerca; • Il capitolo 6: dove sono descritte le conclusioni della ricerca. Per una maggiore comprensione del lavoro sono state aggiunte due appendici. La prima tratta del software per data mining Weka, utilizzato per effettuare le analisi. Mentre, la seconda tratta dell’implementazione dei metodi per la creazione degli alberi decisionali.