6 resultados para educational data mining
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Advances in biomedical signal acquisition systems for motion analysis have led to lowcost and ubiquitous wearable sensors which can be used to record movement data in different settings. This implies the potential availability of large amounts of quantitative data. It is then crucial to identify and to extract the information of clinical relevance from the large amount of available data. This quantitative and objective information can be an important aid for clinical decision making. Data mining is the process of discovering such information in databases through data processing, selection of informative data, and identification of relevant patterns. The databases considered in this thesis store motion data from wearable sensors (specifically accelerometers) and clinical information (clinical data, scores, tests). The main goal of this thesis is to develop data mining tools which can provide quantitative information to the clinician in the field of movement disorders. This thesis will focus on motor impairment in Parkinson's disease (PD). Different databases related to Parkinson subjects in different stages of the disease were considered for this thesis. Each database is characterized by the data recorded during a specific motor task performed by different groups of subjects. The data mining techniques that were used in this thesis are feature selection (a technique which was used to find relevant information and to discard useless or redundant data), classification, clustering, and regression. The aims were to identify high risk subjects for PD, characterize the differences between early PD subjects and healthy ones, characterize PD subtypes and automatically assess the severity of symptoms in the home setting.
Resumo:
In the last years, Intelligent Tutoring Systems have been a very successful way for improving learning experience. Many issues must be addressed until this technology can be defined mature. One of the main problems within the Intelligent Tutoring Systems is the process of contents authoring: knowledge acquisition and manipulation processes are difficult tasks because they require a specialised skills on computer programming and knowledge engineering. In this thesis we discuss a general framework for knowledge management in an Intelligent Tutoring System and propose a mechanism based on first order data mining to partially automate the process of knowledge acquisition that have to be used in the ITS during the tutoring process. Such a mechanism can be applied in Constraint Based Tutor and in the Pseudo-Cognitive Tutor. We design and implement a part of the proposed architecture, mainly the module of knowledge acquisition from examples based on first order data mining. We then show that the algorithm can be applied at least two different domains: first order algebra equation and some topics of C programming language. Finally we discuss the limitation of current approach and the possible improvements of the whole framework.
Resumo:
Precision horticulture and spatial analysis applied to orchards are a growing and evolving part of precision agriculture technology. The aim of this discipline is to reduce production costs by monitoring and analysing orchard-derived information to improve crop performance in an environmentally sound manner. Georeferencing and geostatistical analysis coupled to point-specific data mining allow to devise and implement management decisions tailored within the single orchard. Potential applications range from the opportunity to verify in real time along the season the effectiveness of cultural practices to achieve the production targets in terms of fruit size, number, yield and, in a near future, fruit quality traits. These data will impact not only the pre-harvest but their effect will extend to the post-harvest sector of the fruit chain. Chapter 1 provides an updated overview on precision horticulture , while in Chapter 2 a preliminary spatial statistic analysis of the variability in apple orchards is provided before and after manual thinning; an interpretation of this variability and how it can be managed to maximize orchard performance is offered. Then in Chapter 3 a stratification of spatial data into management classes to interpret and manage spatial variation on the orchard is undertaken. An inverse model approach is also applied to verify whether the crop production explains environmental variation. In Chapter 4 an integration of the techniques adopted before is presented. A new key for reading the information gathered within the field is offered. The overall goal of this Dissertation was to probe into the feasibility, the desirability and the effectiveness of a precision approach to fruit growing, following the lines of other areas of agriculture that already adopt this management tool. As existing applications of precision horticulture already had shown, crop specificity is an important factor to be accounted for. This work focused on apple because of its importance in the area where the work was carried out, and worldwide.
Resumo:
In this thesis the evolution of the techno-social systems analysis methods will be reported, through the explanation of the various research experience directly faced. The first case presented is a research based on data mining of a dataset of words association named Human Brain Cloud: validation will be faced and, also through a non-trivial modeling, a better understanding of language properties will be presented. Then, a real complex system experiment will be introduced: the WideNoise experiment in the context of the EveryAware european project. The project and the experiment course will be illustrated and data analysis will be displayed. Then the Experimental Tribe platform for social computation will be introduced . It has been conceived to help researchers in the implementation of web experiments, and aims also to catalyze the cumulative growth of experimental methodologies and the standardization of tools cited above. In the last part, three other research experience which already took place on the Experimental Tribe platform will be discussed in detail, from the design of the experiment to the analysis of the results and, eventually, to the modeling of the systems involved. The experiments are: CityRace, about the measurement of human traffic-facing strategies; laPENSOcosì, aiming to unveil the political opinion structure; AirProbe, implemented again in the EveryAware project framework, which consisted in monitoring air quality opinion shift of a community informed about local air pollution. At the end, the evolution of the technosocial systems investigation methods shall emerge together with the opportunities and the threats offered by this new scientific path.
Resumo:
Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.
Resumo:
Autism Spectrum Disorders (ASDs) describe a set of neurodevelopmental disorders. ASD represents a significant public health problem. Currently, ASDs are not diagnosed before the 2nd year of life but an early identification of ASDs would be crucial as interventions are much more effective than specific therapies starting in later childhood. To this aim, cheap an contact-less automatic approaches recently aroused great clinical interest. Among them, the cry and the movements of the newborn, both involving the central nervous system, are proposed as possible indicators of neurological disorders. This PhD work is a first step towards solving this challenging problem. An integrated system is presented enabling the recording of audio (crying) and video (movements) data of the newborn, their automatic analysis with innovative techniques for the extraction of clinically relevant parameters and their classification with data mining techniques. New robust algorithms were developed for the selection of the voiced parts of the cry signal, the estimation of acoustic parameters based on the wavelet transform and the analysis of the infant’s general movements (GMs) through a new body model for segmentation and 2D reconstruction. In addition to a thorough literature review this thesis presents the state of the art on these topics that shows that no studies exist concerning normative ranges for newborn infant cry in the first 6 months of life nor the correlation between cry and movements. Through the new automatic methods a population of control infants (“low-risk”, LR) was compared to a group of “high-risk” (HR) infants, i.e. siblings of children already diagnosed with ASD. A subset of LR infants clinically diagnosed as newborns with Typical Development (TD) and one affected by ASD were compared. The results show that the selected acoustic parameters allow good differentiation between the two groups. This result provides new perspectives both diagnostic and therapeutic.