11 resultados para Web Data Mining
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
Traditionally, ontologies describe knowledge representation in a denotational, formalized, and deductive way. In addition, in this paper, we propose a semiotic, inductive, and approximate approach to ontology creation. We define a conceptual framework, a semantics extraction algorithm, and a first proof of concept applying the algorithm to a small set of Wikipedia documents. Intended as an extension to the prevailing top-down ontologies, we introduce an inductive fuzzy grassroots ontology, which organizes itself organically from existing natural language Web content. Using inductive and approximate reasoning to reflect the natural way in which knowledge is processed, the ontology’s bottom-up build process creates emergent semantics learned from the Web. By this means, the ontology acts as a hub for computing with words described in natural language. For Web users, the structural semantics are visualized as inductive fuzzy cognitive maps, allowing an initial form of intelligence amplification. Eventually, we present an implementation of our inductive fuzzy grassroots ontology Thus,this paper contributes an algorithm for the extraction of fuzzy grassroots ontologies from Web data by inductive fuzzy classification.
Resumo:
Index tracking has become one of the most common strategies in asset management. The index-tracking problem consists of constructing a portfolio that replicates the future performance of an index by including only a subset of the index constituents in the portfolio. Finding the most representative subset is challenging when the number of stocks in the index is large. We introduce a new three-stage approach that at first identifies promising subsets by employing data-mining techniques, then determines the stock weights in the subsets using mixed-binary linear programming, and finally evaluates the subsets based on cross validation. The best subset is returned as the tracking portfolio. Our approach outperforms state-of-the-art methods in terms of out-of-sample performance and running times.
Resumo:
SMARTDIAB is a platform designed to support the monitoring, management, and treatment of patients with type 1 diabetes mellitus (T1DM), by combining state-of-the-art approaches in the fields of database (DB) technologies, communications, simulation algorithms, and data mining. SMARTDIAB consists mainly of two units: 1) the patient unit (PU); and 2) the patient management unit (PMU), which communicate with each other for data exchange. The PMU can be accessed by the PU through the internet using devices, such as PCs/laptops with direct internet access or mobile phones via a Wi-Fi/General Packet Radio Service access network. The PU consists of an insulin pump for subcutaneous insulin infusion to the patient and a continuous glucose measurement system. The aforementioned devices running a user-friendly application gather patient's related information and transmit it to the PMU. The PMU consists of a diabetes data management system (DDMS), a decision support system (DSS) that provides risk assessment for long-term diabetes complications, and an insulin infusion advisory system (IIAS), which reside on a Web server. The DDMS can be accessed from both medical personnel and patients, with appropriate security access rights and front-end interfaces. The DDMS, apart from being used for data storage/retrieval, provides also advanced tools for the intelligent processing of the patient's data, supporting the physician in decision making, regarding the patient's treatment. The IIAS is used to close the loop between the insulin pump and the continuous glucose monitoring system, by providing the pump with the appropriate insulin infusion rate in order to keep the patient's glucose levels within predefined limits. The pilot version of the SMARTDIAB has already been implemented, while the platform's evaluation in clinical environment is being in progress.
Resumo:
The web is continuously evolving into a collection of many data, which results in the interest to collect and merge these data in a meaningful way. Based on that web data, this paper describes the building of an ontology resting on fuzzy clustering techniques. Through continual harvesting folksonomies by web agents, an entire automatic fuzzy grassroots ontology is built. This self-updating ontology can then be used for several practical applications in fields such as web structuring, web searching and web knowledge visualization.A potential application for online reputation analysis, added value and possible future studies are discussed in the conclusion.
Resumo:
Smart homes for the aging population have recently started attracting the attention of the research community. The "health state" of smart homes is comprised of many different levels; starting with the physical health of citizens, it also includes longer-term health norms and outcomes, as well as the arena of positive behavior changes. One of the problems of interest is to monitor the activities of daily living (ADL) of the elderly, aiming at their protection and well-being. For this purpose, we installed passive infrared (PIR) sensors to detect motion in a specific area inside a smart apartment and used them to collect a set of ADL. In a novel approach, we describe a technology that allows the ground truth collected in one smart home to train activity recognition systems for other smart homes. We asked the users to label all instances of all ADL only once and subsequently applied data mining techniques to cluster in-home sensor firings. Each cluster would therefore represent the instances of the same activity. Once the clusters were associated to their corresponding activities, our system was able to recognize future activities. To improve the activity recognition accuracy, our system preprocessed raw sensor data by identifying overlapping activities. To evaluate the recognition performance from a 200-day dataset, we implemented three different active learning classification algorithms and compared their performance: naive Bayesian (NB), support vector machine (SVM) and random forest (RF). Based on our results, the RF classifier recognized activities with an average specificity of 96.53%, a sensitivity of 68.49%, a precision of 74.41% and an F-measure of 71.33%, outperforming both the NB and SVM classifiers. Further clustering markedly improved the results of the RF classifier. An activity recognition system based on PIR sensors in conjunction with a clustering classification approach was able to detect ADL from datasets collected from different homes. Thus, our PIR-based smart home technology could improve care and provide valuable information to better understand the functioning of our societies, as well as to inform both individual and collective action in a smart city scenario.