903 resultados para Data-driven knowledge acquisition
                                
Resumo:
Failure to detect patients at risk of attempting suicide can result in tragic consequences. Identifying risks earlier and more accurately helps prevent serious incidents occurring and is the objective of the GRiST clinical decision support system (CDSS). One of the problems it faces is high variability in the type and quantity of data submitted for patients, who are assessed in multiple contexts along the care pathway. Although GRiST identifies up to 138 patient cues to collect, only about half of them are relevant for any one patient and their roles may not be for risk evaluation but more for risk management. This paper explores the data collection behaviour of clinicians using GRiST to see whether it can elucidate which variables are important for risk evaluations and when. The GRiST CDSS is based on a cognitive model of human expertise manifested by a sophisticated hierarchical knowledge structure or tree. This structure is used by the GRiST interface to provide top-down controlled access to the patient data. Our research explores relationships between the answers given to these higher-level 'branch' questions to see whether they can help direct assessors to the most important data, depending on the patient profile and assessment context. The outcome is a model for dynamic data collection driven by the knowledge hierarchy. It has potential for improving other clinical decision support systems operating in domains with high dimensional data that are only partially collected and in a variety of combinations.
                                
Resumo:
The management and sharing of complex data, information and knowledge is a fundamental and growing concern in the Water and other Industries for a variety of reasons. For example, risks and uncertainties associated with climate, and other changes require knowledge to prepare for a range of future scenarios and potential extreme events. Formal ways in which knowledge can be established and managed can help deliver efficiencies on acquisition, structuring and filtering to provide only the essential aspects of the knowledge really needed. Ontologies are a key technology for this knowledge management. The construction of ontologies is a considerable overhead on any knowledge management programme. Hence current computer science research is investigating generating ontologies automatically from documents using text mining and natural language techniques. As an example of this, results from application of the Text2Onto tool to stakeholder documents for a project on sustainable water cycle management in new developments are presented. It is concluded that by adopting ontological representations sooner, rather than later in an analytical process, decision makers will be able to make better use of highly knowledgeable systems containing automated services to ensure that sustainability considerations are included. © 2010 The authors.
                                
Resumo:
* The work is partially supported by Grant no. NIP917 of the Ministry of Science and Education – Republic of Bulgaria.
                                
Resumo:
A major drawback of artificial neural networks is their black-box character. Therefore, the rule extraction algorithm is becoming more and more important in explaining the extracted rules from the neural networks. In this paper, we use a method that can be used for symbolic knowledge extraction from neural networks, once they have been trained with desired function. The basis of this method is the weights of the neural network trained. This method allows knowledge extraction from neural networks with continuous inputs and output as well as rule extraction. An example of the application is showed. This example is based on the extraction of average load demand of a power plant.
                                
Resumo:
Purpose – Traditionally, most studies focus on institutionalized management-driven actors to understand technology management innovation. The purpose of this paper is to argue that there is a need for research to study the nature and role of dissident non-institutionalized actors’ (i.e. outsourced web designers and rapid application software developers). The authors propose that through online social knowledge sharing, non-institutionalized actors’ solution-finding tensions enable technology management innovation. Design/methodology/approach – A synthesis of the literature and an analysis of the data (21 interviews) provided insights in three areas of solution-finding tensions enabling management innovation. The authors frame the analysis on the peripherally deviant work and the nature of the ways that dissident non-institutionalized actors deviate from their clients (understood as the firm) original contracted objectives. Findings – The findings provide insights into the productive role of solution-finding tensions in enabling opportunities for management service innovation. Furthermore, deviant practices that leverage non-institutionalized actors’ online social knowledge to fulfill customers’ requirements are not interpreted negatively, but as a positive willingness to proactively explore alternative paths. Research limitations/implications – The findings demonstrate the importance of dissident non-institutionalized actors in technology management innovation. However, this work is based on a single country (USA) and additional research is needed to validate and generalize the findings in other cultural and institutional settings. Originality/value – This paper provides new insights into the perceptions of dissident non-institutionalized actors in the practice of IT managerial decision making. The work departs from, but also extends, the previous literature, demonstrating that peripherally deviant work in solution-finding practice creates tensions, enabling management innovation between IT providers and users.
                                
Resumo:
In this article I argue that the study of the linguistic aspects of epistemology has become unhelpfully focused on the corpus-based study of hedging and that a corpus-driven approach can help to improve upon this. Through focusing on a corpus of texts from one discourse community (that of genetics) and identifying frequent tri-lexical clusters containing highly frequent lexical items identified as keywords, I undertake an inductive analysis identifying patterns of epistemic significance. Several of these patterns are shown to be hedging devices and the whole corpus frequencies of the most salient of these, candidate and putative, are then compared to the whole corpus frequencies for comparable wordforms and clusters of epistemic significance. Finally I interviewed a ‘friendly geneticist’ in order to check my interpretation of some of the terms used and to get an expert interpretation of the overall findings. In summary I argue that the highly unexpected patterns of hedging found in genetics demonstrate the value of adopting a corpus-driven approach and constitute an advance in our current understanding of how to approach the relationship between language and epistemology.
                                
Resumo:
A rich collection of Heteroptera extracted with Berlese funnel by Dr. I. Loksa between 1953–1974 in Hungary, has been examined. Altogether 157 true bug species have been identified. The great majority of them have been found in very low number, there are only 27 species of which more than 10 adult individuals have been found. Some species considered to be rare or very rare in Hungary have been collected in relatively great number (Ceratocombus coleoptratus, Cryptostemma pusillimum, C. waltli, Acalypta carinata, A. platycheila, Loricula ruficeps, Myrmedobia exilis). The three families, which are more or less rich in species and have the highest ratio of extracted species, were Rhyparochromidae, Tingidae and Nabidae. Out of them, the family Rhyparochromidae has been found to be most diverse and most characteristic at the ground-level. Individuals of the families Tingidae, Hebridae and Rhyparochromidae have been found in greatest number. The occurrence of the lace bug Campylosteira orientalis Horváth, 1881 in Hungary has been verified by a voucher specimen. In respect to the environmental changes through the country, parallel changes have been observed in the zoogeographical distribution of the ground-living bugs.
                                
Resumo:
A rich material of Heteroptera extracted with Berlese funnels by Dr. I. Loksa between 1953–1974 in Hungary, has been examined. Altogether 157 true bug species have been identified. The ground-living heteropteran assemblages collected in different plant communities, substrata, phytogeographical provinces and seasons have been compared with multivariate methods. Because of the unequal number of samples, the objects have been standardized with stochastic simulation. There are several true bug species, which have been collected in almost all of the plant communities. However, characteristic ground-living heteropteran assemblages have been found in numerous Hungarian plant community types. Leaf litter and debris seem to have characteristic bug assemblages. Some differences have also been recognised between the bug fauna of mosses growing on different surfaces. Most of the species have been found in all of the great phytogeographical provinces of Hungary. Most high-dominance species, which have been collected, can be found at the ground-level almost throughout the year. Specimens of many other species have been collected with Berlese funnels in spring, autumn and/or winter. The diversities of the ground-living heteropteran assemblages of the examined objects have also been compared.
                                
Resumo:
As a third part of a series of papers on the ground-living true bugs of Hungary, the species belonging to the lace bug genus Acalypta Westwood, 1840 (Insecta: Heteroptera: Tingidae) were studied. Extensive materials collected with Berlese funnels during about 20 years all over Hungary were identified. Based on these sporadic data of many years, faunistic notes are given on some Hungarian species. The seasonal occurrence of the species are discussed. The numbers of specimens of different Acalypta species collected in diverse plant communities are compared with multivariate methods. Materials collected with pitfall traps between 1979–1982 at Bugac, Kiskunság National Park were also processed. In this area, only A. marginata and A. gracilis occurred, both in great number. The temporal changes of the populations are discussed. Significant differences could be observed between the microhabitat distribution of the two species: both species occurred in very low number in traps placed out in patches colonized by dune-slack purple moorgrass meadow; Acalypta gracilis preferred distinctly the Pannonic dune open grassland patches; A. marginata occurred in almost equal number in Pannonic dune open grassland and in Pannonic sand puszta patches.
                                
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
                                
Resumo:
With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.
                                
Resumo:
This study assessed the civic knowledge, skills, and attitudes of Hispanic eighth grade students in Miami-Dade County Public Schools (M-DCPS), Florida. Three hundred sixty one Hispanic students of Cuban (253), Colombian (57), and Nicaraguan (51) ancestry from 10 middle schools participated in the study. Two hundred twenty eight students were from low socio-economic status (SES) background, and 133 were of middle SES background. There were 136 boys and 225 girls. The International Association for the Evaluation of Educational Achievement Civic Education Student Questionnaire was used to collect data. The instrument assessed the students’ civic knowledge, skills, and attitudes. Multivariate analysis of variance was used to test for differences in the civic knowledge, skills, and attitudes of participants based on ancestry, SES, and gender. ^ The findings indicated that there was no significant difference in the civic knowledge, skills, and attitudes of Hispanic eighth grade students that were of Cuban, Colombian, and Nicaraguan ancestry. There was no significant difference in the civic skills and in five of the civic attitude scales for students from low SES families compared to those from middle SES families. However, there was a significant difference in the civic knowledge and in the civic attitude concerning classroom discussions and participation based on SES. The civic knowledge of middle SES students was higher than that of low SES students. Furthermore, middle SES Hispanic students displayed a higher mean score for the civic attitude of classroom discussions and participation than low SES students. There was no significant difference in the civic knowledge and in five of the civic attitude scales between boys and girls. However, there was a significant difference in the civic skills and the civic attitude of support for women’s rights between boys and girls. Hispanic girls displayed a higher mean score in civic skills than Hispanic boys. Furthermore, the mean score of civic attitude of support for women’s rights for Hispanic girls was higher than that of Hispanic boys. ^ It was concluded that Cuban, Colombian, or Nicaraguan participants did not demonstrate differences in civic attitudes and levels of civic knowledge and skills that eighth grade students possessed. In addition, when compared to boys, girls demonstrated a higher level of civic skills and a greater support for women’s rights and participation in politics and their roles in politics. Moreover, SES was demonstrated to be a key factor in the acquisition of civic knowledge, regardless of ancestry.^
                                
                                
Resumo:
Major portion of hurricane-induced economic loss originates from damages to building structures. The damages on building structures are typically grouped into three main categories: exterior, interior, and contents damage. Although the latter two types of damages, in most cases, cause more than 50% of the total loss, little has been done to investigate the physical damage process and unveil the interdependence of interior damage parameters. Building interior and contents damages are mainly due to wind-driven rain (WDR) intrusion through building envelope defects, breaches, and other functional openings. The limitation of research works and subsequent knowledge gaps, are in most part due to the complexity of damage phenomena during hurricanes and lack of established measurement methodologies to quantify rainwater intrusion. This dissertation focuses on devising methodologies for large-scale experimental simulation of tropical cyclone WDR and measurements of rainwater intrusion to acquire benchmark test-based data for the development of hurricane-induced building interior and contents damage model. Target WDR parameters derived from tropical cyclone rainfall data were used to simulate the WDR characteristics at the Wall of Wind (WOW) facility. The proposed WDR simulation methodology presents detailed procedures for selection of type and number of nozzles formulated based on tropical cyclone WDR study. The simulated WDR was later used to experimentally investigate the mechanisms of rainwater deposition/intrusion in buildings. Test-based dataset of two rainwater intrusion parameters that quantify the distribution of direct impinging raindrops and surface runoff rainwater over building surface — rain admittance factor (RAF) and surface runoff coefficient (SRC), respectively —were developed using common shapes of low-rise buildings. The dataset was applied to a newly formulated WDR estimation model to predict the volume of rainwater ingress through envelope openings such as wall and roof deck breaches and window sill cracks. The validation of the new model using experimental data indicated reasonable estimation of rainwater ingress through envelope defects and breaches during tropical cyclones. The WDR estimation model and experimental dataset of WDR parameters developed in this dissertation work can be used to enhance the prediction capabilities of existing interior damage models such as the Florida Public Hurricane Loss Model (FPHLM).^
                                
Resumo:
The exponential growth of studies on the biological response to ocean acidification over the last few decades has generated a large amount of data. To facilitate data comparison, a data compilation hosted at the data publisher PANGAEA was initiated in 2008 and is updated on a regular basis (doi:10.1594/PANGAEA.149999). By January 2015, a total of 581 data sets (over 4 000 000 data points) from 539 papers had been archived. Here we present the developments of this data compilation five years since its first description by Nisumaa et al. (2010). Most of study sites from which data archived are still in the Northern Hemisphere and the number of archived data from studies from the Southern Hemisphere and polar oceans are still relatively low. Data from 60 studies that investigated the response of a mix of organisms or natural communities were all added after 2010, indicating a welcomed shift from the study of individual organisms to communities and ecosystems. The initial imbalance of considerably more data archived on calcification and primary production than on other processes has improved. There is also a clear tendency towards more data archived from multifactorial studies after 2010. For easier and more effective access to ocean acidification data, the ocean acidification community is strongly encouraged to contribute to the data archiving effort, and help develop standard vocabularies describing the variables and define best practices for archiving ocean acidification data.
 
                    