960 resultados para Semi-supervised clustering
Resumo:
This paper describes our semi-automatic keyword based approach for the four topics of Information Extraction from Microblogs Posted during Disasters task at Forum for Information Retrieval Evaluation (FIRE) 2016. The approach consists three phases.
Clustering of Protein Structures Using Hydrophobic Free Energy And Solvent Accessibility of Proteins
Resumo:
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.
Resumo:
Based on Newmark-β method, a structural vibration response is predicted. Through finding the appropriate control force parameters within certain ranges to optimize the objective function, the predictive control of the structural vibration is achieved. At the same time, the numerical simulation analysis of a two-storey frame structure with magneto-rheological (MR) dampers under earthquake records is carried out, and the parameter influence on structural vibration reduction is discussed. The results demonstrate that the semi-active control based on Newmark-β predictive algorithm is better than the classical control strategy based on full-state feedback control and has remarkable advantages of structural vibration reduction and control robustness.
Resumo:
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
Resumo:
Objective: To summarise the extent to which narrative text fields in administrative health data are used to gather information about the event resulting in presentation to a health care provider for treatment of an injury, and to highlight best practise approaches to conducting narrative text interrogation for injury surveillance purposes.----- Design: Systematic review----- Data sources: Electronic databases searched included CINAHL, Google Scholar, Medline, Proquest, PubMed and PubMed Central.. Snowballing strategies were employed by searching the bibliographies of retrieved references to identify relevant associated articles.----- Selection criteria: Papers were selected if the study used a health-related database and if the study objectives were to a) use text field to identify injury cases or use text fields to extract additional information on injury circumstances not available from coded data or b) use text fields to assess accuracy of coded data fields for injury-related cases or c) describe methods/approaches for extracting injury information from text fields.----- Methods: The papers identified through the search were independently screened by two authors for inclusion, resulting in 41 papers selected for review. Due to heterogeneity between studies metaanalysis was not performed.----- Results: The majority of papers reviewed focused on describing injury epidemiology trends using coded data and text fields to supplement coded data (28 papers), with these studies demonstrating the value of text data for providing more specific information beyond what had been coded to enable case selection or provide circumstantial information. Caveats were expressed in terms of the consistency and completeness of recording of text information resulting in underestimates when using these data. Four coding validation papers were reviewed with these studies showing the utility of text data for validating and checking the accuracy of coded data. Seven studies (9 papers) described methods for interrogating injury text fields for systematic extraction of information, with a combination of manual and semi-automated methods used to refine and develop algorithms for extraction and classification of coded data from text. Quality assurance approaches to assessing the robustness of the methods for extracting text data was only discussed in 8 of the epidemiology papers, and 1 of the coding validation papers. All of the text interrogation methodology papers described systematic approaches to ensuring the quality of the approach.----- Conclusions: Manual review and coding approaches, text search methods, and statistical tools have been utilised to extract data from narrative text and translate it into useable, detailed injury event information. These techniques can and have been applied to administrative datasets to identify specific injury types and add value to previously coded injury datasets. Only a few studies thoroughly described the methods which were used for text mining and less than half of the studies which were reviewed used/described quality assurance methods for ensuring the robustness of the approach. New techniques utilising semi-automated computerised approaches and Bayesian/clustering statistical methods offer the potential to further develop and standardise the analysis of narrative text for injury surveillance.