280 resultados para cosmologia, clustering, AP-test
em Queensland University of Technology - ePrints Archive
Resumo:
The continuous growth of the XML data poses a great concern in the area of XML data management. The need for processing large amounts of XML data brings complications to many applications, such as information retrieval, data integration and many others. One way of simplifying this problem is to break the massive amount of data into smaller groups by application of clustering techniques. However, XML clustering is an intricate task that may involve the processing of both the structure and the content of XML data in order to identify similar XML data. This research presents four clustering methods, two methods utilizing the structure of XML documents and the other two utilizing both the structure and the content. The two structural clustering methods have different data models. One is based on a path model and other is based on a tree model. These methods employ rigid similarity measures which aim to identifying corresponding elements between documents with different or similar underlying structure. The two clustering methods that utilize both the structural and content information vary in terms of how the structure and content similarity are combined. One clustering method calculates the document similarity by using a linear weighting combination strategy of structure and content similarities. The content similarity in this clustering method is based on a semantic kernel. The other method calculates the distance between documents by a non-linear combination of the structure and content of XML documents using a semantic kernel. Empirical analysis shows that the structure-only clustering method based on the tree model is more scalable than the structure-only clustering method based on the path model as the tree similarity measure for the tree model does not need to visit the parents of an element many times. Experimental results also show that the clustering methods perform better with the inclusion of the content information on most test document collections. To further the research, the structural clustering method based on tree model is extended and employed in XML transformation. The results from the experiments show that the proposed transformation process is faster than the traditional transformation system that translates and converts the source XML documents sequentially. Also, the schema matching process of XML transformation produces a better matching result in a shorter time.
Resumo:
Objective To investigate the epidemic characteristics of human cutaneous anthrax (CA) in China, detect the spatiotemporal clusters at the county level for preemptive public health interventions, and evaluate the differences in the epidemiological characteristics within and outside clusters. Methods CA cases reported during 2005–2012 from the national surveillance system were evaluated at the county level using space-time scan statistic. Comparative analysis of the epidemic characteristics within and outside identified clusters was performed using using the χ2 test or Kruskal-Wallis test. Results The group of 30–39 years had the highest incidence of CA, and the fatality rate increased with age, with persons ≥70 years showing a fatality rate of 4.04%. Seasonality analysis showed that most of CA cases occurred between May/June and September/October of each year. The primary spatiotemporal cluster contained 19 counties from June 2006 to May 2010, and it was mainly located straddling the borders of Sichuan, Gansu, and Qinghai provinces. In these high-risk areas, CA cases were predominantly found among younger, local, males, shepherds, who were living on agriculture and stockbreeding and characterized with high morbidity, low mortality and a shorter period from illness onset to diagnosis. Conclusion CA was geographically and persistently clustered in the Southwestern China during 2005–2012, with notable differences in the epidemic characteristics within and outside spatiotemporal clusters; this demonstrates the necessity for CA interventions such as enhanced surveillance, health education, mandatory and standard decontamination or disinfection procedures to be geographically targeted to the areas identified in this study.
Clustering of Protein Structures Using Hydrophobic Free Energy And Solvent Accessibility of Proteins
Resumo:
In a range test, one party holds a ciphertext and needs to test whether the message encrypted in the ciphertext is within a certain interval range. In this paper, a range test protocol is proposed, where the party holding the ciphertext asks another party holding the private key of the encryption algorithm to help him. These two parties run the protocol to implement the test. The test returns TRUE if and only if the encrypted message is within the certain interval range. If the two parties do not conspire, no information about the encrypted message is revealed from the test except what can be deduced from the test result. Advantages of the new protocol over the existing related techniques are that it achieves correctness, soundness, °exibility, high e±ciency and privacy simultaneously.