389 resultados para Large datasets
Resumo:
At Eurocrypt’04, Freedman, Nissim and Pinkas introduced a fuzzy private matching problem. The problem is defined as follows. Given two parties, each of them having a set of vectors where each vector has T integer components, the fuzzy private matching is to securely test if each vector of one set matches any vector of another set for at least t components where t < T. In the conclusion of their paper, they asked whether it was possible to design a fuzzy private matching protocol without incurring a communication complexity with the factor (T t ) . We answer their question in the affirmative by presenting a protocol based on homomorphic encryption, combined with the novel notion of a share-hiding error-correcting secret sharing scheme, which we show how to implement with efficient decoding using interleaved Reed-Solomon codes. This scheme may be of independent interest. Our protocol is provably secure against passive adversaries, and has better efficiency than previous protocols for certain parameter values.
Resumo:
Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.
Resumo:
Enterprises, both public and private, have rapidly commenced using the benefits of enterprise resource planning (ERP) combined with business analytics and “open data sets” which are often outside the control of the enterprise to gain further efficiencies, build new service operations and increase business activity. In many cases, these business activities are based around relevant software systems hosted in a “cloud computing” environment. “Garbage in, garbage out”, or “GIGO”, is a term long used to describe problems in unqualified dependency on information systems, dating from the 1960s. However, a more pertinent variation arose sometime later, namely “garbage in, gospel out” signifying that with large scale information systems, such as ERP and usage of open datasets in a cloud environment, the ability to verify the authenticity of those data sets used may be almost impossible, resulting in dependence upon questionable results. Illicit data set “impersonation” becomes a reality. At the same time the ability to audit such results may be an important requirement, particularly in the public sector. This paper discusses the need for enhancement of identity, reliability, authenticity and audit services, including naming and addressing services, in this emerging environment and analyses some current technologies that are offered and which may be appropriate. However, severe limitations to addressing these requirements have been identified and the paper proposes further research work in the area.
Resumo:
Enterprise resource planning (ERP) systems are rapidly being combined with “big data” analytics processes and publicly available “open data sets”, which are usually outside the arena of the enterprise, to expand activity through better service to current clients as well as identifying new opportunities. Moreover, these activities are now largely based around relevant software systems hosted in a “cloud computing” environment. However, the over 50- year old phrase related to mistrust in computer systems, namely “garbage in, garbage out” or “GIGO”, is used to describe problems of unqualified and unquestioning dependency on information systems. However, a more relevant GIGO interpretation arose sometime later, namely “garbage in, gospel out” signifying that with large scale information systems based around ERP and open datasets as well as “big data” analytics, particularly in a cloud environment, the ability to verify the authenticity and integrity of the data sets used may be almost impossible. In turn, this may easily result in decision making based upon questionable results which are unverifiable. Illicit “impersonation” of and modifications to legitimate data sets may become a reality while at the same time the ability to audit any derived results of analysis may be an important requirement, particularly in the public sector. The pressing need for enhancement of identity, reliability, authenticity and audit services, including naming and addressing services, in this emerging environment is discussed in this paper. Some current and appropriate technologies currently being offered are also examined. However, severe limitations in addressing the problems identified are found and the paper proposes further necessary research work for the area. (Note: This paper is based on an earlier unpublished paper/presentation “Identity, Addressing, Authenticity and Audit Requirements for Trust in ERP, Analytics and Big/Open Data in a ‘Cloud’ Computing Environment: A Review and Proposal” presented to the Department of Accounting and IT, College of Management, National Chung Chen University, 20 November 2013.)
Resumo:
Objective Evaluate the effectiveness and robustness of Anonym, a tool for de-identifying free-text health records based on conditional random fields classifiers informed by linguistic and lexical features, as well as features extracted by pattern matching techniques. De-identification of personal health information in electronic health records is essential for the sharing and secondary usage of clinical data. De-identification tools that adapt to different sources of clinical data are attractive as they would require minimal intervention to guarantee high effectiveness. Methods and Materials The effectiveness and robustness of Anonym are evaluated across multiple datasets, including the widely adopted Integrating Biology and the Bedside (i2b2) dataset, used for evaluation in a de-identification challenge. The datasets used here vary in type of health records, source of data, and their quality, with one of the datasets containing optical character recognition errors. Results Anonym identifies and removes up to 96.6% of personal health identifiers (recall) with a precision of up to 98.2% on the i2b2 dataset, outperforming the best system proposed in the i2b2 challenge. The effectiveness of Anonym across datasets is found to depend on the amount of information available for training. Conclusion Findings show that Anonym compares to the best approach from the 2006 i2b2 shared task. It is easy to retrain Anonym with new datasets; if retrained, the system is robust to variations of training size, data type and quality in presence of sufficient training data.
Resumo:
We present efficient protocols for private set disjointness tests. We start from an intuition of our protocols that applies Sylvester matrices. Unfortunately, this simple construction is insecure as it reveals information about the cardinality of the intersection. More specifically, it discloses its lower bound. By using the Lagrange interpolation we provide a protocol for the honest-but-curious case without revealing any additional information. Finally, we describe a protocol that is secure against malicious adversaries. The protocol applies a verification test to detect misbehaving participants. Both protocols require O(1) rounds of communication. Our protocols are more efficient than the previous protocols in terms of communication and computation overhead. Unlike previous protocols whose security relies on computational assumptions, our protocols provide information theoretic security. To our knowledge, our protocols are first ones that have been designed without a generic secure function evaluation. More importantly, they are the most efficient protocols for private disjointness tests for the malicious adversary case.
Resumo:
There are limited studies that describe patient meal preferences in hospital; however this data is critical to develop menus that address satisfaction and nutrition whilst balancing resources. This quality study aimed to determine preferences for meals and snacks to inform a comprehensive menu revision in a large (929 bed) tertiary public hospital. The method was based on Vivanti et al. (2008) with data collected by two final year dietetic students. The first survey comprised 72 questions, achieved a response rate of 68% (n = 192), with the second more focused at 47 questions achieving a higher response rate of 93% (n = 212). Findings showed over half the patients reporting poor or less than normal appetite, 20% describing taste issues, over a third with a LOS >7 days, a third with a MST _ 2 and less than half eating only from the general menu. Soup then toast was most frequently reported as eaten at home when unwell, and whilst most reported not missing any foods when in hospital (25%), steak was most commonly missed. Hot breakfasts were desired by the majority (63%), with over half preferring toast (even if cold). In relation to snacks, nearly half (48%) wanted something more substantial than tea/coffee/biscuits, with sandwiches (54%) and soup (33%) being suggested. Sandwiches at the evening meal were not popular (6%). Difficulties with using cutlery and meal size selection were identified as issues. Findings from this study had high utility and supported a collaborative and evidenced based approach to a successful major menu change for the hospital.
Resumo:
This study explores people's risk taking behaviour after having suffered large real-world losses following a natural disaster. Using the margins of the 2011 Australian floods (Brisbane) as a natural experimental setting, we find that homeowners who were victims of the floods and face large losses in property values are 50% more likely to opt for a risky gamble -- a scratch card giving a small chance of a large gain ($500,000) -- than for a sure amount of comparable value ($10). This finding is consistent with prospect theory predictions regarding the adoption of a risk-seeking attitude after a loss.
Resumo:
This is an exploratory study into the effective use of embedding custom made audiovisual case studies (AVCS) in enhancing the student’s learning experience. This paper describes a project that used AVCS for a large divergent cohort of undergraduate students, enrolled in an International Business course. The study makes a number of key contributions to advancing learning and teaching within the discipline. AVCS provide first hand reporting of the case material, where the students have the ability to improve their understanding from both verbal and nonverbal cues. The paper demonstrates how AVCS can be embedded in a student-centred teaching approach to capture the students’ interest and to enhance a deep approach to learning by providing real-world authentic experience.
Resumo:
Business processes are an important instrument for understanding and improving how companies provide goods and services to customers. Therefore, many companies have documented their business processes well, often in the Event-driven Process Chains (EPC). Unfortunately, in many cases the resulting EPCs are rather complex, so that the overall process logic is hidden in low level process details. This paper proposes abstraction mechanisms for process models that aim to reduce their complexity, while keeping the overall process structure. We assume that functions are marked with efforts and splits are marked with probabilities. This information is used to separate important process parts from less important ones. Real world process models are used to validate the approach.
Resumo:
Background Large segmental defects in bone do not heal well and present clinical challenges. This study investigated modulation of the mechanical environment as a means of improving bone healing in the presence of bone morphogenetic protein (BMP)-2. Although the influence of mechanical forces on the healing of fractures is well established, no previous studies, to our knowledge, have described their influence on the healing of large segmental defects. We hypothesized that bone-healing would be improved by initial, low-stiffness fixation of the defect, followed by high-stiffness fixation during the healing process. We call this reverse dynamization. Methods A rat model of a critical-sized femoral defect was used. External fixators were constructed to provide different degrees of stiffness and, importantly, the ability to change stiffness during the healing process in vivo. Healing of the critical-sized defects was initiated by the implantation of 11 mg of recombinant human BMP (rhBMP)-2 on a collagen sponge. Groups of rats receiving BMP-2 were allowed to heal with low, medium, and high-stiffness fixators, as well as under conditions of reverse dynamization, in which the stiffness was changed from low to high at two weeks. Healing was assessed at eight weeks with use of radiographs, histological analysis, microcomputed tomography, dual x-ray absorptiometry, and mechanical testing. Results Under constant stiffness, the low-stiffness fixator produced the best healing after eight weeks. However, reverse dynamization provided considerable improvement, resulting in a marked acceleration of the healing process by all of the criteria of this study. The histological data suggest that this was the result of intramembranous, rather than endochondral, ossification. Conclusions Reverse dynamization accelerated healing in the presence of BMP-2 in the rat femur and is worthy of further investigation as a means of improving the healing of large segmental bone defects. Clinical Relevance These data provide the basis of a novel, simple, and inexpensive way to improve the healing of critical-sized defects in long bones. Reverse dynamization may also be applicable to other circumstances in which bonehealing is problematic.
Resumo:
Anatase TiO2 nanocrystals were painted on H-titanate nanofibers by using an aqueous solution of titanyl sulfate. The anatase nanocrystals were bonded solidly onto the titanate fibers through formation of coherent interfaces at which the oxygen atoms were shared by the nanocrystals and the fiber. This approach allowed us to create large anatase surfaces on the nanofibers, which are active in photocatalytic reactions. This method was also applied successfully to coat anatase nanocrystals on surfaces of fly ash and layered clay. The painted nanofibers exhibited a much higher catalytic activity for the photocatalytic degradation of sulforhodamine B and the selective oxidation of benzylamine to the corresponding imine (with a product selectivity >99%) under UV irradiation than both the parent H-titanate nanofibers and a commercial TiO2 powder, P25. We found that gold nanoparticles supported on H-titanate nanofibers showed no catalytic activity for the reduction of nitrobenzene to azoxybenzene, whereas the gold nanoparticles supported on the painted nanofibers and P25 could efficiently reduce nitrobenzene to azoxybenzene as the sole product under visible light irradiation. These results were different from those from the reduction on the gold nanoparticles photocatalyst on ZrO2, in which the azoxybenzene was the intermediate and converted to azobenzene quickly. Evidently, the support materials significantly affect the product selectivity of the nitrobenzene reduction. Finally, the new photocatalysts could be easily dispersed into and separated from a liquid because of their fibril morphology, which is an important advantage for practical applications.
Resumo:
In this paper we describe the use and evaluation of CubIT, a multi-user, very large-scale presentation and collaboration framework. CubIT is installed at the Queensland University of Technology’s (QUT) Cube facility. The “Cube” is an interactive visualisation facility made up of five very large-scale interactive multi-panel wall displays, each consisting of up to twelve 55-inch multi-touch screens (48 screens in total) and massive projected display screens situated above the display panels. The paper outlines the unique design challenges, features, use and evaluation of CubIT. The system was built to make the Cube facility accessible to QUT’s academic and student population. CubIT enables users to easily upload and share their own media content, and allows multiple users to simultaneously interact with the Cube’s wall displays. The features of CubIT are implemented via three user interfaces, a multi-touch interface working on the wall displays, a mobile phone and tablet application and a web-based content management system. The evaluation reveals issues around the public use and functional scope of the system.
Resumo:
Hunter argues that cognitive science models of human thinking explain how analogical reasoning and precedential reasoning operate in law. He offers an explanation of why various legal theories are so limited and calls for greater attention to what is actually happening when lawyers and judges reason, by analogy, with precedent.
Resumo:
This paper describes a novel system for automatic classification of images obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The IIF protocol on HEp-2 cells has been the hallmark method to identify the presence of ANAs, due to its high sensitivity and the large range of antigens that can be detected. However, it suffers from numerous shortcomings, such as being subjective as well as time and labour intensive. Computer Aided Diagnostic (CAD) systems have been developed to address these problems, which automatically classify a HEp-2 cell image into one of its known patterns (eg. speckled, homogeneous). Most of the existing CAD systems use handpicked features to represent a HEp-2 cell image, which may only work in limited scenarios. We propose a novel automatic cell image classification method termed Cell Pyramid Matching (CPM), which is comprised of regional histograms of visual words coupled with the Multiple Kernel Learning framework. We present a study of several variations of generating histograms and show the efficacy of the system on two publicly available datasets: the ICPR HEp-2 cell classification contest dataset and the SNPHEp-2 dataset.