940 resultados para Data anonymization and sanitization


Relevância:

100.00% 100.00%

Publicador:

Resumo:

After nearly fifteen years of the open access (OA) movement and its hard-fought struggle for a more open scholarly communication system, publishers are realizing that business models can be both open and profitable. Making journal articles available on an OA license is becoming an accepted strategy for maximizing the value of content to both research communities and the businesses that serve them. The first blog in this two-part series celebrating Data Innovation Day looks at the role that data-innovation is playing in the shift to open access for journal articles.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quality of data collection methods selected and the integrity of the data collected are integral tot eh success of a study. This chapter focuses on data collection and study validity. After reading the chapter, readers should be able to define types of data collection methods in quantitative research; list advantages and disadvantages of each method; discuss factors related to internal and external validity; critically evaluate data collection methods and discuss the need to operationalise variables of interest for data collection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper firstly presents the benefits and critical challenges on the use of Bluetooth and Wi-Fi for crowd data collection and monitoring. The major challenges include antenna characteristics, environment’s complexity and scanning features. Wi-Fi and Bluetooth are compared in this paper in terms of architecture, discovery time, popularity of use and signal strength. Type of antennas used and the environment’s complexity such as trees for outdoor and partitions for indoor spaces highly affect the scanning range. The aforementioned challenges are empirically evaluated by “real” experiments using Bluetooth and Wi-Fi Scanners. The issues related to the antenna characteristics are also highlighted by experimenting with different antenna types. Novel scanning approaches including Overlapped Zones and Single Point Multi-Range detection methods will be then presented and verified by real-world tests. These novel techniques will be applied for location identification of the MAC IDs captured that can extract more information about people movement dynamics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data in germplasm collections contain a mixture of data types; binary, multistate and quantitative. Given the multivariate nature of these data, the pattern analysis methods of classification and ordination have been identified as suitable techniques for statistically evaluating the available diversity. The proximity (or resemblance) measure, which is in part the basis of the complementary nature of classification and ordination techniques, is often specific to particular data types. The use of a combined resemblance matrix has an advantage over data type specific proximity measures. This measure accommodates the different data types without manipulating them to be of a specific type. Descriptors are partitioned into their data types and an appropriate proximity measure is used on each. The separate proximity matrices, after range standardisation, are added as a weighted average and the combined resemblance matrix is then used for classification and ordination. Germplasm evaluation data for 831 accessions of groundnut (Arachis hypogaea L.) from the Australian Tropical Field Crops Genetic Resource Centre, Biloela, Queensland were examined. Data for four binary, five ordered multistate and seven quantitative descriptors have been documented. The interpretative value of different weightings - equal and unequal weighting of data types to obtain a combined resemblance matrix - was investigated by using principal co-ordinate analysis (ordination) and hierarchical cluster analysis. Equal weighting of data types was found to be more valuable for these data as the results provided a greater insight into the patterns of variability available in the Australian groundnut germplasm collection. The complementary nature of pattern analysis techniques enables plant breeders to identify relevant accessions in relation to the descriptors which distinguish amongst them. This additional information may provide plant breeders with a more defined entry point into the germplasm collection for identifying sources of variability for their plant improvement program, thus improving the utilisation of germplasm resources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While data quality has been identified as a critical factor associated with enterprise resource planning (ERP) failure, the relationship between ERP stakeholders, the information they require and its relationship to ERP outcomes continues to be poorly understood. Applying stakeholder theory to the problem of ERP performance, we put forward a framework articulating the fundamental differences in the way users differentiate between ERP data quality and utility. We argue that the failure of ERPs to produce significant organisational outcomes can be attributed to conflict between stakeholder groups over whether the data contained within an ERP is of adequate ‘quality’. The framework provides guidance as how to manage data flows between stakeholders, offering insight into each of their specific data requirements. The framework provides support for the idea that stakeholder affiliation dictates the assumptions and core values held by individuals, driving their data needs and their perceptions of data quality and utility.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research proposes the development of interfaces to support collaborative, community-driven inquiry into data, which we refer to as Participatory Data Analytics. Since the investigation is led by local communities, it is not possible to anticipate which data will be relevant and what questions are going to be asked. Therefore, users have to be able to construct and tailor visualisations to their own needs. The poster presents early work towards defining a suitable compositional model, which will allow users to mix, match, and manipulate data sets to obtain visual representations with little-to-no programming knowledge. Following a user-centred design process, we are subsequently planning to identify appropriate interaction techniques and metaphors for generating such visual specifications on wall-sized, multi-touch displays.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To the Editor—In a recent review article in Infection Control and Hospital Epidemiology, Umscheid et al1 summarized published data on incidence rates of catheter-associated bloodstream infection (CABSI), catheter-associated urinary tract infection (CAUTI), surgical site infection (SSI), and ventilator- associated pneumonia (VAP); estimated how many cases are preventable; and calculated the savings in hospital costs and lives that would result from preventing all preventable cases. Providing these estimates to policy makers, political leaders, and health officials helps to galvanize their support for infection prevention programs. Our concern is that important limitations of the published studies on which Umscheid and colleagues built their findings are incompletely addressed in this review. More attention needs to be drawn to the techniques applied to generate these estimates...

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This chapter describes decentralized data fusion algorithms for a team of multiple autonomous platforms. Decentralized data fusion (DDF) provides a useful basis with which to build upon for cooperative information gathering tasks for robotic teams operating in outdoor environments. Through the DDF algorithms, each platform can maintain a consistent global solution from which decisions may then be made. Comparisons will be made between the implementation of DDF using two probabilistic representations. The first, Gaussian estimates and the second Gaussian mixtures are compared using a common data set. The overall system design is detailed, providing insight into the overall complexity of implementing a robust DDF system for use in information gathering tasks in outdoor UAV applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A tag-based item recommendation method generates an ordered list of items, likely interesting to a particular user, using the users past tagging behaviour. However, the users tagging behaviour varies in different tagging systems. A potential problem in generating quality recommendation is how to build user profiles, that interprets user behaviour to be effectively used, in recommendation models. Generally, the recommendation methods are made to work with specific types of user profiles, and may not work well with different datasets. In this paper, we investigate several tagging data interpretation and representation schemes that can lead to building an effective user profile. We discuss the various benefits a scheme brings to a recommendation method by highlighting the representative features of user tagging behaviours on a specific dataset. Empirical analysis shows that each interpretation scheme forms a distinct data representation which eventually affects the recommendation result. Results on various datasets show that an interpretation scheme should be selected based on the dominant usage in the tagging data (i.e. either higher amount of tags or higher amount of items present). The usage represents the characteristic of user tagging behaviour in the system. The results also demonstrate how the scheme is able to address the cold-start user problem.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Honig and Samuelsson (2014) and Delmar (2015) recently had an exchange in this journal related to a replication-and-extension attempt of two papers which originally arrived at different conclusions based on the same data set. This commentary provides further clarification on the issues and links the debate to broader issues scholarly culture and practices in entrepreneurship research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decision-making is such an integral aspect in health care routine that the ability to make the right decisions at crucial moments can lead to patient health improvements. Evidence-based practice, the paradigm used to make those informed decisions, relies on the use of current best evidence from systematic research such as randomized controlled trials. Limitations of the outcomes from randomized controlled trials (RCT), such as “quantity” and “quality” of evidence generated, has lowered healthcare professionals’ confidence in using EBP. An alternate paradigm of Practice-Based Evidence has evolved with the key being evidence drawn from practice settings. Through the use of health information technology, electronic health records (EHR) capture relevant clinical practice “evidence”. A data-driven approach is proposed to capitalize on the benefits of EHR. The issues of data privacy, security and integrity are diminished by an information accountability concept. Data warehouse architecture completes the data-driven approach by integrating health data from multi-source systems, unique within the healthcare environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-stakes testing is changing what it means to be a ‘good teacher’ in the contemporary school. This paper uses Deleuze and Guattari's ideas on the control society and dividuation in the context of National Assessment Program Literacy and Numeracy (NAPLAN) testing in Australia to suggest that the database generates new understandings of the ‘good teacher’. Media reports are used to look at how teachers are responding to the high-stakes database through manipulating the data. This article argues that manipulating the data is a regrettable, but logical, response to manifestations of teaching where only the data counts.