282 resultados para Data anonymization and sanitization


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring and assessing environmental health is becoming increasingly important as human activity and climate change place greater pressure on global biodiversity. Acoustic sensors provide the ability to collect data passively, objectively and continuously across large areas for extended periods of time. While these factors make acoustic sensors attractive as autonomous data collectors, there are significant issues associated with large-scale data manipulation and analysis. We present our current research into techniques for analysing large volumes of acoustic data effectively and efficiently. We provide an overview of a novel online acoustic environmental workbench and discuss a number of approaches to scaling analysis of acoustic data; collaboration, manual, automatic and human-in-the loop analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several authors stress the importance of data’s crucial foundation for operational, tactical and strategic decisions (e.g., Redman 1998, Tee et al. 2007). Data provides the basis for decision making as data collection and processing is typically associated with reducing uncertainty in order to make more effective decisions (Daft and Lengel 1986). While the first series of investments of Information Systems/Information Technology (IS/IT) into organizations improved data collection, restricted computational capacity and limited processing power created challenges (Simon 1960). Fifty years on, capacity and processing problems are increasingly less relevant; in fact, the opposite exists. Determining data relevance and usefulness is complicated by increased data capture and storage capacity, as well as continual improvements in information processing capability. As the IT landscape changes, businesses are inundated with ever-increasing volumes of data from both internal and external sources available on both an ad-hoc and real-time basis. More data, however, does not necessarily translate into more effective and efficient organizations, nor does it increase the likelihood of better or timelier decisions. This raises questions about what data managers require to assist their decision making processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring environmental health is becoming increasingly important as human activity and climate change place greater pressure on global biodiversity. Acoustic sensors provide the ability to collect data passively, objectively and continuously across large areas for extended periods. While these factors make acoustic sensors attractive as autonomous data collectors, there are significant issues associated with large-scale data manipulation and analysis. We present our current research into techniques for analysing large volumes of acoustic data efficiently. We provide an overview of a novel online acoustic environmental workbench and discuss a number of approaches to scaling analysis of acoustic data; online collaboration, manual, automatic and human-in-the loop analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Members of the World Trade Organisation (WTO) are obliged to implement the Agreement on Trade-related Intellectual Property Rights 1994 (TRIPS) which establishes minimum standards for the protection and enforcement of intellectual property rights. Almost two decades after TRIPS was adopted at the conclusion of the Uruguay Round of trade negotiations, it is widely accepted that intellectual property systems in developing and least-developed countries must be consistent with, and serve, their development needs and objectives. In adopting the Development Agenda in 2007, the World Intellectual Property Organisation (WIPO) emphasised the importance to developing and least-developed countries of being able to obtain access to knowledge and technology and to participate in collaborations and exchanges with research and scientific institutions in other countries. Access to knowledge, information and technology is crucial if creativity and innovation is to be fostered in developing and least-developed countries. It is particularly important that developing and least-developed countries give effect to their TRIPS obligations by implementing intellectual property systems and adopting intellectual property management practices that enable them to benefit from knowledge flows and support their engagement in international research and science collaborations. However, developing and least-developed countries did not participate in the deliberations leading to the adoption in 2004 by Organisation for Economic Co-operation and Development (OECD) member countries of the Ministerial Declaration on Access to Research Data from Public Funding, nor have they formulated policies on access to publicly funded research outputs such as those developed by the National Institutes of Health in the United States, the United Kingdom Research Councils or the Australian National Health and Medical Research Council. These issues are considered from the viewpoint of Malaysia, a developing country whose economy has grown strongly in recent years. Lacking an established policy covering access to the outputs of publicly funded research, data sharing and licensing practices continue to be fragmented. Obtaining access to research data requires arrangements to be negotiated with individual data owners and custodians. Given the potential for restrictions on access to impact negatively on scientific progress and development in Malaysia, measures are required to ensure that access to knowledge and research results is facilitated. This paper proposes a policy framework for Malaysia‘s public research universities that recognises intellectual property rights while enabling the open access to research data that is essential for innovation and development. It also considers how intellectual property rights in research data can be managed in order to give effect to the policy‘s open access objectives.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Queensland University of Technology (QUT) Library, like many other academic and research institution libraries in Australia, has been collaborating with a range of academic and service provider partners to develop a range of research data management services and collections. Three main strategies are being employed and an overview of process, infrastructure, usage and benefits is provided of each of these service aspects. The development of processes and infrastructure to facilitate the strategic identification and management of QUT developed datasets has been a major focus. A number of Australian National Data Service (ANDS) sponsored projects - including Seeding the Commons; Metadata Hub / Store; Data Capture and Gold Standard Record Exemplars have / will provide QUT with a data registry system, linkages to storage, processes for identifying and describing datasets, and a degree of academic awareness. QUT supports open access and has established a culture for making its research outputs available via the QUT ePrints institutional repository. Incorporating open access research datasets into the library collections is an equally important aspect of facilitating the adoption of data-centric eresearch methods. Some datasets are available commercially, and the library has collaborated with QUT researchers, in the QUT Business School especially strongly, to identify and procure a rapidly growing range of financial datasets to support research. The library undertakes licensing and uses the Library Resource Allocation to pay for the subscriptions. It is a new area of collection development for with much to be learned. The final strategy discussed is the library acting as “data broker”. QUT Library has been working with researchers to identify these datasets and undertake the licensing, payment and access as a centrally supported service on behalf of researchers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research was a step forward in developing a data integration framework for Electronic Health Records. The outcome of the research is a conceptual and logical Data Warehousing model for integrating Cardiac Surgery electronic data records. This thesis investigated the main obstacles for the healthcare data integration and proposes a data warehousing model suitable for integrating fragmented data in a Cardiac Surgery Unit.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In contemporary game development circles the ‘game making jam’ has become an important rite of passage and baptism event, an exploration space and a central indie lifestyle affirmation and community event. Game jams have recently become a focus for design researchers interested in the creative process. In this paper we tell the story of an established local game jam and our various documentation and data collection methods. We present the beginnings of the current project, which seeks to map the creative teams and their process in the space of the challenge, and which aims to enable participants to be more than the objects of the data collection. A perceived issue is that typical documentation approaches are ‘about’ the event as opposed to ‘made by’ the participants and are thus both at odds with the spirit of the jam as a phenomenon and do not really access the rich playful potential of participant experience. In the data collection and visualisation projects described here, we focus on using collected data to re-include the participants in telling stories about their experiences of the event as a place-based experience. Our goal is to find a means to encourage production of ‘anecdata’ - data based on individual story telling that is subjective, malleable, and resists collection via formal mechanisms - and to enable mimesis, or active narrating, on the part of the participants. We present a concept design for data as game based on the logic of early medieval maps and we reflect on how we could enable participation in the data collection itself.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Big Data presents many challenges related to volume, whether one is interested in studying past datasets or, even more problematically, attempting to work with live streams of data. The most obvious challenge, in a ‘noisy’ environment such as contemporary social media, is to collect the pertinent information; be that information for a specific study, tweets which can inform emergency services or other responders to an ongoing crisis, or give an advantage to those involved in prediction markets. Often, such a process is iterative, with keywords and hashtags changing with the passage of time, and both collection and analytic methodologies need to be continually adapted to respond to this changing information. While many of the data sets collected and analyzed are preformed, that is they are built around a particular keyword, hashtag, or set of authors, they still contain a large volume of information, much of which is unnecessary for the current purpose and/or potentially useful for future projects. Accordingly, this panel considers methods for separating and combining data to optimize big data research and report findings to stakeholders. The first paper considers possible coding mechanisms for incoming tweets during a crisis, taking a large stream of incoming tweets and selecting which of those need to be immediately placed in front of responders, for manual filtering and possible action. The paper suggests two solutions for this, content analysis and user profiling. In the former case, aspects of the tweet are assigned a score to assess its likely relationship to the topic at hand, and the urgency of the information, whilst the latter attempts to identify those users who are either serving as amplifiers of information or are known as an authoritative source. Through these techniques, the information contained in a large dataset could be filtered down to match the expected capacity of emergency responders, and knowledge as to the core keywords or hashtags relating to the current event is constantly refined for future data collection. The second paper is also concerned with identifying significant tweets, but in this case tweets relevant to particular prediction market; tennis betting. As increasing numbers of professional sports men and women create Twitter accounts to communicate with their fans, information is being shared regarding injuries, form and emotions which have the potential to impact on future results. As has already been demonstrated with leading US sports, such information is extremely valuable. Tennis, as with American Football (NFL) and Baseball (MLB) has paid subscription services which manually filter incoming news sources, including tweets, for information valuable to gamblers, gambling operators, and fantasy sports players. However, whilst such services are still niche operations, much of the value of information is lost by the time it reaches one of these services. The paper thus considers how information could be filtered from twitter user lists and hash tag or keyword monitoring, assessing the value of the source, information, and the prediction markets to which it may relate. The third paper examines methods for collecting Twitter data and following changes in an ongoing, dynamic social movement, such as the Occupy Wall Street movement. It involves the development of technical infrastructure to collect and make the tweets available for exploration and analysis. A strategy to respond to changes in the social movement is also required or the resulting tweets will only reflect the discussions and strategies the movement used at the time the keyword list is created — in a way, keyword creation is part strategy and part art. In this paper we describe strategies for the creation of a social media archive, specifically tweets related to the Occupy Wall Street movement, and methods for continuing to adapt data collection strategies as the movement’s presence in Twitter changes over time. We also discuss the opportunities and methods to extract data smaller slices of data from an archive of social media data to support a multitude of research projects in multiple fields of study. The common theme amongst these papers is that of constructing a data set, filtering it for a specific purpose, and then using the resulting information to aid in future data collection. The intention is that through the papers presented, and subsequent discussion, the panel will inform the wider research community not only on the objectives and limitations of data collection, live analytics, and filtering, but also on current and in-development methodologies that could be adopted by those working with such datasets, and how such approaches could be customized depending on the project stakeholders.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Environmental monitoring is becoming critical as human activity and climate change place greater pressures on biodiversity, leading to an increasing need for data to make informed decisions. Acoustic sensors can help collect data across large areas for extended periods making them attractive in environmental monitoring. However, managing and analysing large volumes of environmental acoustic data is a great challenge and is consequently hindering the effective utilization of the big dataset collected. This paper presents an overview of our current techniques for collecting, storing and analysing large volumes of acoustic data efficiently, accurately, and cost-effectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A variety of sustainable development research efforts and related activities are attempting to reconcile the issues of conserving our natural resources without limiting economic motivation while also improving our social equity and quality of life. Land use/land cover change, occurring on a global scale, is an aggregate of local land use decisions and profoundly impacts our environment. It is therefore the local decision making process that should be the eventual target of many of the ongoing data collection and research efforts which strive toward supporting a sustainable future. Satellite imagery data is a primary source of data upon which to build a core data set for use by researchers in analyzing this global change. A process is necessary to link global change research, utilizing satellite imagery, to the local land use decision making process. One example of this is the NASA-sponsored Regional Data Center (RDC) prototype. The RDC approach is an attempt to integrate science and technology at the community level. The anticipated result of this complex interaction between research and the decision making communities will be realized in the form of long-term benefits to the public.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Carcinoma ex pleomorphic adenoma (Ca ex PA) is a carcinoma arising from a primary or recurrent benign pleomorphic adenoma. It often poses a diagnostic challenge to clinicians and pathologists. This study intends to review the literature and highlight the current clinical and molecular perspectives about this entity. The most common clinical presentation of CA ex PA is of a firm mass in the parotid gland. The proportion of adenoma and carcinoma components determines the macroscopic features of this neoplasm. The entity is difficult to diagnose pre-operatively. Pathologic assessment is the gold standard for making the diagnosis. Treatment for Ca ex PA often involves an ablative surgical procedure which may be followed by radiotherapy. Overall, patients with Ca ex PA have a poor prognosis. Accurate diagnosis and aggressive surgical management of patients presenting with Ca ex PA can increase their survival rates. Molecular studies have revealed that the development of Ca ex PA follows a multi-step model of carcinogenesis, with the progressive loss of heterozygosity at chromosomal arms 8q, then 12q and finally 17p. There are specific candidate genes in these regions that are associated with particular stages in the progression of Ca ex PA. In addition, many genes which regulate tumour suppression, cell cycle control, growth factors and cell-cell adhesion play a role in the development and progression of Ca ex PA. It is hopeful that these molecular data can give clues for the diagnosis and management of the disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a safety data recording and analysis system that has been developed to capture safety occurrences including precursors using high-definition forward-facing video from train cabs and data from other train-borne systems. The paper describes the data processing model and how events detected through data analysis are related to an underlying socio-technical model of accident causation. The integrated approach to safety data recording and analysis insures systemic factors that condition, influence or potentially contribute to an occurrence are captured both for safety occurrences and precursor events, providing a rich tapestry of antecedent causal factors that can significantly improve learning around accident causation. This can ultimately provide benefit to railways through the development of targeted and more effective countermeasures, better risk models and more effective use and prioritization of safety funds. Level crossing occurrences are a key focus in this paper with data analysis scenarios describing causal factors around near-miss occurrences. The paper concludes with a discussion on how the system can also be applied to other types of railway safety occurrences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE Every health care sector including hospice/palliative care needs to systematically improve services using patient-defined outcomes. Data from the national Australian Palliative Care Outcomes Collaboration aims to define whether hospice/palliative care patients' outcomes and the consistency of these outcomes have improved in the last 3 years. METHODS Data were analysed by clinical phase (stable, unstable, deteriorating, terminal). Patient-level data included the Symptom Assessment Scale and the Palliative Care Problem Severity Score. Nationally collected point-of-care data were anchored for the period July-December 2008 and subsequently compared to this baseline in six 6-month reporting cycles for all services that submitted data in every time period (n = 30) using individual longitudinal multi-level random coefficient models. RESULTS Data were analysed for 19,747 patients (46 % female; 85 % cancer; 27,928 episodes of care; 65,463 phases). There were significant improvements across all domains (symptom control, family care, psychological and spiritual care) except pain. Simultaneously, the interquartile ranges decreased, jointly indicating that better and more consistent patient outcomes were being achieved. CONCLUSION These are the first national hospice/palliative care symptom control performance data to demonstrate improvements in clinical outcomes at a service level as a result of routine data collection and systematic feedback.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research is a step forward in improving the accuracy of detecting anomaly in a data graph representing connectivity between people in an online social network. The proposed hybrid methods are based on fuzzy machine learning techniques utilising different types of structural input features. The methods are presented within a multi-layered framework which provides the full requirements needed for finding anomalies in data graphs generated from online social networks, including data modelling and analysis, labelling, and evaluation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this chapter, we draw out the relevant themes from a range of critical scholarship from the small body of digital media and software studies work that has focused on the politics of Twitter data and the sociotechnical means by which access is regulated. We highlight in particular the contested relationships between social media research (in both academic and non-academic contexts) and the data wholesale, retail, and analytics industries that feed on them. In the second major section of the chapter we discuss in detail the pragmatic edge of these politics in terms of what kinds of scientific research is and is not possible in the current political economy of Twitter data access. Finally, at the end of the chapter we return to the much broader implications of these issues for the politics of knowledge, demonstrating how the apparently microscopic level of how the Twitter API mediates access to Twitter data actually inscribes and influences the macro level of the global political economy of science itself, through re-inscribing institutional and traditional disciplinary privilege We conclude with some speculations about future developments in data rights and data philanthropy that may at least mitigate some of these negative impacts.