818 resultados para big data
Resumo:
The General Election for the 56th United Kingdom Parliament was held on 7 May 2015. Tweets related to UK politics, not only those with the specific hashtag ”#GE2015”, have been collected in the period between March 1 and May 31, 2015. The resulting dataset contains over 28 million tweets for a total of 118 GB in uncompressed format or 15 GB in compressed format. This study describes the method that was used to collect the tweets and presents some analysis, including a political sentiment index, and outlines interesting research directions on Big Social Data based on Twitter microblogging.
Resumo:
This paper presents an overview of the Mobile Data Challenge (MDC), a large-scale research initiative aimed at generating innovations around smartphone-based research, as well as community-based evaluation of mobile data analysis methodologies. First, we review the Lausanne Data Collection Campaign (LDCC), an initiative to collect unique longitudinal smartphone dataset for the MDC. Then, we introduce the Open and Dedicated Tracks of the MDC, describe the specific datasets used in each of them, discuss the key design and implementation aspects introduced in order to generate privacy-preserving and scientifically relevant mobile data resources for wider use by the research community, and summarize the main research trends found among the 100+ challenge submissions. We finalize by discussing the main lessons learned from the participation of several hundred researchers worldwide in the MDC Tracks.
Resumo:
Monitoring urban growth and land-use change is an important issue for sustainable infrastructure planning. Rapid urban development, sprawl and increasing population pressure, particularly in developing nations, are resulting in deterioration of infrastructure facilities, loss of productive agricultural lands and open spaces, pollution, health hazards and micro-climatic changes. In addressing these issues effectively, it is crucial to collect up-to-date and accurate data and monitor the changing environment at regular intervals. This chapter discusses the role of geospatial technologies for mapping and monitoring the changing environment and urban structure, where such technologies are highly useful for sustainable infrastructure planning and provision.
Resumo:
Organisations within the not-for-profit sector provide services to individuals and groups that government and for-profit organisations cannot or will not consider. The not-for-profit sector has come to be a vibrant and rich agglomeration of services and programs that operate under a myriad of philosophical stances, service orientation, client groupings and operational capacities. In Australia these organisations and services are providing social support and service assistance to many people in the community; often targeting their assistance to the most difficult of clients. Initially, in undertaking this role, the not-for-profit sector received limited sponsorship from government. Over time governments assumed greater responsibility in the form of service grants to particular groups: ‘the worthy poor’. More recently, they have entered into contractual service agreements with the not-for-profit sector, which specify the nature of the outcomes to be achieved and, to a degree, the way in which the services will be provided. A consequence of this growing shift to a more marketised model of service contracting, often offered-up under the label of enhanced collaborative practice, has been increased competitiveness between agencies that had previously worked well together (Keast and Brown, 2006). Another trend emerging from the market approach is the entrance of for-profit providers. These larger organisations have higher levels of organisational capacity with considerable organisational slack to allow them to adopt new service roles. Shaped almost as ‘shadow governments’ they appear to be a strong preference for governments looking for greater accountability of outcomes and an easier way to control the interaction with the conventional not-for-profit sector. The question is will governments’ apparent preference for larger organisational arrangements lead to the demise of the vibrancy of the not-for-profit sector and impact on service provision to those people who fall outside of the remit of the new service providers? To address this issue, this paper uses information gleaned from a state-wide survey of not-for-profit organisations in Queensland, Australia which included organisational size, operational scope, funding arrangements and governance/management approaches. Supplementing this information is qualitative data derived from 17 focus groups and 120 interviews conducted over ten years of study of this sector. The findings contribute to greater understanding of the practice and theory of the future provision of social services.
Resumo:
It is a big challenge to acquire correct user profiles for personalized text classification since users may be unsure in providing their interests. Traditional approaches to user profiling adopt machine learning (ML) to automatically discover classification knowledge from explicit user feedback in describing personal interests. However, the accuracy of ML-based methods cannot be significantly improved in many cases due to the term independence assumption and uncertainties associated with them. This paper presents a novel relevance feedback approach for personalized text classification. It basically applies data mining to discover knowledge from relevant and non-relevant text and constraints specific knowledge by reasoning rules to eliminate some conflicting information. We also developed a Dempster-Shafer (DS) approach as the means to utilise the specific knowledge to build high-quality data models for classification. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics support that the proposed technique achieves encouraging performance in comparing with the state-of-the-art relevance feedback models.
Resumo:
Objective Describe the characteristics of patients presenting to Emergency Departments (EDs) within Queensland, Australia with injuries due to assault with a glass implement (‘glassing’) and to set this within the broader context of presentations due to alcohol-related violence. Methods Analysis of prospectively collected ED injury surveillance data collated by the Queensland Injury Surveillance Unit (QISU) between 1999 and 2011. Cases of injury due to alcohol-related violence were identified and analysed using coded fields supplemented with qualitative data contained within the injury description text. Descriptive statistics were used to assess the characteristics of injury presentations due to alcohol-related violence. Violence included interpersonal violence and aggression (verbal aggression and object violence). Results A total of 4629 cases were studied. The study population was predominantly male (72%) and aged 18 to 24 (36%), with males in this age group comprising more than a quarter of the study population (28%). Nine percent of alcohol-related assault injuries were a consequence of ‘glassing’. The home was the most common location for alcohol-related violence (31%) and alcohol-related ‘glassings’ (33%). Overall, the most common glass object involved was a bottle (75%), however, within licensed venues an even mix of a drinking glass (44%) and glass bottle (45%) were identified. Conclusions Contrary to public perception generated by media, ‘glassing’ incidents, particularly at licensed venues, constitute a relatively small proportion of all alcohol-related violence. The current study highlights the predominance of young men injured following alcohol-related violence, demonstrating a key focus area within the population for aiming prevention strategies.
Resumo:
After nearly fifteen years of the open access (OA) movement and its hard-fought struggle for a more open scholarly communication system, publishers are realizing that business models can be both open and profitable. Making journal articles available on an OA license is becoming an accepted strategy for maximizing the value of content to both research communities and the businesses that serve them. The first blog in this two-part series celebrating Data Innovation Day looks at the role that data-innovation is playing in the shift to open access for journal articles.
Resumo:
The objective of this chapter is to provide an overview of traffic data collection that can and should be used for the calibration and validation of traffic simulation models. There are big differences in availability of data from different sources. Some types of data such as loop detector data are widely available and used. Some can be measured with additional effort, for example, travel time data from GPS probe vehicles. Some types such as trajectory data are available only in rare situations such as research projects.
Resumo:
Reality television, alongside shows such as Q&A – which may be Reality TV in all but name – frequently drives social media conversations about the Australian television industry. Big Brother, currently screening on Channel 9, is consistently among the shows with the highest levels of chatter in that regard. The precise Facebook data is hard to quantify but the Official Big Brother page boasts 805,400 likes and more than 59,000 comments since the start of the series, suggesting it has established a firm presence on that platform too...
Resumo:
Magnetic behavior of soils can seriously hamper the performance of geophysical sensors. Currently, we have little understanding of the types of minerals responsible for the magnetic behavior, as well as their distribution in space and evolution through time. This study investigated the magnetic characteristics and mineralogy of Fe-rich soils developed on basaltic substrate in Hawaii. We measured the spatial distribution of magnetic susceptibility (χlf) and frequency dependence (χfd%) across three test areas in a well-developed eroded soil on Kaho'olawe and in two young soils on the Big Island of Hawaii. X-ray diffraction spectroscopy, x-ray fluorescence spectroscopy (XFCF), chemical dissolution, thermal analysis, and temperature-dependent magnetic studies were used to characterize soil development and mineralogy for samples from soil pits on Kaho'olawe, surface samples from all three test areas, and unweathered basalt from the Big Island of Hawaii. The measurements show a general increase in magnetic properties with increasing soil development. The XRF Fe data ranged from 13% for fresh basalt and young soils on the Big Island to 58% for material from the B horizon of Kaho'olawe soils. Dithionite-extractable and oxalate-extractable Fe percentages increase with soil development and correlate with χlf-and χfd%, respectively. Results from the temperature-dependent susceptibility measurements show that the high soil magnetic properties observed in geophysical surveys in Kaho'olawe are entirely due to neoformed minerals. The results of our studies have implications for the existing soil survey of Kaho'olawe and help identify methods to characterize magnetic minerals in tropical soils.
Resumo:
Large volumes of heterogeneous health data silos pose a big challenge when exploring for information to allow for evidence based decision making and ensuring quality outcomes. In this paper, we present a proof of concept for adopting data warehousing technology to aggregate and analyse disparate health data in order to understand the impact various lifestyle factors on obesity. We present a practical model for data warehousing with detailed explanation which can be adopted similarly for studying various other health issues.
Resumo:
Over recent decades, Australian piggeries have commonly employed anaerobic ponds to treat effluent to a standard suitable for recycling for shed flushing purposes and for irrigation onto nearby agricultural land. Anaerobic ponds are generally sized according to the Rational Design Standard (RDS) developed by Barth (1985), resulting in large ponds, which can be expensive to construct, occupy large land areas, and are difficult and expensive to desludge, potentially disrupting the whole piggery operation. Limited anecdotal and scientific evidence suggests that anaerobic ponds that are undersized according to the RDS, operate satisfactorily, without excessive odour emission, impaired biological function or high rates of solids accumulation. Based on these observations, this paper questions the validity of rigidly applying the principles of the RDS and presents a number of alternate design approaches resulting in smaller, more highly loaded ponds that are easier and cheaper to construct and manage. Based on limited data of pond odour emission, it is suggested that higher pond loading rates may reduce overall odour emission by decreasing the pond volume and surface area. Other management options that could be implemented to reduce pond volumes include permeable pond covers, various solids separation methods, and bio-digesters with impermeable covers, used in conjunction with biofilters and/or systems designed for biogas recovery. To ensure that new effluent management options are accepted by regulatory authorities, it is important for researchers to address both industry and regulator concerns and uncertainties regarding new technology, and to demonstrate, beyond reasonable doubt, that new technologies do not increase the risk of adverse impacts on the environment or community amenity. Further development of raw research outcomes to produce relatively simple, practical guidelines and implementation tools also increases the potential for acceptance and implementation of new technology by regulators and industry.
Resumo:
We share our experience in planning, designing and deploying a wireless sensor network of one square kilometre area. Environmental data such as soil moisture, temperature, barometric pressure, and relative humidity are collected in this area situated in the semi-arid region of Karnataka, India. It is a hope that information derived from this data will benefit the marginal farmer towards improving his farming practices. Soon after establishing the need for such a project, we begin by showing the big picture of such a data gathering network, the software architecture we have used, the range measurements needed for determining the sensor density, and the packaging issues that seem to play a crucial role in field deployments. Our field deployment experiences include designing with intermittent grid power, enhancing software tools to aid quicker and effective deployment, and flash memory corruption. The first results on data gathering look encouraging.