98 resultados para data types and operators


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Online social networks make it easier for people to find and communicate with other people based on shared interests, values, membership in particular groups, etc. Common social networks such as Facebook and Twitter have hundreds of millions or even billions of users scattered all around the world sharing interconnected data. Users demand low latency access to not only their own data but also theirfriends’ data, often very large, e.g. videos, pictures etc. However, social network service providers have a limited monetary capital to store every piece of data everywhere to minimise users’ data access latency. Geo-distributed cloud services with virtually unlimited capabilities are suitable for large scale social networks data storage in different geographical locations. Key problems including how to optimally store and replicate these huge datasets and how to distribute the requests to different datacenters are addressed in this paper. A novel genetic algorithm-based approach is used to find a near-optimal number of replicas for every user’s data and a near-optimal placement of replicas to minimise monetary cost while satisfying latency requirements for all users. Experiments on a large Facebook dataset demonstrate our technique’s effectiveness in outperforming other representative placement and replication strategies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With emerging trends for Internet of Things (IoT) and Smart Cities, complex data transformation, aggregation and visualization problems are becoming increasingly common. These tasks support improved business intelligence, analytics and enduser access to data. However, in most cases developers of these tasks are presented with challenging problems including noisy data, diverse data formats, data modeling and increasing demand for sophisticated visualization support. This paper describes our experiences with just such problems in the context of Household Travel Surveys data integration and harmonization. We describe a common approach for addressing these harmonizations. We then discuss a set of lessons that we have learned from our experience that we hope will be useful for others embarking on similar problems. We also identify several key directions and needs for future research and practical support in this area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The complex mutualistic relationship between the cleaner fish (Labroides dimidiatus) and their 'clients' in many reef systems throughout the world has been the subject of debate and research interest for decades. Game-theory models have long struggled with explaining how the mixed strategies of cheating and honesty might have evolved in such a system and while significant efforts have been made theoretically, demonstrating the nature of this relationship empirically remains an important research challenge. Using the experimental framework of behavioural syndromes, we sought to quantitatively assess the relationship between personality and the feeding ecology of cleaner fish to provide novel insights into the underlying mechanistic basis of cheating in cleaner-client interactions. First, we observed and filmed cleaner fish interactions with heterospecifics, movement patterns and general feeding ecology in the wild. We then captured and measured all focal individuals and tested them for individual consistency in measures of activity, exploration and risk taking (boldness) in the laboratory. Our results suggest a syndrome incorporating aspects of personality and foraging effort are central components of the behavioural ecology of L. dimidiatus on the Great Barrier Reef. We found that individuals that exhibited greater feeding effort tended to cheat proportionately less and move over smaller distances relative to bolder more active, exploratory individuals. Our study demonstrates for the first time that individual differences in personality might be mechanistically involved in explaining how the mixed strategies of cheating and honesty persist in cleaner fish mutualisms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recommendation systems adopt various techniques to recommend ranked lists of items to help users in identifying items that fit their personal tastes best. Among various recommendation algorithms, user and item-based collaborative filtering methods have been very successful in both industry and academia. More recently, the rapid growth of the Internet and E-commerce applications results in great challenges for recommendation systems as the number of users and the amount of available online information have been growing too fast. These challenges include performing high quality recommendations per second for millions of users and items, achieving high coverage under the circumstance of data sparsity and increasing the scalability of recommendation systems. To obtain higher quality recommendations under the circumstance of data sparsity, in this paper, we propose a novel method to compute the similarity of different users based on the side information which is beyond user-item rating information from various online recommendation and review sites. Furthermore, we take the special interests of users into consideration and combine three types of information (users, items, user-items) to predict the ratings of items. Then FUIR, a novel recommendation algorithm which fuses user and item information, is proposed to generate recommendation results for target users. We evaluate our proposed FUIR algorithm on three data sets and the experimental results demonstrate that our FUIR algorithm is effective against sparse rating data and can produce higher quality recommendations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evolutionary algorithms (EAs) have recently been suggested as candidate for solving big data optimisation problems that involve very large number of variables and need to be analysed in a short period of time. However, EAs face scalability issue when dealing with big data problems. Moreover, the performance of EAs critically hinges on the utilised parameter values and operator types, thus it is impossible to design a single EA that can outperform all other on every problem instances. To address these challenges, we propose a heterogeneous framework that integrates a cooperative co-evolution method with various types of memetic algorithms. We use the cooperative co-evolution method to split the big problem into sub-problems in order to increase the efficiency of the solving process. The subproblems are then solved using various heterogeneous memetic algorithms. The proposed heterogeneous framework adaptively assigns, for each solution, different operators, parameter values and local search algorithm to efficiently explore and exploit the search space of the given problem instance. The performance of the proposed algorithm is assessed using the Big Data 2015 competition benchmark problems that contain data with and without noise. Experimental results demonstrate that the proposed algorithm, with the cooperative co-evolution method, performs better than without cooperative co-evolution method. Furthermore, it obtained very competitive results for all tested instances, if not better, when compared to other algorithms using a lower computational times.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports on the development of the Humanities Networked Infrastructure (HuNI), a service which aggregates data from thirty Australian data sources and makes them available for use by researchers across the humanities and creative arts, and more widely by the general public. We discuss the methods used by HuNI to aggregate data, as well as the conceptual framework which has shaped the design of HuNI’s Data Model around six core entity types. Two of the key functions available to users of HuNI – building collections and creating links – are discussed, together with their design rationale.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Research on the health and wellbeing benefits of contact with animals and plants indicates the natural environment may have significant positive psychological and physiological effects on human health and wellbeing. In terms of children, studies have demonstrated that children function better cognitively and emotionally in 'green' environments and have more creative play. In Australia as well as internationally, many schools appear to be incorporating nature-based activities into their curricula, mostly via sustainability education. Although these programs appear to be successful, few have been evaluated, particularly in terms of the potential benefits to health and wellbeing. This paper reports on a pilot survey investigating the mental health benefits of contact with nature for primary school children in Melbourne, Australia. A survey of principals and teachers was conducted in urban primary schools within a 20km radius of Melbourne. As well as gathering data on the types and extent of environmental and other nature-based activities in the sample schools, items addressing the perceptions of principals and teachers of the potential effects of these activities on children's mental health and wellbeing were also included. Despite a lower than expected response rate, some interesting findings emerged. Although preliminary, results indicate that participants' perceptions of the benefits to mental health and wellbeing from participation in hands-on nature based activities at their school are positive and encompass many aspects of mental health.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automating Software Engineering is the dream of software Engineers for decades. To make this dream to come to true, data mining can play an important role. Our recent research has shown that to increase the productivity and to reduce the cost of software development, it is essential to have an effective and efficient mechanism to store, manage and utilize existing software resources, and thus to automate software analysis, testing, evaluation and to make use of existing software for new problems. This paper firstly provides a brief overview of traditional data mining followed by a presentation on data mining in broader sense. Secondly, it presents the idea and the technology of software warehouse as an innovative approach in managing software resources using the idea of data warehouse where software assets are systematically accumulated, deposited, retrieved, packaged, managed and utilized driven by data mining and OLAP technologies. Thirdly, we presented the concepts and technology and their applications of data mining and data matrix including software warehouse to software engineering. The perspectives of the role of software warehouse and software mining in modern software development are addressed. We expect that the results will lead to a streamlined high efficient software development process and enhance the productivity in response to modern challenges of the design and development of software applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper conducts productivity and efficiency analysis of banks operating in Australia since the deregulation of the Australian financial system in early 1980s. Applying data envelopment analysis (DEA), with a moving window, the Malmquist indices are determined in order to investigate the levels of and the changes in the efficiency of Australian banks over the period from 1983 to 200 I. The DEA window analysis is adopted in order to relieve the small sample problem that in previous studies has proved problematic in the study of the Australian banking sector. The pal1icular window used in this case has been carefully designed to ensure the robustness of the efficiencies scores to changes in the window width. A second-stage regression is conducted by using the unconditional bootstrap approach suggested by Xue and Harker (1999) to overcome the dependency and heteroskedasticity of DE A efficiency scores. The empirical results demonstrate the effect of deregulation on the performance of individual banks, banks of different organizational types and the entire Australian banking sector.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Past research has identified differences between online and mail collected responses to the same survey, but differences in the demographics of respondents had also been noted making the cause of the variation unclear. In the research reported here, responses to the same questionnaire, delivered via mail and internet surveys, were demographically matched across a range of variables. This removed the impact of response differences caused by age, gender, type of product consumed and length of customer relationship. Across all the different question types and response scales, significant differences were still found between mail and online respondents, even when data were ipsatised. Notably, online respondents were far less likely to use the end-points of the scale, perhaps indicating issues with the online collection methodology. The conclusion is that the two methods of data collection can not be assumed to be directly inter-changeable, and that the method used can lead to different results if not managed carefully.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There has been much debate over the optimal format for scales, particularly in regard to two key issues - the labelling of points and the overall length of response scales. This paper reviews the evidence regarding the advantages of different scale types and lengths, and provides guidance as to what scale types suit different research objectives. Using a direct comparison of 400 responses on 5-point and 11 -point scales to the same question, by the same people, we examine some of the important differences previously found and then illustrate the impact they have on data quality and useability. Our conclusion, based on past research and our own analysis, is that longer, balanced and unlabelled scales offer the maximum flexibility and reliability in the majority of cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aim. This paper reports a study to determine nurses' levels of agreement using a standard 5-point triage scale and to explore the influence of task properties and subjectivity on decision-making consistency.

Background. Triage scales are used to define time-to-treatment in hospital emergency departments. Studies of the inter-rater reliability of these scales using paper-based simulation methods report varying levels of consistency. Understanding how various components of the decision task and individual perceptions of the case influence agreement is critical to the development of strategies to improve consistency of triage.

Method. Simulations were constructed from naturalistic observation, cue types and frequencies were classified. Data collection was conducted in 2002, and the final response rate was 41·3%. Participants were asked to allocate an urgency code for 12 scenarios using the Australasian Triage Scale, and provide estimates of case complexity, levels of certainty and available information. Data were analysed descriptively, agreement between raters was calculated using kappa. The influence of task properties and participants' subjective estimates of case complexity, levels of certainty and available information on agreement were explored using a general linear model.

Findings. Agreement among raters varied from moderate to poor (κ = 0·18–0·64). Participants' subjective estimates of levels of available information were found to influence consistency of triage by statistically significant amounts (F 5·68; ≤0·01).

Conclusions. Strategies employed to optimize consistency of triage should focus on improving the quality of the simulations that are used. In particular, attention should be paid to the development of interactive simulations that will accommodate individual differences in information-seeking behaviour.


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Microarray data classification is one of the most important emerging clinical applications in the medical community. Machine learning algorithms are most frequently used to complete this task. We selected one of the state-of-the-art kernel-based algorithms, the support vector machine (SVM), to classify microarray data. As a large number of kernels are available, a significant research question is what is the best kernel for patient diagnosis based on microarray data classification using SVM? We first suggest three solutions based on data visualization and quantitative measures. Different types of microarray problems then test the proposed solutions. Finally, we found that the rule-based approach is most useful for automatic kernel selection for SVM to classify microarray data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background:  Whether calcium supplementation can reduce osteoporotic fractures is uncertain. We did a meta-analysis to include all the randomised trials in which calcium, or calcium in combination with vitamin D, was used to prevent fracture and osteoporotic bone loss.

Methods:  We identified 29 randomised trials (n=63 897) using electronic databases, supplemented by a hand-search of reference lists, review articles, and conference abstracts. All randomised trials that recruited people aged 50 years or older were eligible. The main outcomes were fractures of all types and percentage change of bone-mineral density from baseline. Data were pooled by use of a random-effect model.

Findings:  In trials that reported fracture as an outcome (17 trials, n=52 625), treatment was associated with a 12% risk reduction in fractures of all types (risk ratio 0·88, 95% CI 0·83–0·95; p=0·0004). In trials that reported bone-mineral density as an outcome (23 trials, n=41 419), the treatment was associated with a reduced rate of bone loss of 0·54% (0·35–0·73; p<0·0001) at the hip and 1·19% (0·76–1·61%; p<0·0001) in the spine. The fracture risk reduction was significantly greater (24%) in trials in which the compliance rate was high (p<0·0001). The treatment effect was better with calcium doses of 1200 mg or more than with doses less than 1200 mg (0·80 vs 0·94; p=0·006), and with vitamin D doses of 800 IU or more than with doses less than 800 IU (0·84 vs 0·87; p=0·03).

Interpretation:  Evidence supports the use of calcium, or calcium in combination with vitamin D supplementation, in the preventive treatment of osteoporosis in people aged 50 years or older. For best therapeutic effect, we recommend minimum doses of 1200 mg of calcium, and 800 IU of vitamin D (for combined calcium plus vitamin D supplementation).