895 resultados para Data linkage


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research is a step forward in improving the accuracy of detecting anomaly in a data graph representing connectivity between people in an online social network. The proposed hybrid methods are based on fuzzy machine learning techniques utilising different types of structural input features. The methods are presented within a multi-layered framework which provides the full requirements needed for finding anomalies in data graphs generated from online social networks, including data modelling and analysis, labelling, and evaluation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we summarize our recent work in analyz- ing and predicting behaviors in sports using spatiotemporal data. We specifically focus on two recent works: 1) Predicting the location of shot in tennis using Hawk-Eye tennis data, and 2) Clustering spatiotemporal plays in soccer to discover the methods in which they get a shot on goal from a professional league.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a single pass algorithm for mining discriminative Itemsets in data streams using a novel data structure and the tilted-time window model. Discriminative Itemsets are defined as Itemsets that are frequent in one data stream and their frequency in that stream is much higher than the rest of the streams in the dataset. In order to deal with the data structure size, we propose a pruning process that results in the compact tree structure containing discriminative Itemsets. Empirical analysis shows the sound time and space complexity of the proposed method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rapid recursive estimation of hidden Markov Model (HMM) parameters is important in applications that place an emphasis on the early availability of reasonable estimates (e.g. for change detection) rather than the provision of longer-term asymptotic properties (such as convergence, convergence rate, and consistency). In the context of vision- based aircraft (image-plane) heading estimation, this paper suggests and evaluates the short-data estimation properties of 3 recursive HMM parameter estimation techniques (a recursive maximum likelihood estimator, an online EM HMM estimator, and a relative entropy based estimator). On both simulated and real data, our studies illustrate the feasibility of rapid recursive heading estimation, but also demonstrate the need for careful step-size design of HMM recursive estimation techniques when these techniques are intended for use in applications where short-data behaviour is paramount.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The upstream oil & gas industry has been contending with massive data sets and monolithic files for many years, but “Big Data”—that is, the ability to apply more sophisticated types of analytical tools to information in a way that extracts new insights or creates new forms of value—is a relatively new concept that has the potential to significantly re-shape the industry. Despite the impressive amount of value that is being realized by Big Data technologies in other parts of the marketplace, however, much of the data collected within the oil & gas sector tends to be discarded, ignored, or analyzed in a very cursory way. This paper examines existing data management practices in the upstream oil & gas industry, and compares them to practices and philosophies that have emerged in organizations that are leading the Big Data revolution. The comparison shows that, in companies that are leading the Big Data revolution, data is regarded as a valuable asset. The presented evidence also shows, however, that this is usually not true within the oil & gas industry insofar as data is frequently regarded there as descriptive information about a physical asset rather than something that is valuable in and of itself. The paper then discusses how upstream oil & gas companies could potentially extract more value from data, and concludes with a series of specific technical and management-related recommendations to this end.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Heterogeneous health data is a critical issue when managing health information for quality decision making processes. In this paper we examine the efficient aggregation of lifestyle information through a data warehousing architecture lens. We present a proof of concept for a clinical data warehouse architecture that enables evidence based decision making processes by integrating and organising disparate data silos in support of healthcare services improvement paradigms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identifying product families has been considered as an effective way to accommodate the increasing product varieties across the diverse market niches. In this paper, we propose a novel framework to identifying product families by using a similarity measure for a common product design data BOM (Bill of Materials) based on data mining techniques such as frequent mining and clus-tering. For calculating the similarity between BOMs, a novel Extended Augmented Adjacency Matrix (EAAM) representation is introduced that consists of information not only of the content and topology but also of the fre-quent structural dependency among the various parts of a product design. These EAAM representations of BOMs are compared to calculate the similarity between products and used as a clustering input to group the product fami-lies. When applied on a real-life manufacturing data, the proposed framework outperforms a current baseline that uses orthogonal Procrustes for grouping product families.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Interpolation techniques for spatial data have been applied frequently in various fields of geosciences. Although most conventional interpolation methods assume that it is sufficient to use first- and second-order statistics to characterize random fields, researchers have now realized that these methods cannot always provide reliable interpolation results, since geological and environmental phenomena tend to be very complex, presenting non-Gaussian distribution and/or non-linear inter-variable relationship. This paper proposes a new approach to the interpolation of spatial data, which can be applied with great flexibility. Suitable cross-variable higher-order spatial statistics are developed to measure the spatial relationship between the random variable at an unsampled location and those in its neighbourhood. Given the computed cross-variable higher-order spatial statistics, the conditional probability density function (CPDF) is approximated via polynomial expansions, which is then utilized to determine the interpolated value at the unsampled location as an expectation. In addition, the uncertainty associated with the interpolation is quantified by constructing prediction intervals of interpolated values. The proposed method is applied to a mineral deposit dataset, and the results demonstrate that it outperforms kriging methods in uncertainty quantification. The introduction of the cross-variable higher-order spatial statistics noticeably improves the quality of the interpolation since it enriches the information that can be extracted from the observed data, and this benefit is substantial when working with data that are sparse or have non-trivial dependence structures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Echology: Making Sense of Data initiative seeks to break new ground in arts practice by asking artists to innovate with respect to a) the possible forms of data representation in public art and b) the artist's role in engaging publics on environmental sustainability in new urban developments. Initiated by ANAT and Carbon Arts in 2011, Echology has seen three artists selected by National competition in 2012 for Lend Lease sites across Australia. In 2013 commissioning of one of these works, the Mussel Choir by Natalie Jeremijenko, began in Melbourne's Victoria Harbour development. This emerging practice of data - driven and environmentally engaged public artwork presents multiple challenges to established systems of public arts production and management, at the same time as offering up new avenues for artists to forge new modes of collaboration. The experience of Echology and in particular, the Mussel Choir is examined here to reveal opportunities for expansion of this practice through identification of the factors that lead to a resilient 'ecology of part nership' between stakeholders that include science and technology researchers, education providers, city administrators, and urban developers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Discovering the means to prevent and cure schizophrenia is a vision that motivates many scientists. But in order to achieve this goal, we need to understand its neurobiological basis. The emergent metadiscipline of cognitive neuroscience fields an impressive array of tools that can be marshaled towards achieving this goal, including powerful new methods of imaging the brain (both structural and functional) as well as assessments of perceptual and cognitive capacities based on psychophysical procedures, experimental tasks and models developed by cognitive science. We believe that the integration of data from this array of tools offers the greatest possibilities and potential for advancing understanding of the neural basis of not only normal cognition but also the cognitive impairments that are fundamental to schizophrenia. Since sufficient expertise in the application of these tools and methods rarely reside in a single individual, or even a single laboratory, collaboration is a key element in this endeavor. Here, we review some of the products of our integrative efforts in collaboration with our colleagues on the East Coast of Australia and Pacific Rim. This research focuses on the neural basis of executive function deficits and impairments in early auditory processing in patients using various combinations of performance indices (from perceptual and cognitive paradigms), ERPs, fMRI and sMRI. In each case, integration of two or more sources of information provides more information than any one source alone by revealing new insights into structure-function relationships. Furthermore, the addition of other imaging methodologies (such as DTI) and approaches (such as computational models of cognition) offers new horizons in human brain imaging research and in understanding human behavior.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Although there are many potential new insights to be gained through advancing research on the clients of male sex workers, significant social, ethical and methodological challenges to accessing this population exist. This research project case explores our attempts to recruit a population that does not typically form a cohesive or coherent 'community' and often avoids self-identifying to mitigate the stigma attached to buying sex. We used an arms-length recruitment campaign that focussed on directing potential participants to our study website, which could in turn lead them to participate in an anonymous telephone interview. Barriers to reaching male sex-work clients, however, demanded the evolution of our recruitment strategy. New technologies are part of the solution to accessing a hard-to-reach population, but they only work if researchers engage responsively. We also show how we conducted an in-depth interview with a client and discuss the value of using secondary data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A tag-based item recommendation method generates an ordered list of items, likely interesting to a particular user, using the users past tagging behaviour. However, the users tagging behaviour varies in different tagging systems. A potential problem in generating quality recommendation is how to build user profiles, that interprets user behaviour to be effectively used, in recommendation models. Generally, the recommendation methods are made to work with specific types of user profiles, and may not work well with different datasets. In this paper, we investigate several tagging data interpretation and representation schemes that can lead to building an effective user profile. We discuss the various benefits a scheme brings to a recommendation method by highlighting the representative features of user tagging behaviours on a specific dataset. Empirical analysis shows that each interpretation scheme forms a distinct data representation which eventually affects the recommendation result. Results on various datasets show that an interpretation scheme should be selected based on the dominant usage in the tagging data (i.e. either higher amount of tags or higher amount of items present). The usage represents the characteristic of user tagging behaviour in the system. The results also demonstrate how the scheme is able to address the cold-start user problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bactrocera papayae Drew & Hancock, Bactrocera philippinensis Drew & Hancock, Bactrocera carambolae Drew & Hancock, and Bactrocera invadens Drew, Tsuruta & White are four horticultural pest tephritid fruit fly species that are highly similar, morphologically and genetically, to the destructive pest, the Oriental fruit fly, Bactrocera dorsalis (Hendel) (Diptera: Tephritidae). This similarity has rendered the discovery of reliable diagnostic characters problematic, which, in view of the economic importance of these taxa and the international trade implications, has resulted in ongoing difficulties for many areas of plant protection and food security. Consequently, a major international collaborative and integrated multidisciplinary research effort was initiated in 2009 to build upon existing literature with the specific aim of resolving biological species limits among B. papayae, B. philippinensis, B. carambolae, B. invadens and B. dorsalis to overcome constraints to pest management and international trade. Bactrocera philippinensis has recently been synonymized with B. papayae as a result of this initiative and this review corroborates that finding; however, the other names remain in use. While consistent characters have been found to reliably distinguish B. carambolae from B. dorsalis, B. invadens and B. papayae, no such characters have been found to differentiate the latter three putative species. We conclude that B. carambolae is a valid species and that the remaining taxa, B. dorsalis, B. invadens and B. papayae, represent the same species. Thus, we consider B. dorsalis (Hendel) as the senior synonym of B. papayae Drew and Hancock syn.n. and B. invadens Drew, Tsuruta & White syn.n. A redescription of B. dorsalis is provided. Given the agricultural importance of B. dorsalis, this taxonomic decision will have significant global plant biosecurity implications, affecting pest management, quarantine, international trade, postharvest treatment and basic research. Throughout the paper, we emphasize the value of independent and multidisciplinary tools in delimiting species, particularly in complicated cases involving morphologically cryptic taxa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a novel program parallelization technique incorporating with dynamic and static scheduling. It utilizes a problem specific pattern developed from the prior knowledge of the targeted problem abstraction. Suitable for solving complex parallelization problems such as data intensive all-to-all comparison constrained by memory, the technique delivers more robust and faster task scheduling compared to the state-of-the art techniques. Good performance is achieved from the technique in data intensive bioinformatics applications.