Biblioteca Digital

211 resultados para Data structures (Computer science)

An efficient tagging data interpretation and representation scheme for item recommendation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A tag-based item recommendation method generates an ordered list of items, likely interesting to a particular user, using the users past tagging behaviour. However, the users tagging behaviour varies in different tagging systems. A potential problem in generating quality recommendation is how to build user profiles, that interprets user behaviour to be effectively used, in recommendation models. Generally, the recommendation methods are made to work with specific types of user profiles, and may not work well with different datasets. In this paper, we investigate several tagging data interpretation and representation schemes that can lead to building an effective user profile. We discuss the various benefits a scheme brings to a recommendation method by highlighting the representative features of user tagging behaviours on a specific dataset. Empirical analysis shows that each interpretation scheme forms a distinct data representation which eventually affects the recommendation result. Results on various datasets show that an interpretation scheme should be selected based on the dominant usage in the tagging data (i.e. either higher amount of tags or higher amount of items present). The usage represents the characteristic of user tagging behaviour in the system. The results also demonstrate how the scheme is able to address the cold-start user problem.

Memory management and parallelization of data intensive all-to-all comparison in shared-memory systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents a novel program parallelization technique incorporating with dynamic and static scheduling. It utilizes a problem specific pattern developed from the prior knowledge of the targeted problem abstraction. Suitable for solving complex parallelization problems such as data intensive all-to-all comparison constrained by memory, the technique delivers more robust and faster task scheduling compared to the state-of-the art techniques. Good performance is achieved from the technique in data intensive bioinformatics applications.

Practical analysis of big acoustic sensor data for environmental monitoring

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring the environment with acoustic sensors is an effective method for understanding changes in ecosystems. Through extensive monitoring, large-scale, ecologically relevant, datasets can be produced that can inform environmental policy. The collection of acoustic sensor data is a solved problem; the current challenge is the management and analysis of raw audio data to produce useful datasets for ecologists. This paper presents the applied research we use to analyze big acoustic datasets. Its core contribution is the presentation of practical large-scale acoustic data analysis methodologies. We describe details of the data workflows we use to provide both citizen scientists and researchers practical access to large volumes of ecoacoustic data. Finally, we propose a work in progress large-scale architecture for analysis driven by a hybrid cloud-and-local production-grade website.

Navigating pathways for academic staff development : implications for institutions and academic ranks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND As engineering schools adopt outcomes - focused learning approaches in response to government expectations and industry requirements of graduates capable of learning and applying knowledge in different contexts, university academics must be capable of developing and delivering programs that meet these requirements. Those academics are increasingly facing challenges in progressing their research and also acquiring different skill sets to meet the learning and teaching requirements. PURPOSE The goal of this study was to identify the types of development and support structures in place for academic staff, especially early career ones, and examine how the type of institution and the rank or role of the staff member affects these structures. DESIGN/METHOD We conducted semi - structured interviews with 21 individuals in a range of positions pertaining to teaching and learning in engineering education. Open coding was used to identify main themes from the guiding questions raised in the interviews and refined to address themes relevant to the development of institutional staff . The interview data was then analysed based on the type of institution and the rank/ role of the participant. RESULTS While development programs that focus on improving teaching and learning are available, the approach on using these types of programs differed based on staff perspective. Fewer academics, regardless of rank/role, had knowledge of support structures related to other areas of scholarship, e.g. disciplinary research, educational research, learning the institutional culture. The type of institution also impacted how they weighted and encouraged multiple forms of scholarship. We found that academic staff holding higher ranking positions, e.g. dean or associate dean, were not only concerned with the success of their respective programs, but also in how to promote other academic staff participation throughout the process. CONCLUSIONS The findings from this stud y extend the premise that developing effective academic staff ultimately leads to more effective institutions and successful graduates and accomplishing this requires staff buy - in at multiple stages of instructional and program development. Staff and administration developing approaches for educational innovation together (Besterfield - Sacre et al., 2014) and getting buy - in from all academic staff to invest in engineering education development will ultimately lead to more successful engineering graduates.

Robust clustering of multi-type relational data via a heterogeneous manifold ensemble

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-Order Co-Clustering (HOCC) methods have attracted high attention in recent years because of their ability to cluster multiple types of objects simultaneously using all available information. During the clustering process, HOCC methods exploit object co-occurrence information, i.e., inter-type relationships amongst different types of objects as well as object affinity information, i.e., intra-type relationships amongst the same types of objects. However, it is difficult to learn accurate intra-type relationships in the presence of noise and outliers. Existing HOCC methods consider the p nearest neighbours based on Euclidean distance for the intra-type relationships, which leads to incomplete and inaccurate intra-type relationships. In this paper, we propose a novel HOCC method that incorporates multiple subspace learning with a heterogeneous manifold ensemble to learn complete and accurate intra-type relationships. Multiple subspace learning reconstructs the similarity between any pair of objects that belong to the same subspace. The heterogeneous manifold ensemble is created based on two-types of intra-type relationships learnt using p-nearest-neighbour graph and multiple subspaces learning. Moreover, in order to make sure the robustness of clustering process, we introduce a sparse error matrix into matrix decomposition and develop a novel iterative algorithm. Empirical experiments show that the proposed method achieves improved results over the state-of-art HOCC methods for FScore and NMI.

Proportions, Percentages, PPM: Do The Molecular Biosciences Treat Compositional Data Right?

Relevância:

100.00% 100.00%

Publicador:

On the need for an international effort to capture, share and use crystallization screening data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When crystallization screening is conducted many outcomes are observed but typically the only trial recorded in the literature is the condition that yielded the crystal(s) used for subsequent diffraction studies. The initial hit that was optimized and the results of all the other trials are lost. These missing results contain information that would be useful for an improved general understanding of crystallization. This paper provides a report of a crystallization data exchange (XDX) workshop organized by several international large-scale crystallization screening laboratories to discuss how this information may be captured and utilized. A group that administers a significant fraction of the worlds crystallization screening results was convened, together with chemical and structural data informaticians and computational scientists who specialize in creating and analysing large disparate data sets. The development of a crystallization ontology for the crystallization community was proposed. This paper (by the attendees of the workshop) provides the thoughts and rationale leading to this conclusion. This is brought to the attention of the wider audience of crystallographers so that they are aware of these early efforts and can contribute to the process going forward. © 2012 International Union of Crystallography All rights reserved.

Compositional data analysis (CoDA) approaches to distance in information retrieval

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many techniques in information retrieval produce counts from a sample, and it is common to analyse these counts as proportions of the whole - term frequencies are a familiar example. Proportions carry only relative information and are not free to vary independently of one another: for the proportion of one term to increase, one or more others must decrease. These constraints are hallmarks of compositional data. While there has long been discussion in other fields of how such data should be analysed, to our knowledge, Compositional Data Analysis (CoDA) has not been considered in IR. In this work we explore compositional data in IR through the lens of distance measures, and demonstrate that common measures, naïve to compositions, have some undesirable properties which can be avoided with composition-aware measures. As a practical example, these measures are shown to improve clustering. Copyright 2014 ACM.

What patient information allows us to make accurate predictions of outcome?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Only some of the information contained in a medical record will be useful to the prediction of patient outcome. We describe a novel method for selecting those outcome predictors which allow us to reliably discriminate between adverse and benign end results. Using the area under the receiver operating characteristic as a nonparametric measure of discrimination, we show how to calculate the maximum discrimination attainable with a given set of discrete valued features. This upper limit forms the basis of our feature selection algorithm. We use the algorithm to select features (from maternity records) relevant to the prediction of failure to progress in labour. The results of this analysis motivate investigation of those predictors of failure to progress relevant to parous and nulliparous sub-populations.

A data mining based method for discovery of web services and their compositions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the availability of huge number of web services, finding an appropriate Web service according to the requirements of a service consumer is still a challenge. Moreover, sometimes a single web service is unable to fully satisfy the requirements of the service consumer. In such cases, combinations of multiple inter-related web services can be utilised. This paper proposes a method that first utilises a semantic kernel model to find related services and then models these related Web services as nodes of a graph. An all-pair shortest-path algorithm is applied to find the best compositions of Web services that are semantically related to the service consumer requirement. The recommendation of individual and composite Web services composition for a service request is finally made. Empirical evaluation confirms that the proposed method significantly improves the accuracy of service discovery in comparison to traditional keyword-based discovery methods.

A data analytics application assessing pavement deflection factors for a road network

Relevância:

100.00% 100.00%

Publicador:

Integrating field survey data with satellite image data to improve shallow water seagrass maps : the role of AUV and snorkeller surveys

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Repeatable and accurate seagrass mapping is required for understanding seagrass ecology and supporting management decisions. For shallow (< 5 m) seagrass habitats, these maps can be created by integrating high spatial resolution imagery with field survey data. Field survey data for seagrass is often collected via snorkelling or diving. However, these methods are limited by environmental and safety considerations. Autonomous Underwater Vehicles (AUVs) are used increasingly to collect field data for habitat mapping, albeit mostly in deeper waters (>20 m). Here we demonstrate and evaluate the use and potential advantages of AUV field data collection for calibration and validation of seagrass habitat mapping of shallow waters (< 5 m), from multispectral satellite imagery. The study was conducted in the seagrass habitats of the Eastern Banks (142 km2), Moreton Bay, Australia. In the field, georeferenced photos of the seagrass were collected along transects via snorkelling or an AUV. Photos from both collection methods were analysed manually for seagrass species composition and then used as calibration and validation data to map seagrass using an established semi-automated object based mapping routine. A comparison of the relative advantages and disadvantages of AUV and snorkeller collected field data sets and their influence on the mapping routine was conducted. AUV data collection was more consistent, repeatable and safer in comparison to snorkeller transects. Inclusion of deeper water AUV data resulted in mapping of a larger extent of seagrass (~7 km2, 5 % of study area) in the deeper waters of the site. Although overall map accuracies did not differ considerably, inclusion of the AUV data from deeper water transects corrected errors in seagrass mapped at depths to 5 m, but where the bottom is visible on satellite imagery. Our results demonstrate that further development of AUV technology is justified for the monitoring of seagrass habitats in ongoing management programs.

Automated sensory data alignment for environmental and epidermal change monitoring

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present research adapting a state of the art condition-invariant robotic place recognition algorithm to the role of automated inter- and intra-image alignment of sensor observations of environmental and skin change over time. The approach involves inverting the typical criteria placed upon navigation algorithms in robotics; we exploit rather than attempt to fix the limited camera viewpoint invariance of such algorithms, showing that approximate viewpoint repetition is realistic in a wide range of environments and medical applications. We demonstrate the algorithms automatically aligning challenging visual data from a range of real-world applications: ecological monitoring of environmental change, aerial observation of natural disasters including flooding, tsunamis and bushfires and tracking wound recovery and sun damage over time and present a prototype active guidance system for enforcing viewpoint repetition. We hope to provide an interesting case study for how traditional research criteria in robotics can be inverted to provide useful outcomes in applied situations.

Representing team behaviours from noisy data using player role

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to their unobtrusive nature, vision-based approaches to tracking sports players have been preferred over wearable sensors as they do not require the players to be instrumented for each match. Unfortunately however, due to the heavy occlusion between players, variation in resolution and pose, in addition to fluctuating illumination conditions, tracking players continuously is still an unsolved vision problem. For tasks like clustering and retrieval, having noisy data (i.e. missing and false player detections) is problematic as it generates discontinuities in the input data stream. One method of circumventing this issue is to use an occupancy map, where the field is discretised into a series of zones and a count of player detections in each zone is obtained. A series of frames can then be concatenated to represent a set-play or example of team behaviour. A problem with this approach though is that the compressibility is low (i.e. the variability in the feature space is incredibly high). In this paper, we propose the use of a bilinear spatiotemporal basis model using a role representation to clean-up the noisy detections which operates in a low-dimensional space. To evaluate our approach, we used a fully instrumented field-hockey pitch with 8 fixed high-definition (HD) cameras and evaluated our approach on approximately 200,000 frames of data from a state-of-the-art real-time player detector and compare it to manually labeled data.

A multidimensional collaborative filtering fusion approach with dimensionality reduction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multidimensional data are getting increasing attention from researchers for creating better recommender systems in recent years. Additional metadata provides algorithms with more details for better understanding the interaction between users and items. While neighbourhood-based Collaborative Filtering (CF) approaches and latent factor models tackle this task in various ways effectively, they only utilize different partial structures of data. In this paper, we seek to delve into different types of relations in data and to understand the interaction between users and items more holistically. We propose a generic multidimensional CF fusion approach for top-N item recommendations. The proposed approach is capable of incorporating not only localized relations of user-user and item-item but also latent interaction between all dimensions of the data. Experimental results show significant improvements by the proposed approach in terms of recommendation accuracy.

«
1
2
...
7
8
9
10
11
12
13
14
15
»