306 resultados para data models

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Here we present a sequential Monte Carlo approach to Bayesian sequential design for the incorporation of model uncertainty. The methodology is demonstrated through the development and implementation of two model discrimination utilities; mutual information and total separation, but it can also be applied more generally if one has different experimental aims. A sequential Monte Carlo algorithm is run for each rival model (in parallel), and provides a convenient estimate of the marginal likelihood (of each model) given the data, which can be used for model comparison and in the evaluation of utility functions. A major benefit of this approach is that it requires very little problem specific tuning and is also computationally efficient when compared to full Markov chain Monte Carlo approaches. This research is motivated by applications in drug development and chemical engineering.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Two distinct maintenance-data-models are studied: a government Enterprise Resource Planning (ERP) maintenance-data-model, and the Software Engineering Industries (SEI) maintenance-data-model. The objective is to: (i) determine whether the SEI maintenance-data-model is sufficient in the context of ERP (by comparing with an ERP case), (ii) identify whether the ERP maintenance-data-model in this study has adequately captured the essential and common maintenance attributes (by comparing with the SEI), and (iii) proposed a new ERP maintenance-data-model as necessary. Our findings suggest that: (i) there are variations to the SEI model in an ERP-context, and (ii) there are rooms for improvements in our ERP case’s maintenance-data-model. Thus, a new ERP maintenance-data-model capturing the fundamental ERP maintenance attributes is proposed. This model is imperative for: (i) enhancing the reporting and visibility of maintenance activities, (ii) monitoring of the maintenance problems, resolutions and performance, and (iii) helping maintenance manager to better manage maintenance activities and make well-informed maintenance decisions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An educational priority of many nations is to enhance mathematical learning in early childhood. One area in need of special attention is that of statistics. This paper argues for a renewed focus on statistical reasoning in the beginning school years, with opportunities for children to engage in data modelling activities. Such modelling involves investigations of meaningful phenomena, deciding what is worthy of attention (i.e., identifying complex attributes), and then progressing to organising, structuring, visualising, and representing data. Results are reported from the first year of a three-year longitudinal study in which three classes of first-grade children and their teachers engaged in activities that required the creation of data models. The theme of “Looking after our Environment,” a component of the children’s science curriculum at the time, provided the context for the activities. Findings focus on how the children dealt with given complex attributes and how they generated their own attributes in classifying broad data sets, and the nature of the models the children created in organising, structuring, and representing their data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

It is a big challenge to acquire correct user profiles for personalized text classification since users may be unsure in providing their interests. Traditional approaches to user profiling adopt machine learning (ML) to automatically discover classification knowledge from explicit user feedback in describing personal interests. However, the accuracy of ML-based methods cannot be significantly improved in many cases due to the term independence assumption and uncertainties associated with them. This paper presents a novel relevance feedback approach for personalized text classification. It basically applies data mining to discover knowledge from relevant and non-relevant text and constraints specific knowledge by reasoning rules to eliminate some conflicting information. We also developed a Dempster-Shafer (DS) approach as the means to utilise the specific knowledge to build high-quality data models for classification. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics support that the proposed technique achieves encouraging performance in comparing with the state-of-the-art relevance feedback models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Search log data is multi dimensional data consisting of number of searches of multiple users with many searched parameters. This data can be used to identify a user’s interest in an item or object being searched. Identifying highest interests of a Web user from his search log data is a complex process. Based on a user’s previous searches, most recommendation methods employ two-dimensional models to find relevant items. Such items are then recommended to a user. Two-dimensional data models, when used to mine knowledge from such multi dimensional data may not be able to give good mappings of user and his searches. The major problem with such models is that they are unable to find the latent relationships that exist between different searched dimensions. In this research work, we utilize tensors to model the various searches made by a user. Such high dimensional data model is then used to extract the relationship between various dimensions, and find the prominent searched components. To achieve this, we have used popular tensor decomposition methods like PARAFAC, Tucker and HOSVD. All experiments and evaluation is done on real datasets, which clearly show the effectiveness of tensor models in finding prominent searched components in comparison to other widely used two-dimensional data models. Such top rated searched components are then given as recommendation to users.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Handling information overload online, from the user's point of view is a big challenge, especially when the number of websites is growing rapidly due to growth in e-commerce and other related activities. Personalization based on user needs is the key to solving the problem of information overload. Personalization methods help in identifying relevant information, which may be liked by a user. User profile and object profile are the important elements of a personalization system. When creating user and object profiles, most of the existing methods adopt two-dimensional similarity methods based on vector or matrix models in order to find inter-user and inter-object similarity. Moreover, for recommending similar objects to users, personalization systems use the users-users, items-items and users-items similarity measures. In most cases similarity measures such as Euclidian, Manhattan, cosine and many others based on vector or matrix methods are used to find the similarities. Web logs are high-dimensional datasets, consisting of multiple users, multiple searches with many attributes to each. Two-dimensional data analysis methods may often overlook latent relationships that may exist between users and items. In contrast to other studies, this thesis utilises tensors, the high-dimensional data models, to build user and object profiles and to find the inter-relationships between users-users and users-items. To create an improved personalized Web system, this thesis proposes to build three types of profiles: individual user, group users and object profiles utilising decomposition factors of tensor data models. A hybrid recommendation approach utilising group profiles (forming the basis of a collaborative filtering method) and object profiles (forming the basis of a content-based method) in conjunction with individual user profiles (forming the basis of a model based approach) is proposed for making effective recommendations. A tensor-based clustering method is proposed that utilises the outcomes of popular tensor decomposition techniques such as PARAFAC, Tucker and HOSVD to group similar instances. An individual user profile, showing the user's highest interest, is represented by the top dimension values, extracted from the component matrix obtained after tensor decomposition. A group profile, showing similar users and their highest interest, is built by clustering similar users based on tensor decomposed values. A group profile is represented by the top association rules (containing various unique object combinations) that are derived from the searches made by the users of the cluster. An object profile is created to represent similar objects clustered on the basis of their similarity of features. Depending on the category of a user (known, anonymous or frequent visitor to the website), any of the profiles or their combinations is used for making personalized recommendations. A ranking algorithm is also proposed that utilizes the personalized information to order and rank the recommendations. The proposed methodology is evaluated on data collected from a real life car website. Empirical analysis confirms the effectiveness of recommendations made by the proposed approach over other collaborative filtering and content-based recommendation approaches based on two-dimensional data analysis methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Due to the development of XML and other data models such as OWL and RDF, sharing data is an increasingly common task since these data models allow simple syntactic translation of data between applications. However, in order for data to be shared semantically, there must be a way to ensure that concepts are the same. One approach is to employ commonly usedschemas—called standard schemas —which help guarantee that syntactically identical objects have semantically similar meanings. As a result of the spread of data sharing, there has been widespread adoption of standard schemas in a broad range of disciplines and for a wide variety of applications within a very short period of time. However, standard schemas are still in their infancy and have not yet matured or been thoroughly evaluated. It is imperative that the data management research community takes a closer look at how well these standard schemas have fared in real-world applications to identify not only their advantages, but also the operational challenges that real users face. In this paper, we both examine the usability of standard schemas in a comparison that spans multiple disciplines, and describe our first step at resolving some of these issues in our Semantic Modeling System. We evaluate our Semantic Modeling System through a careful case study of the use of standard schemas in architecture, engineering, and construction, which we conducted with domain experts. We discuss how our Semantic Modeling System can help the broader problem and also discuss a number of challenges that still remain.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The continuous growth of the XML data poses a great concern in the area of XML data management. The need for processing large amounts of XML data brings complications to many applications, such as information retrieval, data integration and many others. One way of simplifying this problem is to break the massive amount of data into smaller groups by application of clustering techniques. However, XML clustering is an intricate task that may involve the processing of both the structure and the content of XML data in order to identify similar XML data. This research presents four clustering methods, two methods utilizing the structure of XML documents and the other two utilizing both the structure and the content. The two structural clustering methods have different data models. One is based on a path model and other is based on a tree model. These methods employ rigid similarity measures which aim to identifying corresponding elements between documents with different or similar underlying structure. The two clustering methods that utilize both the structural and content information vary in terms of how the structure and content similarity are combined. One clustering method calculates the document similarity by using a linear weighting combination strategy of structure and content similarities. The content similarity in this clustering method is based on a semantic kernel. The other method calculates the distance between documents by a non-linear combination of the structure and content of XML documents using a semantic kernel. Empirical analysis shows that the structure-only clustering method based on the tree model is more scalable than the structure-only clustering method based on the path model as the tree similarity measure for the tree model does not need to visit the parents of an element many times. Experimental results also show that the clustering methods perform better with the inclusion of the content information on most test document collections. To further the research, the structural clustering method based on tree model is extended and employed in XML transformation. The results from the experiments show that the proposed transformation process is faster than the traditional transformation system that translates and converts the source XML documents sequentially. Also, the schema matching process of XML transformation produces a better matching result in a shorter time.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper explores the impact that extreme weather events can have on communities. Using the Brisbane floods of 2011 to examine the recovery operations, the paper highlights the effectiveness of recovery and rebuilding in already strong and resilient communities. Our research has shown that communities which have a strong sense of identity, as well as organized places to meet, develop resilient networks that come into play in times of crisis. The increasing trend of the fly-in/fly-out (FIFO) or drive-in/drive-out (DIDO) workforce to service regional areas has undermined the resilience of existing communities. The first hint of this occurs with community groups not knowing who their neighbours are. The paper is based on research examining the needs of groups in regional communities with the goal to better equip regional communities with the capacity to respond positively to change (and crisis) through in-novative, evidence-based policies, resilience strategies and tools. Part of this process was to build an evidence-base to address a range of challenges associated with the place-based environments and the sharing of information systems within communities and decision makers. The first part of the paper explores the context in which communities have been required to mobilize in response to crises; the issues that have galvanized a common purpose; and the methods by which these communities shared their knowledge. The second part of the paper examines how communities could plan for and mitigate natural disasters in the future by developing better decision making tools. The paper defines the requirements for information systems that will link data models of built infrastruc-ture with data from the disaster and response plans. These will then form the basis for the use of social media to coordinate activities between official crews and the public to improve response coordination and provide the technology that could reduce the time required to allow communities to resume some semblance of normality.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper reviews the use of multi-agent systems to model the impacts of high levels of photovoltaic (PV) system penetration in distribution networks and presents some preliminary data obtained from the Perth Solar City high penetration PV trial. The Perth Solar City trial consists of a low voltage distribution feeder supplying 75 customers where 29 consumers have roof top photovoltaic systems. Data is collected from smart meters at each consumer premises, from data loggers at the transformer low voltage (LV) side and from a nearby distribution network SCADA measurement point on the high voltage side (HV) side of the transformer. The data will be used to progressively develop MAS models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The von Bertalanffy growth model is extended to incorporate explanatory variables. The generalized model includes the switched growth model and the seasonal growth model as special cases, and can also be used to assess the tagging effect on growth. Distribution-free and consistent estimating functions are constructed for estimation of growth parameters from tag-recapture data in which age at release is unknown. This generalizes the work of James (1991, Biometrics 47 1519-1530) who considered the classical model and allowed for individual variability in growth. A real dataset from barramundi (Lates calcarifer) is analysed to estimate the growth parameters and possible effect of tagging on growth.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Habitat models are widely used in ecology, however there are relatively few studies of rare species, primarily because of a paucity of survey records and lack of robust means of assessing accuracy of modelled spatial predictions. We investigated the potential of compiled ecological data in developing habitat models for Macadamia integrifolia, a vulnerable mid-stratum tree endemic to lowland subtropical rainforests of southeast Queensland, Australia. We compared performance of two binomial models—Classification and Regression Trees (CART) and Generalised Additive Models (GAM)—with Maximum Entropy (MAXENT) models developed from (i) presence records and available absence data and (ii) developed using presence records and background data. The GAM model was the best performer across the range of evaluation measures employed, however all models were assessed as potentially useful for informing in situ conservation of M. integrifolia, A significant loss in the amount of M. integrifolia habitat has occurred (p < 0.05), with only 37% of former habitat (pre-clearing) remaining in 2003. Remnant patches are significantly smaller, have larger edge-to-area ratios and are more isolated from each other compared to pre-clearing configurations (p < 0.05). Whilst the network of suitable habitat patches is still largely intact, there are numerous smaller patches that are more isolated in the contemporary landscape compared with their connectedness before clearing. These results suggest that in situ conservation of M. integrifolia may be best achieved through a landscape approach that considers the relative contribution of small remnant habitat fragments to the species as a whole, as facilitating connectivity among the entire network of habitat patches.