932 resultados para data complexity
Resumo:
This report is an update of an earlier one produced in January 2010 (see Carrington et al. 2010) which remains as an ePrint through the project’s home page. The report focus on our examination of extant data which have been sourced with respect to personally and socially risky behaviour associated with males living in regional and remote Australia and which were available in public data bases at production.
Resumo:
This report is an update of an earlier one produced in January 2010 (see Carrington et al. 2010) which remains as an ePrint through the project’s home page. The report considers extant data which have been sourced with respect to some of the consequences of violent acts, incidents, harms and risky behaviour involving males living in regional and remote Australia and which were available in public data bases at production.
Resumo:
Various time-memory tradeoffs attacks for stream ciphers have been proposed over the years. However, the claimed success of these attacks assumes the initialisation process of the stream cipher is one-to-one. Some stream cipher proposals do not have a one-to-one initialisation process. In this paper, we examine the impact of this on the success of time-memory-data tradeoff attacks. Under the circumstances, some attacks are more successful than previously claimed while others are less. The conditions for both cases are established.
Resumo:
With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques were used to derive this interesting information. Mining on XML documents is impacted by its model due to the semi-structured nature of these documents. Hence, in this chapter we present an overview of the various models of XML documents, how these models were used for mining and some of the issues and challenges in these models. In addition, this chapter also provides some insights into the future models of XML documents for effectively capturing the two important features namely structure and content of XML documents for mining.
Resumo:
This special issue of the Journal of Urban Technology brings together five articles that are based on presentations given at the Street Computing workshop held on 24 November 2009 in Melbourne in conjunction with the Australian Computer-Human Interaction conference (OZCHI 2009). Our own article introduces the Street Computing vision and explores the potential, challenges and foundations of this research vision. In order to do so, we first look at the currently available sources of information and discuss their link to existing research efforts. Section 2 then introduces the notion of Street Computing and our research approach in more detail. Section 3 looks beyond the core concept itself and summarises related work in this field of interest.
Resumo:
This chapter argues that evolutionary economics should be founded upon complex systems theory rather than neo-Darwinian analogies concerning natural selection, which focus on supply side considerations and competition amongst firms and technologies. It suggests that conceptions such as production and consumption functions should be replaced by network representations, in which the preferences or, more correctly, the aspirations of consumers are fundamental and, as such, the primary drivers of economic growth. Technological innovation is viewed as a process that is intermediate between these aspirational networks, and the organizational networks in which goods and services are produced. Consumer knowledge becomes at least as important as producer knowledge in determining how economic value is generated. It becomes clear that the stability afforded by connective systems of rules is essential for economic flexibility to exist, but that too many rules result in inert and structurally unstable states. In contrast, too few rules result in a more stable state, but at a low level of ordered complexity. Economic evolution from this perspective is explored using random and scale free network representations of complex systems.
Resumo:
In this paper we present a sequential Monte Carlo algorithm for Bayesian sequential experimental design applied to generalised non-linear models for discrete data. The approach is computationally convenient in that the information of newly observed data can be incorporated through a simple re-weighting step. We also consider a flexible parametric model for the stimulus-response relationship together with a newly developed hybrid design utility that can produce more robust estimates of the target stimulus in the presence of substantial model and parameter uncertainty. The algorithm is applied to hypothetical clinical trial or bioassay scenarios. In the discussion, potential generalisations of the algorithm are suggested to possibly extend its applicability to a wide variety of scenarios
Resumo:
Several authors stress the importance of data’s crucial foundation for operational, tactical and strategic decisions (e.g., Redman 1998, Tee et al. 2007). Data provides the basis for decision making as data collection and processing is typically associated with reducing uncertainty in order to make more effective decisions (Daft and Lengel 1986). While the first series of investments of Information Systems/Information Technology (IS/IT) into organizations improved data collection, restricted computational capacity and limited processing power created challenges (Simon 1960). Fifty years on, capacity and processing problems are increasingly less relevant; in fact, the opposite exists. Determining data relevance and usefulness is complicated by increased data capture and storage capacity, as well as continual improvements in information processing capability. As the IT landscape changes, businesses are inundated with ever-increasing volumes of data from both internal and external sources available on both an ad-hoc and real-time basis. More data, however, does not necessarily translate into more effective and efficient organizations, nor does it increase the likelihood of better or timelier decisions. This raises questions about what data managers require to assist their decision making processes.
Resumo:
In the last few years we have observed a proliferation of approaches for clustering XML docu- ments and schemas based on their structure and content. The presence of such a huge amount of approaches is due to the different applications requiring the XML data to be clustered. These applications need data in the form of similar contents, tags, paths, structures and semantics. In this paper, we first outline the application contexts in which clustering is useful, then we survey approaches so far proposed relying on the abstract representation of data (instances or schema), on the identified similarity measure, and on the clustering algorithm. This presentation leads to draw a taxonomy in which the current approaches can be classified and compared. We aim at introducing an integrated view that is useful when comparing XML data clustering approaches, when developing a new clustering algorithm, and when implementing an XML clustering compo- nent. Finally, the paper moves into the description of future trends and research issues that still need to be faced.
Resumo:
BACKGROUND: Emergency departments (EDs) are critical to the management of acute illness and injury, and the provision of health system access. However, EDs have become increasingly congested due to increased demand, increased complexity of care and blocked access to ongoing care (access block). Congestion has clinical and organisational implications. This paper aims to describe the factors that appear to infl uence demand for ED services, and their interrelationships as the basis for further research into the role of private hospital EDs. DATA SOURCES: Multiple databases (PubMed, ProQuest, Academic Search Elite and Science Direct) and relevant journals were searched using terms related to EDs and emergency health needs. Literature pertaining to emergency department utilisation worldwide was identified, and articles selected for further examination on the basis of their relevance and significance to ED demand. RESULTS: Factors influencing ED demand can be categorized into those describing the health needs of the patients, those predisposing a patient to seeking help, and those relating to policy factors such as provision of services and insurance status. This paper describes the factors influencing ED presentations, and proposes a novel conceptual map of their interrelationship. CONCLUSION: This review has explored the factors contributing to the growing demand for ED care, the influence these factors have on ED demand, and their interrelationships depicted in the conceptual model.
Resumo:
With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.
Resumo:
This paper argues for a renewed focus on statistical reasoning in the beginning school years, with opportunities for children to engage in data modelling. Results are reported from the first year of a 3-year longitudinal study in which three classes of first-grade children (6-year-olds) and their teachers engaged in data modelling activities. The theme of Looking after our Environment, part of the children’s science curriculum, provided the task context. The goals for the two activities addressed here included engaging children in core components of data modelling, namely, selecting attributes, structuring and representing data, identifying variation in data, and making predictions from given data. Results include the various ways in which children represented and re represented collected data, including attribute selection, and the metarepresentational competence they displayed in doing so. The “data lenses” through which the children dealt with informal inference (variation and prediction) are also reported.
Resumo:
In response to the need to leverage private finance and the lack of competition in some parts of the Australian public sector infrastructure market, especially in the very large economic infrastructure sector procured using Pubic Private Partnerships, the Australian Federal government has demonstrated its desire to attract new sources of in-bound foreign direct investment (FDI). This paper aims to report on progress towards an investigation into the determinants of multinational contractors’ willingness to bid for Australian public sector major infrastructure projects. This research deploys Dunning’s eclectic theory for the first time in terms of in-bound FDI by multinational contractors into Australia. Elsewhere, the authors have developed Dunning’s principal hypothesis to suit the context of this research and to address a weakness arising in this hypothesis that is based on a nominal approach to the factors in Dunning's eclectic framework and which fails to speak to the relative explanatory power of these factors. In this paper, a first stage test of the authors' development of Dunning's hypothesis is presented by way of an initial review of secondary data vis-à-vis the selected sector (roads and bridges) in Australia (as the host location) and with respect to four selected home countries (China; Japan; Spain; and US). In doing so, the next stage in the research method concerning sampling and case studies is also further developed and described in this paper. In conclusion, the extent to which the initial review of secondary data suggests the relative importance of the factors in the eclectic framework is considered. It is noted that more robust conclusions are expected following the future planned stages of the research including primary data from the case studies and a global survey of the world’s largest contractors and which is briefly previewed. Finally, and beyond theoretical contributions expected from the overall approach taken to developing and testing Dunning’s framework, other expected contributions concerning research method and practical implications are mentioned.
Resumo:
A rule-based approach for classifying previously identified medical concepts in the clinical free text into an assertion category is presented. There are six different categories of assertions for the task: Present, Absent, Possible, Conditional, Hypothetical and Not associated with the patient. The assertion classification algorithms were largely based on extending the popular NegEx and Context algorithms. In addition, a health based clinical terminology called SNOMED CT and other publicly available dictionaries were used to classify assertions, which did not fit the NegEx/Context model. The data for this task includes discharge summaries from Partners HealthCare and from Beth Israel Deaconess Medical Centre, as well as discharge summaries and progress notes from University of Pittsburgh Medical Centre. The set consists of 349 discharge reports, each with pairs of ground truth concept and assertion files for system development, and 477 reports for evaluation. The system’s performance on the evaluation data set was 0.83, 0.83 and 0.83 for recall, precision and F1-measure, respectively. Although the rule-based system shows promise, further improvements can be made by incorporating machine learning approaches.
Resumo:
Purpose---The aim of this study is to identify complexity measures for building projects in the People’s Republic of China (PRC). Design/Methodology/Approach---A three-round of Delphi questionnaire survey was conducted to identify the key parameters that measure the degree of project complexity. A complexity index (CI) was developed based on the identified measures and their relative importance. Findings---Six key measures of project complexity have been identified, which include, namely (1) building structure & function; (2) construction method; (3) the urgency of the project schedule; (4) project size/scale; (5) geological condition; and (6) neighboring environment. Practical implications---These complexity measures help stakeholders assess degrees of project complexity and better manage the potential risks that might be induced to different levels of project complexity. Originality/Value---The findings provide insightful perspectives to define and understand project complexity. For stakeholders, understanding and addressing the complexity help to improve project planning and implementation.