148 resultados para Cluster Analysis. Information Theory. Entropy. Cross Information Potential. Complex Data
Resumo:
IT-supported field data management benefits on-site construction management by improving accessibility to the information and promoting efficient communication between project team members. However, most of on-site safety inspections still heavily rely on subjective judgment and manual reporting processes and thus observers’ experiences often determine the quality of risk identification and control. This study aims to develop a methodology to efficiently retrieve safety-related information so that the safety inspectors can easily access to the relevant site safety information for safer decision making. The proposed methodology consists of three stages: (1) development of a comprehensive safety database which contains information of risk factors, accident types, impact of accidents and safety regulations; (2) identification of relationships among different risk factors based on statistical analysis methods; and (3) user-specified information retrieval using data mining techniques for safety management. This paper presents an overall methodology and preliminary results of the first stage research conducted with 101 accident investigation reports.
Resumo:
This work identifies the limitations of n-way data analysis techniques in multidimensional stream data, such as Internet chat room communications data, and establishes a link between data collection and performance of these techniques. Its contributions are twofold. First, it extends data analysis to multiple dimensions by constructing n-way data arrays known as high order tensors. Chat room tensors are generated by a simulator which collects and models actual communication data. The accuracy of the model is determined by the Kolmogorov-Smirnov goodness-of-fit test which compares the simulation data with the observed (real) data. Second, a detailed computational comparison is performed to test several data analysis techniques including svd [1], and multi-way techniques including Tucker1, Tucker3 [2], and Parafac [3].
Resumo:
Identifying the design features that impact construction is essential to developing cost effective and constructible designs. The similarity of building components is a critical design feature that affects method selection, productivity, and ultimately construction cost and schedule performance. However, there is limited understanding of what constitutes similarity in the design of building components and limited computer-based support to identify this feature in a building product model. This paper contributes a feature-based framework for representing and reasoning about component similarity that builds on ontological modelling, model-based reasoning and cluster analysis techniques. It describes the ontology we developed to characterize component similarity in terms of the component attributes, the direction, and the degree of variation. It also describes the generic reasoning process we formalized to identify component similarity in a standard product model based on practitioners' varied preferences. The generic reasoning process evaluates the geometric, topological, and symbolic similarities between components, creates groupings of similar components, and quantifies the degree of similarity. We implemented this reasoning process in a prototype cost estimating application, which creates and maintains cost estimates based on a building product model. Validation studies of the prototype system provide evidence that the framework is general and enables a more accurate and efficient cost estimating process.
Resumo:
The linguistic turn within philosophy has recently gained increased attention within social sciences. It can be seen as an attempt to investigate traditional philosophical problems by analysing the linguistic expressions used for these investigations. More generally, the phenomenon of language itself must be considered because of its (constitutional) impact on the investigation of phenomena in social sciences. In order to understand the consequences of the linguistic turn, its origins in philosophy are important and will be discussed. Within social sciences the linguistic turn already had significant impact. As an example, we will therefore discuss what directions the linguistic turn enabled for organizational analysis. Information Systems as a discipline must face the consequences of the linguistic turn as well. We will discuss how the linguistic framework introduced impacts the development of knowledge management and that of managerial and organizational support systems. This example shows what different perspectives the linguistic turn can provide for investigations within Information Systems. In addition, we will briefly outline the impact of the linguistic turn with respect to methodologies in Information Systems research.
Resumo:
Recent advances in the area of ‘Transformational Government’ position the citizen at the centre of focus. This paradigm shift from a department-centric to a citizen-centric focus requires governments to re-think their approach to service delivery, thereby decreasing costs and increasing citizen satisfaction. The introduction of franchises as a virtual business layer between the departments and their citizens is intended to provide a solution. Franchises are structured to address the needs of citizens independent of internal departmental structures. For delivering services online, governments pursue the development of a One-Stop Portal, which structures information and services through those franchises. Thus, each franchise can be mapped to a specific service bundle, which groups together services that are deemed to be of relevance to a specific citizen need. This study focuses on the development and evaluation of these service bundles. In particular, two research questions guide the line of investigation of this study: Research Question 1): What methods can be used by governments to identify service bundles as part of governmental One-Stop Portals? Research Question 2): How can the quality of service bundles in governmental One-Stop Portals be evaluated? The first research question asks about the identification of suitable service bundle identification methods. A literature review was conducted, to, initially, conceptualise the service bundling task, in general. As a consequence, a 4-layer model of service bundling and a morphological box were created, detailing characteristics that are of relevance when identifying service bundles. Furthermore, a literature review of Decision-Support Systems was conducted to identify approaches of relevance in different bundling scenarios. These initial findings were complemented by targeted studies of multiple leading governments in the e-government domain, as well as with a local expert in the field. Here, the aim was to identify the current status of online service delivery and service bundling in practice. These findings led to the conceptualising of two service bundle identification methods, applicable in the context of Queensland Government: On the one hand, a provider-driven approach, based on service description languages, attributes, and relationships between services was conceptualised. As well, a citizen-driven approach, based on analysing the outcomes from content identification and grouping workshops with citizens, was also conceptualised. Both methods were then applied and evaluated in practice. The conceptualisation of the provider-driven method for service bundling required the initial specification of relevant attributes that could be used to identify similarities between services called relationships; these relationships then formed the basis for the identification of service bundles. This study conceptualised and defined seven relationships, namely ‘Co-location’, ‘Resource’, ‘Co-occurrence’, ‘Event’, ‘Consumer’, ‘Provider’, and ‘Type’. The relationships, and the bundling method itself, were applied and refined as part of six Action Research cycles in collaboration with the Queensland Government. The findings show that attributes and relationships can be used effectively as a means for bundle identification, if distinct decision rules are in place to prescribe how services are to be identified. For the conceptualisation of the citizen-driven method, insights from the case studies led to the decision to involve citizens, through card sorting activities. Based on an initial list of services, relevant for a certain franchise, participating citizens grouped services according to their liking. The card sorting activity, as well as the required analysis and aggregation of the individual card sorting results, was analysed in depth as part of this study. A framework was developed that can be used as a decision-support tool to assist with the decision of what card sorting analysis method should be utilised in a given scenario. The characteristic features associated with card sorting in a government context led to the decision to utilise statistical analysis approaches, such as cluster analysis and factor analysis, to aggregate card sorting results. The second research question asks how the quality of service bundles can be assessed. An extensive literature review was conducted focussing on bundle, portal, and e-service quality. It was found that different studies use different constructs, terminology, and units of analysis, which makes comparing these models a difficult task. As a direct result, a framework was conceptualised, that can be used to position past and future studies in this research domain. Complementing the literature review, interviews conducted as part of the case studies with leaders in e-government, indicated that, typically, satisfaction is evaluated for the overall portal once the portal is online, but quality tests are not conducted during the development phase. Consequently, a research model which appropriately defines perceived service bundle quality would need to be developed from scratch. Based on existing theory, such as Theory of Reasoned Action, Expectation Confirmation Theory, and Theory of Affordances, perceived service bundle quality was defined as an inferential belief. Perceived service bundle quality was positioned within the nomological net of services. Based on the literature analysis on quality, and on the subsequent work of a focus group, the hypothesised antecedents (descriptive beliefs) of the construct and the associated question items were defined and the research model conceptualised. The model was then tested, refined, and finally validated during six Action Research cycles. Results show no significant difference in higher quality or higher satisfaction among users for either the provider-driven method or for the citizen-driven method. The decision on which method to choose, it was found, should be based on contextual factors, such as objectives, resources, and the need for visibility. The constructs of the bundle quality model were examined. While the quality of bundles identified through the citizen-centric approach could be explained through the constructs ‘Navigation’, ‘Ease of Understanding’, and ‘Organisation’, bundles identified through the provider-driven approach could be explained solely through the constructs ‘Navigation’ and ‘Ease of Understanding’. An active labelling style for bundles, as part of the provider-driven Information Architecture, had a larger impact on ‘Quality’ than the topical labelling style used in the citizen-centric Information Architecture. However, ‘Organisation’, reflecting the internal, logical structure of the Information Architecture, was a significant factor impacting on ‘Quality’ only in the citizen-driven Information Architecture. Hence, it was concluded that active labelling can compensate for a lack of logical structure. Further studies are needed to further test this conjecture. Such studies may involve building alternative models and conducting additional empirical research (e.g. use of an active labelling style for the citizen-driven Information Architecture). This thesis contributes to the body of knowledge in several ways. Firstly, it presents an empirically validated model of the factors explaining and predicting a citizen’s perception of service bundle quality. Secondly, it provides two alternative methods that can be used by governments to identify service bundles in structuring the content of a One-Stop Portal. Thirdly, this thesis provides a detailed narrative to suggest how the recent paradigm shift in the public domain, towards a citizen-centric focus, can be pursued by governments; the research methodology followed by this study can serve as an exemplar for governments seeking to achieve a citizen-centric approach to service delivery.
Resumo:
Quantum-inspired models have recently attracted increasing attention in Information Retrieval. An intriguing characteristic of the mathematical framework of quantum theory is the presence of complex numbers. However, it is unclear what such numbers could or would actually represent or mean in Information Retrieval. The goal of this paper is to discuss the role of complex numbers within the context of Information Retrieval. First, we introduce how complex numbers are used in quantum probability theory. Then, we examine van Rijsbergen’s proposal of evoking complex valued representations of informations objects. We empirically show that such a representation is unlikely to be effective in practice (confuting its usefulness in Information Retrieval). We then explore alternative proposals which may be more successful at realising the power of complex numbers.
Resumo:
Moving cell fronts are an essential feature of wound healing, development and disease. The rate at which a cell front moves is driven, in part, by the cell motility, quantified in terms of the cell diffusivity $D$, and the cell proliferation rate �$\lambda$. Scratch assays are a commonly-reported procedure used to investigate the motion of cell fronts where an initial cell monolayer is scratched and the motion of the front is monitored over a short period of time, often less than 24 hours. The simplest way of quantifying a scratch assay is to monitor the progression of the leading edge. Leading edge data is very convenient since, unlike other methods, it is nondestructive and does not require labeling, tracking or counting individual cells amongst the population. In this work we study short time leading edge data in a scratch assay using a discrete mathematical model and automated image analysis with the aim of investigating whether such data allows us to reliably identify $D$ and $\lambda$�. Using a naıve calibration approach where we simply scan the relevant region of the ($D$;$\lambda$�) parameter space, we show that there are many choices of $D$ and $\lambda$� for which our model produces indistinguishable short time leading edge data. Therefore, without due care, it is impossible to estimate $D$ and $\lambda$� from this kind of data. To address this, we present a modified approach accounting for the fact that cell motility occurs over a much shorter time scale than proliferation. Using this information we divide the duration of the experiment into two periods, and we estimate $D$ using data from the first period, while we estimate �$\lambda$ using data from the second period. We confirm the accuracy of our approach using in silico data and a new set of in vitro data, which shows that our method recovers estimates of $D$ and $\lamdba$� that are consistent with previously-reported values except that that our approach is fast, inexpensive, nondestructive and avoids the need for cell labeling and cell counting.
Resumo:
The top-k retrieval problem aims to find the optimal set of k documents from a number of relevant documents given the user’s query. The key issue is to balance the relevance and diversity of the top-k search results. In this paper, we address this problem using Facility Location Analysis taken from Operations Research, where the locations of facilities are optimally chosen according to some criteria. We show how this analysis technique is a generalization of state-of-the-art retrieval models for diversification (such as the Modern Portfolio Theory for Information Retrieval), which treat the top-k search results like “obnoxious facilities” that should be dispersed as far as possible from each other. However, Facility Location Analysis suggests that the top-k search results could be treated like “desirable facilities” to be placed as close as possible to their customers. This leads to a new top-k retrieval model where the best representatives of the relevant documents are selected. In a series of experiments conducted on two TREC diversity collections, we show that significant improvements can be made over the current state-of-the-art through this alternative treatment of the top-k retrieval problem.
Resumo:
Data in germplasm collections contain a mixture of data types; binary, multistate and quantitative. Given the multivariate nature of these data, the pattern analysis methods of classification and ordination have been identified as suitable techniques for statistically evaluating the available diversity. The proximity (or resemblance) measure, which is in part the basis of the complementary nature of classification and ordination techniques, is often specific to particular data types. The use of a combined resemblance matrix has an advantage over data type specific proximity measures. This measure accommodates the different data types without manipulating them to be of a specific type. Descriptors are partitioned into their data types and an appropriate proximity measure is used on each. The separate proximity matrices, after range standardisation, are added as a weighted average and the combined resemblance matrix is then used for classification and ordination. Germplasm evaluation data for 831 accessions of groundnut (Arachis hypogaea L.) from the Australian Tropical Field Crops Genetic Resource Centre, Biloela, Queensland were examined. Data for four binary, five ordered multistate and seven quantitative descriptors have been documented. The interpretative value of different weightings - equal and unequal weighting of data types to obtain a combined resemblance matrix - was investigated by using principal co-ordinate analysis (ordination) and hierarchical cluster analysis. Equal weighting of data types was found to be more valuable for these data as the results provided a greater insight into the patterns of variability available in the Australian groundnut germplasm collection. The complementary nature of pattern analysis techniques enables plant breeders to identify relevant accessions in relation to the descriptors which distinguish amongst them. This additional information may provide plant breeders with a more defined entry point into the germplasm collection for identifying sources of variability for their plant improvement program, thus improving the utilisation of germplasm resources.
Resumo:
Information on the variation available for different plant attributes has enabled germplasm collections to be effectively utilised in plant breeding. A world sourced collection of white clover germplasm has been developed at the White Clover Resource Centre at Glen Innes, New South Wales. This collection of 439 accessions was characterised under field conditions as a preliminary study of the genotypic variation for morphological attributes; stolon density, stolon branching, number of nodes. number of rooted nodes, stolon thickness, internode length, leaf length, plant height and plant spread, together with seasonal herbage yield. Characterisation was conducted on different batches of germplasm (subsets of accessions taken from the complete collection) over a period of five years. Inclusion of two check cultivars, Haifa and Huia, in each batch enabled adjustment of the characterisation data for year effects and attribute-by-year interaction effects. The component of variance for seasonal herbage yield among batches was large relative to that for accessions. Accession-by-experiment and accession-by-season interactions for herbage yield were not detected. Accession mean repeatability for herbage yield across seasons was intermediate (0.453). The components of genotypic variance among accessions for all attributes, except plant height, were larger than their respective standard errors. The estimates of accession mean repeatability for the attributes ranged from low (0.277 for plant height) to intermediate (0.544 for internode length). Multivariate techniques of clustering and ordination were used to investigate the diversity present among the accessions in the collection. Both cluster analysis and principal component analysis suggested that seven groups of accessions existed. It was also proposed from the pattern analysis results that accessions from a group characterised by large leaves, tall plants and thick stolons could be crossed with accessions from a group that had above average stolon density and stolon branching. This material could produce breeding populations to be used in recurrent selection for the development of white clover cultivars for dryland summer moisture stress environments in Australia. The germplasm collection was also found to be deficient in genotypes with high stolon density, high number of branches high number of rooted nodes and large leaves. This warrants addition of new germplasm accessions possessing these characteristics to the present germplasm collection.
Resumo:
Ecosystem based management requires the integration of various types of assessment indicators. Understanding stakeholders' information preferences is important, in selecting those indicators that best support management and policy. Both the preferences of decision-makers and the general public may matter, in democratic participatory management institutions. This paper presents a multi-criteria analysis aimed at quantifying the relative importance to these groups of economic, ecological and socio-economic indicators usually considered when managing ecosystem services in a coastal development context. The Analytic Hierarchy Process (AHP) is applied within two nationwide surveys in Australia, and preferences of both the general public and decision-makers for these indicators are elicited and compared. Results show that, on average across both groups, the priority in assessing a generic coastal development project is for the ecological assessment of its impacts on marine biodiversity. Ecological assessment indicators are globally preferred to both economic and socio-economic indicators regardless of the nature of the impacts studied. These results are observed for a significantly larger proportion of decision-maker than general public respondents, questioning the extent to which the general public's preferences are well reflected in decision-making processes.
Resumo:
Timely feedback is a vital component in the learning process. It is especially important for beginner students in Information Technology since many have not yet formed an effective internal model of a computer that they can use to construct viable knowledge. Research has shown that learning efficiency is increased if immediate feedback is provided for students. Automatic analysis of student programs has the potential to provide immediate feedback for students and to assist teaching staff in the marking process. This paper describes a “fill in the gap” programming analysis framework which tests students’ solutions and gives feedback on their correctness, detects logic errors and provides hints on how to fix these errors. Currently, the framework is being used with the Environment for Learning to Programming (ELP) system at Queensland University of Technology (QUT); however, the framework can be integrated into any existing online learning environment or programming Integrated Development Environment (IDE)
Resumo:
Student underachievement in the middle years (typically Years 4 to 9) is a concern in education. Incorporating Information and Communication Technologies (ICT) in assessment that is aligned to teaching and learning has the potential to engage students in higher cognitive processes that lead to increased student achievement. To examine this proposition an investigation was undertaken into teachers’ perceptions of alignment and the implications of those for student achievement in ICT enhanced middle years assessment tasks. This investigation used a collective case study design underpinned by socio-cultural theory. Two methods were used for data collection, namely, semi-structured interviews with individual teachers and a focus group discussion with teachers and another with students. Findings revealed teachers’ perceptions that alignment: assists in mediating achievement of learning outcomes in quality middle years assessment tasks, assists in creating a challenging but supportive environment in which positive learning dispositions and success is encouraged for all students, and contributes to more rigorous use of ICT in assessment. The process of implementing alignment was found to be complex but assisted through prioritising particular practices. These findings enabled the development of eight steps which serve as a guide to the effective implementation of alignment in middle years assessment tasks.
Resumo:
Communication is one team process factor that has received considerable research attention in the team literature. This literature provides equivocal evidence regarding the role of communication in team performance and yet, does not provide any evidence for when communication becomes important for team performance. This research program sought to address this evidence gap by a) testing task complexity and team member diversity (race diversity, gender diversity and work value diversity) as moderators of the team communication — performance relationship; and b) testing a team communication — performance model using established teams across two different task types. The functional perspective was used as the theoretical framework for operationalizing team communication activity. The research program utilised a quasi-experimental research design with participants from a large multi-national information technology company whose Head Office was based in Sydney, Australia. Participants voluntarily completed two team building exercises (a decision making and production task), and completed two online questionnaires. In total, data were collected from 1039 individuals who constituted 203 work teams. Analysis of the data revealed a small number of significant moderation effects, not all in the expected direction. However, an interesting and unexpected finding also emerged from Study One. Large and significant correlations between communication activity ratings were found across tasks, but not within tasks. This finding suggested that teams were displaying very similar profiles of communication on each task, despite the tasks having different communication requirements. Given this finding, Study Two sought to a) determine the relative importance of task versus team effects in explaining variance in team communication measures for established teams; b) determine if established teams had reliable and discernable team communication profiles and if so, c) investigate whether team communication profiles related to task performance. Multi-level modeling and repeated measures analysis of variance (ANOVA) revealed that task type did not have an effect on team communication ratings. However, teams accounted for 24% of the total variance in communication measures. Through cluster analysis, five reliable and distinct team communication profiles were identified. Consistent with the findings of the multi-level analysis and repeated measures ANOVA, teams’ profiles were virtually identical across the decision making and production tasks. A relationship between communication profile and performance was identified for the production task, although not for the decision making task. This research responds to calls in the literature for a better understanding of when communication becomes important for team performance. The moderators tested in this research were not found to have a substantive or reliable effect on the relationship between communication and performance. However, the consistency in team communication activity suggests that established teams can be characterized by their communication profiles and further, that these communication profiles may have implications for team performance. The findings of this research provide theoretical support for the functional perspective in terms of the communication – performance relationship and further support the team development literature as an explanation for the stability in team communication profiles. This research can also assist organizations to better understand the specific types of communication activity and profiles of communication that could offer teams a performance advantage.
Resumo:
Presentation about information modelling and artificial intelligence, semantic structure, cognitive processing and quantum theory.