359 resultados para Clustering techniques
Resumo:
Vector field visualisation is one of the classic sub-fields of scientific data visualisation. The need for effective visualisation of flow data arises in many scientific domains ranging from medical sciences to aerodynamics. Though there has been much research on the topic, the question of how to communicate flow information effectively in real, practical situations is still largely an unsolved problem. This is particularly true for complex 3D flows. In this presentation we give a brief introduction and background to vector field visualisation and comment on the effectiveness of the most common solutions. We will then give some examples of current development on texture-based techniques, and given practical examples of their use in CFD research and hydrodynamic applications.
Resumo:
Personalised social matching systems can be seen as recommender systems that recommend people to others in the social networks. However, with the rapid growth of users in social networks and the information that a social matching system requires about the users, recommender system techniques have become insufficiently adept at matching users in social networks. This paper presents a hybrid social matching system that takes advantage of both collaborative and content-based concepts of recommendation. The clustering technique is used to reduce the number of users that the matching system needs to consider and to overcome other problems from which social matching systems suffer, such as cold start problem due to the absence of implicit information about a new user. The proposed system has been evaluated on a dataset obtained from an online dating website. Empirical analysis shows that accuracy of the matching process is increased, using both user information (explicit data) and user behavior (implicit data).
Resumo:
This paper presents an overview of the experiments conducted using Hybrid Clustering of XML documents using Constraints (HCXC) method for the clustering task in the INEX 2009 XML Mining track. This technique utilises frequent subtrees generated from the structure to extract the content for clustering the XML documents. It also presents the experimental study using several data representations such as the structure-only, content-only and using both the structure and the content of XML documents for the purpose of clustering them. Unlike previous years, this year the XML documents were marked up using the Wiki tags and contains categories derived by using the YAGO ontology. This paper also presents the results of studying the effect of these tags on XML clustering using the HCXC method.
Resumo:
Road surface macro-texture is an indicator used to determine the skid resistance levels in pavements. Existing methods of quantifying macro-texture include the sand patch test and the laser profilometer. These methods utilise the 3D information of the pavement surface to extract the average texture depth. Recently, interest in image processing techniques as a quantifier of macro-texture has arisen, mainly using the Fast Fourier Transform (FFT). This paper reviews the FFT method, and then proposes two new methods, one using the autocorrelation function and the other using wavelets. The methods are tested on pictures obtained from a pavement surface extending more than 2km's. About 200 images were acquired from the surface at approx. 10m intervals from a height 80cm above ground. The results obtained from image analysis methods using the FFT, the autocorrelation function and wavelets are compared with sensor measured texture depth (SMTD) data obtained from the same paved surface. The results indicate that coefficients of determination (R2) exceeding 0.8 are obtained when up to 10% of outliers are removed.
Resumo:
The modern society has come to expect the electrical energy on demand, while many of the facilities in power systems are aging beyond repair and maintenance. The risk of failure is increasing with the aging equipments and can pose serious consequences for continuity of electricity supply. As the equipments used in high voltage power networks are very expensive, economically it may not be feasible to purchase and store spares in a warehouse for extended periods of time. On the other hand, there is normally a significant time before receiving equipment once it is ordered. This situation has created a considerable interest in the evaluation and application of probability methods for aging plant and provisions of spares in bulk supply networks, and can be of particular importance for substations. Quantitative adequacy assessment of substation and sub-transmission power systems is generally done using a contingency enumeration approach which includes the evaluation of contingencies, classification of the contingencies based on selected failure criteria. The problem is very complex because of the need to include detailed modelling and operation of substation and sub-transmission equipment using network flow evaluation and to consider multiple levels of component failures. In this thesis a new model associated with aging equipment is developed to combine the standard tools of random failures, as well as specific model for aging failures. This technique is applied in this thesis to include and examine the impact of aging equipments on system reliability of bulk supply loads and consumers in distribution network for defined range of planning years. The power system risk indices depend on many factors such as the actual physical network configuration and operation, aging conditions of the equipment, and the relevant constraints. The impact and importance of equipment reliability on power system risk indices in a network with aging facilities contains valuable information for utilities to better understand network performance and the weak links in the system. In this thesis, algorithms are developed to measure the contribution of individual equipment to the power system risk indices, as part of the novel risk analysis tool. A new cost worth approach was developed in this thesis that can make an early decision in planning for replacement activities concerning non-repairable aging components, in order to maintain a system reliability performance which economically is acceptable. The concepts, techniques and procedures developed in this thesis are illustrated numerically using published test systems. It is believed that the methods and approaches presented, substantially improve the accuracy of risk predictions by explicit consideration of the effect of equipment entering a period of increased risk of a non-repairable failure.
Resumo:
Eigen-based techniques and other monolithic approaches to face recognition have long been a cornerstone in the face recognition community due to the high dimensionality of face images. Eigen-face techniques provide minimal reconstruction error and limit high-frequency content while linear discriminant-based techniques (fisher-faces) allow the construction of subspaces which preserve discriminatory information. This paper presents a frequency decomposition approach for improved face recognition performance utilising three well-known techniques: Wavelets; Gabor / Log-Gabor; and the Discrete Cosine Transform. Experimentation illustrates that frequency domain partitioning prior to dimensionality reduction increases the information available for classification and greatly increases face recognition performance for both eigen-face and fisher-face approaches.
Resumo:
BACKGROUND: Grafting of autologous hyaline cartilage and bone for articular cartilage repair is a well-accepted technique. Although encouraging midterm clinical results have been reported, no information on the mechanical competence of the transplanted joint surface is available. HYPOTHESIS: The mechanical competence of osteochondral autografts is maintained after transplantation. STUDY DESIGN: Controlled laboratory study. METHODS: Osteochondral defects were filled with autografts (7.45 mm in diameter) in one femoral condyle in 12 mature sheep. The ipsilateral femoral condyle served as the donor site, and the resulting defect (8.3 mm in diameter) was left empty. The repair response was examined after 3 and 6 months with mechanical and histologic assessment and histomorphometric techniques. RESULTS: Good surface congruity and plug placement was achieved. The Young modulus of the grafted cartilage significantly dropped to 57.5% of healthy tissue after 3 months (P < .05) but then recovered to 82.2% after 6 months. The aggregate and dynamic moduli behaved similarly. The graft edges showed fibrillation and, in some cases (4 of 6), hypercellularity and chondrocyte clustering. Subchondral bone sclerosis was observed in 8 of 12 cases, and the amount of mineralized bone in the graft area increased from 40% to 61%. CONCLUSIONS: The mechanical quality of transplanted cartilage varies considerably over a short period of time, potentially reflecting both degenerative and regenerative processes, while histologically signs of both cartilage and bone degeneration occur. CLINICAL RELEVANCE: Both the mechanically degenerative and restorative processes illustrate the complex progression of regeneration after osteochondral transplantation. The histologic evidence raises doubts as to the long-term durability of the osteochondral repair.
Resumo:
In order to achieve meaningful reductions in individual ecological footprints, individuals must dramatically alter their day to day behaviours. Effective interventions will need to be evidence based and there is a necessity for the rapid transfer or communication of information from the point of research, into policy and practice. A number of health disciplines, including psychology and public health, share a common mission to promote health and well-being and it is becoming clear that the most practical pathway to achieving this mission is through interdisciplinary collaboration. This paper argues that an interdisciplinary collaborative approach will facilitate research that results in the rapid transfer of findings into policy and practice. The application of this approach is described in relation to the Green Living project which explored the psycho-social predictors of environmentally friendly behaviour. Following a qualitative pilot study, and in consultation with an expert panel comprising academics, industry professionals and government representatives, a self-administered mail survey was distributed to a random sample of 3000 residents of Brisbane and Moreton Bay (Queensland, Australia). The Green Living survey explored specific beliefs which included attitudes, norms, perceived control, intention and behaviour, as well as a number of other constructs such as environmental concern and altruism. This research has two beneficial outcomes. First, it will inform a practical model for predicting sustainable living behaviours and a number of local councils have already expressed an interest in making use of the results as part of their ongoing community engagement programs. Second, it provides an example of how a collaborative interdisciplinary project can provide a more comprehensive approach to research than can be accomplished by a single disciplinary project.
Resumo:
Understanding the motion characteristics of on-site objects is desirable for the analysis of construction work zones, especially in problems related to safety and productivity studies. This article presents a methodology for rapid object identification and tracking. The proposed methodology contains algorithms for spatial modeling and image matching. A high-frame-rate range sensor was utilized for spatial data acquisition. The experimental results indicated that an occupancy grid spatial modeling algorithm could quickly build a suitable work zone model from the acquired data. The results also showed that an image matching algorithm is able to find the most similar object from a model database and from spatial models obtained from previous scans. It is then possible to use the matched information to successfully identify and track objects.
Resumo:
Background: Waist circumference has been identified as a valuable predictor of cardiovascular risk in children. The development of waist circumference percentiles and cut-offs for various ethnic groups are necessary because of differences in body composition. The purpose of this study was to develop waist circumference percentiles for Chinese children and to explore optimal waist circumference cut-off values for predicting cardiovascular risk factors clustering in this population.----- ----- Methods: Height, weight, and waist circumference were measured in 5529 children (2830 boys and 2699 girls) aged 6-12 years randomly selected from southern and northern China. Blood pressure, fasting triglycerides, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, and glucose were obtained in a subsample (n = 1845). Smoothed percentile curves were produced using the LMS method. Receiver-operating characteristic analysis was used to derive the optimal age- and gender-specific waist circumference thresholds for predicting the clustering of cardiovascular risk factors.----- ----- Results: Gender-specific waist circumference percentiles were constructed. The waist circumference thresholds were at the 90th and 84th percentiles for Chinese boys and girls respectively, with sensitivity and specificity ranging from 67% to 83%. The odds ratio of a clustering of cardiovascular risk factors among boys and girls with a higher value than cut-off points was 10.349 (95% confidence interval 4.466 to 23.979) and 8.084 (95% confidence interval 3.147 to 20.767) compared with their counterparts.----- ----- Conclusions: Percentile curves for waist circumference of Chinese children are provided. The cut-off point for waist circumference to predict cardiovascular risk factors clustering is at the 90th and 84th percentiles for Chinese boys and girls, respectively.
Resumo:
The traditional Vector Space Model (VSM) is not able to represent both the structure and the content of XML documents. This paper introduces a novel method of representing XML documents in a Tensor Space Model (TSM) and then utilizing it for clustering. Empirical analysis shows that the proposed method is scalable for large-sized datasets; as well, the factorized matrices produced from the proposed method help to improve the quality of clusters through the enriched document representation of both structure and content information.
Resumo:
A significant proportion of the cost of software development is due to software testing and maintenance. This is in part the result of the inevitable imperfections due to human error, lack of quality during the design and coding of software, and the increasing need to reduce faults to improve customer satisfaction in a competitive marketplace. Given the cost and importance of removing errors improvements in fault detection and removal can be of significant benefit. The earlier in the development process faults can be found, the less it costs to correct them and the less likely other faults are to develop. This research aims to make the testing process more efficient and effective by identifying those software modules most likely to contain faults, allowing testing efforts to be carefully targeted. This is done with the use of machine learning algorithms which use examples of fault prone and not fault prone modules to develop predictive models of quality. In order to learn the numerical mapping between module and classification, a module is represented in terms of software metrics. A difficulty in this sort of problem is sourcing software engineering data of adequate quality. In this work, data is obtained from two sources, the NASA Metrics Data Program, and the open source Eclipse project. Feature selection before learning is applied, and in this area a number of different feature selection methods are applied to find which work best. Two machine learning algorithms are applied to the data - Naive Bayes and the Support Vector Machine - and predictive results are compared to those of previous efforts and found to be superior on selected data sets and comparable on others. In addition, a new classification method is proposed, Rank Sum, in which a ranking abstraction is laid over bin densities for each class, and a classification is determined based on the sum of ranks over features. A novel extension of this method is also described based on an observed polarising of points by class when rank sum is applied to training data to convert it into 2D rank sum space. SVM is applied to this transformed data to produce models the parameters of which can be set according to trade-off curves to obtain a particular performance trade-off.