122 resultados para Supervised and Unsupervised Classification
Resumo:
Early detection, clinical management and disease recurrence monitoring are critical areas in cancer treatment in which specific biomarker panels are likely to be very important in each of these key areas. We have previously demonstrated that levels of alpha-2-heremans-schmid-glycoprotein (AHSG), complement component C3 (C3), clusterin (CLI), haptoglobin (HP) and serum amyloid A (SAA) are significantly altered in serum from patients with squamous cell carcinoma of the lung. Here, we report the abundance levels for these proteins in serum samples from patients with advanced breast cancer, colorectal cancer (CRC) and lung cancer compared to healthy controls (age and gender matched) using commercially available enzyme-linked immunosorbent assay kits. Logistic regression (LR) models were fitted to the resulting data, and the classification ability of the proteins was evaluated using receiver-operating characteristic curve and leave-one-out cross-validation (LOOCV). The most accurate individual candidate biomarkers were C3 for breast cancer [area under the curve (AUC) = 0.89, LOOCV = 73%], CLI for CRC (AUC = 0.98, LOOCV = 90%), HP for small cell lung carcinoma (AUC = 0.97, LOOCV = 88%), C3 for lung adenocarcinoma (AUC = 0.94, LOOCV = 89%) and HP for squamous cell carcinoma of the lung (AUC = 0.94, LOOCV = 87%). The best dual combination of biomarkers using LR analysis were found to be AHSG + C3 (AUC = 0.91, LOOCV = 83%) for breast cancer, CLI + HP (AUC = 0.98, LOOCV = 92%) for CRC, C3 + SAA (AUC = 0.97, LOOCV = 91%) for small cell lung carcinoma and HP + SAA for both adenocarcinoma (AUC = 0.98, LOOCV = 96%) and squamous cell carcinoma of the lung (AUC = 0.98, LOOCV = 84%). The high AUC values reported here indicated that these candidate biomarkers have the potential to discriminate accurately between control and cancer groups both individually and in combination with other proteins. Copyright © 2011 UICC.
Resumo:
Spatially-explicit modelling of grassland classes is important to site-specific planning for improving grassland and environmental management over large areas. In this study, a climate-based grassland classification model, the Comprehensive and Sequential Classification System (CSCS) was integrated with spatially interpolated climate data to classify grassland in Gansu province, China. The study area is characterized by complex topographic features imposed by plateaus, high mountains, basins and deserts. To improve the quality of the interpolated climate data and the quality of the spatial classification over this complex topography, three linear regression methods, namely an analytic method based on multiple regression and residues (AMMRR), a modification of the AMMRR method through adding the effect of slope and aspect to the interpolation analysis (M-AMMRR) and a method which replaces the IDW approach for residue interpolation in M-AMMRR with an ordinary kriging approach (I-AMMRR), for interpolating climate variables were evaluated. The interpolation outcomes from the best interpolation method were then used in the CSCS model to classify the grassland in the study area. Climate variables interpolated included the annual cumulative temperature and annual total precipitation. The results indicated that the AMMRR and M-AMMRR methods generated acceptable climate surfaces but the best model fit and cross validation result were achieved by the I-AMMRR method. Twenty-six grassland classes were classified for the study area. The four grassland vegetation classes that covered more than half of the total study area were "cool temperate-arid temperate zonal semi-desert", "cool temperate-humid forest steppe and deciduous broad-leaved forest", "temperate-extra-arid temperate zonal desert", and "frigid per-humid rain tundra and alpine meadow". The vegetation classification map generated in this study provides spatial information on the locations and extents of the different grassland classes. This information can be used to facilitate government agencies' decision-making in land-use planning and environmental management, and for vegetation and biodiversity conservation. The information can also be used to assist land managers in the estimation of safe carrying capacities which will help to prevent overgrazing and land degradation.
Resumo:
Objectives To review the effects of physical activity on health and behavior outcomes and develop evidence-based recommendations for physical activity in youth. Study design A systematic literature review identified 850 articles; additional papers were identified by the expert panelists. Articles in the identified outcome areas were reviewed, evaluated and summarized by an expert panelist. The strength of the evidence, conclusions, key issues, and gaps in the evidence were abstracted in a standardized format and presented and discussed by panelists and organizational representatives. Results Most intervention studies used supervised programs of moderate to vigorous physical activity of 30 to 45 minutes duration 3 to 5 days per week. The panel believed that a greater amount of physical activity would be necessary to achieve similar beneficial effects on health and behavioral outcomes in ordinary daily circumstances (typically intermittent and unsupervised activity). Conclusion School-age youth should participate daily in 60 minutes or more of moderate to vigorous physical activity that is developmentally appropriate, enjoyable, and involves a variety of activities.
Resumo:
Recent advances suggest that encoding images through Symmetric Positive Definite (SPD) matrices and then interpreting such matrices as points on Riemannian manifolds can lead to increased classification performance. Taking into account manifold geometry is typically done via (1) embedding the manifolds in tangent spaces, or (2) embedding into Reproducing Kernel Hilbert Spaces (RKHS). While embedding into tangent spaces allows the use of existing Euclidean-based learning algorithms, manifold shape is only approximated which can cause loss of discriminatory information. The RKHS approach retains more of the manifold structure, but may require non-trivial effort to kernelise Euclidean-based learning algorithms. In contrast to the above approaches, in this paper we offer a novel solution that allows SPD matrices to be used with unmodified Euclidean-based learning algorithms, with the true manifold shape well-preserved. Specifically, we propose to project SPD matrices using a set of random projection hyperplanes over RKHS into a random projection space, which leads to representing each matrix as a vector of projection coefficients. Experiments on face recognition, person re-identification and texture classification show that the proposed approach outperforms several recent methods, such as Tensor Sparse Coding, Histogram Plus Epitome, Riemannian Locality Preserving Projection and Relational Divergence Classification.
Resumo:
The overarching research work is based on two approaches: - Conceptual Analysis, Extraction and Linking - Experimentation with Product Libraries - Conceptual Analysis, Extraction and Linking: This aspect of the research has been achieved through the development of a conceptual framework for facilitating the understanding of the constituting components of BIM, Specifications and Cost Planning under investigation. The framework builds on theories spanning the constituent research themes and was used as a basis for justifying the elected approaches adopted throughout the research work. By means of tags and codes, a system for classifying building specification information has been developed as a differentiator between the chosen research approach and existing classification strategies in industry. Furthermore, syntactic links between extracted classes of specification information and cost planning have been established and will be adopted as a basis for authenticating the impact of specification information within BIM models. - Experimentation with Product Libraries Following the extraction and classification of BIM, Specifications and Cost Planning information, early experimentation on linking specifications to BIM models by means of a raas-based product library have been successful. A comparative analysis between a range of existing product libraries has also been realised. The outcomes have been amply documented in papers, all of which have received positive reviews. Ongoing experiments and analysis with the product library involve integrating the cost planning component for authenticating the completeness, relevance and impact of embedded specification within BIM models.
Resumo:
Irregular atrial pressure, defective folate and cholesterol metabolism contribute to the pathogenesis of hypertension. However, little is known about the combined roles of the methylenetetrahydrofolate reductase (MTHFR), apolipoprotein-E (ApoE) and angiotensin-converting enzyme (ACE) genes, which are involved in metabolism and homeostasis. The objective of this study is to investigate the association of the MTHFR 677 C>T and 1298A>C, ACE insertion–deletion (I/D) and ApoE genetic polymorphisms with hypertension and to further explore the epistasis interactions that are involved in these mechanisms. A total of 594 subjects, including 348 normotensive and 246 hypertensive ischemic stroke subjects were recruited. The MTHFR 677 C>T and 1298A>C, ACE I/D and ApoEpolymorphisms were genotyped and the epistasis interaction were analyzed. The MTHFR 677 C>T and ApoE polymorphisms demonstrated significant associations with susceptibility to hypertension in multiple logistic regression models, multifactor dimensionality reduction and a classification and regression tree. In addition, the logistic regression model demonstrated that significant interactions between the ApoE E3E3, E2E4, E2E2 and MTHFR 677 C>T polymorphisms existed. In conclusion, the results of this epistasis study indicated significant association between the ApoE and MTHFR polymorphisms and hypertension.
Resumo:
Young novice drivers are at considerable risk of injury on the road. Their behaviour appears vulnerable to the social influence of their parents and friends. The nature and mechanisms of parent and peer influence on young novice driver (16–25 years) behaviour was explored via small group interviews (n = 21) and two surveys (n1 = 1170, n2 = 390) to inform more effective young driver countermeasures. Parental and peer influence occurred in preLicence, Learner, and Provisional (intermediate) periods. Pre-Licence and unsupervised Learner drivers reported their parents were less likely to punish risky driving (e.g., speeding). These drivers were more likely to imitate their parents and reported their parents were also risky drivers. Young novice drivers who experienced or expected social punishments from peers, including ‘being told off’ for risky driving, reported less riskiness. Conversely drivers who experienced or expected social rewards such as being ‘cheered on’ by friends – who were also more risky drivers – reported more risky driving including crashes and offences. Interventions enhancing positive influence and curtailing negative influence may improve road safety outcomes not only for young novice drivers, but for all persons who share the road with them. Parent-specific interventions warrant further development and evaluation including: modelling safe driving behaviour by parents; active monitoring of driving during novice licensure; and sharing the family vehicle during the intermediate phase. Peer-targeted interventions including modelling of safe driving behaviour and attitudes; minimisation of social reinforcement and promotion of social sanctions for risky driving also need further development and evaluation.
Resumo:
Automated remote ultrasound detectors allow large amounts of data on bat presence and activity to be collected. Processing of such data involves identifying bat species from their echolocation calls. Automated species identification has the potential to provide more consistent, predictable, and potentially higher levels of accuracy than identification by humans. In contrast, identification by humans permits flexibility and intelligence in identification, as well as the incorporation of features and patterns that may be difficult to quantify. We compared humans with artificial neural networks (ANNs) in their ability to classify short recordings of bat echolocation calls of variable signal to noise ratios; these sequences are typical of those obtained from remote automated recording systems that are often used in large-scale ecological studies. We presented 45 recordings (1–4 calls) produced by known species of bats to ANNs and to 26 human participants with 1 month to 23 years of experience in acoustic identification of bats. Humans correctly classified 86% of recordings to genus and 56% to species; ANNs correctly identified 92% and 62%, respectively. There was no significant difference between the performance of ANNs and that of humans, but ANNs performed better than about 75% of humans. There was little relationship between the experience of the human participants and their classification rate. However, humans with <1 year of experience performed worse than others. Currently, identification of bat echolocation calls by humans is suitable for ecological research, after careful consideration of biases. However, improvements to ANNs and the data that they are trained on may in future increase their performance to beyond those demonstrated by humans.
Resumo:
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We implemented an n-gram mutual information (NGMI) based segmentation algorithm with the mixed-up features from unsupervised, supervised and dictionarybased segmentation methods. This algorithm is also combined with a simple strategy for out-of-vocabulary (OOV) word recognition. The evaluation for both open and closed training shows encouraging results of our system. The results for OOV word recognition in closed training evaluation were however found unsatisfactory.
Resumo:
With a focus to optimising the life cycle performance of Australian Railway bridges, new bridge classification and environmental classification systems are proposed. The new bridge classification system is mainly to facilitate the implementation of novel Bridge Management System (BMS) which optimise the life cycle cost both at project level and network level while environment classification is mainly to improve accuracy of Remaining Service Potential (RSP) module of the proposed BMS. In fact, limited capacity of the existing BMS to trigger the maintenance intervention point is an indirect result of inadequacies of the existing bridge and environmental classification systems. The proposed bridge classification system permits to identify the intervention points based on percentage deterioration of individual elements and maintenance cost, while allowing performance based rating technique to implement for maintenance optimisation and prioritisation. Simultaneously, the proposed environment classification system will enhance the accuracy of prediction of deterioration of steel components.
Resumo:
Avian species richness surveys, which measure the total number of unique avian species, can be conducted via remote acoustic sensors. An immense quantity of data can be collected, which, although rich in useful information, places a great workload on the scientists who manually inspect the audio. To deal with this big data problem, we calculated acoustic indices from audio data at a one-minute resolution and used them to classify one-minute recordings into five classes. By filtering out the non-avian minutes, we can reduce the amount of data by about 50% and improve the efficiency of determining avian species richness. The experimental results show that, given 60 one-minute samples, our approach enables to direct ecologists to find about 10% more avian species.
Resumo:
Objective: In the majority of exercise intervention studies, the aggregate reported weight loss is often small. The efficacy of exercise as a weight loss tool remains in question. The aim of the present study was to investigate the variability in appetite and body weight when participants engaged in a supervised and monitored exercise programme. ---------- Design: Fifty-eight obese men and women (BMI = 31·8 ± 4·5 kg/m2) were prescribed exercise to expend approximately 2092 kJ (500 kcal) per session, five times a week at an intensity of 70 % maximum heart rate for 12 weeks under supervised conditions in the research unit. Body weight and composition, total daily energy intake and various health markers were measured at weeks 0, 4, 8 and 12. ---------- Results: Mean reduction in body weight (3·2 ± 1·98 kg) was significant (P < 0·001); however, there was large individual variability (−14·7 to +2·7 kg). This large variability could be largely attributed to the differences in energy intake over the 12-week intervention. Those participants who failed to lose meaningful weight increased their food intake and reduced intake of fruits and vegetables. ---------- Conclusion: These data have demonstrated that even when exercise energy expenditure is high, a healthy diet is still required for weight loss to occur in many people.
Resumo:
In this paper, we present the application of a non-linear dimensionality reduction technique for the learning and probabilistic classification of hyperspectral image. Hyperspectral image spectroscopy is an emerging technique for geological investigations from airborne or orbital sensors. It gives much greater information content per pixel on the image than a normal colour image. This should greatly help with the autonomous identification of natural and manmade objects in unfamiliar terrains for robotic vehicles. However, the large information content of such data makes interpretation of hyperspectral images time-consuming and userintensive. We propose the use of Isomap, a non-linear manifold learning technique combined with Expectation Maximisation in graphical probabilistic models for learning and classification. Isomap is used to find the underlying manifold of the training data. This low dimensional representation of the hyperspectral data facilitates the learning of a Gaussian Mixture Model representation, whose joint probability distributions can be calculated offline. The learnt model is then applied to the hyperspectral image at runtime and data classification can be performed.
Resumo:
A significant proportion of the cost of software development is due to software testing and maintenance. This is in part the result of the inevitable imperfections due to human error, lack of quality during the design and coding of software, and the increasing need to reduce faults to improve customer satisfaction in a competitive marketplace. Given the cost and importance of removing errors improvements in fault detection and removal can be of significant benefit. The earlier in the development process faults can be found, the less it costs to correct them and the less likely other faults are to develop. This research aims to make the testing process more efficient and effective by identifying those software modules most likely to contain faults, allowing testing efforts to be carefully targeted. This is done with the use of machine learning algorithms which use examples of fault prone and not fault prone modules to develop predictive models of quality. In order to learn the numerical mapping between module and classification, a module is represented in terms of software metrics. A difficulty in this sort of problem is sourcing software engineering data of adequate quality. In this work, data is obtained from two sources, the NASA Metrics Data Program, and the open source Eclipse project. Feature selection before learning is applied, and in this area a number of different feature selection methods are applied to find which work best. Two machine learning algorithms are applied to the data - Naive Bayes and the Support Vector Machine - and predictive results are compared to those of previous efforts and found to be superior on selected data sets and comparable on others. In addition, a new classification method is proposed, Rank Sum, in which a ranking abstraction is laid over bin densities for each class, and a classification is determined based on the sum of ranks over features. A novel extension of this method is also described based on an observed polarising of points by class when rank sum is applied to training data to convert it into 2D rank sum space. SVM is applied to this transformed data to produce models the parameters of which can be set according to trade-off curves to obtain a particular performance trade-off.
Resumo:
The Fleet Store is a project that was created to research the impact of enterprise and authentic learning models, in increasing the viability and improved career potential of fashion business, design and creative industry (fashion major) students. Reflective Thinking techniques were employed to gain valuable insights into the quality of the experience, the networking and the motivational and experiential learning for all students. The lecturer acted as the Managing Director and curator of the entire event while maintaining pedagogy to support the experience. Research focussed on the ways in which student learning outcomes have been improved by creating product a professional and economically viable pop up fashion outlet in an inner city, high profile shopping precinct. The first QUT double degree fashion business students were supervised and guided to be responsible for creating and maintaining a profitable fashion outlet in collaboration with their lecturer Kay McMahon, Wintergarden Management, Brisbane Marketing, Creative Enterprise Australia and QUT Fashion. Reflective thinking and further research into career outcomes (that are acknowledged as being supported by the experience) are currently being undertaken.