137 resultados para feature vector
Resumo:
Retinopathy of prematurity (ROP) is a rare disease in which retinal blood vessels of premature infants fail to develop normally, and is one of the major causes of childhood blindness throughout the world. The Discrete Conditional Phase-type (DC-Ph) model consists of two components, the conditional component measuring the inter-relationships between covariates and the survival component which models the survival distribution using a Coxian phase-type distribution. This paper expands the DC-Ph models by introducing a support vector machine (SVM), in the role of the conditional component. The SVM is capable of classifying multiple outcomes and is used to identify the infant's risk of developing ROP. Class imbalance makes predicting rare events difficult. A new class decomposition technique, which deals with the problem of multiclass imbalance, is introduced. Based on the SVM classification, the length of stay in the neonatal ward is modelled using a 5, 8 or 9 phase Coxian distribution.
Resumo:
One of the major challenges in systems biology is to understand the complex responses of a biological system to external perturbations or internal signalling depending on its biological conditions. Genome-wide transcriptomic profiling of cellular systems under various chemical perturbations allows the manifestation of certain features of the chemicals through their transcriptomic expression profiles. The insights obtained may help to establish the connections between human diseases, associated genes and therapeutic drugs. The main objective of this study was to systematically analyse cellular gene expression data under various drug treatments to elucidate drug-feature specific transcriptomic signatures. We first extracted drug-related information (drug features) from the collected textual description of DrugBank entries using text-mining techniques. A novel statistical method employing orthogonal least square learning was proposed to obtain drug-feature-specific signatures by integrating gene expression with DrugBank data. To obtain robust signatures from noisy input datasets, a stringent ensemble approach was applied with the combination of three techniques: resampling, leave-one-out cross validation, and aggregation. The validation experiments showed that the proposed method has the capacity of extracting biologically meaningful drug-feature-specific gene expression signatures. It was also shown that most of signature genes are connected with common hub genes by regulatory network analysis. The common hub genes were further shown to be related to general drug metabolism by Gene Ontology analysis. Each set of genes has relatively few interactions with other sets, indicating the modular nature of each signature and its drug-feature-specificity. Based on Gene Ontology analysis, we also found that each set of drug feature (DF)-specific genes were indeed enriched in biological processes related to the drug feature. The results of these experiments demonstrated the pot- ntial of the method for predicting certain features of new drugs using their transcriptomic profiles, providing a useful methodological framework and a valuable resource for drug development and characterization.
Resumo:
Power electronics plays an important role in the control and conversion of modern electric power systems. In particular, to integrate various renewable energies using DC transmissions and to provide more flexible power control in AC systems, significant efforts have been made in the modulation and control of power electronics devices. Pulse width modulation (PWM) is a well developed technology in the conversion between AC and DC power sources, especially for the purpose of harmonics reduction and energy optimization. As a fundamental decoupled control method, vector control with PI controllers has been widely used in power systems. However, significant power loss occurs during the operation of these devices, and the loss is often dissipated in the form of heat, leading to significant maintenance effort. Though much work has been done to improve the power electronics design, little has focused so far on the investigation of the controller design to reduce the controller energy consumption (leading to power loss in power electronics) while maintaining acceptable system performance. This paper aims to bridge the gap and investigates their correlations. It is shown a more thoughtful controller design can achieve better balance between energy consumption in power electronics control and system performance, which potentially leads to significant energy saving for integration of renewable power sources.
Resumo:
In many applications, and especially those where batch processes are involved, a target scalar output of interest is often dependent on one or more time series of data. With the exponential growth in data logging in modern industries such time series are increasingly available for statistical modeling in soft sensing applications. In order to exploit time series data for predictive modelling, it is necessary to summarise the information they contain as a set of features to use as model regressors. Typically this is done in an unsupervised fashion using simple techniques such as computing statistical moments, principal components or wavelet decompositions, often leading to significant information loss and hence suboptimal predictive models. In this paper, a functional learning paradigm is exploited in a supervised fashion to derive continuous, smooth estimates of time series data (yielding aggregated local information), while simultaneously estimating a continuous shape function yielding optimal predictions. The proposed Supervised Aggregative Feature Extraction (SAFE) methodology can be extended to support nonlinear predictive models by embedding the functional learning framework in a Reproducing Kernel Hilbert Spaces setting. SAFE has a number of attractive features including closed form solution and the ability to explicitly incorporate first and second order derivative information. Using simulation studies and a practical semiconductor manufacturing case study we highlight the strengths of the new methodology with respect to standard unsupervised feature extraction approaches.
Resumo:
The peptides derived from envelope proteins have been shown to inhibit the protein-protein interactions in the virus membrane fusion process and thus have a great potential to be developed into effective antiviral therapies. There are three types of envelope proteins each exhibiting distinct structure folds. Although the exact fusion mechanism remains elusive, it was suggested that the three classes of viral fusion proteins share a similar mechanism of membrane fusion. The common mechanism of action makes it possible to correlate the properties of self-derived peptide inhibitors with their activities. Here we developed a support vector machine model using sequence-based statistical scores of self-derived peptide inhibitors as input features to correlate with their activities. The model displayed 92% prediction accuracy with the Matthew’s correlation coefficient of 0.84, obviously superior to those using physicochemical properties and amino acid decomposition as input. The predictive support vector machine model for self- derived peptides of envelope proteins would be useful in development of antiviral peptide inhibitors targeting the virus fusion process.
Resumo:
This paper investigated using lip movements as a behavioural biometric for person authentication. The system was trained, evaluated and tested using the XM2VTS dataset, following the Lausanne Protocol configuration II. Features were selected from the DCT coefficients of the greyscale lip image. This paper investigated the number of DCT coefficients selected, the selection process, and static and dynamic feature combinations. Using a Gaussian Mixture Model - Universal Background Model framework an Equal Error Rate of 2.20% was achieved during evaluation and on an unseen test set a False Acceptance Rate of 1.7% and False Rejection Rate of 3.0% was achieved. This compares favourably with face authentication results on the same dataset whilst not being susceptible to spoofing attacks.
Resumo:
The main objective of the study presented in this paper was to investigate the feasibility using support vector machines (SVM) for the prediction of the fresh properties of self-compacting concrete. The radial basis function (RBF) and polynomial kernels were used to predict these properties as a function of the content of mix components. The fresh properties were assessed with the slump flow, T50, T60, V-funnel time, Orimet time, and blocking ratio (L-box). The retention of these tests was also measured at 30 and 60 min after adding the first water. The water dosage varied from 188 to 208 L/m3, the dosage of superplasticiser (SP) from 3.8 to 5.8 kg/m3, and the volume of coarse aggregates from 220 to 360 L/m3. In total, twenty mixes were used to measure the fresh state properties with different mixture compositions. RBF kernel was more accurate compared to polynomial kernel based support vector machines with a root mean square error (RMSE) of 26.9 (correlation coefficient of R2 = 0.974) for slump flow prediction, a RMSE of 0.55 (R2 = 0.910) for T50 (s) prediction, a RMSE of 1.71 (R2 = 0.812) for T60 (s) prediction, a RMSE of 0.1517 (R2 = 0.990) for V-funnel time prediction, a RMSE of 3.99 (R2 = 0.976) for Orimet time prediction, and a RMSE of 0.042 (R2 = 0.988) for L-box ratio prediction, respectively. A sensitivity analysis was performed to evaluate the effects of the dosage of cement and limestone powder, the water content, the volumes of coarse aggregate and sand, the dosage of SP and the testing time on the predicted test responses. The analysis indicates that the proposed SVM RBF model can gain a high precision, which provides an alternative method for predicting the fresh properties of SCC.