114 resultados para ensemble empirical mode decomposition with canonical correlation analysis (EEMD-CCA)
em Queensland University of Technology - ePrints Archive
Resumo:
The function of a protein can be partially determined by the information contained in its amino acid sequence. It can be assumed that proteins with similar amino acid sequences normally have closer functions. Hence analysing the similarity of proteins has become one of the most important areas of protein study. In this work, a layered comparison method is used to analyze the similarity of proteins. It is based on the empirical mode decomposition (EMD) method, and protein sequences are characterized by the intrinsic mode functions (IMFs). The similarity of proteins is studied with a new cross-correlation formula. It seems that the EMD method can be used to detect the functional relationship of two proteins. This kind of similarity method is a complement of traditional sequence similarity approaches which focus on the alignment of amino acids
Resumo:
Diagnostics is based on the characterization of mechanical system condition and allows early detection of a possible fault. Signal processing is an approach widely used in diagnostics, since it allows directly characterizing the state of the system. Several types of advanced signal processing techniques have been proposed in the last decades and added to more conventional ones. Seldom, these techniques are able to consider non-stationary operations. Diagnostics of roller bearings is not an exception of this framework. In this paper, a new vibration signal processing tool, able to perform roller bearing diagnostics in whatever working condition and noise level, is developed on the basis of two data-adaptive techniques as Empirical Mode Decomposition (EMD), Minimum Entropy Deconvolution (MED), coupled by means of the mathematics related to the Hilbert transform. The effectiveness of the new signal processing tool is proven by means of experimental data measured in a test-rig that employs high power industrial size components.
Resumo:
Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
As a part of vital infrastructure and transportation networks, bridge structures must function safely at all times. However, due to heavier and faster moving vehicular loads and function adjustment, such as Busway accommodation, many bridges are now operating at an overload beyond their design capacity. Additionally, the huge renovation and replacement costs always make the infrastructure owners difficult to undertake. Structural health monitoring (SHM) is set to assess condition and foresee probable failures of designated bridge(s), so as to monitor the structural health of the bridges. The SHM systems proposed recently are incorporated with Vibration-Based Damage Detection (VBDD) techniques, Statistical Methods and Signal processing techniques and have been regarded as efficient and economical ways to solve the problem. The recent development in damage detection and condition assessment techniques based on VBDD and statistical methods are reviewed. The VBDD methods based on changes in natural frequencies, curvature/strain modes, modal strain energy (MSE) dynamic flexibility, artificial neural networks (ANN) before and after damage and other signal processing methods like Wavelet techniques and empirical mode decomposition (EMD) / Hilbert spectrum methods are discussed here.
Resumo:
Structural health is a vital aspect of infrastructure sustainability. As a part of a vital infrastructure and transportation network, bridge structures must function safely at all times. However, due to heavier and faster moving vehicular loads and function adjustment, such as Busway accommodation, many bridges are now operating at an overload beyond their design capacity. Additionally, the huge renovation and replacement costs are a difficult burden for infrastructure owners. The structural health monitoring (SHM) systems proposed recently are incorporated with vibration-based damage detection techniques, statistical methods and signal processing techniques and have been regarded as efficient and economical ways to assess bridge condition and foresee probable costly failures. In this chapter, the recent developments in damage detection and condition assessment techniques based on vibration-based damage detection and statistical methods are reviewed. The vibration-based damage detection methods based on changes in natural frequencies, curvature or strain modes, modal strain energy, dynamic flexibility, artificial neural networks, before and after damage, and other signal processing methods such as Wavelet techniques, empirical mode decomposition and Hilbert spectrum methods are discussed in this chapter.
Resumo:
Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.
Resumo:
Spectroscopic studies of complex clinical fluids have led to the application of a more holistic approach to their chemical analysis becoming more popular and widely employed. The efficient and effective interpretation of multidimensional spectroscopic data relies on many chemometric techniques and one such group of tools is represented by so-called correlation analysis methods. Typical of these techniques are two-dimensional correlation analysis and statistical total correlation spectroscopy (STOCSY). Whilst the former has largely been applied to optical spectroscopic analysis, STOCSY was developed and has been applied almost exclusively to NMR metabonomic studies. Using a 1H NMR study of human blood plasma, from subjects recovering from exhaustive exercise trials, the basic concepts and applications of these techniques are examined. Typical information from their application to NMR-based metabonomics is presented and their value in aiding interpretation of NMR data obtained from biological systems is illustrated. Major energy metabolites are identified in the NMR spectra and the dynamics of their appearance and removal from plasma during exercise recovery are illustrated and discussed. The complementary nature of two-dimensional correlation analysis and statistical total correlation spectroscopy are highlighted.
Resumo:
The diagnostics of mechanical components operating in transient conditions is still an open issue, in both research and industrial field. Indeed, the signal processing techniques developed to analyse stationary data are not applicable or are affected by a loss of effectiveness when applied to signal acquired in transient conditions. In this paper, a suitable and original signal processing tool (named EEMED), which can be used for mechanical component diagnostics in whatever operating condition and noise level, is developed exploiting some data-adaptive techniques such as Empirical Mode Decomposition (EMD), Minimum Entropy Deconvolution (MED) and the analytical approach of the Hilbert transform. The proposed tool is able to supply diagnostic information on the basis of experimental vibrations measured in transient conditions. The tool has been originally developed in order to detect localized faults on bearings installed in high speed train traction equipments and it is more effective to detect a fault in non-stationary conditions than signal processing tools based on spectral kurtosis or envelope analysis, which represent until now the landmark for bearings diagnostics.