813 resultados para graph analysis
Resumo:
The world is rich with information such as signage and maps to assist humans to navigate. We present a method to extract topological spatial information from a generic bitmap floor plan and build a topometric graph that can be used by a mobile robot for tasks such as path planning and guided exploration. The algorithm first detects and extracts text in an image of the floor plan. Using the locations of the extracted text, flood fill is used to find the rooms and hallways. Doors are found by matching SURF features and these form the connections between rooms, which are the edges of the topological graph. Our system is able to automatically detect doors and differentiate between hallways and rooms, which is important for effective navigation. We show that our method can extract a topometric graph from a floor plan and is robust against ambiguous cases most commonly seen in floor plans including elevators and stairwells.
Resumo:
Bioelectrical impedance analysis, (BIA), is a method of body composition analysis first investigated in 1962 which has recently received much attention by a number of research groups. The reasons for this recent interest are its advantages, (viz: inexpensive, non-invasive and portable) and also the increasing interest in the diagnostic value of body composition analysis. The concept utilised by BIA to predict body water volumes is the proportional relationship for a simple cylindrical conductor, (volume oc length2/resistance), which allows the volume to be predicted from the measured resistance and length. Most of the research to date has measured the body's resistance to the passage of a 50· kHz AC current to predict total body water, (TBW). Several research groups have investigated the application of AC currents at lower frequencies, (eg 5 kHz), to predict extracellular water, (ECW). However all research to date using BIA to predict body water volumes has used the impedance measured at a discrete frequency or frequencies. This thesis investigates the variation of impedance and phase of biological systems over a range of frequencies and describes the development of a swept frequency bioimpedance meter which measures impedance and phase at 496 frequencies ranging from 4 kHz to 1 MHz. The impedance of any biological system varies with the frequency of the applied current. The graph of reactance vs resistance yields a circular arc with the resistance decreasing with increasing frequency and reactance increasing from zero to a maximum then decreasing to zero. Computer programs were written to analyse the measured impedance spectrum and determine the impedance, Zc, at the characteristic frequency, (the frequency at which the reactance is a maximum). The fitted locus of the measured data was extrapolated to determine the resistance, Ro, at zero frequency; a value that cannot be measured directly using surface electrodes. The explanation of the theoretical basis for selecting these impedance values (Zc and Ro), to predict TBW and ECW is presented. Studies were conducted on a group of normal healthy animals, (n=42), in which TBW and ECW were determined by the gold standard of isotope dilution. The prediction quotients L2/Zc and L2/Ro, (L=length), yielded standard errors of 4.2% and 3.2% respectively, and were found to be significantly better than previously reported, empirically determined prediction quotients derived from measurements at a single frequency. The prediction equations established in this group of normal healthy animals were applied to a group of animals with abnormally low fluid levels, (n=20), and also to a group with an abnormal balance of extra-cellular to intracellular fluids, (n=20). In both cases the equations using L2/Zc and L2/Ro accurately and precisely predicted TBW and ECW. This demonstrated that the technique developed using multiple frequency bioelectrical impedance analysis, (MFBIA), can accurately predict both TBW and ECW in both normal and abnormal animals, (with standard errors of the estimate of 6% and 3% for TBW and ECW respectively). Isotope dilution techniques were used to determine TBW and ECW in a group of 60 healthy human subjects, (male. and female, aged between 18 and 45). Whole body impedance measurements were recorded on each subject using the MFBIA technique and the correlations between body water volumes, (TBW and ECW), and heighe/impedance, (for all measured frequencies), were compared. The prediction quotients H2/Zc and H2/Ro, (H=height), again yielded the highest correlation with TBW and ECW respectively with corresponding standard errors of 5.2% and 10%. The values of the correlation coefficients obtained in this study were very similar to those recently reported by others. It was also observed that in healthy human subjects the impedance measured at virtually any frequency yielded correlations not significantly different from those obtained from the MFBIA quotients. This phenomenon has been reported by other research groups and emphasises the need to validate the technique by investigating its application in one or more groups with abnormalities in fluid levels. The clinical application of MFBIA was trialled and its capability of detecting lymphoedema, (an excess of extracellular fluid), was investigated. The MFBIA technique was demonstrated to be significantly more sensitive, (P<.05), in detecting lymphoedema than the current technique of circumferential measurements. MFBIA was also shown to provide valuable information describing the changes in the quantity of muscle mass of the patient during the course of the treatment. The determination of body composition, (viz TBW and ECW), by MFBIA has been shown to be a significant improvement on previous bioelectrical impedance techniques. The merit of the MFBIA technique is evidenced in its accurate, precise and valid application in animal groups with a wide variation in body fluid volumes and balances. The multiple frequency bioelectrical impedance analysis technique developed in this study provides accurate and precise estimates of body composition, (viz TBW and ECW), regardless of the individual's state of health.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.
Resumo:
Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.
Resumo:
Online social networks can be modelled as graphs; in this paper, we analyze the use of graph metrics for identifying users with anomalous relationships to other users. A framework is proposed for analyzing the effectiveness of various graph theoretic properties such as the number of neighbouring nodes and edges, betweenness centrality, and community cohesiveness in detecting anomalous users. Experimental results on real-world data collected from online social networks show that the majority of users typically have friends who are friends themselves, whereas anomalous users’ graphs typically do not follow this common rule. Empirical analysis also shows that the relationship between average betweenness centrality and edges identifies anomalies more accurately than other approaches.
Resumo:
Secure communications in distributed Wireless Sensor Networks (WSN) operating under adversarial conditions necessitate efficient key management schemes. In the absence of a priori knowledge of post-deployment network configuration and due to limited resources at sensor nodes, key management schemes cannot be based on post-deployment computations. Instead, a list of keys, called a key-chain, is distributed to each sensor node before the deployment. For secure communication, either two nodes should have a key in common in their key-chains, or they should establish a key through a secure-path on which every link is secured with a key. We first provide a comparative survey of well known key management solutions for WSN. Probabilistic, deterministic and hybrid key management solutions are presented, and they are compared based on their security properties and re-source usage. We provide a taxonomy of solutions, and identify trade-offs in them to conclude that there is no one size-fits-all solution. Second, we design and analyze deterministic and hybrid techniques to distribute pair-wise keys to sensor nodes before the deployment. We present novel deterministic and hybrid approaches based on combinatorial design theory and graph theory for deciding how many and which keys to assign to each key-chain before the sensor network deployment. Performance and security of the proposed schemes are studied both analytically and computationally. Third, we address the key establishment problem in WSN which requires key agreement algorithms without authentication are executed over a secure-path. The length of the secure-path impacts the power consumption and the initialization delay for a WSN before it becomes operational. We formulate the key establishment problem as a constrained bi-objective optimization problem, break it into two sub-problems, and show that they are both NP-Hard and MAX-SNP-Hard. Having established inapproximability results, we focus on addressing the authentication problem that prevents key agreement algorithms to be used directly over a wireless link. We present a fully distributed algorithm where each pair of nodes can establish a key with authentication by using their neighbors as the witnesses.
Resumo:
In this paper, we propose a semi-supervised approach of anomaly detection in Online Social Networks. The social network is modeled as a graph and its features are extracted to detect anomaly. A clustering algorithm is then used to group users based on these features and fuzzy logic is applied to assign degree of anomalous behavior to the users of these clusters. Empirical analysis shows effectiveness of this method.
Resumo:
A people-to-people matching system (or a match-making system) refers to a system in which users join with the objective of meeting other users with the common need. Some real-world examples of these systems are employer-employee (in job search networks), mentor-student (in university social networks), consume-to-consumer (in marketplaces) and male-female (in an online dating network). The network underlying in these systems consists of two groups of users, and the relationships between users need to be captured for developing an efficient match-making system. Most of the existing studies utilize information either about each of the users in isolation or their interaction separately, and develop recommender systems using the one form of information only. It is imperative to understand the linkages among the users in the network and use them in developing a match-making system. This study utilizes several social network analysis methods such as graph theory, small world phenomenon, centrality analysis, density analysis to gain insight into the entities and their relationships present in this network. This paper also proposes a new type of graph called “attributed bipartite graph”. By using these analyses and the proposed type of graph, an efficient hybrid recommender system is developed which generates recommendation for new users as well as shows improvement in accuracy over the baseline methods.
Resumo:
Many studies have shown that we can gain additional information on time series by investigating their accompanying complex networks. In this work, we investigate the fundamental topological and fractal properties of recurrence networks constructed from fractional Brownian motions (FBMs). First, our results indicate that the constructed recurrence networks have exponential degree distributions; the average degree exponent 〈λ〉 increases first and then decreases with the increase of Hurst index H of the associated FBMs; the relationship between H and 〈λ〉 can be represented by a cubic polynomial function. We next focus on the motif rank distribution of recurrence networks, so that we can better understand networks at the local structure level. We find the interesting superfamily phenomenon, i.e., the recurrence networks with the same motif rank pattern being grouped into two superfamilies. Last, we numerically analyze the fractal and multifractal properties of recurrence networks. We find that the average fractal dimension 〈dB〉 of recurrence networks decreases with the Hurst index H of the associated FBMs, and their dependence approximately satisfies the linear formula 〈dB〉≈2-H, which means that the fractal dimension of the associated recurrence network is close to that of the graph of the FBM. Moreover, our numerical results of multifractal analysis show that the multifractality exists in these recurrence networks, and the multifractality of these networks becomes stronger at first and then weaker when the Hurst index of the associated time series becomes larger from 0.4 to 0.95. In particular, the recurrence network with the Hurst index H=0.5 possesses the strongest multifractality. In addition, the dependence relationships of the average information dimension 〈D(1)〉 and the average correlation dimension 〈D(2)〉 on the Hurst index H can also be fitted well with linear functions. Our results strongly suggest that the recurrence network inherits the basic characteristic and the fractal nature of the associated FBM series.
Resumo:
Raman spectroscopy of formamide-intercalated kaolinites treated using controlled-rate thermal analysis technology (CRTA), allowing the separation of adsorbed formamide from intercalated formamide in formamide-intercalated kaolinites, is reported. The Raman spectra of the CRTA-treated formamide-intercalated kaolinites are significantly different from those of the intercalated kaolinites, which display a combination of both intercalated and adsorbed formamide. An intense band is observed at 3629 cm-1, attributed to the inner surface hydroxyls hydrogen bonded to the formamide. Broad bands are observed at 3600 and 3639 cm-1, assigned to the inner surface hydroxyls, which are hydrogen bonded to the adsorbed water molecules. The hydroxyl-stretching band of the inner hydroxyl is observed at 3621 cm-1 in the Raman spectra of the CRTA-treated formamide-intercalated kaolinites. The results of thermal analysis show that the amount of intercalated formamide between the kaolinite layers is independent of the presence of water. Significant differences are observed in the CO stretching region between the adsorbed and intercalated formamide.
Resumo:
Diffusion equations that use time fractional derivatives are attractive because they describe a wealth of problems involving non-Markovian Random walks. The time fractional diffusion equation (TFDE) is obtained from the standard diffusion equation by replacing the first-order time derivative with a fractional derivative of order α ∈ (0, 1). Developing numerical methods for solving fractional partial differential equations is a new research field and the theoretical analysis of the numerical methods associated with them is not fully developed. In this paper an explicit conservative difference approximation (ECDA) for TFDE is proposed. We give a detailed analysis for this ECDA and generate discrete models of random walk suitable for simulating random variables whose spatial probability density evolves in time according to this fractional diffusion equation. The stability and convergence of the ECDA for TFDE in a bounded domain are discussed. Finally, some numerical examples are presented to show the application of the present technique.