166 resultados para Networks analysis
em Queensland University of Technology - ePrints Archive
Resumo:
A new relationship type of social networks - online dating - are gaining popularity. With a large member base, users of a dating network are overloaded with choices about their ideal partners. Recommendation methods can be utilized to overcome this problem. However, traditional recommendation methods do not work effectively for online dating networks where the dataset is sparse and large, and a two-way matching is required. This paper applies social networking concepts to solve the problem of developing a recommendation method for online dating networks. We propose a method by using clustering, SimRank and adapted SimRank algorithms to recommend matching candidates. Empirical results show that the proposed method can achieve nearly double the performance of the traditional collaborative filtering and common neighbor methods of recommendation.
Resumo:
The rapid growth in the number of users using social networks and the information that a social network requires about their users make the traditional matching systems insufficiently adept at matching users within social networks. This paper introduces the use of clustering to form communities of users and, then, uses these communities to generate matches. Forming communities within a social network helps to reduce the number of users that the matching system needs to consider, and helps to overcome other problems from which social networks suffer, such as the absence of user activities' information about a new user. The proposed system has been evaluated on a dataset obtained from an online dating website. Empirical analysis shows that accuracy of the matching process is increased using the community information.
Resumo:
This article presents and evaluates a model to automatically derive word association networks from text corpora. Two aspects were evaluated: To what degree can corpus-based word association networks (CANs) approximate human word association networks with respect to (1) their ability to quantitatively predict word associations and (2) their structural network characteristics. Word association networks are the basis of the human mental lexicon. However, extracting such networks from human subjects is laborious, time consuming and thus necessarily limited in relation to the breadth of human vocabulary. Automatic derivation of word associations from text corpora would address these limitations. In both evaluations corpus-based processing provided vector representations for words. These representations were then employed to derive CANs using two measures: (1) the well known cosine metric, which is a symmetric measure, and (2) a new asymmetric measure computed from orthogonal vector projections. For both evaluations, the full set of 4068 free association networks (FANs) from the University of South Florida word association norms were used as baseline human data. Two corpus based models were benchmarked for comparison: a latent topic model and latent semantic analysis (LSA). We observed that CANs constructed using the asymmetric measure were slightly less effective than the topic model in quantitatively predicting free associates, and slightly better than LSA. The structural networks analysis revealed that CANs do approximate the FANs to an encouraging degree.
Resumo:
This paper studies the problem of selecting users in an online social network for targeted advertising so as to maximize the adoption of a given product. In previous work, two families of models have been considered to address this problem: direct targeting and network-based targeting. The former approach targets users with the highest propensity to adopt the product, while the latter approach targets users with the highest influence potential – that is users whose adoption is most likely to be followed by subsequent adoptions by peers. This paper proposes a hybrid approach that combines a notion of propensity and a notion of influence into a single utility function. We show that targeting a fixed number of high-utility users results in more adoptions than targeting either highly influential users or users with high propensity.
Resumo:
We describe the design and evaluation of a platform for networks of cameras in low-bandwidth, low-power sensor networks. In our work to date we have investigated two different DSP hardware/software platforms for undertaking the tasks of compression and object detection and tracking. We compare the relative merits of each of the hardware and software platforms in terms of both performance and energy consumption. Finally we discuss what we believe are the ongoing research questions for image processing in WSNs.
Resumo:
A comprehensive voltage imbalance sensitivity analysis and stochastic evaluation based on the rating and location of single-phase grid-connected rooftop photovoltaic cells (PVs) in a residential low voltage distribution network are presented. The voltage imbalance at different locations along a feeder is investigated. In addition, the sensitivity analysis is performed for voltage imbalance in one feeder when PVs are installed in other feeders of the network. A stochastic evaluation based on Monte Carlo method is carried out to investigate the risk index of the non-standard voltage imbalance in the network in the presence of PVs. The network voltage imbalance characteristic based on different criteria of PV rating and location and network conditions is generalized. Improvement methods are proposed for voltage imbalance reduction and their efficacy is verified by comparing their risk index using Monte Carlo simulations.
Analytical modeling and sensitivity analysis for travel time estimation on signalized urban networks
Resumo:
This paper presents a model for estimation of average travel time and its variability on signalized urban networks using cumulative plots. The plots are generated based on the availability of data: a) case-D, for detector data only; b) case-DS, for detector data and signal timings; and c) case-DSS, for detector data, signal timings and saturation flow rate. The performance of the model for different degrees of saturation and different detector detection intervals is consistent for case-DSS and case-DS whereas, for case-D the performance is inconsistent. The sensitivity analysis of the model for case-D indicates that it is sensitive to detection interval and signal timings within the interval. When detection interval is integral multiple of signal cycle then it has low accuracy and low reliability. Whereas, for detection interval around 1.5 times signal cycle both accuracy and reliability are high.
Resumo:
Wireless network technologies, such as IEEE 802.11 based wireless local area networks (WLANs), have been adopted in wireless networked control systems (WNCS) for real-time applications. Distributed real-time control requires satisfaction of (soft) real-time performance from the underlying networks for delivery of real-time traffic. However, IEEE 802.11 networks are not designed for WNCS applications. They neither inherently provide quality-of-service (QoS) support, nor explicitly consider the characteristics of the real-time traffic on networked control systems (NCS), i.e., periodic round-trip traffic. Therefore, the adoption of 802.11 networks in real-time WNCSs causes challenging problems for network design and performance analysis. Theoretical methodologies are yet to be developed for computing the best achievable WNCS network performance under the constraints of real-time control requirements. Focusing on IEEE 802.11 distributed coordination function (DCF) based WNCSs, this paper analyses several important NCS network performance indices, such as throughput capacity, round trip time and packet loss ratio under the periodic round trip traffic pattern, a unique feature of typical NCSs. Considering periodic round trip traffic, an analytical model based on Markov chain theory is developed for deriving these performance indices under a critical real-time traffic condition, at which the real-time performance constraints are marginally satisfied. Case studies are also carried out to validate the theoretical development.
Resumo:
Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.
Resumo:
Trees, shrubs and other vegetation are of continued importance to the environment and our daily life. They provide shade around our roads and houses, offer a habitat for birds and wildlife, and absorb air pollutants. However, vegetation touching power lines is a risk to public safety and the environment, and one of the main causes of power supply problems. Vegetation management, which includes tree trimming and vegetation control, is a significant cost component of the maintenance of electrical infrastructure. For example, Ergon Energy, the Australia’s largest geographic footprint energy distributor, currently spends over $80 million a year inspecting and managing vegetation that encroach on power line assets. Currently, most vegetation management programs for distribution systems are calendar-based ground patrol. However, calendar-based inspection by linesman is labour-intensive, time consuming and expensive. It also results in some zones being trimmed more frequently than needed and others not cut often enough. Moreover, it’s seldom practicable to measure all the plants around power line corridors by field methods. Remote sensing data captured from airborne sensors has great potential in assisting vegetation management in power line corridors. This thesis presented a comprehensive study on using spiking neural networks in a specific image analysis application: power line corridor monitoring. Theoretically, the thesis focuses on a biologically inspired spiking cortical model: pulse coupled neural network (PCNN). The original PCNN model was simplified in order to better analyze the pulse dynamics and control the performance. Some new and effective algorithms were developed based on the proposed spiking cortical model for object detection, image segmentation and invariant feature extraction. The developed algorithms were evaluated in a number of experiments using real image data collected from our flight trails. The experimental results demonstrated the effectiveness and advantages of spiking neural networks in image processing tasks. Operationally, the knowledge gained from this research project offers a good reference to our industry partner (i.e. Ergon Energy) and other energy utilities who wants to improve their vegetation management activities. The novel approaches described in this thesis showed the potential of using the cutting edge sensor technologies and intelligent computing techniques in improve power line corridor monitoring. The lessons learnt from this project are also expected to increase the confidence of energy companies to move from traditional vegetation management strategy to a more automated, accurate and cost-effective solution using aerial remote sensing techniques.
Resumo:
Diversity techniques have long been used to combat the channel fading in wireless communications systems. Recently cooperative communications has attracted lot of attention due to many benefits it offers. Thus cooperative routing protocols with diversity transmission can be developed to exploit the random nature of the wireless channels to improve the network efficiency by selecting multiple cooperative nodes to forward data. In this paper we analyze and evaluate the performance of a novel routing protocol with multiple cooperative nodes which share multiple channels. Multiple shared channels cooperative (MSCC) routing protocol achieves diversity advantage by using cooperative transmission. It unites clustering hierarchy with a bandwidth reuse scheme to mitigate the co-channel interference. Theoretical analysis of average packet reception rate and network throughput of the MSCC protocol are presented and compared with simulated results.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.