16 resultados para networks text analysis text network graph Gephi network measures shuffed text Zipf Heap Python
em Digital Commons - Michigan Tech
Resumo:
The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.
Resumo:
Dynamic spectrum access (DSA) aims at utilizing spectral opportunities both in time and frequency domains at any given location, which arise due to variations in spectrum usage. Recently, Cognitive radios (CRs) have been proposed as a means of implementing DSA. In this work we focus on the aspect of resource management in overlaid CRNs. We formulate resource allocation strategies for cognitive radio networks (CRNs) as mathematical optimization problems. Specifically, we focus on two key problems in resource management: Sum Rate Maximization and Maximization of Number of Admitted Users. Since both the above mentioned problems are NP hard due to presence of binary assignment variables, we propose novel graph based algorithms to optimally solve these problems. Further, we analyze the impact of location awareness on network performance of CRNs by considering three cases: Full location Aware, Partial location Aware and Non location Aware. Our results clearly show that location awareness has significant impact on performance of overlaid CRNs and leads to increase in spectrum utilization effciency.
Resumo:
Fuzzy community detection is to identify fuzzy communities in a network, which are groups of vertices in the network such that the membership of a vertex in one community is in [0,1] and that the sum of memberships of vertices in all communities equals to 1. Fuzzy communities are pervasive in social networks, but only a few works have been done for fuzzy community detection. Recently, a one-step forward extension of Newman’s Modularity, the most popular quality function for disjoint community detection, results into the Generalized Modularity (GM) that demonstrates good performance in finding well-known fuzzy communities. Thus, GMis chosen as the quality function in our research. We first propose a generalized fuzzy t-norm modularity to investigate the effect of different fuzzy intersection operators on fuzzy community detection, since the introduction of a fuzzy intersection operation is made feasible by GM. The experimental results show that the Yager operator with a proper parameter value performs better than the product operator in revealing community structure. Then, we focus on how to find optimal fuzzy communities in a network by directly maximizing GM, which we call it Fuzzy Modularity Maximization (FMM) problem. The effort on FMM problem results into the major contribution of this thesis, an efficient and effective GM-based fuzzy community detection method that could automatically discover a fuzzy partition of a network when it is appropriate, which is much better than fuzzy partitions found by existing fuzzy community detection methods, and a crisp partition of a network when appropriate, which is competitive with partitions resulted from the best disjoint community detections up to now. We address FMM problem by iteratively solving a sub-problem called One-Step Modularity Maximization (OSMM). We present two approaches for solving this iterative procedure: a tree-based global optimizer called Find Best Leaf Node (FBLN) and a heuristic-based local optimizer. The OSMM problem is based on a simplified quadratic knapsack problem that can be solved in linear time; thus, a solution of OSMM can be found in linear time. Since the OSMM algorithm is called within FBLN recursively and the structure of the search tree is non-deterministic, we can see that the FMM/FBLN algorithm runs in a time complexity of at least O (n2). So, we also propose several highly efficient and very effective heuristic algorithms namely FMM/H algorithms. We compared our proposed FMM/H algorithms with two state-of-the-art community detection methods, modified MULTICUT Spectral Fuzzy c-Means (MSFCM) and Genetic Algorithm with a Local Search strategy (GALS), on 10 real-world data sets. The experimental results suggest that the H2 variant of FMM/H is the best performing version. The H2 algorithm is very competitive with GALS in producing maximum modularity partitions and performs much better than MSFCM. On all the 10 data sets, H2 is also 2-3 orders of magnitude faster than GALS. Furthermore, by adopting a simply modified version of the H2 algorithm as a mutation operator, we designed a genetic algorithm for fuzzy community detection, namely GAFCD, where elite selection and early termination are applied. The crossover operator is designed to make GAFCD converge fast and to enhance GAFCD’s ability of jumping out of local minimums. Experimental results on all the data sets show that GAFCD uncovers better community structure than GALS.
Resumo:
Wireless sensor network is an emerging research topic due to its vast and ever-growing applications. Wireless sensor networks are made up of small nodes whose main goal is to monitor, compute and transmit data. The nodes are basically made up of low powered microcontrollers, wireless transceiver chips, sensors to monitor their environment and a power source. The applications of wireless sensor networks range from basic household applications, such as health monitoring, appliance control and security to military application, such as intruder detection. The wide spread application of wireless sensor networks has brought to light many research issues such as battery efficiency, unreliable routing protocols due to node failures, localization issues and security vulnerabilities. This report will describe the hardware development of a fault tolerant routing protocol for railroad pedestrian warning system. The protocol implemented is a peer to peer multi-hop TDMA based protocol for nodes arranged in a linear zigzag chain arrangement. The basic working of the protocol was derived from Wireless Architecture for Hard Real-Time Embedded Networks (WAHREN).
Resumo:
Mobile sensor networks have unique advantages compared with wireless sensor networks. The mobility enables mobile sensors to flexibly reconfigure themselves to meet sensing requirements. In this dissertation, an adaptive sampling method for mobile sensor networks is presented. Based on the consideration of sensing resource constraints, computing abilities, and onboard energy limitations, the adaptive sampling method follows a down sampling scheme, which could reduce the total number of measurements, and lower sampling cost. Compressive sensing is a recently developed down sampling method, using a small number of randomly distributed measurements for signal reconstruction. However, original signals cannot be reconstructed using condensed measurements, as addressed by Shannon Sampling Theory. Measurements have to be processed under a sparse domain, and convex optimization methods should be applied to reconstruct original signals. Restricted isometry property would guarantee signals can be recovered with little information loss. While compressive sensing could effectively lower sampling cost, signal reconstruction is still a great research challenge. Compressive sensing always collects random measurements, whose information amount cannot be determined in prior. If each measurement is optimized as the most informative measurement, the reconstruction performance can perform much better. Based on the above consideration, this dissertation is focusing on an adaptive sampling approach, which could find the most informative measurements in unknown environments and reconstruct original signals. With mobile sensors, measurements are collect sequentially, giving the chance to uniquely optimize each of them. When mobile sensors are about to collect a new measurement from the surrounding environments, existing information is shared among networked sensors so that each sensor would have a global view of the entire environment. Shared information is analyzed under Haar Wavelet domain, under which most nature signals appear sparse, to infer a model of the environments. The most informative measurements can be determined by optimizing model parameters. As a result, all the measurements collected by the mobile sensor network are the most informative measurements given existing information, and a perfect reconstruction would be expected. To present the adaptive sampling method, a series of research issues will be addressed, including measurement evaluation and collection, mobile network establishment, data fusion, sensor motion, signal reconstruction, etc. Two dimensional scalar field will be reconstructed using the method proposed. Both single mobile sensors and mobile sensor networks will be deployed in the environment, and reconstruction performance of both will be compared.In addition, a particular mobile sensor, a quadrotor UAV is developed, so that the adaptive sampling method can be used in three dimensional scenarios.
Resumo:
Building energy meter network, based on per-appliance monitoring system, willbe an important part of the Advanced Metering Infrastructure. Two key issues exist for designing such networks. One is the network structure to be used. The other is the implementation of the network structure on a large amount of small low power devices, and the maintenance of high quality communication when the devices have electric connection with high voltage AC line. The recent advancement of low-power wireless communication makes itself the right candidate for house and building energy network. Among all kinds of wireless solutions, the low speed but highly reliable 802.15.4 radio has been chosen in this design. While many network-layer solutions have been provided on top of 802.15.4, an IPv6 based method is used in this design. 6LOWPAN is the particular protocol which adapts IP on low power personal network radio. In order to extend the network into building area without, a specific network layer routing mechanism-RPL, is included in this design. The fundamental unit of the building energy monitoring system is a smart wall plug. It is consisted of an electricity energy meter, a RF communication module and a low power CPU. The real challenge for designing such a device is its network firmware. In this design, IPv6 is implemented through Contiki operation system. Customize hardware driver and meter application program have been developed on top of the Contiki OS. Some experiments have been done, in order to prove the network ability of this system.
Resumo:
Target localization has a wide range of military and civilian applications in wireless mobile networks. Examples include battle-field surveillance, emergency 911 (E911), traffc alert, habitat monitoring, resource allocation, routing, and disaster mitigation. Basic localization techniques include time-of-arrival (TOA), direction-of-arrival (DOA) and received-signal strength (RSS) estimation. Techniques that are proposed based on TOA and DOA are very sensitive to the availability of Line-of-sight (LOS) which is the direct path between the transmitter and the receiver. If LOS is not available, TOA and DOA estimation errors create a large localization error. In order to reduce NLOS localization error, NLOS identifcation, mitigation, and localization techniques have been proposed. This research investigates NLOS identifcation for multiple antennas radio systems. The techniques proposed in the literature mainly use one antenna element to enable NLOS identifcation. When a single antenna is utilized, limited features of the wireless channel can be exploited to identify NLOS situations. However, in DOA-based wireless localization systems, multiple antenna elements are available. In addition, multiple antenna technology has been adopted in many widely used wireless systems such as wireless LAN 802.11n and WiMAX 802.16e which are good candidates for localization based services. In this work, the potential of spatial channel information for high performance NLOS identifcation is investigated. Considering narrowband multiple antenna wireless systems, two xvNLOS identifcation techniques are proposed. Here, the implementation of spatial correlation of channel coeffcients across antenna elements as a metric for NLOS identifcation is proposed. In order to obtain the spatial correlation, a new multi-input multi-output (MIMO) channel model based on rough surface theory is proposed. This model can be used to compute the spatial correlation between the antenna pair separated by any distance. In addition, a new NLOS identifcation technique that exploits the statistics of phase difference across two antenna elements is proposed. This technique assumes the phases received across two antenna elements are uncorrelated. This assumption is validated based on the well-known circular and elliptic scattering models. Next, it is proved that the channel Rician K-factor is a function of the phase difference variance. Exploiting Rician K-factor, techniques to identify NLOS scenarios are proposed. Considering wideband multiple antenna wireless systems which use MIMO-orthogonal frequency division multiplexing (OFDM) signaling, space-time-frequency channel correlation is exploited to attain NLOS identifcation in time-varying, frequency-selective and spaceselective radio channels. Novel NLOS identi?cation measures based on space, time and frequency channel correlation are proposed and their performances are evaluated. These measures represent a better NLOS identifcation performance compared to those that only use space, time or frequency.
Resumo:
With wireless vehicular communications, Vehicular Ad Hoc Networks (VANETs) enable numerous applications to enhance traffic safety, traffic efficiency, and driving experience. However, VANETs also impose severe security and privacy challenges which need to be thoroughly investigated. In this dissertation, we enhance the security, privacy, and applications of VANETs, by 1) designing application-driven security and privacy solutions for VANETs, and 2) designing appealing VANET applications with proper security and privacy assurance. First, the security and privacy challenges of VANETs with most application significance are identified and thoroughly investigated. With both theoretical novelty and realistic considerations, these security and privacy schemes are especially appealing to VANETs. Specifically, multi-hop communications in VANETs suffer from packet dropping, packet tampering, and communication failures which have not been satisfyingly tackled in literature. Thus, a lightweight reliable and faithful data packet relaying framework (LEAPER) is proposed to ensure reliable and trustworthy multi-hop communications by enhancing the cooperation of neighboring nodes. Message verification, including both content and signature verification, generally is computation-extensive and incurs severe scalability issues to each node. The resource-aware message verification (RAMV) scheme is proposed to ensure resource-aware, secure, and application-friendly message verification in VANETs. On the other hand, to make VANETs acceptable to the privacy-sensitive users, the identity and location privacy of each node should be properly protected. To this end, a joint privacy and reputation assurance (JPRA) scheme is proposed to synergistically support privacy protection and reputation management by reconciling their inherent conflicting requirements. Besides, the privacy implications of short-time certificates are thoroughly investigated in a short-time certificates-based privacy protection (STCP2) scheme, to make privacy protection in VANETs feasible with short-time certificates. Secondly, three novel solutions, namely VANET-based ambient ad dissemination (VAAD), general-purpose automatic survey (GPAS), and VehicleView, are proposed to support the appealing value-added applications based on VANETs. These solutions all follow practical application models, and an incentive-centered architecture is proposed for each solution to balance the conflicting requirements of the involved entities. Besides, the critical security and privacy challenges of these applications are investigated and addressed with novel solutions. Thus, with proper security and privacy assurance, these solutions show great application significance and economic potentials to VANETs. Thus, by enhancing the security, privacy, and applications of VANETs, this dissertation fills the gap between the existing theoretic research and the realistic implementation of VANETs, facilitating the realistic deployment of VANETs.
Resumo:
By providing vehicle-to-vehicle and vehicle-to-infrastructure wireless communications, vehicular ad hoc networks (VANETs), also known as the “networks on wheels”, can greatly enhance traffic safety, traffic efficiency and driving experience for intelligent transportation system (ITS). However, the unique features of VANETs, such as high mobility and uneven distribution of vehicular nodes, impose critical challenges of high efficiency and reliability for the implementation of VANETs. This dissertation is motivated by the great application potentials of VANETs in the design of efficient in-network data processing and dissemination. Considering the significance of message aggregation, data dissemination and data collection, this dissertation research targets at enhancing the traffic safety and traffic efficiency, as well as developing novel commercial applications, based on VANETs, following four aspects: 1) accurate and efficient message aggregation to detect on-road safety relevant events, 2) reliable data dissemination to reliably notify remote vehicles, 3) efficient and reliable spatial data collection from vehicular sensors, and 4) novel promising applications to exploit the commercial potentials of VANETs. Specifically, to enable cooperative detection of safety relevant events on the roads, the structure-less message aggregation (SLMA) scheme is proposed to improve communication efficiency and message accuracy. The scheme of relative position based message dissemination (RPB-MD) is proposed to reliably and efficiently disseminate messages to all intended vehicles in the zone-of-relevance in varying traffic density. Due to numerous vehicular sensor data available based on VANETs, the scheme of compressive sampling based data collection (CS-DC) is proposed to efficiently collect the spatial relevance data in a large scale, especially in the dense traffic. In addition, with novel and efficient solutions proposed for the application specific issues of data dissemination and data collection, several appealing value-added applications for VANETs are developed to exploit the commercial potentials of VANETs, namely general purpose automatic survey (GPAS), VANET-based ambient ad dissemination (VAAD) and VANET based vehicle performance monitoring and analysis (VehicleView). Thus, by improving the efficiency and reliability in in-network data processing and dissemination, including message aggregation, data dissemination and data collection, together with the development of novel promising applications, this dissertation will help push VANETs further to the stage of massive deployment.
Resumo:
Sporulation is a process in which some bacteria divide asymmetrically to form tough protective endospores, which help them to survive in a hazardous environment for a quite long time. The factors which can trigger this process are diverse. Heat, radiation, chemicals and lacking of nutrition can all lead to the formation of endospores. This phenomenon will lead to low productivity during industrial production. However, the sporulation mechanism in a spore-forming bacterium, Clostridium theromcellum, is still unclear. Therefore, if a regulation network of sporulation can be built, we may figure out ways to inhibit this process. In this study, a computational method is applied to predict the sporulation network in Clostridium theromcellum. A working sporulation network model with 40 new predicted genes and 4 function groups is built by using a network construction program, CINPER. 5 sets of microarray expression data in Clostridium theromcellum under different conditions have been collected. The analysis shows the predicted result is reasonable.
Resumo:
Tracking or target localization is used in a wide range of important tasks from knowing when your flight will arrive to ensuring your mail is received on time. Tracking provides the location of resources enabling solutions to complex logistical problems. Wireless Sensor Networks (WSN) create new opportunities when applied to tracking, such as more flexible deployment and real-time information. When radar is used as the sensing element in a tracking WSN better results can be obtained; because radar has a comparatively larger range both in distance and angle to other sensors commonly used in WSNs. This allows for less nodes deployed covering larger areas, saving money. In this report I implement a tracking WSN platform similar to what was developed by Lim, Wang, and Terzis. This consists of several sensor nodes each with a radar, a sink node connected to a host PC, and a Matlab© program to fuse sensor data. I have re-implemented their experiment with my WSN platform for tracking a non-cooperative target to verify their results and also run simulations to compare. The results of these tests are discussed and some future improvements are proposed.
Resumo:
In 1906, two American industrialists, John Munroe Longyear and Frederick Ayer, formed the Arctic Coal Company to make the first large scale attempt at mining in the high-Arctic location of Spitsbergen, north of the Norwegian mainland. In doing so, they encountered numerous obstacles and built an organization that attempted to overcome them. The Americans sold out in 1916 but others followed, eventually culminating in the transformation of a largely underdeveloped landscape into a mining region. This work uses John Law’s network approach of the Actor Network Theory (ANT) framework to explain how the Arctic Coal Company built a mining network in this environmentally difficult region and why they made the choices they did. It does so by identifying and analyzing the problems the company encountered and the strategies they used to overcome them by focusing on three major components of the operations; the company’s four land claims, its technical system and its main settlement, Longyear City. Extensive comparison between aspects of Longyear City and the company’s choices of technology with other American examples place analysis of the company in a wider context and helps isolate unique aspects of mining in the high-Arctic. American examples dominate comparative sections because Americans dominated the ownership and upper management of the company.
Resumo:
The objective of this report is to study distributed (decentralized) three phase optimal power flow (OPF) problem in unbalanced power distribution networks. A full three phase representation of the distribution networks is considered to account for the highly unbalance state of the distribution networks. All distribution network’s series/shunt components, and load types/combinations had been modeled on commercial version of General Algebraic Modeling System (GAMS), the high-level modeling system for mathematical programming and optimization. The OPF problem has been successfully implemented and solved in a centralized approach and distributed approach, where the objective is to minimize the active power losses in the entire system. The study was implemented on the IEEE-37 Node Test Feeder. A detailed discussion of all problem sides and aspects starting from the basics has been provided in this study. Full simulation results have been provided at the end of the report.
Resumo:
Important food crops like rice are constantly exposed to various stresses that can have devastating effect on their survival and productivity. Being sessile, these highly evolved organisms have developed elaborate molecular machineries to sense a mixture of stress signals and elicit a precise response to minimize the damage. However, recent discoveries revealed that the interplay of these stress regulatory and signaling molecules is highly complex and remains largely unknown. In this work, we conducted large scale analysis of differential gene expression using advanced computational methods to dissect regulation of stress response which is at the heart of all molecular changes leading to the observed phenotypic susceptibility. One of the most important stress conditions in terms of loss of productivity is drought. We performed genomic and proteomic analysis of epigenetic and miRNA mechanisms in regulation of drought responsive genes in rice and found subsets of genes with striking properties. Overexpressed genesets included higher number of epigenetic marks, miRNA targets and transcription factors which regulate drought tolerance. On the other hand, underexpressed genesets were poor in above features but were rich in number of metabolic genes with multiple co-expression partners contributing majorly towards drought resistance. Identification and characterization of the patterns exhibited by differentially expressed genes hold key to uncover the synergistic and antagonistic components of the cross talk between stress response mechanisms. We performed meta-analysis on drought and bacterial stresses in rice and Arabidopsis, and identified hundreds of shared genes. We found high level of conservation of gene expression between these stresses. Weighted co-expression network analysis detected two tight clusters of genes made up of master transcription factors and signaling genes showing strikingly opposite expression status. To comprehensively identify the shared stress responsive genes between multiple abiotic and biotic stresses in rice, we performed meta-analyses of microarray studies from seven different abiotic and six biotic stresses separately and found more than thirteen hundred shared stress responsive genes. Various machine learning techniques utilizing these genes classified the stresses into two major classes' namely abiotic and biotic stresses and multiple classes of individual stresses with high accuracy and identified the top genes showing distinct patterns of expression. Functional enrichment and co-expression network analysis revealed the different roles of plant hormones, transcription factors in conserved and non-conserved genesets in regulation of stress response.
Resumo:
In recent years, security of industrial control systems has been the main research focus due to the potential cyber-attacks that can impact the physical operations. As a result of these risks, there has been an urgent need to establish a stronger security protection against these threats. Conventional firewalls with stateful rules can be implemented in the critical cyberinfrastructure environment which might require constant updates. Despite the ongoing effort to maintain the rules, the protection mechanism does not restrict malicious data flows and it poses the greater risk of potential intrusion occurrence. The contributions of this thesis are motivated by the aforementioned issues which include a systematic investigation of attack-related scenarios within a substation network in a reliable sense. The proposed work is two-fold: (i) system architecture evaluation and (ii) construction of attack tree for a substation network. Cyber-system reliability remains one of the important factors in determining the system bottleneck for investment planning and maintenance. It determines the longevity of the system operational period with or without any disruption. First, a complete enumeration of existing implementation is exhaustively identified with existing communication architectures (bidirectional) and new ones with strictly unidirectional. A detailed modeling of the extended 10 system architectures has been evaluated. Next, attack tree modeling for potential substation threats is formulated. This quantifies the potential risks for possible attack scenarios within a network or from the external networks. The analytical models proposed in this thesis can serve as a fundamental development that can be further researched.