Biblioteca Digital

90 resultados para Large-scale Distribution

Visual analytics for large-scale bioinformatic data sets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rapid advances in sequencing technologies (Next Generation Sequencing or NGS) have led to a vast increase in the quantity of bioinformatics data available, with this increasing scale presenting enormous challenges to researchers seeking to identify complex interactions. This paper is concerned with the domain of transcriptional regulation, and the use of visualisation to identify relationships between specific regulatory proteins (the transcription factors or TFs) and their associated target genes (TGs). We present preliminary work from an ongoing study which aims to determine the effectiveness of different visual representations and large scale displays in supporting discovery. Following an iterative process of implementation and evaluation, representations were tested by potential users in the bioinformatics domain to determine their efficacy, and to understand better the range of ad hoc practices among bioinformatics literate users. Results from two rounds of small scale user studies are considered with initial findings suggesting that bioinformaticians require richly detailed views of TF data, features to compare TF layouts between organisms quickly, and ways to keep track of interesting data points.

A large-scale analysis of genetic variants within putative miRNA binding sites in prostate cancer

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prostate cancer is the second most common malignancy among men worldwide. Genome-wide association studies have identified 100 risk variants for prostate cancer, which can explain approximately 33% of the familial risk of the disease. We hypothesized that a comprehensive analysis of genetic variations found within the 3' untranslated region of genes predicted to affect miRNA binding (miRSNP) can identify additional prostate cancer risk variants. We investigated the association between 2,169 miRSNPs and prostate cancer risk in a large-scale analysis of 22,301 cases and 22,320 controls of European ancestry from 23 participating studies. Twenty-two miRSNPs were associated (P<2.3×10(-5)) with risk of prostate cancer, 10 of which were within 7 genes previously not mapped by GWAS studies. Further, using miRNA mimics and reporter gene assays, we showed that miR-3162-5p has specific affinity for the KLK3 rs1058205 miRSNP T-allele, whereas miR-370 has greater affinity for the VAMP8 rs1010 miRSNP A-allele, validating their functional role. SIGNIFICANCE Findings from this large association study suggest that a focus on miRSNPs, including functional evaluation, can identify candidate risk loci below currently accepted statistical levels of genome-wide significance. Studies of miRNAs and their interactions with SNPs could provide further insights into the mechanisms of prostate cancer risk.

R1SVM: A randomised nonlinear approach to large-scale anomaly detection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of unsupervised anomaly detection arises in a wide variety of practical applications. While one-class support vector machines have demonstrated their effectiveness as an anomaly detection technique, their ability to model large datasets is limited due to their memory and time complexity for training. To address this issue for supervised learning of kernel machines, there has been growing interest in random projection methods as an alternative to the computationally expensive problems of kernel matrix construction and sup-port vector optimisation. In this paper we leverage the theory of nonlinear random projections and propose the Randomised One-class SVM (R1SVM), which is an efficient and scalable anomaly detection technique that can be trained on large-scale datasets. Our empirical analysis on several real-life and synthetic datasets shows that our randomised 1SVM algorithm achieves comparable or better accuracy to deep auto encoder and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.

Towards a typology of hashtag publics: A large-scale comparative study of user engagement across trending topics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Twitter’s hashtag functionality is now used for a very wide variety of purposes, from covering crises and other breaking news events through gathering an instant community around shared media texts (such as sporting events and TV broadcasts) to signalling emotive states from amusement to despair. These divergent uses of the hashtag are increasingly recognised in the literature, with attention paid especially to the ability for hashtags to facilitate the creation of ad hoc or hashtag publics. A more comprehensive understanding of these different uses of hashtags has yet to be developed, however. Previous research has explored the potential for a systematic analysis of the quantitative metrics that could be generated from processing a series of hashtag datasets. Such research found, for example, that crisis-related hashtags exhibited a significantly larger incidence of retweets and tweets containing URLs than hashtags relating to televised events, and on this basis hypothesised that the information-seeking and -sharing behaviours of Twitter users in such different contexts were substantially divergent. This article updates such study and their methodology by examining the communicative metrics of a considerably larger and more diverse number of hashtag datasets, compiled over the past five years. This provides an opportunity both to confirm earlier findings, as well as to explore whether hashtag use practices may have shifted subsequently as Twitter’s userbase has developed further; it also enables the identification of further hashtag types beyond the “crisis” and “mainstream media event” types outlined to date. The article also explores the presence of such patterns beyond recognised hashtags, by incorporating an analysis of a number of keyword-based datasets. This large-scale, comparative approach contributes towards the establishment of a more comprehensive typology of hashtags and their publics, and the metrics it describes will also be able to be used to classify new hashtags emerging in the future. In turn, this may enable researchers to develop systems for automatically distinguishing newly trending topics into a number of event types, which may be useful for example for the automatic detection of acute crises and other breaking news events.

Combinatorial design of key distribution mechanisms for wireless sensor networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Secure communications in wireless sensor networks operating under adversarial conditions require providing pairwise (symmetric) keys to sensor nodes. In large scale deployment scenarios, there is no prior knowledge of post deployment network configuration since nodes may be randomly scattered over a hostile territory. Thus, shared keys must be distributed before deployment to provide each node a key-chain. For large sensor networks it is infeasible to store a unique key for all other nodes in the key-chain of a sensor node. Consequently, for secure communication either two nodes have a key in common in their key-chains and they have a wireless link between them, or there is a path, called key-path, among these two nodes where each pair of neighboring nodes on this path have a key in common. Length of the key-path is the key factor for efficiency of the design. This paper presents novel deterministic and hybrid approaches based on Combinatorial Design for deciding how many and which keys to assign to each key-chain before the sensor network deployment. In particular, Balanced Incomplete Block Designs (BIBD) and Generalized Quadrangles (GQ) are mapped to obtain efficient key distribution schemes. Performance and security properties of the proposed schemes are studied both analytically and computationally. Comparison to related work shows that the combinatorial approach produces better connectivity with smaller key-chain sizes.

Key distribution mechanisms for wireless sensor networks : a survey

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advances in technology introduce new application areas for sensor networks. Foreseeable wide deployment of mission critical sensor networks creates concerns on security issues. Security of large scale densely deployed and infrastructure less wireless networks of resource limited sensor nodes requires efficient key distribution and management mechanisms. We consider distributed and hierarchical wireless sensor networks where unicast, multicast and broadcast type of communications can take place. We evaluate deterministic, probabilistic and hybrid type of key pre-distribution and dynamic key generation algorithms for distributing pair-wise, group-wise and network-wise keys.

Plasma-controlled adatom delivery and (re)distribution: Enabling uninterrupted, low-temperature growth of ultralong vertically aligned single walled carbon nanotubes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Large-scale (∼109 atoms) numerical simulations reveal that plasma-controlled dynamic delivery and redistribution of carbon atoms between the substrate and nanotube surfaces enable the growth of ultralong single walled carbon nanotubes (SWCNTs) and explain the common experimental observation of slower growth at advanced stages. It is shown that the plasma-based processes feature up to two orders of magnitude higher growth rates than equivalent neutral-gas systems and are better suited for the SWCNT synthesis at low nanodevice friendly temperatures. © 2008 American Institute of Physics.

Application of simulated annealing to data distribution for all-to-all comparison problems in homogeneous systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed systems are widely used for solving large-scale and data-intensive computing problems, including all-to-all comparison (ATAC) problems. However, when used for ATAC problems, existing computational frameworks such as Hadoop focus on load balancing for allocating comparison tasks, without careful consideration of data distribution and storage usage. While Hadoop-based solutions provide users with simplicity of implementation, their inherent MapReduce computing pattern does not match the ATAC pattern. This leads to load imbalances and poor data locality when Hadoop's data distribution strategy is used for ATAC problems. Here we present a data distribution strategy which considers data locality, load balancing and storage savings for ATAC computing problems in homogeneous distributed systems. A simulated annealing algorithm is developed for data distribution and task scheduling. Experimental results show a significant performance improvement for our approach over Hadoop-based solutions.

Compositional agent-based models for electricity distribution networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents a novel approach to building large-scale agent-based models of networked physical systems using a compositional approach to provide extensibility and flexibility in building the models and simulations. A software framework (MODAM - MODular Agent-based Model) was implemented for this purpose, and validated through simulations. These simulations allow assessment of the impact of technological change on the electricity distribution network looking at the trajectories of electricity consumption at key locations over many years.

A Framework of Issues in Large Process Modelling Projects

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As process management projects have increased in size due to globalised and company-wide initiatives, a corresponding growth in the size of process modeling projects can be observed. Despite advances in languages, tools and methodologies, several aspects of these projects have been largely ignored by the academic community. This paper makes a first contribution to a potential research agenda in this field by defining the characteristics of large-scale process modeling projects and proposing a framework of related issues. These issues are derived from a semi -structured interview and six focus groups conducted in Australia, Germany and the USA with enterprise and modeling software vendors and customers. The focus groups confirm the existence of unresolved problems in business process modeling projects. The outcomes provide a research agenda which directs researchers into further studies in global process management, process model decomposition and the overall governance of process modeling projects. It is expected that this research agenda will provide guidance to researchers and practitioners by focusing on areas of high theoretical and practical relevance.

Concurrent multi-scale modeling of civil infrastructures for analyses on structural deterioration—Part I : Modeling methodology and strategy

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper aims to develop the methodology and strategy for concurrent finite element modeling of civil infrastructures at the different scale levels for the purposes of analyses of structural deteriorating. The modeling strategy and method were investigated to develop the concurrent multi-scale model of structural behavior (CMSM-of-SB) in which the global structural behavior and nonlinear damage features of local details in a large complicated structure could be concurrently analyzed in order to meet the needs of structural-state evaluation as well as structural deteriorating. In the proposed method, the “large-scale” modeling is adopted for the global structure with linear responses between stress and strain and the “small-scale” modeling is available for nonlinear damage analyses of the local welded details. A longitudinal truss in steel bridge decks was selected as a case to study how a CMSM-of-SB was developed. The reduced-scale specimen of the longitudinal truss was studied in the laboratory to measure its dynamic and static behavior in global truss and local welded details, while the multi-scale models using constraint equations and substructuring were developed for numerical simulation. The comparison of dynamic and static response between the calculated results by different models indicated that the proposed multi-scale model was found to be the most efficient and accurate. The verification of the model with results from the tested truss under the specific loading showed that, responses at the material scale in the vicinity of local details as well as structural global behaviors could be obtained and fit well with the measured results. The proposed concurrent multi-scale modeling strategy and implementation procedures were applied to Runyang cable-stayed bridge (RYCB) and the CMSM-of-SB of the bridge deck system was accordingly constructed as a practical application.

The influence of habitat heterogeneity on patterns of connectivity among rabbit populations in southern Queensland

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Patterns of connectivity among local populations influence the dynamics of regional systems, but most ecological models have concentrated on explaining the effect of connectivity on local population structure using dynamic processes covering short spatial and temporal scales. In this study, a model was developed in an extended spatial system to examine the hypothesis that long term connectivity levels among local populations are influenced by the spatial distribution of resources and other habitat factors. The habitat heterogeneity model was applied to local wild rabbit populations in the semi-arid Mitchell region of southern central Queensland (the Eastern system). Species' specific population parameters which were appropriate for the rabbit in this region were used. The model predicted a wide range of long term connectivity levels among sites, ranging from the extreme isolation of some sites to relatively high interaction probabilities for others. The validity of model assumptions was assessed by regressing model output against independent population genetic data, and explained over 80% of the variation in the highly structured genetic data set. Furthermore, the model was robust, explaining a significant proportion of the variation in the genetic data over a wide range of parameters. The performance of the habitat heterogeneity model was further assessed by simulating the widely reported recent range expansion of the wild rabbit into the Mitchell region from the adjacent, panmictic Western rabbit population system. The model explained well the independently determined genetic characteristics of the Eastern system at different hierarchic levels, from site specific differences (for example, fixation of a single allele in the population at one site), to differences between population systems (absence of an allele in the Eastern system which is present in all Western system sites). The model therefore explained the past and long term processes which have led to the formation and maintenance of the highly structured Eastern rabbit population system. Most animals exhibit sex biased dispersal which may influence long term connectivity levels among local populations, and thus the dynamics of regional systems. When appropriate sex specific dispersal characteristics were used, the habitat heterogeneity model predicted substantially different interaction patterns between female-only and combined male and female dispersal scenarios. In the latter case, model output was validated using data from a bi-parentally inherited genetic marker. Again, the model explained over 80% of the variation in the genetic data. The fact that such a large proportion of variability is explained in two genetic data sets provides very good evidence that habitat heterogeneity influences long term connectivity levels among local rabbit populations in the Mitchell region for both males and females. The habitat heterogeneity model thus provides a powerful approach for understanding the large scale processes that shape regional population systems in general. Therefore the model has the potential to be useful as a tool to aid in the management of those systems, whether it be for pest management or conservation purposes.

Conifer defence against insects: microarray gene expression profiling of Sitka spruce (Picea sitchensis) induced by mechanical wounding or feeding by spruce budworms (Choristoneura occidentalis) or white pine weevils (Pissodes strobi) reveals large

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Conifers are resistant to attack from a large number of potential herbivores or pathogens. Previous molecular and biochemical characterization of selected conifer defence systems support a model of multigenic, constitutive and induced defences that act on invading insects via physical, chemical, biochemical or ecological (multitrophic) mechanisms. However, the genomic foundation of the complex defence and resistance mechanisms of conifers is largely unknown. As part of a genomics strategy to characterize inducible defences and possible resistance mechanisms of conifers against insect herbivory, we developed a cDNA microarray building upon a new spruce (Picea spp.) expressed sequence tag resource. This first-generation spruce cDNA microarray contains 9720 cDNA elements representing c. 5500 unique genes. We used this array to monitor gene expression in Sitka spruce (Picea sitchensis) bark in response to herbivory by white pine weevils (Pissodes strobi, Curculionidae) or wounding, and in young shoot tips in response to western spruce budworm (Choristoneura occidentalis, Lepidopterae) feeding. Weevils are stem-boring insects that feed on phloem, while budworms are foliage feeding larvae that consume needles and young shoot tips. Both insect species and wounding treatment caused substantial changes of the host plant transcriptome detected in each case by differential gene expression of several thousand array elements at 1 or 2 d after the onset of treatment. Overall, there was considerable overlap among differentially expressed gene sets from these three stress treatments. Functional classification of the induced transcripts revealed genes with roles in general plant defence, octadecanoid and ethylene signalling, transport, secondary metabolism, and transcriptional regulation. Several genes involved in primary metabolic processes such as photosynthesis were down-regulated upon insect feeding or wounding, fitting with the concept of dynamic resource allocation in plant defence. Refined expression analysis using gene-specific primers and real-time PCR for selected transcripts was in agreement with microarray results for most genes tested. This study provides the first large-scale survey of insect-induced defence transcripts in a gymnosperm and provides a platform for functional investigation of plant-insect interactions in spruce. Induction of spruce genes of octadecanoid and ethylene signalling, terpenoid biosynthesis, and phenolic secondary metabolism are discussed in more detail.

Component reliability estimations without field data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Maintenance activities in a large-scale engineering system are usually scheduled according to the lifetimes of various components in order to ensure the overall reliability of the system. Lifetimes of components can be deduced by the corresponding probability distributions with parameters estimated from past failure data. While failure data of the components is not always readily available, the engineers have to be content with the primitive information from the manufacturers only, such as the mean and standard deviation of lifetime, to plan for the maintenance activities. In this paper, the moment-based piecewise polynomial model (MPPM) are proposed to estimate the parameters of the reliability probability distribution of the products when only the mean and standard deviation of the product lifetime are known. This method employs a group of polynomial functions to estimate the two parameters of the Weibull Distribution according to the mathematical relationship between the shape parameter of two-parameters Weibull Distribution and the ratio of mean and standard deviation. Tests are carried out to evaluate the validity and accuracy of the proposed methods with discussions on its suitability of applications. The proposed method is particularly useful for reliability-critical systems, such as railway and power systems, in which the maintenance activities are scheduled according to the expected lifetimes of the system components.

Rich probabilistic representations for bearing only decentralised data fusion

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this paper is to demonstrate the validity of using Gaussian mixture models (GMM) for representing probabilistic distributions in a decentralised data fusion (DDF) framework. GMMs are a powerful and compact stochastic representation allowing efficient communication of feature properties in large scale decentralised sensor networks. It will be shown that GMMs provide a basis for analytical solutions to the update and prediction operations for general Bayesian filtering. Furthermore, a variant on the Covariance Intersect algorithm for Gaussian mixtures will be presented ensuring a conservative update for the fusion of correlated information between two nodes in the network. In addition, purely visual sensory data will be used to show that decentralised data fusion and tracking of non-Gaussian states observed by multiple autonomous vehicles is feasible.

«
1
2
3
4
5
6
»