866 resultados para non-trivial data structures
Resumo:
An important goal in computational neuroanatomy is the complete and accurate simulation of neuronal morphology. We are developing computational tools to model three-dimensional dendritic structures based on sets of stochastic rules. This paper reports an extensive, quantitative anatomical characterization of simulated motoneurons and Purkinje cells. We used several local and global algorithms implemented in the L-Neuron and ArborVitae programs to generate sets of virtual neurons. Parameters statistics for all algorithms were measured from experimental data, thus providing a compact and consistent description of these morphological classes. We compared the emergent anatomical features of each group of virtual neurons with those of the experimental database in order to gain insights on the plausibility of the model assumptions, potential improvements to the algorithms, and non-trivial relations among morphological parameters. Algorithms mainly based on local constraints (e.g., branch diameter) were successful in reproducing many morphological properties of both motoneurons and Purkinje cells (e.g. total length, asymmetry, number of bifurcations). The addition of global constraints (e.g., trophic factors) improved the angle-dependent emergent characteristics (average Euclidean distance from the soma to the dendritic terminations, dendritic spread). Virtual neurons systematically displayed greater anatomical variability than real cells, suggesting the need for additional constraints in the models. For several emergent anatomical properties, a specific algorithm reproduced the experimental statistics better than the others did. However, relative performances were often reversed for different anatomical properties and/or morphological classes. Thus, combining the strengths of alternative generative models could lead to comprehensive algorithms for the complete and accurate simulation of dendritic morphology.
Resumo:
CO, O3, and H2O data in the upper troposphere/lower stratosphere (UTLS) measured by the Atmospheric Chemistry Experiment Fourier Transform Spectrometer(ACE-FTS) on Canada’s SCISAT-1 satellite are validated using aircraft and ozonesonde measurements. In the UTLS, validation of chemical trace gas measurements is a challenging task due to small-scale variability in the tracer fields, strong gradients of the tracers across the tropopause, and scarcity of measurements suitable for validation purposes. Validation based on coincidences therefore suffers from geophysical noise. Two alternative methods for the validation of satellite data are introduced, which avoid the usual need for coincident measurements: tracer-tracer correlations, and vertical tracer profiles relative to tropopause height. Both are increasingly being used for model validation as they strongly suppress geophysical variability and thereby provide an “instantaneous climatology”. This allows comparison of measurements between non-coincident data sets which yields information about the precision and a statistically meaningful error-assessment of the ACE-FTS satellite data in the UTLS. By defining a trade-off factor, we show that the measurement errors can be reduced by including more measurements obtained over a wider longitude range into the comparison, despite the increased geophysical variability. Applying the methods then yields the following upper bounds to the relative differences in the mean found between the ACE-FTS and SPURT aircraft measurements in the upper troposphere (UT) and lower stratosphere (LS), respectively: for CO ±9% and ±12%, for H2O ±30% and ±18%, and for O3 ±25% and ±19%. The relative differences for O3 can be narrowed down by using a larger dataset obtained from ozonesondes, yielding a high bias in the ACEFTS measurements of 18% in the UT and relative differences of ±8% for measurements in the LS. When taking into account the smearing effect of the vertically limited spacing between measurements of the ACE-FTS instrument, the relative differences decrease by 5–15% around the tropopause, suggesting a vertical resolution of the ACE-FTS in the UTLS of around 1 km. The ACE-FTS hence offers unprecedented precision and vertical resolution for a satellite instrument, which will allow a new global perspective on UTLS tracer distributions.
Resumo:
Data assimilation methods which avoid the assumption of Gaussian error statistics are being developed for geoscience applications. We investigate how the relaxation of the Gaussian assumption affects the impact observations have within the assimilation process. The effect of non-Gaussian observation error (described by the likelihood) is compared to previously published work studying the effect of a non-Gaussian prior. The observation impact is measured in three ways: the sensitivity of the analysis to the observations, the mutual information, and the relative entropy. These three measures have all been studied in the case of Gaussian data assimilation and, in this case, have a known analytical form. It is shown that the analysis sensitivity can also be derived analytically when at least one of the prior or likelihood is Gaussian. This derivation shows an interesting asymmetry in the relationship between analysis sensitivity and analysis error covariance when the two different sources of non-Gaussian structure are considered (likelihood vs. prior). This is illustrated for a simple scalar case and used to infer the effect of the non-Gaussian structure on mutual information and relative entropy, which are more natural choices of metric in non-Gaussian data assimilation. It is concluded that approximating non-Gaussian error distributions as Gaussian can give significantly erroneous estimates of observation impact. The degree of the error depends not only on the nature of the non-Gaussian structure, but also on the metric used to measure the observation impact and the source of the non-Gaussian structure.
Resumo:
Recent advances in the field of chaotic advection provide the impetus to revisit the dynamics of particles transported by blood flow in the presence of vessel wall irregularities. The irregularity, being either a narrowing or expansion of the vessel, mimicking stenoses or aneurysms, generates abnormal flow patterns that lead to a peculiar filamentary distribution of advected particles, which, in the blood, would include platelets. Using a simple model, we show how the filamentary distribution depends on the size of the vessel wall irregularity, and how it varies under resting or exercise conditions. The particles transported by blood flow that spend a long time around a disturbance either stick to the vessel wall or reside on fractal filaments. We show that the faster flow associated with exercise creates widespread filaments where particles can get trapped for a longer time, thus allowing for the possible activation of such particles. We argue, based on previous results in the field of active processes in flows, that the non-trivial long-time distribution of transported particles has the potential to have major effects on biochemical processes occurring in blood flow, including the activation and deposition of platelets. One aspect of the generality of our approach is that it also applies to other relevant biological processes, an example being the coexistence of plankton species investigated previously.
Resumo:
Several activities in service oriented computing, such as automatic composition, monitoring, and adaptation, can benefit from knowing properties of a given service composition before executing them. Among these properties we will focus on those related to execution cost and resource usage, in a wide sense, as they can be linked to QoS characteristics. In order to attain more accuracy, we formulate execution costs / resource usage as functions on input data (or appropriate abstractions thereof) and show how these functions can be used to make better, more informed decisions when performing composition, adaptation, and proactive monitoring. We present an approach to, on one hand, synthesizing these functions in an automatic fashion from the definition of the different orchestrations taking part in a system and, on the other hand, to effectively using them to reduce the overall costs of non-trivial service-based systems featuring sensitivity to data and possibility of failure. We validate our approach by means of simulations of scenarios needing runtime selection of services and adaptation due to service failure. A number of rebinding strategies, including the use of cost functions, are compared.
Resumo:
We propose a novel measure to assess the presence of meso-scale structures in complex networks. This measure is based on the identi?cation of regular patterns in the adjacency matrix of the network, and on the calculation of the quantity of information lost when pairs of nodes are iteratively merged. We show how this measure is able to quantify several meso-scale structures, like the presence of modularity, bipartite and core-periphery con?gurations, or motifs. Results corresponding to a large set of real networks are used to validate its ability to detect non-trivial topological patterns.
Resumo:
Comunicación presentada en las XVI Jornadas de Ingeniería del Software y Bases de Datos, JISBD 2011, A Coruña, 5-7 septiembre 2011.
Resumo:
Although managers consider accurate, timely, and relevant information as critical to the quality of their decisions, evidence of large variations in data quality abounds. Over a period of twelve months, the action research project reported herein attempted to investigate and track data quality initiatives undertaken by the participating organisation. The investigation focused on two types of errors: transaction input errors and processing errors. Whenever the action research initiative identified non-trivial errors, the participating organisation introduced actions to correct the errors and prevent similar errors in the future. Data quality metrics were taken quarterly to measure improvements resulting from the activities undertaken during the action research project. The action research project results indicated that for a mission-critical database to ensure and maintain data quality, commitment to continuous data quality improvement is necessary. Also, communication among all stakeholders is required to ensure common understanding of data quality improvement goals. The action research project found that to further substantially improve data quality, structural changes within the organisation and to the information systems are sometimes necessary. The major goal of the action research study is to increase the level of data quality awareness within all organisations and to motivate them to examine the importance of achieving and maintaining high-quality data.
Resumo:
In this paper we present algorithms which work on pairs of 0,1- matrices which multiply again a matrix of zero and one entries. When applied over a pair, the algorithms change the number of non-zero entries present in the matrices, meanwhile their product remains unchanged. We establish the conditions under which the number of 1s decreases. We recursively define as well pairs of matrices which product is a specific matrix and such that by applying on them these algorithms, we minimize the total number of non-zero entries present in both matrices. These matrices may be interpreted as solutions for a well known information retrieval problem, and in this case the number of 1 entries represent the complexity of the retrieve and information update operations.
Resumo:
Physical systems with co-existence and interplay of processes featuring distinct spatio-temporal scales are found in various research areas ranging from studies of brain activity to astrophysics. The complexity of such systems makes their theoretical and experimental analysis technically and conceptually challenging. Here, we discovered that while radiation of partially mode-locked fibre lasers is stochastic and intermittent on a short time scale, it exhibits non-trivial periodicity and long-scale correlations over slow evolution from one round-trip to another. A new technique for evolution mapping of intensity autocorrelation function has enabled us to reveal a variety of localized spatio-temporal structures and to experimentally study their symbiotic co-existence with stochastic radiation. Real-time characterization of dynamical spatio-temporal regimes of laser operation is set to bring new insights into rich underlying nonlinear physics of practical active- and passive-cavity photonic systems.
Resumo:
The ability to forecast machinery failure is vital to reducing maintenance costs, operation downtime and safety hazards. Recent advances in condition monitoring technologies have given rise to a number of prognostic models for forecasting machinery health based on condition data. Although these models have aided the advancement of the discipline, they have made only a limited contribution to developing an effective machinery health prognostic system. The literature review indicates that there is not yet a prognostic model that directly models and fully utilises suspended condition histories (which are very common in practice since organisations rarely allow their assets to run to failure); that effectively integrates population characteristics into prognostics for longer-range prediction in a probabilistic sense; which deduces the non-linear relationship between measured condition data and actual asset health; and which involves minimal assumptions and requirements. This work presents a novel approach to addressing the above-mentioned challenges. The proposed model consists of a feed-forward neural network, the training targets of which are asset survival probabilities estimated using a variation of the Kaplan-Meier estimator and a degradation-based failure probability density estimator. The adapted Kaplan-Meier estimator is able to model the actual survival status of individual failed units and estimate the survival probability of individual suspended units. The degradation-based failure probability density estimator, on the other hand, extracts population characteristics and computes conditional reliability from available condition histories instead of from reliability data. The estimated survival probability and the relevant condition histories are respectively presented as “training target” and “training input” to the neural network. The trained network is capable of estimating the future survival curve of a unit when a series of condition indices are inputted. Although the concept proposed may be applied to the prognosis of various machine components, rolling element bearings were chosen as the research object because rolling element bearing failure is one of the foremost causes of machinery breakdowns. Computer simulated and industry case study data were used to compare the prognostic performance of the proposed model and four control models, namely: two feed-forward neural networks with the same training function and structure as the proposed model, but neglected suspended histories; a time series prediction recurrent neural network; and a traditional Weibull distribution model. The results support the assertion that the proposed model performs better than the other four models and that it produces adaptive prediction outputs with useful representation of survival probabilities. This work presents a compelling concept for non-parametric data-driven prognosis, and for utilising available asset condition information more fully and accurately. It demonstrates that machinery health can indeed be forecasted. The proposed prognostic technique, together with ongoing advances in sensors and data-fusion techniques, and increasingly comprehensive databases of asset condition data, holds the promise for increased asset availability, maintenance cost effectiveness, operational safety and – ultimately – organisation competitiveness.
Resumo:
Software transactional memory has the potential to greatly simplify development of concurrent software, by supporting safe composition of concurrent shared-state abstractions. However, STM semantics are defined in terms of low-level reads and writes on individual memory locations, so implementations are unable to take advantage of the properties of user-defined abstractions. Consequently, the performance of transactions over some structures can be disappointing. ----- ----- We present Modular Transactional Memory, our framework which allows programmers to extend STM with concurrency control algorithms tailored to the data structures they use in concurrent programs. We describe our implementation in Concurrent Haskell, and two example structures: a finite map which allows concurrent transactions to operate on disjoint sets of keys, and a non-deterministic channel which supports concurrent sources and sinks. ----- ----- Our approach is based on previous work by others on boosted and open-nested transactions, with one significant development: transactions are given types which denote the concurrency control algorithms they employ. Typed transactions offer a higher level of assurance for programmers reusing transactional code, and allow more flexible abstract concurrency control.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Availability has become a primary goal of information security and is as significant as other goals, in particular, confidentiality and integrity. Maintaining availability of essential services on the public Internet is an increasingly difficult task in the presence of sophisticated attackers. Attackers may abuse limited computational resources of a service provider and thus managing computational costs is a key strategy for achieving the goal of availability. In this thesis we focus on cryptographic approaches for managing computational costs, in particular computational effort. We focus on two cryptographic techniques: computational puzzles in cryptographic protocols and secure outsourcing of cryptographic computations. This thesis contributes to the area of cryptographic protocols in the following ways. First we propose the most efficient puzzle scheme based on modular exponentiations which, unlike previous schemes of the same type, involves only a few modular multiplications for solution verification; our scheme is provably secure. We then introduce a new efficient gradual authentication protocol by integrating a puzzle into a specific signature scheme. Our software implementation results for the new authentication protocol show that our approach is more efficient and effective than the traditional RSA signature-based one and improves the DoSresilience of Secure Socket Layer (SSL) protocol, the most widely used security protocol on the Internet. Our next contributions are related to capturing a specific property that enables secure outsourcing of cryptographic tasks in partial-decryption. We formally define the property of (non-trivial) public verifiability for general encryption schemes, key encapsulation mechanisms (KEMs), and hybrid encryption schemes, encompassing public-key, identity-based, and tag-based encryption avors. We show that some generic transformations and concrete constructions enjoy this property and then present a new public-key encryption (PKE) scheme having this property and proof of security under the standard assumptions. Finally, we combine puzzles with PKE schemes for enabling delayed decryption in applications such as e-auctions and e-voting. For this we first introduce the notion of effort-release PKE (ER-PKE), encompassing the well-known timedrelease encryption and encapsulated key escrow techniques. We then present a security model for ER-PKE and a generic construction of ER-PKE complying with our security notion.