974 resultados para Orion DBMS, Database, Uncertainty, Uncertain values, Benchmark
Resumo:
BACKGROUND:Short (~5 nucleotides) interspersed repeats regulate several aspects of post-transcriptional gene expression. Previously we developed an algorithm (REPFIND) that assigns P-values to all repeated motifs in a given nucleic acid sequence and reliably identifies clusters of short CAC-containing motifs required for mRNA localization in Xenopus oocytes.DESCRIPTION:In order to facilitate the identification of genes possessing clusters of repeats that regulate post-transcriptional aspects of gene expression in mammalian genes, we used REPFIND to create a database of all repeated motifs in the 3' untranslated regions (UTR) of genes from the Mammalian Gene Collection (MGC). The MGC database includes seven vertebrate species: human, cow, rat, mouse and three non-mammalian vertebrate species. A web-based application was developed to search this database of repeated motifs to generate species-specific lists of genes containing specific classes of repeats in their 3'-UTRs. This computational tool is called 3'-UTR SIRF (Short Interspersed Repeat Finder), and it reveals that hundreds of human genes contain an abundance of short CAC-rich and CAG-rich repeats in their 3'-UTRs that are similar to those found in mRNAs localized to the neurites of neurons. We tested four candidate mRNAs for localization in rat hippocampal neurons by in situ hybridization. Our results show that two candidate CAC-rich (Syntaxin 1B and Tubulin beta4) and two candidate CAG-rich (Sec61alpha and Syntaxin 1A) mRNAs are localized to distal neurites, whereas two control mRNAs lacking repeated motifs in their 3'-UTR remain primarily in the cell body.CONCLUSION:Computational data generated with 3'-UTR SIRF indicate that hundreds of mammalian genes have an abundance of short CA-containing motifs that may direct mRNA localization in neurons. In situ hybridization shows that four candidate mRNAs are localized to distal neurites of cultured hippocampal neurons. These data suggest that short CA-containing motifs may be part of a widely utilized genetic code that regulates mRNA localization in vertebrate cells. The use of 3'-UTR SIRF to search for new classes of motifs that regulate other aspects of gene expression should yield important information in future studies addressing cis-regulatory information located in 3'-UTRs.
Resumo:
A new neural network architecture is introduced for incremental supervised learning of recognition categories and multidimensional maps in response to arbitrary sequences of analog or binary input vectors. The architecture, called Fuzzy ARTMAP, achieves a synthesis of fuzzy logic and Adaptive Resonance Theory (ART) neural networks by exploiting a close formal similarity between the computations of fuzzy subsethood and ART category choice, resonance, and learning. Fuzzy ARTMAP also realizes a new Minimax Learning Rule that conjointly minimizes predictive error and maximizes code compression, or generalization. This is achieved by a match tracking process that increases the ART vigilance parameter by the minimum amount needed to correct a predictive error. As a result, the system automatically learns a minimal number of recognition categories, or "hidden units", to met accuracy criteria. Category proliferation is prevented by normalizing input vectors at a preprocessing stage. A normalization procedure called complement coding leads to a symmetric theory in which the MIN operator (Λ) and the MAX operator (v) of fuzzy logic play complementary roles. Complement coding uses on-cells and off-cells to represent the input pattern, and preserves individual feature amplitudes while normalizing the total on-cell/off-cell vector. Learning is stable because all adaptive weights can only decrease in time. Decreasing weights correspond to increasing sizes of category "boxes". Smaller vigilance values lead to larger category boxes. Improved prediction is achieved by training the system several times using different orderings of the input set. This voting strategy can also be used to assign probability estimates to competing predictions given small, noisy, or incomplete training sets. Four classes of simulations illustrate Fuzzy ARTMAP performance as compared to benchmark back propagation and genetic algorithm systems. These simulations include (i) finding points inside vs. outside a circle; (ii) learning to tell two spirals apart; (iii) incremental approximation of a piecewise continuous function; and (iv) a letter recognition database. The Fuzzy ARTMAP system is also compared to Salzberg's NGE system and to Simpson's FMMC system.
Resumo:
The last 30 years have seen Fuzzy Logic (FL) emerging as a method either complementing or challenging stochastic methods as the traditional method of modelling uncertainty. But the circumstances under which FL or stochastic methods should be used are shrouded in disagreement, because the areas of application of statistical and FL methods are overlapping with differences in opinion as to when which method should be used. Lacking are practically relevant case studies comparing these two methods. This work compares stochastic and FL methods for the assessment of spare capacity on the example of pharmaceutical high purity water (HPW) utility systems. The goal of this study was to find the most appropriate method modelling uncertainty in industrial scale HPW systems. The results provide evidence which suggests that stochastic methods are superior to the methods of FL in simulating uncertainty in chemical plant utilities including HPW systems in typical cases whereby extreme events, for example peaks in demand, or day-to-day variation rather than average values are of interest. The average production output or other statistical measures may, for instance, be of interest in the assessment of workshops. Furthermore the results indicate that the stochastic model should be used only if found necessary by a deterministic simulation. Consequently, this thesis concludes that either deterministic or stochastic methods should be used to simulate uncertainty in chemical plant utility systems and by extension some process system because extreme events or the modelling of day-to-day variation are important in capacity extension projects. Other reasons supporting the suggestion that stochastic HPW models are preferred to FL HPW models include: 1. The computer code for stochastic models is typically less complex than a FL models, thus reducing code maintenance and validation issues. 2. In many respects FL models are similar to deterministic models. Thus the need for a FL model over a deterministic model is questionable in the case of industrial scale HPW systems as presented here (as well as other similar systems) since the latter requires simpler models. 3. A FL model may be difficult to "sell" to an end-user as its results represent "approximate reasoning" a definition of which is, however, lacking. 4. Stochastic models may be applied with some relatively minor modifications on other systems, whereas FL models may not. For instance, the stochastic HPW system could be used to model municipal drinking water systems, whereas the FL HPW model should or could not be used on such systems. This is because the FL and stochastic model philosophies of a HPW system are fundamentally different. The stochastic model sees schedule and volume uncertainties as random phenomena described by statistical distributions based on either estimated or historical data. The FL model, on the other hand, simulates schedule uncertainties based on estimated operator behaviour e.g. tiredness of the operators and their working schedule. But in a municipal drinking water distribution system the notion of "operator" breaks down. 5. Stochastic methods can account for uncertainties that are difficult to model with FL. The FL HPW system model does not account for dispensed volume uncertainty, as there appears to be no reasonable method to account for it with FL whereas the stochastic model includes volume uncertainty.
Resumo:
In many real world situations, we make decisions in the presence of multiple, often conflicting and non-commensurate objectives. The process of optimizing systematically and simultaneously over a set of objective functions is known as multi-objective optimization. In multi-objective optimization, we have a (possibly exponentially large) set of decisions and each decision has a set of alternatives. Each alternative depends on the state of the world, and is evaluated with respect to a number of criteria. In this thesis, we consider the decision making problems in two scenarios. In the first scenario, the current state of the world, under which the decisions are to be made, is known in advance. In the second scenario, the current state of the world is unknown at the time of making decisions. For decision making under certainty, we consider the framework of multiobjective constraint optimization and focus on extending the algorithms to solve these models to the case where there are additional trade-offs. We focus especially on branch-and-bound algorithms that use a mini-buckets algorithm for generating the upper bound at each node of the search tree (in the context of maximizing values of objectives). Since the size of the guiding upper bound sets can become very large during the search, we introduce efficient methods for reducing these sets, yet still maintaining the upper bound property. We define a formalism for imprecise trade-offs, which allows the decision maker during the elicitation stage, to specify a preference for one multi-objective utility vector over another, and use such preferences to infer other preferences. The induced preference relation then is used to eliminate the dominated utility vectors during the computation. For testing the dominance between multi-objective utility vectors, we present three different approaches. The first is based on a linear programming approach, the second is by use of distance-based algorithm (which uses a measure of the distance between a point and a convex cone); the third approach makes use of a matrix multiplication, which results in much faster dominance checks with respect to the preference relation induced by the trade-offs. Furthermore, we show that our trade-offs approach, which is based on a preference inference technique, can also be given an alternative semantics based on the well known Multi-Attribute Utility Theory. Our comprehensive experimental results on common multi-objective constraint optimization benchmarks demonstrate that the proposed enhancements allow the algorithms to scale up to much larger problems than before. For decision making problems under uncertainty, we describe multi-objective influence diagrams, based on a set of p objectives, where utility values are vectors in Rp, and are typically only partially ordered. These can be solved by a variable elimination algorithm, leading to a set of maximal values of expected utility. If the Pareto ordering is used this set can often be prohibitively large. We consider approximate representations of the Pareto set based on ϵ-coverings, allowing much larger problems to be solved. In addition, we define a method for incorporating user trade-offs, which also greatly improves the efficiency.
Resumo:
Evaluating environmental policies, such as the mitigation of greenhouse gases, frequently requires balancing near-term mitigation costs against long-term environmental benefits. Conventional approaches to valuing such investments hold interest rates constant, but the authors contend that there is a real degree of uncertainty in future interest rates. This leads to a higher valuation of future benefits relative to conventional methods that ignore interest rate uncertainty.
Resumo:
We demonstrate that when the future path of the discount rate is uncertain and highly correlated, the distant future should be discounted at significantly lower rates than suggested by the current rate. We then use two centuries of US interest rate data to quantify this effect. Using both random walk and mean-reverting models, we compute the "certainty-equivalent rate" that summarizes the effect of uncertainty and measures the appropriate forward rate of discount in the future. Under the random walk model we find that the certainty-equivalent rate falls continuously from 4% to 2% after 100 years, 1% after 200 years, and 0.5% after 300 years. At horizons of 400 years, the discounted value increases by a factor of over 40,000 relative to conventional discounting. Applied to climate change mitigation, we find that incorporating discount rate uncertainty almost doubles the expected present value of mitigation benefits. © 2003 Elsevier Science (USA). All rights reserved.
Resumo:
Most of the air quality modelling work has been so far oriented towards deterministic simulations of ambient pollutant concentrations. This traditional approach, which is based on the use of one selected model and one data set of discrete input values, does not reflect the uncertainties due to errors in model formulation and input data. Given the complexities of urban environments and the inherent limitations of mathematical modelling, it is unlikely that a single model based on routinely available meteorological and emission data will give satisfactory short-term predictions. In this study, different methods involving the use of more than one dispersion model, in association with different emission simulation methodologies and meteorological data sets, were explored for predicting best CO and benzene estimates, and related confidence bounds. The different approaches were tested using experimental data obtained during intensive monitoring campaigns in busy street canyons in Paris, France. Three relative simple dispersion models (STREET, OSPM and AEOLIUS) that are likely to be used for regulatory purposes were selected for this application. A sensitivity analysis was conducted in order to identify internal model parameters that might significantly affect results. Finally, a probabilistic methodology for assessing urban air quality was proposed.
Resumo:
In this paper we propose a method for interpolation over a set of retrieved cases in the adaptation phase of the case-based reasoning cycle. The method has two advantages over traditional systems: the first is that it can predict “new” instances, not yet present in the case base; the second is that it can predict solutions not present in the retrieval set. The method is a generalisation of Shepard’s Interpolation method, formulated as the minimisation of an error function defined in terms of distance metrics in the solution and problem spaces. We term the retrieval algorithm the Generalised Shepard Nearest Neighbour (GSNN) method. A novel aspect of GSNN is that it provides a general method for interpolation over nominal solution domains. The method is illustrated in the paper with reference to the Irises classification problem. It is evaluated with reference to a simulated nominal value test problem, and to a benchmark case base from the travel domain. The algorithm is shown to out-perform conventional nearest neighbour methods on these problems. Finally, GSNN is shown to improve in efficiency when used in conjunction with a diverse retrieval algorithm.
Resumo:
The Continuous Plankton Recorder (CPR) survey provides a unique multi- decadal dataset on the abundance of plankton in the North Sea and North Atlantic and is one of only a few monitoring programmes operating at a large spatio- temporal scale. The results of all samples analysed from the survey since 1946 are stored on an Access Database at the Sir Alister Hardy Foundation for Ocean Science (SAHFOS) in Plymouth. The database is large, containing more than two million records (~80 million data points, if zero results are added) for more than 450 taxonomic entities. An open data policy is operated by SAHFOS. However, the data are not on-line and so access by scientists and others wishing to use the results is not interactive. Requests for data are dealt with by the Database Manager. To facilitate access to the data from the North Sea, which is an area of high research interest, a selected set of data for key phytoplankton and zooplankton species has been processed in a form that makes them readily available on CD for research and other applications. A set of MATLAB tools has been developed to provide an interpolated spatio-temporal description of plankton sampled by the CPR in the North Sea, as well as easy and fast access to users in the form of a browser. Using geostatistical techniques, plankton abundance values have been interpolated on a regular grid covering the North Sea. The grid is established on centres of 1 degree longitude x 0.5 degree latitude (~32 x 30 nautical miles). Based on a monthly temporal resolution over a fifty-year period (1948-1997), 600 distribution maps have been produced for 54 zooplankton species, and 480 distribution maps for 57 phytoplankton species over the shorter period 1958-1997. The gridded database has been developed in a user-friendly form and incorporates, as a package on a CD, a set of options for visualisation and interpretation, including the facility to plot maps for selected species by month, year, groups of months or years, long-term means or as time series and contour plots. This study constitutes the first application of an easily accessed and interactive gridded database of plankton abundance in the North Sea. As a further development the MATLAB browser is being converted to a user- friendly Windows-compatible format (WinCPR) for release on CD and via the Web in 2003.
Resumo:
The Continuous Plankton Recorder has been deployed on a seasonal basis in the north Pacific since 2000, accumulating a database of abundance measurements for over 290 planktonic taxa in over 3,500 processed samples. There is an additional archive of over 10,000 samples available for further analyses. Exxon Valdez Oil Spill Trustee Council financial support has contributed to about half of this tally, through four projects funded since 2002. Time series of zooplankton variables for sub-regions of the survey area are presented together with abstracts of eight papers published using data from these projects. The time series covers a period when the dominant climate signal in the north Pacific, the Pacific Decadal Oscillation (PDO), switched with unusual frequency between warm/positive states (pre-1999 and 2003-2006) and cool/negative states (1999-2002 and 2007). The CPR data suggest that cool negative years show higher biomass on the shelf and lower biomass in the open ocean, while the reverse is true in warm (PDO positive) years with lower shelf biomass (except 2005) and higher oceanic biomass. In addition, there was a delay in plankton increase on the Alaskan shelf in the colder spring of 2007, compared to the warmer springs of the preceding years. In warm years, smaller species of copepods which lack lipid reserves are also more common. Availability of the zooplankton prey to higher trophic levels (including those that society values highly) is therefore dependent on the timing of increase and peak abundance, ease of capture and nutritional value. Previously published studies using these data highlight the wide-ranging applicability of CPR data and include collaborative studies on; phenology in the key copepod species Neocalanus plumchrus, descriptions of distributions of decapod larvae and euphausiid species, the effects of hydrographic features such as mesoscale eddies and the North Pacific Current on plankton populations and a molecularbased investigation of macro-scale population structure in N. cristatus. The future funding situation is uncertain but the value of the data and studies so far accumulated is considerable and sets a strong foundation for further studies on plankton dynamics and interactions with higher trophic levels in the northern Gulf of Alaska.
Resumo:
In this paper, a parallel-matching processor architecture with early jump-out (EJO) control is proposed to carry out high-speed biometric fingerprint database retrieval. The processor performs the fingerprint retrieval by using minutia point matching. An EJO method is applied to the proposed architecture to speed up the large database retrieval. The processor is implemented on a Xilinx Virtex-E, and occupies 6,825 slices and runs at up to 65 MHz. The software/hardware co-simulation benchmark with a database of 10,000 fingerprints verifies that the matching speed can achieve the rate of up to 1.22 million fingerprints per second. EJO results in about a 22% gain in computing efficiency.
Resumo:
This paper studies a problem of dynamic pricing faced by a retailer with limited inventory, uncertain about the demand rate model, aiming to maximize expected discounted revenue over an infinite time horizon. The retailer doubts his demand model which is generated by historical data and views it as an approximation. Uncertainty in the demand rate model is represented by a notion of generalized relative entropy process, and the robust pricing problem is formulated as a two-player zero-sum stochastic differential game. The pricing policy is obtained through the Hamilton-Jacobi-Isaacs (HJI) equation. The existence and uniqueness of the solution of the HJI equation is shown and a verification theorem is proved to show that the solution of the HJI equation is indeed the value function of the pricing problem. The results are illustrated by an example with exponential nominal demand rate.
Resumo:
Flutter prediction as currently practiced is usually deterministic, with a single structural model used to represent an aircraft. By using interval analysis to take into account structural variability, recent work has demonstrated that small changes in the structure can lead to very large changes in the altitude at which
utter occurs (Marques, Badcock, et al., J. Aircraft, 2010). In this follow-up work we examine the same phenomenon using probabilistic collocation (PC), an uncertainty quantification technique which can eficiently propagate multivariate stochastic input through a simulation code,
in this case an eigenvalue-based fluid-structure stability code. The resulting analysis predicts the consequences of an uncertain structure on incidence of
utter in probabilistic terms { information that could be useful in planning
flight-tests and assessing the risk of structural failure. The uncertainty in
utter altitude is confirmed to be substantial. Assuming that the structural uncertainty represents a epistemic uncertainty regarding the
structure, it may be reduced with the availability of additional information { for example aeroelastic response data from a flight-test. Such data is used to update the structural uncertainty using Bayes' theorem. The consequent
utter uncertainty is significantly reduced across the entire Mach number range.
Resumo:
Individuals subtly reminded of death, coalitional challenges, or feelings of uncertainty display exaggerated preferences for affirmations and against criticisms of their cultural in-groups. Terror management, coalitional psychology, and uncertainty management theories postulate this “worldview defense” effectas the output of mechanisms evolved either to allay the fear of death, foster social support, or reduce anxiety by increasing adherence to cultural values. In 4 studies, we report evidence for an alternative perspective. We argue that worldview defense owes to unconscious vigilance, a state of accentuatedreactivity to affective targets (which need not relate to cultural worldviews) that follows detection of subtle alarm cues (which need not pertain to death, coalitional challenges, or uncertainty). In Studies 1 and 2, death-primed participants produced exaggerated ratings of worldview-neutral affective targets. In Studies 3 and 4, subliminal threat manipulations unrelated to death, coalitional challenges, or uncertaintyevoked worldview defense. These results are discussed as they inform evolutionary interpretations of worldview defense and future investigations of the influence of unconscious alarm on judgment.
Resumo:
The nuclear accident in Chernobyl in 1986 is a dramatic example of the type of incidents that are characteristic of a risk society. The consequences of the incident are indeterminate, the causes complex and future developments unpredictable. Nothing can compensate for its effects and it affects a broad population indiscriminately. This paper examines the lived experience of those who experienced biographical disruption as residents of the region on the basis of qualitative case studies carried out in 2003 in the Chernobyl regions of Russia, Ukraine and Belarus. Our analysis indicates that informants tend to view their future as highly uncertain and unpredictable; they experience uncertainty about whether they are already contaminated, and they have to take hazardous decisions about where to go and what to eat. Fear, rumours and experts compete in supplying information to residents about the actual and potential consequences of the disaster, but there is little trust in, and only limited awareness of, the information that is provided. Most informants continue with their lives and do what they must or even what they like, even where the risks are known. They often describe their behaviour as being due to economic circumstances; where there is extreme poverty, even hazardous food sources are better than none. Unlike previous studies, we identify a pronounced tendency among informants not to separate the problems associated with the disaster from the hardships that have resulted from the break-up of the USSR, with both events creating a deep-seated sense of resignation and fatalism. Although most informants hold their governments to blame for lack of information, support and preventive measures, there is little or no collective action to have these put in place. This contrasts with previous research which has suggested that populations affected by disasters attribute crucial significance to that incident and, as a consequence, become increasingly politicized with regard to related policy agendas.