922 resultados para Elements, Elettrofisiologia, Acquisizione Real Time, Analisi Real Time, High Throughput Data
Resumo:
Web services from different partners can be combined to applications that realize a more complex business goal. Such applications built as Web service compositions define how interactions between Web services take place in order to implement the business logic. Web service compositions not only have to provide the desired functionality but also have to comply with certain Quality of Service (QoS) levels. Maximizing the users' satisfaction, also reflected as Quality of Experience (QoE), is a primary goal to be achieved in a Service-Oriented Architecture (SOA). Unfortunately, in a dynamic environment like SOA unforeseen situations might appear like services not being available or not responding in the desired time frame. In such situations, appropriate actions need to be triggered in order to avoid the violation of QoS and QoE constraints. In this thesis, proper solutions are developed to manage Web services and Web service compositions with regard to QoS and QoE requirements. The Business Process Rules Language (BPRules) was developed to manage Web service compositions when undesired QoS or QoE values are detected. BPRules provides a rich set of management actions that may be triggered for controlling the service composition and for improving its quality behavior. Regarding the quality properties, BPRules allows to distinguish between the QoS values as they are promised by the service providers, QoE values that were assigned by end-users, the monitored QoS as measured by our BPR framework, and the predicted QoS and QoE values. BPRules facilitates the specification of certain user groups characterized by different context properties and allows triggering a personalized, context-aware service selection tailored for the specified user groups. In a service market where a multitude of services with the same functionality and different quality values are available, the right services need to be selected for realizing the service composition. We developed new and efficient heuristic algorithms that are applied to choose high quality services for the composition. BPRules offers the possibility to integrate multiple service selection algorithms. The selection algorithms are applicable also for non-linear objective functions and constraints. The BPR framework includes new approaches for context-aware service selection and quality property predictions. We consider the location information of users and services as context dimension for the prediction of response time and throughput. The BPR framework combines all new features and contributions to a comprehensive management solution. Furthermore, it facilitates flexible monitoring of QoS properties without having to modify the description of the service composition. We show how the different modules of the BPR framework work together in order to execute the management rules. We evaluate how our selection algorithms outperform a genetic algorithm from related research. The evaluation reveals how context data can be used for a personalized prediction of response time and throughput.
Resumo:
A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation, few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and can foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images with computed tomography (CT) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image called EMMA. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation. Finally, we will describe a number of additional real-world applications that can be solved efficiently and reliably using EMMA. EMMA can be used in machine learning to find maximally informative projections of high-dimensional data. EMMA can also be used to detect and correct corruption in magnetic resonance images (MRI).
Resumo:
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data.
Resumo:
The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM's over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub-problems. As an application of SVM's, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images.
Resumo:
Robust responses and links between the tropical energy and water cycles are investigated using multiple datasets and climate models over the period 1979-2006. Atmospheric moisture and net radiative cooling provide powerful constraints upon future changes in precipitation. While moisture amount is robustly linked with surface temperature, the response of atmospheric net radiative cooling, derived from satellite data, is less coherent. Precipitation trends and relationships with surface temperature are highly sensitive to the data product and the time-period considered. Data from the Special Sensor Microwave Imager (SSM/I) produces the strongest trends in precipitation and response to warming of all the datasets considered. The tendency for moist regions to become wetter while dry regions become drier in response to warming is captured by both observations and models. Citation: John, V. O., R. P. Allan, and B. J. Soden (2009), How robust are observed and simulated precipitation responses to tropical ocean warming?
Resumo:
Visual exploration of scientific data in life science area is a growing research field due to the large amount of available data. The Kohonen’s Self Organizing Map (SOM) is a widely used tool for visualization of multidimensional data. In this paper we present a fast learning algorithm for SOMs that uses a simulated annealing method to adapt the learning parameters. The algorithm has been adopted in a data analysis framework for the generation of similarity maps. Such maps provide an effective tool for the visual exploration of large and multi-dimensional input spaces. The approach has been applied to data generated during the High Throughput Screening of molecular compounds; the generated maps allow a visual exploration of molecules with similar topological properties. The experimental analysis on real world data from the National Cancer Institute shows the speed up of the proposed SOM training process in comparison to a traditional approach. The resulting visual landscape groups molecules with similar chemical properties in densely connected regions.
Resumo:
For the very large nonlinear dynamical systems that arise in a wide range of physical, biological and environmental problems, the data needed to initialize a numerical forecasting model are seldom available. To generate accurate estimates of the expected states of the system, both current and future, the technique of ‘data assimilation’ is used to combine the numerical model predictions with observations of the system measured over time. Assimilation of data is an inverse problem that for very large-scale systems is generally ill-posed. In four-dimensional variational assimilation schemes, the dynamical model equations provide constraints that act to spread information into data sparse regions, enabling the state of the system to be reconstructed accurately. The mechanism for this is not well understood. Singular value decomposition techniques are applied here to the observability matrix of the system in order to analyse the critical features in this process. Simplified models are used to demonstrate how information is propagated from observed regions into unobserved areas. The impact of the size of the observational noise and the temporal position of the observations is examined. The best signal-to-noise ratio needed to extract the most information from the observations is estimated using Tikhonov regularization theory. Copyright © 2005 John Wiley & Sons, Ltd.
Resumo:
The authors present a systolic design for a simple GA mechanism which provides high throughput and unidirectional pipelining by exploiting the inherent parallelism in the genetic operators. The design computes in O(N+G) time steps using O(N2) cells where N is the population size and G is the chromosome length. The area of the device is independent of the chromosome length and so can be easily scaled by replicating the arrays or by employing fine-grain migration. The array is generic in the sense that it does not rely on the fitness function and can be used as an accelerator for any GA application using uniform crossover between pairs of chromosomes. The design can also be used in hybrid systems as an add-on to complement existing designs and methods for fitness function acceleration and island-style population management
Resumo:
Uncertainties associated with the representation of various physical processes in global climate models (GCMs) mean that, when projections from GCMs are used in climate change impact studies, the uncertainty propagates through to the impact estimates. A complete treatment of this ‘climate model structural uncertainty’ is necessary so that decision-makers are presented with an uncertainty range around the impact estimates. This uncertainty is often underexplored owing to the human and computer processing time required to perform the numerous simulations. Here, we present a 189-member ensemble of global river runoff and water resource stress simulations that adequately address this uncertainty. Following several adaptations and modifications, the ensemble creation time has been reduced from 750 h on a typical single-processor personal computer to 9 h of high-throughput computing on the University of Reading Campus Grid. Here, we outline the changes that had to be made to the hydrological impacts model and to the Campus Grid, and present the main results. We show that, although there is considerable uncertainty in both the magnitude and the sign of regional runoff changes across different GCMs with climate change, there is much less uncertainty in runoff changes for regions that experience large runoff increases (e.g. the high northern latitudes and Central Asia) and large runoff decreases (e.g. the Mediterranean). Furthermore, there is consensus that the percentage of the global population at risk to water resource stress will increase with climate change.
Resumo:
BACKGROUND: In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power.In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. RESULTS: We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. CONCLUSION: This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.
Resumo:
The expression of proteins using recombinant baculoviruses is a mature and widely used technology. However, some aspects of the technology continue to detract from high throughput use and the basis of the final observed expression level is poorly understood. Here, we describe the design and use of a set of vectors developed around a unified cloning strategy that allow parallel expression of target proteins in the baculovirus system as N-terminal or C-terminal fusions. Using several protein kinases as tests we found that amino-terminal fusion to maltose binding protein rescued expression of the poorly expressed human kinase Cot but had only a marginal effect on expression of a well-expressed kinase IKK-2. In addition, MBP fusion proteins were found to be secreted from the expressing cell. Use of a carboxyl-terminal GFP tagging vector showed that fluorescence measurement paralleled expression level and was a convenient readout in the context of insect cell expression, an observation that was further supported with additional non-kinase targets. The expression of the target proteins using the same vectors in vitro showed that differences in expression level were wholly dependent on the environment of the expressing cell and an investigation of the time course of expression showed it could affect substantially the observed expression level for poorly but not well-expressed proteins. Our vector suite approach shows that rapid expression survey can be achieved within the baculovirus system and in addition, goes some way to identifying the underlying basis of the expression level obtained. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
It has become evident that the mystery of life will not be deciphered just by decoding its blueprint, the genetic code. In the life and biomedical sciences, research efforts are now shifting from pure gene analysis to the analysis of all biomolecules involved in the machinery of life. One area of these postgenomic research fields is proteomics. Although proteomics, which basically encompasses the analysis of proteins, is not a new concept, it is far from being a research field that can rely on routine and large-scale analyses. At the time the term proteomics was coined, a gold-rush mentality was created, promising vast and quick riches (i.e., solutions to the immensely complex questions of life and disease). Predictably, the reality has been quite different. The complexity of proteomes and the wide variations in the abundances and chemical properties of their constituents has rendered the use of systematic analytical approaches only partially successful, and biologically meaningful results have been slow to arrive. However, to learn more about how cells and, hence, life works, it is essential to understand the proteins and their complex interactions in their native environment. This is why proteomics will be an important part of the biomedical sciences for the foreseeable future. Therefore, any advances in providing the tools that make protein analysis a more routine and large-scale business, ideally using automated and rapid analytical procedures, are highly sought after. This review will provide some basics, thoughts and ideas on the exploitation of matrix-assisted laser desorption/ ionization in biological mass spectrometry - one of the most commonly used analytical tools in proteomics - for high-throughput analyses.
Resumo:
Recombinant baculoviruses have established themselves as a favoured technology for the high-level expression of recombinant proteins. The construction of recombinant viruses, however, is a time consuming step that restricts consideration of the technology for high throughput developments. Here we use a targeted gene knockout technology to inactivate an essential viral gene that lies adjacent to the locus used for recombination. Viral DNA prepared from the knockout fails to initiate an infection unless rescued by recombination with a baculovirus transfer vector. Modified viral DNA allows 100% recombinant virus formation, obviates the need for further virus purification and offers an efficient means of mass parallel recombinant formation.
Resumo:
It has become evident that the mystery of life will not be deciphered just by decoding its blueprint, the genetic code. In the life and biomedical sciences, research efforts are now shifting from pure gene analysis to the analysis of all biomolecules involved in the machinery of life. One area of these postgenomic research fields is proteomics. Although proteomics, which basically encompasses the analysis of proteins, is not a new concept, it is far from being a research field that can rely on routine and large-scale analyses. At the time the term proteomics was coined, a gold-rush mentality was created, promising vast and quick riches (i.e., solutions to the immensely complex questions of life and disease). Predictably, the reality has been quite different. The complexity of proteomes and the wide variations in the abundances and chemical properties of their constituents has rendered the use of systematic analytical approaches only partially successful, and biologically meaningful results have been slow to arrive. However, to learn more about how cells and, hence, life works, it is essential to understand the proteins and their complex interactions in their native environment. This is why proteomics will be an important part of the biomedical sciences for the foreseeable future. Therefore, any advances in providing the tools that make protein analysis a more routine and large-scale business, ideally using automated and rapid analytical procedures, are highly sought after. This review will provide some basics, thoughts and ideas on the exploitation of matrix-assisted laser desorption/ionization in biological mass spectrometry - one of the most commonly used analytical tools in proteomics - for high-throughput analyses.
Resumo:
A novel and generic miniaturization methodology for the determination of partition coefficient values of organic compounds in noctanol/water by using magnetic nanoparticles is, for the first time, described. We have successfully designed, synthesised and characterised new colloidal stable porous silica-encapsulated magnetic nanoparticles of controlled dimensions. These nanoparticles absorbing a tiny amount of n-octanol in their porous silica over-layer are homogeneously dispersed into a bulk aqueous phase (pH 7.40) containing an organic compound prior to magnetic separation. The small size of the particles and the efficient mixing allow a rapid establishment of the partition equilibrium of the organic compound between the solid supported n-octanol nano-droplets and the bulk aqueous phase. UV-vis spectrophotometry is then applied as a quantitative method to determine the concentration of the organic compound in the aqueous phase both before and after partitioning (after magnetic separation). log D values of organic compounds of pharmaceutical interest (0.65-3.50), determined by this novel methodology, were found to be in excellent agreement with the values measured by the shake-flask method in two independent laboratories, which are also consistent with the literature data. It was also found that this new technique gives a number of advantages such as providing an accurate measurement of log D value, a much shorter experimental time and a smaller sample size required. With this approach, the formation of a problematic emulsion, commonly encountered in shake-flask experiments, is eliminated. It is envisaged that this method could be applicable to the high throughput log D screening of drug candidates. (c) 2005 Elsevier B.V. All rights reserved.