937 resultados para Distributed data
Resumo:
Usually, ancillary services are provided by large conventional generators; however, with the growing interest in distributed generation to satisfy energy and environmental requirements, it seems reasonable to assume that these services could also be provided by distributed generators in an economical and efficient way. In this paper, a proposal for enhancement of the capacity of active power reserve for frequency control using distributed generators is presented. The goal is to minimize the payments done by the transmission system operator to conventional and distributed generators for this ancillary service and for the energy needed to satisfy loads and system losses, subject to a set of constraints. In order to perform analysis, the proposal was implemented using data of the IEEE 30-bus transmission test system. Comparisons were performed considering conventional generators without and with distributed generators installed in the system.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The use of markers distributed all long the genome may increase the accuracy of the predicted additive genetic value of young animals that are candidates to be selected as reproducers. In commercial herds, due to the cost of genotyping, only some animals are genotyped and procedures, divided in two or three steps, are done in order to include these genomic data in genetic evaluation. However, genomic evaluation may be calculated using one unified step that combines phenotypic data, pedigree and genomics. The aim of the study was to compare a multiple-trait model using only pedigree information with another using pedigree and genomic data. In this study, 9,318 lactations from 3061 buffaloes were used, 384 buffaloes were genotyped using a Illumina bovine chip (Illumina Infinium (R) bovineHD BeadChip). Seven traits were analyzed milk yield (MY), fat yield (FY), protein yield (PY), lactose yield (LY), fat percentage (F%), protein percentage (P%) and somatic cell score (SCSt). Two analyses were done: one using phenotypic and pedigree information (matrix A) and in the other using a matrix based in pedigree and genomic information (one step, matrix H). The (co) variance components were estimated using multiple-trait analysis by Bayesian inference method, applying an animal model, through Gibbs sampling. The model included the fixed effects of contemporary groups (herd-year-calving season), number of milking (2 levels), and age of buffalo at calving as (co) variable (quadratic and linear effect). The additive genetic, permanent environmental, and residual effects were included as random effects in the model. The heritability estimates using matrix A were 0.25, 0.22, 0.26, 0.17, 0.37, 0.42 and 0.26 and using matrix H were 0.25, 0.24, 0.26, 0.18, 0.38, 0.46 and 0.26 for MY, FY, PY, LY, % F, % P and SCCt, respectively. The estimates of the additive genetic effect for the traits were similar in both analyses, but the accuracy were bigger using matrix H (superior to 15% for traits studied). The heritability estimates were moderated indicating genetic gain under selection. The use of genomic information in the analyses increases the accuracy. It permits a better estimation of the additive genetic value of the animals.
Resumo:
This paper reports on an unmodeled, all-sky search for gravitational waves from merging intermediate mass black hole binaries (IMBHB). The search was performed on data from the second joint science run of the LIGO and Virgo detectors (July 2009-October 2010) and was sensitive to IMBHBs with a range up to similar to 200 Mpc, averaged over the possible sky positions and inclinations of the binaries with respect to the line of sight. No significant candidate was found. Upper limits on the coalescence-rate density of nonspinning IMBHBs with total masses between 100 and 450 M-circle dot and mass ratios between 0.25 and 1 were placed by combining this analysis with an analogous search performed on data from the first LIGO-Virgo joint science run (November 2005-October 2007). The most stringent limit was set for systems consisting of two 88 M-circle dot black holes and is equal to 0.12 Mpc(-3) Myr(-1) at the 90% confidence level. This paper also presents the first estimate, for the case of an unmodeled analysis, of the impact on the search range of IMBHB spin configurations: the visible volume for IMBHBs with nonspinning components is roughly doubled for a population of IMBHBs with spins aligned with the binary's orbital angular momentum and uniformly distributed in the dimensionless spin parameter up to 0.8, whereas an analogous population with antialigned spins decreases the visible volume by similar to 20%.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
We present results of a search for continuously emitted gravitational radiation, directed at the brightest low-mass x-ray binary, Scorpius X-1. Our semicoherent analysis covers 10 days of LIGO S5 data ranging from 50-550 Hz, and performs an incoherent sum of coherent F-statistic power distributed amongst frequency-modulated orbital sidebands. All candidates not removed at the veto stage were found to be consistent with noise at a 1% false alarm rate. We present Bayesian 95% confidence upper limits on gravitational-wave strain amplitude using two different prior distributions: a standard one, with no a priori assumptions about the orientation of Scorpius X-1; and an angle-restricted one, using a prior derived from electromagnetic observations. Median strain upper limits of 1.3 x 10(-24) and 8 x 10(-25) are reported at 150 Hz for the standard and angle-restricted searches respectively. This proof-of-principle analysis was limited to a short observation time by unknown effects of accretion on the intrinsic spin frequency of the neutron star, but improves upon previous upper limits by factors of similar to 1.4 for the standard, and 2.3 for the angle-restricted search at the sensitive region of the detector.
Resumo:
The number of electronic devices connected to agricultural machinery is increasing to support new agricultural practices tasks related to the Precision Agriculture such as spatial variability mapping and Variable Rate Technology (VRT). The Distributed Control System (DCS) is a suitable solution for decentralization of the data acquisition system and the Controller Area Network (CAN) is the major trend among the embedded communications protocols for agricultural machinery and vehicles. The application of soil correctives is a typical problem in Brazil. The efficiency of this correction process is highly dependent of the inputs way at soil and the occurrence of errors affects directly the agricultural yield. To handle this problem, this paper presents the development of a CAN-based distributed control system for a VRT system of soil corrective in agricultural machinery. The VRT system is composed by a tractor-implement that applies a desired rate of inputs according to the georeferenced prescription map of the farm field to support PA (Precision Agriculture). The performance evaluation of the CAN-based VRT system was done by experimental tests and analyzing the CAN messages transmitted in the operation of the entire system. The results of the control error according to the necessity of agricultural application allow conclude that the developed VRT system is suitable for the agricultural productions reaching an acceptable response time and application error. The CAN-Based DCS solution applied in the VRT system reduced the complexity of the control system, easing the installation and maintenance. The use of VRT system allowed applying only the required inputs, increasing the efficiency operation and minimizing the environmental impact.
Resumo:
Centralized and Distributed methods are two connection management schemes in wavelength convertible optical networks. In the earlier work, the centralized scheme is said to have lower network blocking probability than the distributed one. Hence, much of the previous work in connection management has focused on the comparison of different algorithms in only distributed scheme or in only centralized scheme. However, we believe that the network blocking probability of these two connection management schemes depends, to a great extent, on the network traffic patterns and reservation times. Our simulation results reveal that the performance improvement (in terms of blocking probability) of centralized method over distributed method is inversely proportional to the ratio of average connection interarrival time to reservation time. After that ratio increases beyond a threshold, those two connection management schemes yield almost the same blocking probability under the same network load. In this paper, we review the working procedure of distributed and centralized schemes, discuss the tradeoff between them, compare these two methods under different network traffic patterns via simulation and give our conclusion based on the simulation data.
Resumo:
In this paper, we propose a Loss Tolerant Reliable (LTR) data transport mechanism for dynamic Event Sensing (LTRES) in WSNs. In LTRES, a reliable event sensing requirement at the transport layer is dynamically determined by the sink. A distributed source rate adaptation mechanism is designed, incorporating a loss rate based lightweight congestion control mechanism, to regulate the data traffic injected into the network so that the reliability requirement can be satisfied. An equation based fair rate control algorithm is used to improve the fairness among the LTRES flows sharing the congestion path. The performance evaluations show that LTRES can provide LTR data transport service for multiple events with short convergence time, low lost rate and high overall bandwidth utilization.
Resumo:
Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.
Resumo:
Abstract Background With the development of DNA hybridization microarray technologies, nowadays it is possible to simultaneously assess the expression levels of thousands to tens of thousands of genes. Quantitative comparison of microarrays uncovers distinct patterns of gene expression, which define different cellular phenotypes or cellular responses to drugs. Due to technical biases, normalization of the intensity levels is a pre-requisite to performing further statistical analyses. Therefore, choosing a suitable approach for normalization can be critical, deserving judicious consideration. Results Here, we considered three commonly used normalization approaches, namely: Loess, Splines and Wavelets, and two non-parametric regression methods, which have yet to be used for normalization, namely, the Kernel smoothing and Support Vector Regression. The results obtained were compared using artificial microarray data and benchmark studies. The results indicate that the Support Vector Regression is the most robust to outliers and that Kernel is the worst normalization technique, while no practical differences were observed between Loess, Splines and Wavelets. Conclusion In face of our results, the Support Vector Regression is favored for microarray normalization due to its superiority when compared to the other methods for its robustness in estimating the normalization curve.
Resumo:
Abstract Background The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. Results BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at http://blasto.iq.usp.br/~tkoide/BayGO in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. Conclusion The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis.
Resumo:
Abstract Background One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. Results A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. Conclusion The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.
Resumo:
Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.
Resumo:
Abstract Background Several mathematical and statistical methods have been proposed in the last few years to analyze microarray data. Most of those methods involve complicated formulas, and software implementations that require advanced computer programming skills. Researchers from other areas may experience difficulties when they attempting to use those methods in their research. Here we present an user-friendly toolbox which allows large-scale gene expression analysis to be carried out by biomedical researchers with limited programming skills. Results Here, we introduce an user-friendly toolbox called GEDI (Gene Expression Data Interpreter), an extensible, open-source, and freely-available tool that we believe will be useful to a wide range of laboratories, and to researchers with no background in Mathematics and Computer Science, allowing them to analyze their own data by applying both classical and advanced approaches developed and recently published by Fujita et al. Conclusion GEDI is an integrated user-friendly viewer that combines the state of the art SVR, DVAR and SVAR algorithms, previously developed by us. It facilitates the application of SVR, DVAR and SVAR, further than the mathematical formulas present in the corresponding publications, and allows one to better understand the results by means of available visualizations. Both running the statistical methods and visualizing the results are carried out within the graphical user interface, rendering these algorithms accessible to the broad community of researchers in Molecular Biology.