16 resultados para Data-Intensive Science
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.
Resumo:
Background: The hypothalamus plays a pivotal role in numerous mechanisms highly relevant to the maintenance of body homeostasis, such as the control of food intake and energy expenditure. Impairment of these mechanisms has been associated with the metabolic disturbances involved in the pathogenesis of obesity. Since rodent species constitute important models for metabolism studies and the rat hypothalamus is poorly characterized by proteomic strategies, we performed experiments aimed at constructing a two-dimensional gel electrophoresis (2-DE) profile of rat hypothalamus proteins. Results: As a first step, we established the best conditions for tissue collection and protein extraction, quantification and separation. The extraction buffer composition selected for proteome characterization of rat hypothalamus was urea 7 M, thiourea 2 M, CHAPS 4%, Triton X-100 0.5%, followed by a precipitation step with chloroform/methanol. Two-dimensional (2-D) gels of hypothalamic extracts from four-month-old rats were analyzed; the protein spots were digested and identified by using tandem mass spectrometry and database query using the protein search engine MASCOT. Eighty-six hypothalamic proteins were identified, the majority of which were classified as participating in metabolic processes, consistent with the finding of a large number of proteins with catalytic activity. Genes encoding proteins identified in this study have been related to obesity development. Conclusion: The present results indicate that the 2-DE technique will be useful for nutritional studies focusing on hypothalamic proteins. The data presented herein will serve as a reference database for studies testing the effects of dietary manipulations on hypothalamic proteome. We trust that these experiments will lead to important knowledge on protein targets of nutritional variables potentially able to affect the complex central nervous system control of energy homeostasis.
Resumo:
Last Glacial Maximum simulated sea surface temperature from the Paleo-Climate version of the National Center for Atmospheric Research Coupled Climate Model (NCAR-CCSM) are compared with available reconstructions and data-based products in the tropical and south Atlantic region. Model results are compared to data proxies based on the Multiproxy Approach for the Reconstruction of the Glacial Ocean surface product (MARGO). Results show that the model sea surface temperature is not consistent with the proxy-data in all of the region of interest. Discrepancies are found in the eastern, equatorial and in the high-latitude South Atlantic. The model overestimates the cooling in the southern South Atlantic (near 50 degrees S) shown by the proxy-data. Near the equator, model and proxies are in better agreement. In the eastern part of the equatorial basin the model underestimates the cooling shown by all proxies. A northward shift in the position of the subtropical convergence zone in the simulation suggests a compression or/and an equatorward shift of the subtropical gyre at the surface, consistent with what is observed in the proxy reconstruction. (C) 2008 Elsevier B.V. All rights reserved
Resumo:
Petroleum contamination impact on macrobenthic communities in the northeast portion of Todos os Santos Bay was assessed combining in multivariate analyses, chemical parameters such as aliphatic and polycyclic aromatic hydrocarbon indices and concentration ratios with benthic ecological parameters. Sediment samples were taken in August 2000 with a 0.05 m(2) van Veen grab at 28 sampling locations. The predominance of n-alkanes with more than 24 carbons, together with CPI values close to one, and the fact that most of the stations showed UCM/resolved aliphatic hydrocarbons ratios (UCM:R) higher than two, indicated a high degree of anthropogenic contribution, the presence of terrestrial plant detritus, petroleum products and evidence of chronic oil pollution. The indices used to determine the origin of PAH indicated the occurrence of a petrogenic contribution. A pyrolytic contribution constituted mainly by fossil fuel combustion derived PAH was also observed. The results of the stepwise multiple regression analysis performed with chemical data and benthic ecological descriptors demonstrated that not only total PAH concentrations but also specific concentration ratios or indices such as >= C24:< C24, An/178 and Fl/Fl + Py, are determining the structure of benthic communities within the study area. According to the BIO-ENV results petroleum related variables seemed to have a main influence on macrofauna community structure. The PCA ordination performed with the chemical data resulted in the formation of three groups of stations. The decrease in macrofauna density, number of species and diversity from groups III to I seemed to be related to the occurrence of high aliphatic hydrocarbon and PAH concentrations associated with fine sediments. Our results showed that macrobenthic communities in the northeast portion of Todos os Santos Bay are subjected to the impact of chronic oil pollution as was reflected by the reduction in the number of species and diversity. These results emphasise the importance to combine in multivariate approaches not only total hydrocarbon concentrations but also indices, isomer pair ratios and specific compound concentrations with biological data to improve the assessment of anthropogenic impact on marine ecosystems. (c) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Octopus vulgaris is a cephalopod species in several oceans and commonly caught by artisanal and industrial fisheries. In Brazil, O. vulgaris populations are mainly distributed along the southern coast and have been subjected to intensive fishing during recent years. Despite the importance of this marine resource, no genetic study has been carried out to examine genetic differences among populations along the coast of Brazil. In this study, 343 individuals collected by commercial vessels were genotyped at six microsatellite loci to investigate the genetic differences in O. vulgaris populations along the southern coast of Brazil. Genetic structure and levels of differentiation among sampling sites were estimated via a genotype assignment test and F-statistics. Our results indicate that the O. vulgaris stock consists of four genetic populations with an overall significant analogous F(ST). (phi(CT) = 0.10710, P<0.05) value. The genetic diversity was high with an observed heterozygosity of Ho = 0.987. The negative values of F(IS) found for most of the loci examined suggested a possible bottleneck process. These findings are important for further steps toward more sustainable octopus fisheries, so that this marine resource can be preserved for long-term utilization. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
We report new archeointensity data obtained from the analyses of baked clay elements (architectural and kiln brick fragments) sampled in Southeast Brazil and historically and/or archeologically dated between the end of the XVIth century and the beginning of the XXth century AD. The results were determined using the classical Thellier and Thellier protocol as modified by Coe, including partial thermoremanent magnetization (pTRM) and pTRM-tail checks, and the Triaxe protocol, which involves continuous high-temperature magnetization measurements. In both protocols, TRM anisotropy and cooling rate TRM dependence effects were taken into account for intensity determinations which were successfully performed for 150 specimens from 43 fragments, with a good agreement between intensity results obtained from the two procedures. Nine site-mean intensity values were derived from three to eight fragments and defined with standard deviations of less than 8%. The site-mean values vary from similar to 25 mu T to similar to 42 mu T and describe in Southeast Brazil a continuous decreasing trend by similar to 5 mu T per century between similar to 1600 AD and similar to 1900 AD. Their comparison with recent archeointensity results obtained from Northeast Brazil and reduced at a same latitude shows that: (1) the geocentric axial dipole approximation is not valid between these southeastern and northeastern regions of Brazil, whose latitudes differ by similar to 10 degrees, and (2) the available global geomagnetic field models (gufm1 models, their recalibrated versions and the CALSK3 models) are not sufficiently precise to reliably reproduce the non-dipole field effects which prevailed in Brazil for at least the 1600-1750 period. The large non-dipole contribution thus highlighted is most probably linked to the evolution of the South Atlantic Magnetic Anomaly (SAMA) during that period. Furthermore, although our dataset is limited, the Brazilian archeointensity data appear to support the view of a rather oscillatory behavior of the axial dipole moment during the past three centuries that would have been marked in particular by a moderate increase between the end of the XVIIIth century and the middle of the XIXth century followed by the well-known decrease from 1840 AD attested by direct measurements. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The Amazonian lowlands include large patches of open vegetation which contrast sharply with the rainforest, and the origin of these patches has been debated. This study focuses on a large area of open vegetation in northern Brazil, where d13C and, in some instances, C/N analyses of the organic matter preserved in late Quaternary sediments were used to achieve floristic reconstructions over time. The main goal was to determine when the modern open vegetation started to develop in this area. The variability in d13C data derived from nine cores ranges from -32.2 to -19.6 parts per thousand, but with nearly 60% of data above -26.5 parts per thousand. The most enriched values were detected only in ecotone and open vegetated areas. The development of open vegetation communities was asynchronous, varying between estimated ages of 6400 and 3000 cal a BP. This suggests that the origin of the studied patches of open vegetation might be linked to sedimentary dynamics of a late Quaternary megafan system. As sedimentation ended, this vegetation type became established over the megafan surface. In addition, the data presented here show that the presence of C4 plants must be used carefully as a proxy to interpret dry paleoclimatic episodes in Amazonian areas. Copyright (c) 2012 John Wiley & Sons, Ltd.
Resumo:
In this work, different methods to estimate the value of thin film residual stresses using instrumented indentation data were analyzed. This study considered procedures proposed in the literature, as well as a modification on one of these methods and a new approach based on the effect of residual stress on the value of hardness calculated via the Oliver and Pharr method. The analysis of these methods was centered on an axisymmetric two-dimensional finite element model, which was developed to simulate instrumented indentation testing of thin ceramic films deposited onto hard steel substrates. Simulations were conducted varying the level of film residual stress, film strain hardening exponent, film yield strength, and film Poisson's ratio. Different ratios of maximum penetration depth h(max) over film thickness t were also considered, including h/t = 0.04, for which the contribution of the substrate in the mechanical response of the system is not significant. Residual stresses were then calculated following the procedures mentioned above and compared with the values used as input in the numerical simulations. In general, results indicate the difference that each method provides with respect to the input values depends on the conditions studied. The method by Suresh and Giannakopoulos consistently overestimated the values when stresses were compressive. The method provided by Wang et al. has shown less dependence on h/t than the others.
Resumo:
In [1], the authors proposed a framework for automated clustering and visualization of biological data sets named AUTO-HDS. This letter is intended to complement that framework by showing that it is possible to get rid of a user-defined parameter in a way that the clustering stage can be implemented more accurately while having reduced computational complexity
Resumo:
The design of a network is a solution to several engineering and science problems. Several network design problems are known to be NP-hard, and population-based metaheuristics like evolutionary algorithms (EAs) have been largely investigated for such problems. Such optimization methods simultaneously generate a large number of potential solutions to investigate the search space in breadth and, consequently, to avoid local optima. Obtaining a potential solution usually involves the construction and maintenance of several spanning trees, or more generally, spanning forests. To efficiently explore the search space, special data structures have been developed to provide operations that manipulate a set of spanning trees (population). For a tree with n nodes, the most efficient data structures available in the literature require time O(n) to generate a new spanning tree that modifies an existing one and to store the new solution. We propose a new data structure, called node-depth-degree representation (NDDR), and we demonstrate that using this encoding, generating a new spanning forest requires average time O(root n). Experiments with an EA based on NDDR applied to large-scale instances of the degree-constrained minimum spanning tree problem have shown that the implementation adds small constants and lower order terms to the theoretical bound.
Resumo:
Each plasma physics laboratory has a proprietary scheme to control and data acquisition system. Usually, it is different from one laboratory to another. It means that each laboratory has its own way to control the experiment and retrieving data from the database. Fusion research relies to a great extent on international collaboration and this private system makes it difficult to follow the work remotely. The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The choice of MDSplus (Model Driven System plus) is proved by the fact that it is widely utilized, and the scientists from different institutions may use the same system in different experiments in different tokamaks without the need to know how each system treats its acquisition system and data analysis. Another important point is the fact that the MDSplus has a library system that allows communication between different types of language (JAVA, Fortran, C, C++, Python) and programs such as MATLAB, IDL, OCTAVE. In the case of tokamak TCABR interfaces (object of this paper) between the system already in use and MDSplus were developed, instead of using the MDSplus at all stages, from the control, and data acquisition to the data analysis. This was done in the way to preserve a complex system already in operation and otherwise it would take a long time to migrate. This implementation also allows add new components using the MDSplus fully at all stages. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Objectives. The null hypothesis was that mechanical testing systems used to determine polymerization stress (sigma(pol)) would rank a series of composites similarly. Methods. Two series of composites were tested in the following systems: universal testing machine (UTM) using glass rods as bonding substrate, UTM/acrylic rods, "low compliance device", and single cantilever device ("Bioman"). One series had five experimental composites containing BisGMA:TEGDMA in equimolar concentrations and 60, 65, 70, 75 or 80 wt% of filler. The other series had five commercial composites: Filtek Z250 (3M ESPE), Filtek A110 (3M ESPE), Tetric Ceram (Ivoclar), Heliomolar (Ivoclar) and Point 4 (Kerr). Specimen geometry, dimensions and curing conditions were similar in all systems. sigma(pol) was monitored for 10 min. Volumetric shrinkage (VS) was measured in a mercury dilatometer and elastic modulus (E) was determined by three-point bending. Shrinkage rate was used as a measure of reaction kinetics. ANOVA/Tukey test was performed for each variable, separately for each series. Results. For the experimental composites, sigma(pol) decreased with filler content in all systems, following the variation in VS. For commercial materials, sigma(pol) did not vary in the UTM/acrylic system and showed very few similarities in rankings in the others tests system. Also, no clear relationships were observed between sigma(pol) and VS or E. Significance. The testing systems showed a good agreement for the experimental composites, but very few similarities for the commercial composites. Therefore, comparison of polymerization stress results from different devices must be done carefully. (c) 2012 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.
Resumo:
Data visualization techniques are powerful in the handling and analysis of multivariate systems. One such technique known as parallel coordinates was used to support the diagnosis of an event, detected by a neural network-based monitoring system, in a boiler at a Brazilian Kraft pulp mill. Its attractiveness is the possibility of the visualization of several variables simultaneously. The diagnostic procedure was carried out step-by-step going through exploratory, explanatory, confirmatory, and communicative goals. This tool allowed the visualization of the boiler dynamics in an easier way, compared to commonly used univariate trend plots. In addition it facilitated analysis of other aspects, namely relationships among process variables, distinct modes of operation and discrepant data. The whole analysis revealed firstly that the period involving the detected event was associated with a transition between two distinct normal modes of operation, and secondly the presence of unusual changes in process variables at this time.
Resumo:
This work proposes a method for data clustering based on complex networks theory. A data set is represented as a network by considering different metrics to establish the connection between each pair of objects. The clusters are obtained by taking into account five community detection algorithms. The network-based clustering approach is applied in two real-world databases and two sets of artificially generated data. The obtained results suggest that the exponential of the Minkowski distance is the most suitable metric to quantify the similarities between pairs of objects. In addition, the community identification method based on the greedy optimization provides the best cluster solution. We compare the network-based clustering approach with some traditional clustering algorithms and verify that it provides the lowest classification error rate. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Content-based image retrieval is still a challenging issue due to the inherent complexity of images and choice of the most discriminant descriptors. Recent developments in the field have introduced multidimensional projections to burst accuracy in the retrieval process, but many issues such as introduction of pattern recognition tasks and deeper user intervention to assist the process of choosing the most discriminant features still remain unaddressed. In this paper, we present a novel framework to CBIR that combines pattern recognition tasks, class-specific metrics, and multidimensional projection to devise an effective and interactive image retrieval system. User interaction plays an essential role in the computation of the final multidimensional projection from which image retrieval will be attained. Results have shown that the proposed approach outperforms existing methods, turning out to be a very attractive alternative for managing image data sets.