12 resultados para data-types
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.
Resumo:
Each plasma physics laboratory has a proprietary scheme to control and data acquisition system. Usually, it is different from one laboratory to another. It means that each laboratory has its own way to control the experiment and retrieving data from the database. Fusion research relies to a great extent on international collaboration and this private system makes it difficult to follow the work remotely. The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The choice of MDSplus (Model Driven System plus) is proved by the fact that it is widely utilized, and the scientists from different institutions may use the same system in different experiments in different tokamaks without the need to know how each system treats its acquisition system and data analysis. Another important point is the fact that the MDSplus has a library system that allows communication between different types of language (JAVA, Fortran, C, C++, Python) and programs such as MATLAB, IDL, OCTAVE. In the case of tokamak TCABR interfaces (object of this paper) between the system already in use and MDSplus were developed, instead of using the MDSplus at all stages, from the control, and data acquisition to the data analysis. This was done in the way to preserve a complex system already in operation and otherwise it would take a long time to migrate. This implementation also allows add new components using the MDSplus fully at all stages. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Categorical data cannot be interpolated directly because they are outcomes of discrete random variables. Thus, types of categorical variables are transformed into indicator functions that can be handled by interpolation methods. Interpolated indicator values are then backtransformed to the original types of categorical variables. However, aspects such as variability and uncertainty of interpolated values of categorical data have never been considered. In this paper we show that the interpolation variance can be used to map an uncertainty zone around boundaries between types of categorical variables. Moreover, it is shown that the interpolation variance is a component of the total variance of the categorical variables, as measured by the coefficient of unalikeability. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
An annotated list of the type specimens of Lygistorrhinidae and Mycetophilidae (Diptera: Bibionomorpha) at the KwaZulu-Natal Museum, Pietermaritzburg, South Africa is provided. Information on 54 type specimens, three lygistorrhinids and 51 mycetophilids, with details of labels and actual preservation of the specimens is furnished. Locality data are georeferenced and habitus images of type specimens are provided.
Resumo:
Objective: To evaluate the Vickers hardness of different acrylic resins for denture bases with and without the addition of glass fibres. Background: It has been suggested that different polymerisation methods, as well as the addition of glass fibre (FV) might improve the hardness of acrylic. Materials and methods: Five types of acrylic resin were tested: Vipi Wave (VW), microwave polymerisation; Vipi Flash (VF), auto-polymerisation; Lucitone (LT), QC20 (QC) and Vipi Cril (VC), conventional heat-polymerisation, all with or without glass fibre reinforcement (GFR) and distributed into 10 groups (n = 12). Specimens were then submitted to Vickers hardness testing with a 25-g load for 30 s. All data were submitted to ANOVA and Tukey's HSD test. Results: A significant statistical difference was observed with regard to the polymerisation method and the GFR (p < 0.05). Without the GFR, the acrylic resin VC presented the highest hardness values, and VF and LT presented the lowest. In the presence of GFR, VC resin still presented the highest Vickers hardness values, and VF and QC presented the lowest. Conclusions: The acrylic resin VC and VW presented higher hardness values than VF and QC resins. Moreover, GFR increased the Vickers hardness of resins VW, VC and LT.
Resumo:
Aerobic exercise training (ET) has been established as an important non-pharmacological treatment of hypertension, since it decreases blood pressure. Studies show that the skeletal muscle abnormalities in hypertension are directly associated with capillary rarefaction, higher percentage of fast-twitch fibers (type II) with glycolytic metabolism predominance and increased muscular fatigue. However, little is known about these parameters in hypertension induced by ET. We hypothesized that ET corrects capillary rarefaction, potentially contributing to the restoration of the proportion of muscle fiber types and metabolic proprieties. Twelve-week old Spontaneously Hypertensive Rats (SHR, n=14) and Wistar Kyoto rats (WKY, n=14) were randomly assigned into 4 groups: SHR, trained SHR (SHR-T), WKY and trained WKY (WKY-T). As expected, ten weeks of ET was effective in reducing blood pressure in SHR-T group. In addition, we analyzed the main markers of ET. Resting bradycardia, increase of exercise tolerance, peak oxygen uptake and citrate synthase enzyme activity in trained groups (WKY-T and SHR-T) showed that the aerobic condition was achieved. ET also corrected the skeletal muscle capillary rarefaction in SHR-T. In parallel, we observed reduction in percentage of type IIA and IIX fibers and simultaneous augmented percentage of type I fibers induced by ET in hypertension. These data suggest that ET prevented changes in soleus fiber type composition in SHR, since angiogenesis and oxidative enzyme activity increased are important adaptations of ET, acting in the maintenance of muscle oxidative metabolism and fiber profile.
Resumo:
Traditional supervised data classification considers only physical features (e. g., distance or similarity) of the input data. Here, this type of learning is called low level classification. On the other hand, the human (animal) brain performs both low and high orders of learning and it has facility in identifying patterns according to the semantic meaning of the input data. Data classification that considers not only physical attributes but also the pattern formation is, here, referred to as high level classification. In this paper, we propose a hybrid classification technique that combines both types of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features or class topologies, while the latter measures the compliance of the test instances to the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the pattern formation, but also is able to improve the performance of traditional classification techniques. Furthermore, as the class configuration's complexity increases, such as the mixture among different classes, a larger portion of the high level term is required to get correct classification. This feature confirms that the high level classification has a special importance in complex situations of classification. Finally, we show how the proposed technique can be employed in a real-world application, where it is capable of identifying variations and distortions of handwritten digit images. As a result, it supplies an improvement in the overall pattern recognition rate.
Resumo:
In this paper we discuss the detection of glucose and triglycerides using information visualization methods to process impedance spectroscopy data. The sensing units contained either lipase or glucose oxidase immobilized in layer-by-layer (LbL) films deposited onto interdigitated electrodes. The optimization consisted in identifying which part of the electrical response and combination of sensing units yielded the best distinguishing ability. It is shown that complete separation can be obtained for a range of concentrations of glucose and triglyceride when the interactive document map (IDMAP) technique is used to project the data into a two-dimensional plot. Most importantly, the optimization procedure can be extended to other types of biosensors, thus increasing the versatility of analysis provided by tailored molecular architectures exploited with various detection principles. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
A procedure has been proposed by Ciotti and Bricaud (2006) to retrieve spectral absorption coefficients of phytoplankton and colored detrital matter (CDM) from satellite radiance measurements. This was also the first procedure to estimate a size factor for phytoplankton, based on the shape of the retrieved algal absorption spectrum, and the spectral slope of CDM absorption. Applying this method to the global ocean color data set acquired by SeaWiFS over twelve years (1998-2009), allowed for a comparison of the spatial variations of chlorophyll concentration ([Chl]), algal size factor (S-f), CDM absorption coefficient (a(cdm)) at 443 nm, and spectral slope of CDM absorption (S-cdm). As expected, correlations between the derived parameters were characterized by a large scatter at the global scale. We compared temporal variability of the spatially averaged parameters over the twelve-year period for three oceanic areas of biogeochemical importance: the Eastern Equatorial Pacific, the North Atlantic and the Mediterranean Sea. In all areas, both S-f and a(cdm)(443) showed large seasonal and interannual variations, generally correlated to those of algal biomass. The CDM maxima appeared in some occasions to last longer than those of [Chl]. The spectral slope of CDM absorption showed very large seasonal cycles consistent with photobleaching, challenging the assumption of a constant slope commonly used in bio-optical models. In the Equatorial Pacific, the seasonal cycles of [Chl], S-f, a(cdm)(443) and S-cdm, as well as the relationships between these parameters, were strongly affected by the 1997-98 El Ni o/La Ni a event.
Resumo:
Abstract Background One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. Results A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. Conclusion The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.
Resumo:
The objective of this study was to review mortality from external causes (accidental injury) in children and adolescents in systematically selected journals. This was a systematic review of the literature on mortality from accidental injury in children and adolescents. We searched the Pubrvled, Latin-American and Caribbean Health Sciences and Excerpta Medica databases for articles published between July of 2001 and June of 2011. National data from official agencies, retrieved by manual searches, were also reviewed. We reviewed 15 journal articles, the 2011 edition of a National Safety Council publication and 2010 statistical data from the Brazilian National Ministry of Health Mortality Database. Most published data were related to high-income countries. Mortality from accidental injury was highest among children less than 1 year of age. Accidental threats to breathing (non-drowning threats) constituted the leading cause of death among this age group in the published articles. Across the pediatric age group in the surveyed studies, traffic accidents were the leading cause of death, followed by accidental drowning and submersion. Traffic accidents constitute the leading external cause of accidental death among children in the countries understudy. However, infants were vulnerable to external causes, particularly to accidental non-drowning threats to breathing, and this age group had the highest mortality rates for external causes. Actions to reduce such events are suggested. Further studies investigating the occurrence of accidental deaths in low-income countries are needed to improve the understanding of these preventable events.
Resumo:
Abstract Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.