897 resultados para structured dependency
Resumo:
Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.
Resumo:
Bayesian methods offer a flexible and convenient probabilistic learning framework to extract interpretable knowledge from complex and structured data. Such methods can characterize dependencies among multiple levels of hidden variables and share statistical strength across heterogeneous sources. In the first part of this dissertation, we develop two dependent variational inference methods for full posterior approximation in non-conjugate Bayesian models through hierarchical mixture- and copula-based variational proposals, respectively. The proposed methods move beyond the widely used factorized approximation to the posterior and provide generic applicability to a broad class of probabilistic models with minimal model-specific derivations. In the second part of this dissertation, we design probabilistic graphical models to accommodate multimodal data, describe dynamical behaviors and account for task heterogeneity. In particular, the sparse latent factor model is able to reveal common low-dimensional structures from high-dimensional data. We demonstrate the effectiveness of the proposed statistical learning methods on both synthetic and real-world data.
Resumo:
Universidade Estadual de Campinas. Faculdade de Educação Física
Resumo:
OBJECTIVE: The aim of this study was to translate the Structured Clinical Interview for Mood Spectrum into Brazilian Portuguese, measuring its reliability, validity, and defining scores for bipolar disorders. METHOD: Questionnaire was translated (into Brazilian Portuguese) and back-translated into English. Sample consisted of 47 subjects with bipolar disorder, 47 with major depressive disorder, 18 with schizophrenia and 22 controls. Inter-rater reliability was tested in 20 subjects with bipolar disorder and MDD. Internal consistency was measured using the Kuder Richardson formula. Forward stepwise discriminant analysis was performed. Scores were compared between groups; manic (M), depressive (D) and total (T) threshold scores were calculated through receiver operating characteristic (ROC) curves. RESULTS: Kuder Richardson coefficients were between 0.86 and 0.94. Intraclass correlation coefficient was 0.96 (CI 95 % 0.93-0.97). Subjects with bipolar disorder had higher M and T, and similar D scores, when compared to major depressive disorder (ANOVA, p < 0.001). The sub-domains that best discriminated unipolar and bipolar subjects were manic energy and manic mood. M had the best area under the curve (0.909), and values of M equal to or greater than 30 yielded 91.5% sensitivity and 74.5% specificity. CONCLUSION: Structured Clinical Interview for Mood Spectrum has good reliability and validity. Cut-off of 30 best differentiates subjects with bipolar disorder vs. unipolar depression. A cutoff score of 30 or higher in the mania sub-domain is appropriate to help make a distinction between subjects with bipolar disorder and those with unipolar depression.
Resumo:
Background: Microarray techniques have become an important tool to the investigation of genetic relationships and the assignment of different phenotypes. Since microarrays are still very expensive, most of the experiments are performed with small samples. This paper introduces a method to quantify dependency between data series composed of few sample points. The method is used to construct gene co-expression subnetworks of highly significant edges. Results: The results shown here are for an adapted subset of a Saccharomyces cerevisiae gene expression data set with low temporal resolution and poor statistics. The method reveals common transcription factors with a high confidence level and allows the construction of subnetworks with high biological relevance that reveals characteristic features of the processes driving the organism adaptations to specific environmental conditions. Conclusion: Our method allows a reliable and sophisticated analysis of microarray data even under severe constraints. The utilization of systems biology improves the biologists ability to elucidate the mechanisms underlying celular processes and to formulate new hypotheses.
Resumo:
An analytical procedure based on microwave-assisted digestion with diluted acid and a double cloud point extraction is proposed for nickel determination in plant materials by flame atomic absorption spectrometry. Extraction in micellar medium was successfully applied for sample clean up, aiming to remove organic species containing phosphorous that caused spectral interferences by structured background attributed to the formation of PO species in the flame. Cloud point extraction of nickel complexes formed with 1,2-thiazolylazo-2-naphthol was explored for pre-concentration, with enrichment factor estimated as 30, detection limit of 5 mu g L(-1) (99.7% confidence level) and linear response up to 80 mu g L(-1). The accuracy of the procedure was evaluated by nickel determinations in reference materials and the results agreed with the certified values at the 95% confidence level.
Resumo:
Solid-liquid phase equilibrium modeling of triacylglycerol mixtures is essential for lipids design. Considering the alpha polymorphism and liquid phase as ideal, the Margules 2-suffix excess Gibbs energy model with predictive binary parameter correlations describes the non ideal beta and beta` solid polymorphs. Solving by direct optimization of the Gibbs free energy enables one to predict from a bulk mixture composition the phases composition at a given temperature and thus the SFC curve, the melting profile and the Differential Scanning Calorimetry (DSC) curve that are related to end-user lipid properties. Phase diagram, SFC and DSC curve experimental data are qualitatively and quantitatively well predicted for the binary mixture 1,3-dipalmitoyl-2-oleoyl-sn-glycerol (POP) and 1,2,3-tripalmitoyl-sn-glycerol (PPP), the ternary mixture 1,3-dimyristoyl-2-palmitoyl-sn-glycerol (MPM), 1,2-distearoyl-3-oleoyl-sn-glycerol (SSO) and 1,2,3-trioleoyl-sn-glycerol (OOO), for palm oil and cocoa butter. Then, addition to palm oil of Medium-Long-Medium type structured lipids is evaluated, using caprylic acid as medium chain and long chain fatty acids (EPA-eicosapentaenoic acid, DHA-docosahexaenoic acid, gamma-linolenic-octadecatrienoic acid and AA-arachidonic acid), as sn-2 substitutes. EPA, DHA and AA increase the melting range on both the fusion and crystallization side. gamma-linolenic shifts the melting range upwards. This predictive tool is useful for the pre-screening of lipids matching desired properties set a priori.
Resumo:
The knowledge of the relationship between spatial variability of the surface soil water content (theta) and its mean across a spatial domain (theta(m)) is crucial for hydrological modeling and understanding soil water dynamics at different scales. With the aim to compare the soil moisture dynamics and variability between the two land uses and to explore the relationship between the spatial variability of theta and theta(m), this study analyzed sets of surface theta measurements performed with an impedance soil moisture probe, collected 136 times during a period of one year in two transects covering different land uses, i.e., korshinsk peashrub transect (KPT) and bunge needlegrass transect (BNT), in a watershed of the Loess Plateau, China. Results showed that the temporal pattern of theta behaved similarly for the two land uses, with both relative wetter soils during wet period and relative drier soils during dry period recognized in BNT. Soil moisture tended to be temporally stable among different dates, and more stable patterns could be observed for dates with more similar soil water conditions. The magnitude of the spatial variation of theta in KPT was greater than that in ENT. For both land uses, the standard deviation (SD) of theta in general increased as theta(m) increased, a behavior that could be well described with a natural logarithmic function. Convex relationship of CV and theta(m) and the maximum CV for both land uses (43.5% in KPT and 41.0% in BNT) can, therefore, be ascertained. Geostatistical analysis showed that the range in KPT (9.1 m) was shorter than that in BNT (15.1 m). The nugget effects, the structured variability, hence the total variability increased as theta(m) increased. For both land uses, the spatial dependency in general increased with increasing theta(m). 2011 Elsevier B.V. All rights reserved.
Resumo:
Interesterification of palm stearin (PS) with liquid vegetable oils could yield a good solid fat stock that may impart desirable physical properties, because PS is a useful source of vegetable hard fat, providing beta` stable solid fats Dietary ingestion of olive oil (OO) has been reported to have physiological benefits such as lowering serum cholesterol levels Fat blends, formulated by binary blends of palm stearin and olive oil in different ratios, were subjected to chemical interesterification with sodium methoxide The original and interesterified blends were examined for fatty acid and triacylglycerol composition, melting point, solid fat content (SFC) and consistency. Interestenfication caused rearrangement of triacylglycerol species, reduction of trisaturated and triunsaturated triacylglycerols content and increase in diunsaturated-monosaturated triacylglycerols of all blends, resulting in lowering of melting point and solid fat content The incorporation of OO to PS reduced consistency, producing more plastic blends The mixture and chemical interesterification allowed obtaining fats with various degrees of plasticity, increasing the possibilities for the commercial use of palm stearin and olive oil (C) 2009 Elsevier Ltd All rights reserved
Resumo:
Many harvested marine and terrestrial populations have segments of their range protected in areas free from exploitation. Reasons for areas being protected from harvesting include conservation, tourism, research, protection of breeding grounds, stock recovery, harvest regulation, or habitat that is uneconomical to exploit. In this paper we consider the problem of optimally exploiting a single species local population that is connected by dispersing larvae to an unharvested local population. We define a spatially-explicit population dynamics model and apply dynamic optimization techniques to determine policies for harvesting the exploited patch. We then consider how reservation affects yield and spawning stock abundance when compared to policies that have not recognised the spatial structure of the metapopulation. Comparisons of harvest strategies between an exploited metapopulation with and without a harvest refuge are also made. Results show that in a 2 local population metapopulation with unidirectional larval transfer, the optimal exploitation of the harvested population should be conducted as if it were independent of the reserved population. Numerical examples suggest that relative source populations should be exploited if the objective is to maximise spawning stock abundance within a harvested metapopulation that includes a protected local population. However, this strategy can markedly reduce yield over a sink harvested reserve system and may require strict regulation for conservation goals to be realised. If exchange rates are high, results indicate that spawning stock abundance can be less in a reserve system than in a fully exploited metapopulation. In order to maximise economic gain in the reserve system, results indicate that relative sink populations should be harvested. Depending on transfer levels, loss in harvest through reservation can be minimal, and is likely to be compensated by the potential environmental and economic benefits of the reserve.
Resumo:
The infection of insect cells with baculovirus was described in a mathematical model as a part of the structured dynamic model describing whole animal cell metabolism. The model presented here is capable of simulating cell population dynamics, the concentrations of extracellular and intracellular viral components, and the heterologous product titers. The model describes the whole processes of viral infection and the effect of the infection on the host cell metabolism. Dynamic simulation of the model in batch and fed-batch mode gave good agreement between model predictions and experimental data. Optimum conditions for insect cell culture and viral infection in batch and fed-batch culture were studied using the model.
Resumo:
The movement of chemicals through the soil to the groundwater or discharged to surface waters represents a degradation of these resources. In many cases, serious human and stock health implications are associated with this form of pollution. The chemicals of interest include nutrients, pesticides, salts, and industrial wastes. Recent studies have shown that current models and methods do not adequately describe the leaching of nutrients through soil, often underestimating the risk of groundwater contamination by surface-applied chemicals, and overestimating the concentration of resident solutes. This inaccuracy results primarily from ignoring soil structure and nonequilibrium between soil constituents, water, and solutes. A multiple sample percolation system (MSPS), consisting of 25 individual collection wells, was constructed to study the effects of localized soil heterogeneities on the transport of nutrients (NO3-, Cl-, PO43-) in the vadose zone of an agricultural soil predominantly dominated by clay. Very significant variations in drainage patterns across a small spatial scale were observed tone-way ANOVA, p < 0.001) indicating considerable heterogeneity in water flow patterns and nutrient leaching. Using data collected from the multiple sample percolation experiments, this paper compares the performance of two mathematical models for predicting solute transport, the advective-dispersion model with a reaction term (ADR), and a two-region preferential flow model (TRM) suitable for modelling nonequilibrium transport. These results have implications for modelling solute transport and predicting nutrient loading on a larger scale. (C) 2001 Elsevier Science Ltd. All rights reserved.
Resumo:
This article recalls a classic scheme for categorizing attitude measures. One particular group of measures, those that rely on respondents' interpretations of partially structured stimuli, has virtually disappeared from attitude research. An attitude measure based on respondents' interpretation of partially structured stimuli is considered. Four studies employing such a measure demonstrate that it predicts unique variance in self-reported and actual behavior, beyond that predicted by explicit and contemporary implicit measures and regardless of whether the attitude object under consideration is wrought with social desirability concerns. Implications for conceptualizing attitude measurement and attitude-behavior relations are discussed.