945 resultados para biological data
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.
Resumo:
Cancer and cardio-vascular diseases are the leading causes of death world-wide. Caused by systemic genetic and molecular disruptions in cells, these disorders are the manifestation of profound disturbance of normal cellular homeostasis. People suffering or at high risk for these disorders need early diagnosis and personalized therapeutic intervention. Successful implementation of such clinical measures can significantly improve global health. However, development of effective therapies is hindered by the challenges in identifying genetic and molecular determinants of the onset of diseases; and in cases where therapies already exist, the main challenge is to identify molecular determinants that drive resistance to the therapies. Due to the progress in sequencing technologies, the access to a large genome-wide biological data is now extended far beyond few experimental labs to the global research community. The unprecedented availability of the data has revolutionized the capabilities of computational researchers, enabling them to collaboratively address the long standing problems from many different perspectives. Likewise, this thesis tackles the two main public health related challenges using data driven approaches. Numerous association studies have been proposed to identify genomic variants that determine disease. However, their clinical utility remains limited due to their inability to distinguish causal variants from associated variants. In the presented thesis, we first propose a simple scheme that improves association studies in supervised fashion and has shown its applicability in identifying genomic regulatory variants associated with hypertension. Next, we propose a coupled Bayesian regression approach -- eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combinations of regulatory genomic variants that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance in samples, but also predicts gene expression more accurately than other methods. We demonstrate that eQTeL accurately detects causal regulatory SNPs by simulation, particularly those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and histone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal. The challenge of identifying molecular determinants of cancer resistance so far could only be dealt with labor intensive and costly experimental studies, and in case of experimental drugs such studies are infeasible. Here we take a fundamentally different data driven approach to understand the evolving landscape of emerging resistance. We introduce a novel class of genetic interactions termed synthetic rescues (SR) in cancer, which denotes a functional interaction between two genes where a change in the activity of one vulnerable gene (which may be a target of a cancer drug) is lethal, but subsequently altered activity of its partner rescuer gene restores cell viability. Next we describe a comprehensive computational framework --termed INCISOR-- for identifying SR underlying cancer resistance. Applying INCISOR to mine The Cancer Genome Atlas (TCGA), a large collection of cancer patient data, we identified the first pan-cancer SR networks, composed of interactions common to many cancer types. We experimentally test and validate a subset of these interactions involving the master regulator gene mTOR. We find that rescuer genes become increasingly activated as breast cancer progresses, testifying to pervasive ongoing rescue processes. We show that SRs can be utilized to successfully predict patients' survival and response to the majority of current cancer drugs, and importantly, for predicting the emergence of drug resistance from the initial tumor biopsy. Our analysis suggests a potential new strategy for enhancing the effectiveness of existing cancer therapies by targeting their rescuer genes to counteract resistance. The thesis provides statistical frameworks that can harness ever increasing high throughput genomic data to address challenges in determining the molecular underpinnings of hypertension, cardiovascular disease and cancer resistance. We discover novel molecular mechanistic insights that will advance the progress in early disease prevention and personalized therapeutics. Our analyses sheds light on the fundamental biological understanding of gene regulation and interaction, and opens up exciting avenues of translational applications in risk prediction and therapeutics.
Resumo:
Since insect species are poikilothermic organisms, they generally exhibit different growth patterns depending on the temperature at which they develop. This factor is important in forensic entomology, especially for estimating postmortem interval (PMI) when it is based on the developmental time of the insects reared in decomposing bodies. This study aimed to estimate the rates of development, viability, and survival of immatures of Sarcophaga (Liopygia) ruficornis (Fabricius 1794) and Microcerella halli (Engel 1931) (Diptera: Sarcophagidae) reared in different temperatures: 10, 15, 20, 25, 30, and 35 ± 1 °C. Bovine raw ground meat was offered as food for all experimental groups, each consisting of four replicates, in the proportion of 2 g/larva. To measure the evolution of growth, ten specimens of each group were randomly chosen and weighed every 12 h, from initial feeding larva to pupae, and then discarded. Considering the records of weight gain, survival rates, and stability of growth rates, the range of optimum temperature for the development of S. (L.) ruficornis is between 20 and 35 °C, and that of M. halli is between 20 and 25 °C. For both species, the longest times of development were in the lowest temperatures. The survival rate at extreme temperatures (10 and 35 °C) was lower in both species. Biological data such as the ones obtained in this study are of great importance to achieve a more accurate estimate of the PMI.
Resumo:
In the first paper of this series (Albuquerque & Brandão, 2004) we revised the Vezenyii species group of the exclusively Neotropical solenopsidine (Myrmicinae) ant genus Oxyepoecus. In this closing paper we update distribution information on the Vezenyii group species and revise the other Oxyepoecus species-group (Rastratus). We describe two species (Oxyepoecus myops n. sp. and O. rosai n. sp.) and redescribe previously known species of the group [O. daguerrei (Santschi, 1933), O. mandibularis (Emery, 1913), O. plaumanni Kempf, 1974, O. rastratus Mayr, 1887, and O. reticulatus Kempf, 1974], adding locality records and comments on the meagre biological data of these species. We also present an identification key to Oxyepoecus species based on workers.
Resumo:
The larva and pupa of Tapuruia felisbertoi Lane, 1973, collected in Hevea brasiliensis (Euphorbiaceae) in Mato Grosso, Brazil, are described and illustrated. Biological data and a comparison with the larvae of other Hexoplonini species are also presented.
Resumo:
Inventários e estudos faunísticos detalhados sobre vertebrados são uma das fontes mais relevantes de dados para interpretações de padrões detalhados de diversidade biológica. Dados básicos e de boa qualidade sobre faunística são ainda mais urgentes em regiões pouco estudadas e sob intensa ameaça antrópica, tais como a região do Cerrado, um dos 34 hotspots globais para a conservação da biodiversidade. Apresentamos aqui uma síntese dos resultados dos inventários de vertebrados na Estação Ecológica Serra Geral do Tocantins (~716.000 ha), a segunda maior unidade de conservação em todo o Cerrado. Foram registradas 450 espécies de vertebrados na EESGT e entorno imediato, incluindo 17 espécies ameaçadas, 50 espécies endêmicas do Cerrado e 11 espécies com distribuição potencialmente restrita. Do total de espécies amostradas, 180 são novos registros para a região do Jalapão. Ao menos 12 espécies amostradas foram consideradas potenciais espécies novas, das quais quatro foram descritas recentemente, a partir do material obtido no inventário. Os resultados evidenciam que a EESGT é uma das mais importantes áreas protegidas no Brasil central, contribuindo para a persistência de espécies ameaçadas, dependentes dos últimos grandes blocos contínuos de vegetação nativa de Cerrado. Nossos resultados indicam ainda que a conservação da EESGT e suas principais subunidades é crucial para a representatividade do sistema de áreas protegidas do Cerrado, protegendo potenciais endemismos restritos que aliam alta vulnerabilidade intrínseca e valor como indicadores de padrões e processos biogeográficos formadores da rica e cada vez mais ameaçada fauna Neotropical.
Resumo:
Natural products have widespread biological activities, including inhibition of mitochondrial enzyme systems. Some of these activities, for example cytotoxicity, may be the result of alteration of cellular bioenergetics. Based on previous computer-aided drug design (CADD) studies and considering reported data on structure-activity relationships (SAR), an assumption regarding the mechanism of action of natural products against parasitic infections involves the NADH-oxidase inhibition. In this study, chemometric tools, such as: Principal Component Analysis (PCA), Consensus PCA (CPCA), and partial least squares regression (PLS), were applied to a set of forty natural compounds, acting as NADH-oxidase inhibitors. The calculations were performed using the VolSurf+ program. The formalisms employed generated good exploratory and predictive results. The independent variables or descriptors having a hydrophobic profile were strongly correlated to the biological data.
Resumo:
This work was focused on a multi-purpose estuarine environment (river Sado estuary, SW Portugal) around which a number of activities (e.g., fishing, farming, heavy industry, tourism and recreational activities) coexist with urban centres with a total of about 200 000 inhabitants. Based on previous knowledge of the hazardous chemicals within the ecosystem and their potential toxicity to benthic species, this project intended to evaluate the impact of estuarine contaminants on the human and ecosystem health. An integrative methodology based on epidemiological, analytical and biological data and comprising several lines of evidence, namely, human contamination pathways, human health effects, consumption of local produce, estuarine sediments, wells and soils contamination, effects on commercial benthic organisms, and genotoxic potential of sediments, was used. The epidemiological survey confirmed the occurrence of direct and indirect (through food chain) exposure of the local population to estuarine contaminants. Furthermore, the complex mixture of contaminants (e.g., metals, pesticides, polycyclic aromatic hydrocarbons) trapped in the estuary sediments was toxic to human liver cells exposed in vitro, causing cell death, oxidative stress and genotoxic effects that might constitute a risk factor for the development of chronic-degenerative diseases, on the long term. Finally, the integration of data from several endpoints indicated that the estuary is moderately impacted by toxicants that affect also the aquatic biota. Nevertheless, the human health risk can only be correctly assessed through a biomonitoring study including the quantification of contaminants (or metabolites) in biological fluids as well as biomarkers of early biological effects (e.g., biochemical, genetic and omics-based endpoints) and genetic susceptibility in the target population. Data should be supported by a detailed survey to assess the impact of the contaminated seafood and local farm products consumption on human health and, particularly, on metabolic diseases or cancer development.
Resumo:
We present a novel data analysis strategy which combined with subcellular fractionation and liquid chromatography-mass spectrometry (LC-MS) based proteomics provides a simple and effective workflow for global drug profiling. Five subcellular fractions were obtained by differential centrifugation followed by high resolution LC-MS and complete functional regulation analysis. The methodology combines functional regulation and enrichment analysis into a single visual summary. The workflow enables improved insight into perturbations caused by drugs. We provide a statistical argument to demonstrate that even crude subcellular fractions leads to improved functional characterization. We demonstrate this data analysis strategy on data obtained in a MS-based global drug profiling study. However, this strategy can also be performed on other types of large scale biological data.
Resumo:
In Brazilian Amazonia, Cholini (Coleoptera, Curculionidae, Molytinae) is represented by 53 species distributed in seven genera: Ameris Dejean, 1821; Cholus Germar, 1824; Homalinotus Sahlberg, 1823; Lobaspis Chevrolat, 1881; Odontoderes Sahlberg, 1823; Ozopherus Pascoe, 1872 and Rhinastus Schoenherr, 1825. This work documents the species of Cholini housed in the Invertebrate Collection of the Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil and gives the geographical and biological data associated with them. A total of 186 Cholini specimens were identified as belonging to 14 species (13 from Brazilian Amazonia) and five genera (Cholus, Homalinotus, Odontoderes, Ozopherus and Rhinastus). Only 24% of the Cholini species reported from Brazilian Amazonia are actually represented in the INPA collection, underscoring the need for a more systematical collecting based on available biological information. The known geographical distribution was expanded for the following species: Cholus granifer (Chevrolat, 1881) for Brazil; C. pantherinus (Olivier, 1790) for Manaus (Amazonas); Cholus parallelogrammus (Germar, 1824) for Piraquara (Paraná); Homalinotus depressus (Linnaeus, 1758) for lago Janauacá (Amazonas) and rio Tocantins (Pará); H. humeralis (Gyllenhal, 1836) for Novo Airão, Coari (Amazonas) and Porto Velho (Rondônia); H. nodipennis (Chevrolat, 1878) for Carauari, Lábrea (Amazonas) and Ariquemes (Rondônia); H. validus (Olivier, 1790) for rio Araguaia (Brasil), Manaus (Amazonas), rio Tocantins (Pará), Porto Velho and BR 364, Km 130 (Rondônia); Odontoderes carinatus (Guérin-Méneville, 1844) for Manaus (Amazonas); O. spinicollis (Boheman, 1836) for rio Uraricoera (Roraima); and Ozopherus muricatus Pascoe, 1872 for lago Janauacá (Amazonas). Homalinotus humeralis is reported for the first time from "urucuri" palm, Attalea phalerata Mart. ex Spreng.
Resumo:
Tese de Doutoramento (Programa Doutoral em Engenharia Biomédica)
Resumo:
In Ireland, although flatfish form a valuable fishery, little is known about the smallest, the dab Limanda limanda. In this study, a variety of parameters of reproductive development, including ovarian phase description, gonadosomatic index (GSI), hepatosomatic index (HSI), relative condition (Kn) and oocyte size were analysed to provide information on the dab’s reproductive cycle and spawning periods. Sampling were collected monthly over an 18-month period using bottom trawls of the Irish coastline. A six phase macroscopic guide was developed for both sexes of dab, and verified using histology. In comparisons of macroscopic and microscopic phases, there was high agreement in the proposed female guide (86%), with males demonstratively lower (62%). No significant bias was observed between the the two reproductive methods. When the male macroscopic guide was examined, misclassification was high in phase 5 and phase 5 (41%), with 96% of misclassification occurring in adjacent phases. The sampled population was primarily composed of females, with ratios of females to males 1:0.6, although the predominance of females was less noticeable during the reproductive season. Oocyte growth in dab follows asynchronous development, and spawn over a protracted period indicating a batch spawning strategy. Spawning occurred mainly in early spring, with total regeneration of gonads by May. The length at which 50% of the population was reproductively mature was identified as 14cm and 17cm, for male and female dab, respectively. Precision and bias in age determinations using whole otoliths to age dab was investigated using six age readers from various institutions. Low levels of precision were obtained (CV: 10-23%) inferring the need for an alternative methodology. Precision and bias was influence by the level of experience of the reader, with ageing error attributed to interpretative differences and difficulty in edge determination. Sectioned otolith age determinations were subsequently compared to whole otolith age determinations using two age readers experienced in dab ageing. Although increased precision was observed in whole otoliths from previous estimates (CV=0%, 0% APE), sectioned otoliths were used for growth models. This was based on multinominal logistic regression on age length keys developed using both ageing methods. Biological data (length and age) for both sexes was applied to four growth models, where the Akaike criterion and Multi model Inference indicated the logistic model as having the best fit to the collected data. In general, female dab attained a longer length then males, with growth rates significantly different between the two sexes. Length weight relationships between the two sexes were also significantly different.
Resumo:
The morphological characteristics of the egg and five immature stages of Acrosternum obstinatum (Stål, 1860), fed on passion fruit, are described and illustrated. Biological data are also provided.
Resumo:
ABSTRACT The smallnose fanskate, Sympterygia bonapartii Müller & Henle, 1841 is one of the most disembarked items in commercial harbors in Argentina. In this work, the microscopic architecture of mature male gonads and the dynamics of cysts development are analyzed as a contribution to awareness of the reproductive biology of the species. Some biological data related to reproduction are given as well. Two seasons were sampled (fall and spring) and length classes's frequency distribution and maturity stages frequency distribution are given. Size at first sexual maturation for males was estimated at 57 cm of total length. Testes are symmetric, peer, lobed, with several germinal zones. Inside the gonads, there are many spermatocysts, containing reproductive cells at the same developmental stage. On the basis of their cytological and microanatomical features, several maturative degrees of the spermatogenic series were differentiated. Few Leydig cells were recognized at the interstitial tissue among cysts. The microscopic and semiquantitative analysis performed in this work provides morphological information about male gametogenesis and some biological data for the North Patagonian population of this economically and ecologically important species.
Resumo:
The authors give biological data and a histological study of infestation of Musca domestica LINNEU, 1758 by a Phycomycetes of the genus Empusa.