9 resultados para biological data

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Viime vuosien nopea kehitys on kiihdyttänyt uusien lääkkeiden kehittämisprosessia. Kombinatorinen kemia on tehnyt mahdolliseksi syntetisoida suuria kokoelmia rakenteeltaan toisistaan poikkeavia molekyylejä, nk. kombinatorisia kirjastoja, biologista seulontaa varten. Siinä molekyylien rakenteeseen liittyvä aktiivisuus tutkitaan useilla erilaisilla biologisilla testeillä mahdollisten "osumien" löytämiseksi, joista osasta saatetaan myöhemmin kehittää uusia lääkeaineita. Jotta biologisten tutkimusten tulokset olisivat luotettavia, on syntetisoitujen komponenttien oltava mahdollisimman puhtaita. Tämän vuoksi tarvitaan HTP-puhdistusta korkealaatuisten komponenttien ja luotettavan biologisen tiedon takaamiseksi. Jatkuvasti kasvavat tuotantovaatimukset ovat johtaneet näiden puhdistustekniikoiden automatisointiin ja rinnakkaistamiseen. Preparatiivinen LC/MS soveltuu kombinatoristen kirjastojen nopeaan ja tehokkaaseen puhdistamiseen. Monet tekijät, esimerkiksi erotuskolonnin ominaisuudet sekä virtausgradientti, vaikuttavat preparatiivisen LC/MS puhdistusprosessin tehokkuuteen. Nämä parametrit on optimoitava parhaan tuloksen saamiseksi. Tässä työssä tutkittiin emäksisiä komponentteja erilaisissa virtausolosuhteissa. Menetelmä kombinatoristen kirjastojen puhtaustason määrittämiseksi LC/MS-puhdistuksen jälkeen optimoitiin ja määritettiin puhtaus joillekin komponenteille eri kirjastoista ennen puhdistusta.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Prostate cancers form a heterogeneous group of diseases and there is a need for novel biomarkers, and for more efficient and targeted methods of treatment. In this thesis, the potential of microarray data, RNA interference (RNAi) and compound screens were utilized in order to identify novel biomarkers, drug targets and drugs for future personalized prostate cancer therapeutics. First, a bioinformatic mRNA expression analysis covering 9873 human tissue and cell samples, including 349 prostate cancer and 147 normal prostate samples, was used to distinguish in silico prevalidated putative prostate cancer biomarkers and drug targets. Second, RNAi based high-throughput (HT) functional profiling of 295 prostate and prostate cancer tissue specific genes was performed in cultured prostate cancer cells. Third, a HT compound screen approach using a library of 4910 drugs and drug-like molecules was exploited to identify potential drugs inhibiting prostate cancer cell growth. Nine candidate drug targets, with biomarker potential, and one cancer selective compound were validated in vitro and in vivo. In addition to androgen receptor (AR) signaling, endoplasmic reticulum (ER) function, arachidonic acid (AA) pathway, redox homeostasis and mitosis were identified as vital processes in prostate cancer cells. ERG oncogene positive cancer cells exhibited sensitivity to induction of oxidative and ER stress, whereas advanced and castrate-resistant prostate cancer (CRPC) could be potentially targeted through AR signaling and mitosis. In conclusion, this thesis illustrates the power of systems biological data analysis in the discovery of potential vulnerabilities present in prostate cancer cells, as well as novel options for personalized cancer management.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the growth in new technologies, using online tools have become an everyday lifestyle. It has a greater impact on researchers as the data obtained from various experiments needs to be analyzed and knowledge of programming has become mandatory even for pure biologists. Hence, VTT came up with a new tool, R Executables (REX) which is a web application designed to provide a graphical interface for biological data functions like Image analysis, Gene expression data analysis, plotting, disease and control studies etc., which employs R functions to provide results. REX provides a user interactive application for the biologists to directly enter the values and run the required analysis with a single click. The program processes the given data in the background and prints results rapidly. Due to growth of data and load on server, the interface has gained problems concerning time consumption, poor GUI, data storage issues, security, minimal user interactive experience and crashes with large amount of data. This thesis handles the methods by which these problems were resolved and made REX a better application for the future. The old REX was developed using Python Django and now, a new programming language, Vaadin has been implemented. Vaadin is a Java framework for developing web applications and the programming language is extremely similar to Java with new rich components. Vaadin provides better security, better speed, good and interactive interface. In this thesis, subset functionalities of REX was selected which includes IST bulk plotting and image segmentation and implemented those using Vaadin. A code of 662 lines was programmed by me which included Vaadin as the front-end handler while R language was used for back-end data retrieval, computing and plotting. The application is optimized to allow further functionalities to be migrated with ease from old REX. Future development is focused on including Hight throughput screening functions along with gene expression database handling

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rapid changes in biodiversity are occurring globally, as a consequence of anthropogenic disturbance. This has raised concerns, since biodiversity is known to significantly contribute to ecosystem functions and services. Marine benthic communities participate in numerous functions provided by soft-sedimentary ecosystems. Eutrophication-induced oxygen deficiency is a growing threat against infaunal communities, both in open sea areas and in coastal zones. There is thus a need to understand how such disturbance affects benthic communities, and what is lost in terms of ecosystem functioning if benthic communities are harmed. In this thesis, the status of benthic biodiversity was assessed for the open Baltic Sea, a system severely affected by broad-scale hypoxia. Long-term monitoring data made it possible to establish quantitative biodiversity baselines against which change could be compared. The findings show that benthic biodiversity is currently severely impaired in large areas of the open Baltic Sea, from the Bornholm Basin to the Gulf of Finland. The observed reduction in biodiversity indicates that benthic communities are structurally and functionally impoverished in several of the sub-basins due to the hypoxic stress. A more detailed examination of disturbance impacts (through field studies and -experiments) on benthic communities in coastal areas showed that changes in benthic community structure and function took place well before species were lost from the system. The degradation of benthic community structure and function was directed by the type of disturbance, and its specific temporal and spatial characteristics. The observed shifts in benthic trait composition were primarily the result of reductions in species’ abundances, or of changes in demographic characteristics, such as the loss of large, adult bivalves. Reduction in community functions was expressed as declines in the benthic bioturbation potential and in secondary biomass production. The benthic communities and their degradation accounted for a substantial proportion of the changes observed in ecosystem multifunctionality. Individual ecosystem functions (i.e. measures of sediment ecosystem metabolism, elemental cycling, biomass production, organic matter transformation and physical structuring) were observed to differ in their response to increasing hypoxic disturbance. Interestingly, the results suggested that an impairment of ecosystem functioning could be detected at an earlier stage if multiple functions were considered. Importantly, the findings indicate that even small-scale hypoxic disturbance can reduce the buffering capacity of sedimentary ecosystem, and increase the susceptibility of the system towards further stress. Although the results of the individual papers are context-dependent, their combined outcome implies that healthy benthic communities are important for sustaining overall ecosystem functioning as well as ecosystem resilience in the Baltic Sea.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coastal areas harbour high biodiversity, but are simultaneously affected by rapid degradations of species and habitats due to human interactions. Such alterations also affect the functioning of the ecosystem, which is primarily governed by the characteristics or traits expressed by the organisms present. Marine benthic fauna is nvolved in numerous functions such as organic matter transformation and transport, secondary production, oxygen transport as well as nutrient cycling. Approaches utilising the variety of faunal traits to assess benthic community functioning have rapidly increased and shown the need for further development of the concept. In this thesis, I applied biological trait analysis that allows for assessments of a multitude of categorical traits and thus evaluation of multiple functional aspects simultaneously. I determined the functional trait structure, diversity and variability of coastal zoobenthic communities in the Baltic Sea. The measures were related to recruitment processes, habitat heterogeneity, large-scale environmental and taxonomic gradients as well as anthropogenic impacts. The studies comprised spatial scales from metres to thousands of kilometres, and temporal scales spanning one season as well as a decade. The benthic functional structure was found to vary within and between seagrass landscape microhabitats and four different habitats within a coastal bay, in papers I and II respectively. Expressions of trait categories varied within habitats, while the density of individuals was found to drive the functional differences between habitats. The findings in paper III unveiled high trait richness of Finnish coastal benthos (25 traits and 102 cateogries) although this differed between areas high and low in salinity and human pressure. In paper IV, the natural reduction in taxonomic richness across the Baltic Sea led to an overall reduction in function. However, functional richness in terms of number of trait categories remained comparatively high at low taxon richness. Changes in number of taxa within trait categories were also subtle and some individual categories were maintained or even increased. The temporal analysis in papers I and III highlighted generalities in trait expressions and dominant trait categories in a seagrass landscape as well as a “type organism” for the northern Baltic Sea. Some initial findings were made in all four papers on the role of common and rare species and traits for benthic community functioning. The findings show that common and rare species may not always express the same trait categories in relation to each other. Rare species in general did not express unique functional properties. In order to advance the understanding of the approach, I also assessed some issues concerning the limitations of the concept. This was conducted by evaluating the link between trait category and taxonomic richness using especially univariate measures. My results also show the need to collaborate nationally and internationally on safeguarding the utility of taxonomic and trait data. The findings also highlight the importance of including functional trait information into current efforts in marine spatial planning and biomonitoring.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mass spectrometry (MS)-based proteomics has seen significant technical advances during the past two decades and mass spectrometry has become a central tool in many biosciences. Despite the popularity of MS-based methods, the handling of the systematic non-biological variation in the data remains a common problem. This biasing variation can result from several sources ranging from sample handling to differences caused by the instrumentation. Normalization is the procedure which aims to account for this biasing variation and make samples comparable. Many normalization methods commonly used in proteomics have been adapted from the DNA-microarray world. Studies comparing normalization methods with proteomics data sets using some variability measures exist. However, a more thorough comparison looking at the quantitative and qualitative differences of the performance of the different normalization methods and at their ability in preserving the true differential expression signal of proteins, is lacking. In this thesis, several popular and widely used normalization methods (the Linear regression normalization, Local regression normalization, Variance stabilizing normalization, Quantile-normalization, Median central tendency normalization and also variants of some of the forementioned methods), representing different strategies in normalization are being compared and evaluated with a benchmark spike-in proteomics data set. The normalization methods are evaluated in several ways. The performance of the normalization methods is evaluated qualitatively and quantitatively on a global scale and in pairwise comparisons of sample groups. In addition, it is investigated, whether performing the normalization globally on the whole data or pairwise for the comparison pairs examined, affects the performance of the normalization method in normalizing the data and preserving the true differential expression signal. In this thesis, both major and minor differences in the performance of the different normalization methods were found. Also, the way in which the normalization was performed (global normalization of the whole data or pairwise normalization of the comparison pair) affected the performance of some of the methods in pairwise comparisons. Differences among variants of the same methods were also observed.