19 resultados para Automatic tools

em Helda - Digital Repository of University of Helsinki


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this thesis is to develop a fully automatic lameness detection system that operates in a milking robot. The instrumentation, measurement software, algorithms for data analysis and a neural network model for lameness detection were developed. Automatic milking has become a common practice in dairy husbandry, and in the year 2006 about 4000 farms worldwide used over 6000 milking robots. There is a worldwide movement with the objective of fully automating every process from feeding to milking. Increase in automation is a consequence of increasing farm sizes, the demand for more efficient production and the growth of labour costs. As the level of automation increases, the time that the cattle keeper uses for monitoring animals often decreases. This has created a need for systems for automatically monitoring the health of farm animals. The popularity of milking robots also offers a new and unique possibility to monitor animals in a single confined space up to four times daily. Lameness is a crucial welfare issue in the modern dairy industry. Limb disorders cause serious welfare, health and economic problems especially in loose housing of cattle. Lameness causes losses in milk production and leads to early culling of animals. These costs could be reduced with early identification and treatment. At present, only a few methods for automatically detecting lameness have been developed, and the most common methods used for lameness detection and assessment are various visual locomotion scoring systems. The problem with locomotion scoring is that it needs experience to be conducted properly, it is labour intensive as an on-farm method and the results are subjective. A four balance system for measuring the leg load distribution of dairy cows during milking in order to detect lameness was developed and set up in the University of Helsinki Research farm Suitia. The leg weights of 73 cows were successfully recorded during almost 10,000 robotic milkings over a period of 5 months. The cows were locomotion scored weekly, and the lame cows were inspected clinically for hoof lesions. Unsuccessful measurements, caused by cows standing outside the balances, were removed from the data with a special algorithm, and the mean leg loads and the number of kicks during milking was calculated. In order to develop an expert system to automatically detect lameness cases, a model was needed. A probabilistic neural network (PNN) classifier model was chosen for the task. The data was divided in two parts and 5,074 measurements from 37 cows were used to train the model. The operation of the model was evaluated for its ability to detect lameness in the validating dataset, which had 4,868 measurements from 36 cows. The model was able to classify 96% of the measurements correctly as sound or lame cows, and 100% of the lameness cases in the validation data were identified. The number of measurements causing false alarms was 1.1%. The developed model has the potential to be used for on-farm decision support and can be used in a real-time lameness monitoring system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Free and Open Source Software (FOSS) has gained increased interest in the computer software industry, but assessing its quality remains a challenge. FOSS development is frequently carried out by globally distributed development teams, and all stages of development are publicly visible. Several product and process-level quality factors can be measured using the public data. This thesis presents a theoretical background for software quality and metrics and their application in a FOSS environment. Information available from FOSS projects in three information spaces are presented, and a quality model suitable for use in a FOSS context is constructed. The model includes both process and product quality metrics, and takes into account the tools and working methods commonly used in FOSS projects. A subset of the constructed quality model is applied to three FOSS projects, highlighting both theoretical and practical concerns in implementing automatic metric collection and analysis. The experiment shows that useful quality information can be extracted from the vast amount of data available. In particular, projects vary in their growth rate, complexity, modularity and team structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bioremediation, which is the exploitation of the intrinsic ability of environmental microbes to degrade and remove harmful compounds from nature, is considered to be an environmentally sustainable and cost-effective means for environmental clean-up. However, a comprehensive understanding of the biodegradation potential of microbial communities and their response to decontamination measures is required for the effective management of bioremediation processes. In this thesis, the potential to use hydrocarbon-degradative genes as indicators of aerobic hydrocarbon biodegradation was investigated. Small-scale functional gene macro- and microarrays targeting aliphatic, monoaromatic and low molecular weight polyaromatic hydrocarbon biodegradation were developed in order to simultaneously monitor the biodegradation of mixtures of hydrocarbons. The validity of the array analysis in monitoring hydrocarbon biodegradation was evaluated in microcosm studies and field-scale bioremediation processes by comparing the hybridization signal intensities to hydrocarbon mineralization, real-time polymerase chain reaction (PCR), dot blot hybridization and both chemical and microbiological monitoring data. The results obtained by real-time PCR, dot blot hybridization and gene array analysis were in good agreement with hydrocarbon biodegradation in laboratory-scale microcosms. Mineralization of several hydrocarbons could be monitored simultaneously using gene array analysis. In the field-scale bioremediation processes, the detection and enumeration of hydrocarbon-degradative genes provided important additional information for process optimization and design. In creosote-contaminated groundwater, gene array analysis demonstrated that the aerobic biodegradation potential that was present at the site, but restrained under the oxygen-limited conditions, could be successfully stimulated with aeration and nutrient infiltration. During ex situ bioremediation of diesel oil- and lubrication oil-contaminated soil, the functional gene array analysis revealed inefficient hydrocarbon biodegradation, caused by poor aeration during composting. The functional gene array specifically detected upper and lower biodegradation pathways required for complete mineralization of hydrocarbons. Bacteria representing 1 % of the microbial community could be detected without prior PCR amplification. Molecular biological monitoring methods based on functional genes provide powerful tools for the development of more efficient remediation processes. The parallel detection of several functional genes using functional gene array analysis is an especially promising tool for monitoring the biodegradation of mixtures of hydrocarbons.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mutation and recombination are the fundamental processes leading to genetic variation in natural populations. This variation forms the raw material for evolution through natural selection and drift. Therefore, studying mutation rates may reveal information about evolutionary histories as well as phylogenetic interrelationships of organisms. In this thesis two molecular tools, DNA barcoding and the molecular clock were examined. In the first part, the efficiency of mutations to delineate closely related species was tested and the implications for conservation practices were assessed. The second part investigated the proposition that a constant mutation rate exists within invertebrates, in form of a metabolic-rate dependent molecular clock, which can be applied to accurately date speciation events. DNA barcoding aspires to be an efficient technique to not only distinguish between species but also reveal population-level variation solely relying on mutations found on a short stretch of a single gene. In this thesis barcoding was applied to discriminate between Hylochares populations from Russian Karelia and new Hylochares findings from the greater Helsinki region in Finland. Although barcoding failed to delineate the two reproductively isolated groups, their distinct morphological features and differing life-history traits led to their classification as two closely related, although separate species. The lack of genetic differentiation appears to be due to a recent divergence event not yet reflected in the beetles molecular make-up. Thus, the Russian Hylochares was described as a new species. The Finnish species, previously considered as locally extinct, was recognized as endangered. Even if, due to their identical genetic make-up, the populations had been regarded as conspecific, conservation strategies based on prior knowledge from Russia would not have guaranteed the survival of the Finnish beetle. Therefore, new conservation actions based on detailed studies of the biology and life-history of the Finnish Hylochares were conducted to protect this endemic rarity in Finland. The idea behind the strict molecular clock is that mutation rates are constant over evolutionary time and may thus be used to infer species divergence dates. However, one of the most recent theories argues that a strict clock does not tick per unit of time but that it has a constant substitution rate per unit of mass-specific metabolic energy. Therefore, according to this hypothesis, molecular clocks have to be recalibrated taking body size and temperature into account. This thesis tested the temperature effect on mutation rates in equally sized invertebrates. For the first dataset (family Eucnemidae, Coleoptera) the phylogenetic interrelationships and evolutionary history of the genus Arrhipis had to be inferred before the influence of temperature on substitution rates could be studied. Further, a second, larger invertebrate dataset (family Syrphidae, Diptera) was employed. Several methodological approaches, a number of genes and multiple molecular clock models revealed that there was no consistent relationship between temperature and mutation rate for the taxa under study. Thus, the body size effect, observed in vertebrates but controversial for invertebrates, rather than temperature may be the underlying driving force behind the metabolic-rate dependent molecular clock. Therefore, the metabolic-rate dependent molecular clock does not hold for the here studied invertebrate groups. This thesis emphasizes that molecular techniques relying on mutation rates have to be applied with caution. Whereas they may work satisfactorily under certain conditions for specific taxa, they may fail for others. The molecular clock as well as DNA barcoding should incorporate all the information and data available to obtain comprehensive estimations of the existing biodiversity and its evolutionary history.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Although immensely complex, speech is also a very efficient means of communication between humans. Understanding how we acquire the skills necessary for perceiving and producing speech remains an intriguing goal for research. However, while learning is likely to begin as soon as we start hearing speech, the tools for studying the language acquisition strategies in the earliest stages of development remain scarce. One prospective strategy is statistical learning. In order to investigate its role in language development, we designed a new research method. The method was tested in adults using magnetoencephalography (MEG) as a measure of cortical activity. Neonatal brain activity was measured with electroencephalography (EEG). Additionally, we developed a method for assessing the integration of seen and heard syllables in the developing brain as well as a method for assessing the role of visual speech when learning phoneme categories. The MEG study showed that adults learn statistical properties of speech during passive listening of syllables. The amplitude of the N400m component of the event-related magnetic fields (ERFs) reflected the location of syllables within pseudowords. The amplitude was also enhanced for syllables in a statistically unexpected position. The results suggest a role for the N400m component in statistical learning studies in adults. Using the same research design with sleeping newborn infants, the auditory event-related potentials (ERPs) measured with EEG reflected the location of syllables within pseudowords. The results were successfully replicated in another group of infants. The results show that even newborn infants have a powerful mechanism for automatic extraction of statistical characteristics from speech. We also found that 5-month-old infants integrate some auditory and visual syllables into a fused percept, whereas other syllable combinations are not fully integrated. Auditory syllables were paired with visual syllables possessing a different phonetic identity, and the ERPs for these artificial syllable combinations were compared with the ERPs for normal syllables. For congruent auditory-visual syllable combinations, the ERPs did not differ from those for normal syllables. However, for incongruent auditory-visual syllable combinations, we observed a mismatch response in the ERPs. The results show an early ability to perceive speech cross-modally. Finally, we exposed two groups of 6-month-old infants to artificially created auditory syllables located between two stereotypical English syllables in the formant space. The auditory syllables followed, equally for both groups, a unimodal statistical distribution, suggestive of a single phoneme category. The visual syllables combined with the auditory syllables, however, were different for the two groups, one group receiving visual stimuli suggestive of two separate phoneme categories, the other receiving visual stimuli suggestive of only one phoneme category. After a short exposure, we observed different learning outcomes for the two groups of infants. The results thus show that visual speech can influence learning of phoneme categories. Altogether, the results demonstrate that complex language learning skills exist from birth. They also suggest a role for the visual component of speech in the learning of phoneme categories.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The analysis of lipid compositions from biological samples has become increasingly important. Lipids have a role in cardiovascular disease, metabolic syndrome and diabetes. They also participate in cellular processes such as signalling, inflammatory response, aging and apoptosis. Also, the mechanisms of regulation of cell membrane lipid compositions are poorly understood, partially because a lack of good analytical methods. Mass spectrometry has opened up new possibilities for lipid analysis due to its high resolving power, sensitivity and the possibility to do structural identification by fragment analysis. The introduction of Electrospray ionization (ESI) and the advances in instrumentation revolutionized the analysis of lipid compositions. ESI is a soft ionization method, i.e. it avoids unwanted fragmentation the lipids. Mass spectrometric analysis of lipid compositions is complicated by incomplete separation of the signals, the differences in the instrument response of different lipids and the large amount of data generated by the measurements. These factors necessitate the use of computer software for the analysis of the data. The topic of the thesis is the development of methods for mass spectrometric analysis of lipids. The work includes both computational and experimental aspects of lipid analysis. The first article explores the practical aspects of quantitative mass spectrometric analysis of complex lipid samples and describes how the properties of phospholipids and their concentration affect the response of the mass spectrometer. The second article describes a new algorithm for computing the theoretical mass spectrometric peak distribution, given the elemental isotope composition and the molecular formula of a compound. The third article introduces programs aimed specifically for the analysis of complex lipid samples and discusses different computational methods for separating the overlapping mass spectrometric peaks of closely related lipids. The fourth article applies the methods developed by simultaneously measuring the progress curve of enzymatic hydrolysis for a large number of phospholipids, which are used to determine the substrate specificity of various A-type phospholipases. The data provides evidence that the substrate efflux from bilayer is the key determining factor for the rate of hydrolysis.