15 resultados para Analysis Tools
em Helda - Digital Repository of University of Helsinki
Resumo:
This work belongs to the field of computational high-energy physics (HEP). The key methods used in this thesis work to meet the challenges raised by the Large Hadron Collider (LHC) era experiments are object-orientation with software engineering, Monte Carlo simulation, the computer technology of clusters, and artificial neural networks. The first aspect discussed is the development of hadronic cascade models, used for the accurate simulation of medium-energy hadron-nucleus reactions, up to 10 GeV. These models are typically needed in hadronic calorimeter studies and in the estimation of radiation backgrounds. Various applications outside HEP include the medical field (such as hadron treatment simulations), space science (satellite shielding), and nuclear physics (spallation studies). Validation results are presented for several significant improvements released in Geant4 simulation tool, and the significance of the new models for computing in the Large Hadron Collider era is estimated. In particular, we estimate the ability of the Bertini cascade to simulate Compact Muon Solenoid (CMS) hadron calorimeter HCAL. LHC test beam activity has a tightly coupled cycle of simulation-to-data analysis. Typically, a Geant4 computer experiment is used to understand test beam measurements. Thus an another aspect of this thesis is a description of studies related to developing new CMS H2 test beam data analysis tools and performing data analysis on the basis of CMS Monte Carlo events. These events have been simulated in detail using Geant4 physics models, full CMS detector description, and event reconstruction. Using the ROOT data analysis framework we have developed an offline ANN-based approach to tag b-jets associated with heavy neutral Higgs particles, and we show that this kind of NN methodology can be successfully used to separate the Higgs signal from the background in the CMS experiment.
Resumo:
Gene mapping is a systematic search for genes that affect observable characteristics of an organism. In this thesis we offer computational tools to improve the efficiency of (disease) gene-mapping efforts. In the first part of the thesis we propose an efficient simulation procedure for generating realistic genetical data from isolated populations. Simulated data is useful for evaluating hypothesised gene-mapping study designs and computational analysis tools. As an example of such evaluation, we demonstrate how a population-based study design can be a powerful alternative to traditional family-based designs in association-based gene-mapping projects. In the second part of the thesis we consider a prioritisation of a (typically large) set of putative disease-associated genes acquired from an initial gene-mapping analysis. Prioritisation is necessary to be able to focus on the most promising candidates. We show how to harness the current biomedical knowledge for the prioritisation task by integrating various publicly available biological databases into a weighted biological graph. We then demonstrate how to find and evaluate connections between entities, such as genes and diseases, from this unified schema by graph mining techniques. Finally, in the last part of the thesis, we define the concept of reliable subgraph and the corresponding subgraph extraction problem. Reliable subgraphs concisely describe strong and independent connections between two given vertices in a random graph, and hence they are especially useful for visualising such connections. We propose novel algorithms for extracting reliable subgraphs from large random graphs. The efficiency and scalability of the proposed graph mining methods are backed by extensive experiments on real data. While our application focus is in genetics, the concepts and algorithms can be applied to other domains as well. We demonstrate this generality by considering coauthor graphs in addition to biological graphs in the experiments.
Resumo:
This thesis presents a highly sensitive genome wide search method for recessive mutations. The method is suitable for distantly related samples that are divided into phenotype positives and negatives. High throughput genotype arrays are used to identify and compare homozygous regions between the cohorts. The method is demonstrated by comparing colorectal cancer patients against unaffected references. The objective is to find homozygous regions and alleles that are more common in cancer patients. We have designed and implemented software tools to automate the data analysis from genotypes to lists of candidate genes and to their properties. The programs have been designed in respect to a pipeline architecture that allows their integration to other programs such as biological databases and copy number analysis tools. The integration of the tools is crucial as the genome wide analysis of the cohort differences produces many candidate regions not related to the studied phenotype. CohortComparator is a genotype comparison tool that detects homozygous regions and compares their loci and allele constitutions between two sets of samples. The data is visualised in chromosome specific graphs illustrating the homozygous regions and alleles of each sample. The genomic regions that may harbour recessive mutations are emphasised with different colours and a scoring scheme is given for these regions. The detection of homozygous regions, cohort comparisons and result annotations are all subjected to presumptions many of which have been parameterized in our programs. The effect of these parameters and the suitable scope of the methods have been evaluated. Samples with different resolutions can be balanced with the genotype estimates of their haplotypes and they can be used within the same study.
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
Bioremediation, which is the exploitation of the intrinsic ability of environmental microbes to degrade and remove harmful compounds from nature, is considered to be an environmentally sustainable and cost-effective means for environmental clean-up. However, a comprehensive understanding of the biodegradation potential of microbial communities and their response to decontamination measures is required for the effective management of bioremediation processes. In this thesis, the potential to use hydrocarbon-degradative genes as indicators of aerobic hydrocarbon biodegradation was investigated. Small-scale functional gene macro- and microarrays targeting aliphatic, monoaromatic and low molecular weight polyaromatic hydrocarbon biodegradation were developed in order to simultaneously monitor the biodegradation of mixtures of hydrocarbons. The validity of the array analysis in monitoring hydrocarbon biodegradation was evaluated in microcosm studies and field-scale bioremediation processes by comparing the hybridization signal intensities to hydrocarbon mineralization, real-time polymerase chain reaction (PCR), dot blot hybridization and both chemical and microbiological monitoring data. The results obtained by real-time PCR, dot blot hybridization and gene array analysis were in good agreement with hydrocarbon biodegradation in laboratory-scale microcosms. Mineralization of several hydrocarbons could be monitored simultaneously using gene array analysis. In the field-scale bioremediation processes, the detection and enumeration of hydrocarbon-degradative genes provided important additional information for process optimization and design. In creosote-contaminated groundwater, gene array analysis demonstrated that the aerobic biodegradation potential that was present at the site, but restrained under the oxygen-limited conditions, could be successfully stimulated with aeration and nutrient infiltration. During ex situ bioremediation of diesel oil- and lubrication oil-contaminated soil, the functional gene array analysis revealed inefficient hydrocarbon biodegradation, caused by poor aeration during composting. The functional gene array specifically detected upper and lower biodegradation pathways required for complete mineralization of hydrocarbons. Bacteria representing 1 % of the microbial community could be detected without prior PCR amplification. Molecular biological monitoring methods based on functional genes provide powerful tools for the development of more efficient remediation processes. The parallel detection of several functional genes using functional gene array analysis is an especially promising tool for monitoring the biodegradation of mixtures of hydrocarbons.
Resumo:
The analysis of lipid compositions from biological samples has become increasingly important. Lipids have a role in cardiovascular disease, metabolic syndrome and diabetes. They also participate in cellular processes such as signalling, inflammatory response, aging and apoptosis. Also, the mechanisms of regulation of cell membrane lipid compositions are poorly understood, partially because a lack of good analytical methods. Mass spectrometry has opened up new possibilities for lipid analysis due to its high resolving power, sensitivity and the possibility to do structural identification by fragment analysis. The introduction of Electrospray ionization (ESI) and the advances in instrumentation revolutionized the analysis of lipid compositions. ESI is a soft ionization method, i.e. it avoids unwanted fragmentation the lipids. Mass spectrometric analysis of lipid compositions is complicated by incomplete separation of the signals, the differences in the instrument response of different lipids and the large amount of data generated by the measurements. These factors necessitate the use of computer software for the analysis of the data. The topic of the thesis is the development of methods for mass spectrometric analysis of lipids. The work includes both computational and experimental aspects of lipid analysis. The first article explores the practical aspects of quantitative mass spectrometric analysis of complex lipid samples and describes how the properties of phospholipids and their concentration affect the response of the mass spectrometer. The second article describes a new algorithm for computing the theoretical mass spectrometric peak distribution, given the elemental isotope composition and the molecular formula of a compound. The third article introduces programs aimed specifically for the analysis of complex lipid samples and discusses different computational methods for separating the overlapping mass spectrometric peaks of closely related lipids. The fourth article applies the methods developed by simultaneously measuring the progress curve of enzymatic hydrolysis for a large number of phospholipids, which are used to determine the substrate specificity of various A-type phospholipases. The data provides evidence that the substrate efflux from bilayer is the key determining factor for the rate of hydrolysis.
Resumo:
XVIII IUFRO World Congress, Ljubljana 1986.
Resumo:
The feasibility of different modern analytical techniques for the mass spectrometric detection of anabolic androgenic steroids (AAS) in human urine was examined in order to enhance the prevalent analytics and to find reasonable strategies for effective sports drug testing. A comparative study of the sensitivity and specificity between gas chromatography (GC) combined with low (LRMS) and high resolution mass spectrometry (HRMS) in screening of AAS was carried out with four metabolites of methandienone. Measurements were done in selected ion monitoring mode with HRMS using a mass resolution of 5000. With HRMS the detection limits were considerably lower than with LRMS, enabling detection of steroids at low 0.2-0.5 ng/ml levels. However, also with HRMS, the biological background hampered the detection of some steroids. The applicability of liquid-phase microextraction (LPME) was studied with metabolites of fluoxymesterone, 4-chlorodehydromethyltestosterone, stanozolol and danazol. Factors affecting the extraction process were studied and a novel LPME method with in-fiber silylation was developed and validated for GC/MS analysis of the danazol metabolite. The method allowed precise, selective and sensitive analysis of the metabolite and enabled simultaneous filtration, extraction, enrichment and derivatization of the analyte from urine without any other steps in sample preparation. Liquid chromatographic/tandem mass spectrometric (LC/MS/MS) methods utilizing electrospray ionization (ESI), atmospheric pressure chemical ionization (APCI) and atmospheric pressure photoionization (APPI) were developed and applied for detection of oxandrolone and metabolites of stanozolol and 4-chlorodehydromethyltestosterone in urine. All methods exhibited high sensitivity and specificity. ESI showed, however, the best applicability, and a LC/ESI-MS/MS method for routine screening of nine 17-alkyl-substituted AAS was thus developed enabling fast and precise measurement of all analytes with detection limits below 2 ng/ml. The potential of chemometrics to resolve complex GC/MS data was demonstrated with samples prepared for AAS screening. Acquired full scan spectral data (m/z 40-700) were processed by the OSCAR algorithm (Optimization by Stepwise Constraints of Alternating Regression). The deconvolution process was able to dig out from a GC/MS run more than the double number of components as compared with the number of visible chromatographic peaks. Severely overlapping components, as well as components hidden in the chromatographic background could be isolated successfully. All studied techniques proved to be useful analytical tools to improve detection of AAS in urine. Superiority of different procedures is, however, compound-dependent and different techniques complement each other.
Resumo:
The issue of the usefulness of different prosopis species versus their status as weeds is a matter of hot debate around the world. The tree Prosopis juliflora had until 2000 been proclaimed weedy in its native range in South America and elsewhere in the dry tropics. P. juliflora or mesquite has a 90-year history in Sudan. During the early 1990s a popular opinion in central Sudan and the Sudanese Government had begun to consider prosopis a noxious weed and a problematic tree species due to its aggressive ability to invade farmlands and pastures, especially in and around irrigated agricultural lands. As a consequence prosopis was officially declared an invasive alien species also in Sudan, and in 1995 a presidential decree for its eradication was issued. Using a total economic valuation (TEV) approach, this study analysed the impacts of prosopis on the local livelihoods in two contrasting irrigated agricultural schemes. Primarily a problem-based approach was used in which the derivation of non-market values was captured using ecological economic tools. In the New Halfa Irrigation Scheme in Kassala State, four separate household surveys were conducted due to diversity between the respective population groups. The main aim was here to study the magnitude of environmental economic benefits and costs derived from the invasion of prosopis in a large agricultural irrigation scheme on clay soil. Another study site, the Gandato Irrigation Scheme in River Nile State represented impacts from prosopis that an irrigation scheme was confronted with on sandy soil in the arid and semi-arid ecozones along the main River Nile. The two cases showed distinctly different effects of prosopis but both indicated the benefits to exceed the costs. The valuation on clay soil in New Halfa identified a benefit/cost ratio of 2.1, while this indicator equalled 46 on the sandy soils of Gandato. The valuation results were site-specific and based on local market prices. The most important beneficial impacts of prosopis on local livelihoods were derived from free-grazing forage for livestock, environmental conservation of the native vegetation, wood and non-wood forest products, as well as shelterbelt effects. The main social costs from prosopis were derived from weeding and clearing it from farm lands and from canalsides, from thorn injuries to humans and livestock, as well as from repair expenses vehicle tyre punctures. Of the population groups, the tenants faced most of the detrimental impacts, while the landless population groups (originating from western and eastern Sudan) as well as the nomads were highly dependent on this tree resource. For the Gandato site the monetized benefit-cost ratio of 46 still excluded several additional beneficial impacts of prosopis in the area that were difficult to quantify and monetize credibly. In River Nile State the beneficial impact could thus be seen as completely outweighing the costs of prosopis. The results can contributed to the formulation of national and local forest and agricultural policies related to prosopis in Sudan and also be used in other countries faced with similar impacts caused by this tree.
Resumo:
Acacia senegal, the gum arabic producing tree, is the most important component in traditional dryland agroforestry systems in the Blue Nile region, Sudan. The aim of the present study was to provide new knowledge on the potential use of A. senegal in dryland agroforestry systems on clay soils, as well as information on tree/crop interaction, and on silvicultural and management tools, with consideration on system productivity, nutrient cycling and sustainability. Moreover, the aim was also to clarify the intra-specific variation in the performance of A. senegal and, specifically, the adaptation of trees of different origin to the clay soils of the Blue Nile region. In agroforestry systems established at the beginning of the study, tree and crop growth, water use, gum and crop yields, nutrient cycling and system performance were investigated for a period of four years (1999 to 2002). Trees were grown at 5 x 5 m and 10 x 10 m spacing alone or in mixture with sorghum or sesame; crops were also grown in sole culture. The symbiotic biological N2 fixation by A. senegal was estimated using the 15N natural abundance (δ15N) procedure in eight provenances collected from different environments and soil types of the gum arabic belt and grown in clay soil in the Blue Nile region. Balanites aegyptiaca (a non-legume) was used as a non-N-fixing reference tree species, so as to allow 15N-based estimates of the proportion of the nitrogen in trees derived from the atmosphere. In the planted acacia trees, measurements were made on shoot growth, water-use efficiency (as assessed by the δ13C method) and (starting from the third year) gum production. Carbon isotope ratios were obtained from the leaves and branch wood samples. The agroforestry system design caused no statistically significant variation in water use, but the variation was highly significant between years, and the highest water use occurred in the years with high rainfall. No statistically significant differences were found in sorghum or sesame yields when intercropping and sole crop systems were compared (yield averages were 1.54 and 1.54 ha-1 for sorghum and 0.36 and 0.42 t ha-1 for sesame in the intercropped and mono-crop plots, respectively). Thus, at an early stage of agroforestry system management, A. senegal had no detrimental effect on crop yield, but the pattern of resource capture by trees and crops may change as the system matures. Intercropping resulted in taller trees and larger basal and crown diameters as compared to the development of sole trees. It also resulted in a higher land equivalent ratio. When gum yields were analysed it was found that a significant positive relationship existed between the second gum picking and the total gum yield. The second gum picking seems to be a decisive factor in gum production and could be used as an indicator for the total gum yield in a particular year. In trees, the concentrations of N and P were higher in leaves and roots, whereas the levels of K were higher in stems, branches and roots. Soil organic matter, N, P and K contents were highest in the upper soil stratum. There was some indication that the P content slightly increased in the topsoil as the agroforestry plantations aged. At a stocking of 400 trees ha-1 (5 x 5 m spacing), A. senegal accumulated in the biomass a total of 18, 1.21, 7.8 and 972 kg ha-1of N, P, K and OC, respectively. Trees contributed ca. 217 and 1500 kg ha-1 of K and OC, respectively, to the top 25-cm of soil over the first four years of intercropping. Acacia provenances of clay plain origin showed considerable variation in seed weight. They also had the lowest average seed weight as compared to the sandy soil (western) provenances. At the experimental site in the clay soil region, the clay provenances were distinctly superior to the sand provenances in all traits studied but especially in basal diameter and crown width, thus reflecting their adaptation to the environment. Values of δ13C, indicating water use efficiency, were higher in the sand soil group as compared to the clay one, both in leaves and in branch wood. This suggests that the sand provenances (with an average value of -28.07 ) displayed conservative water use and high drought tolerance. Of the clay provenances, the local one (Bout) displayed a highly negative (-29.31 ) value, which indicates less conservative water use that resulted in high productivity at this particular clay-soil site. Water use thus appeared to correspond to the environmental conditions prevailing at the original locations for these provenances. Results suggest that A. senegal provenances from the clay part of the gum belt are adapted for a faster growth rate and higher biomass and gum productivity as compared to provenances from sand regions. A strong negative relationship was found between the per-tree gum yield and water use efficiency, as indicated by δ13C. The differences in water use and gum production were greater among provenance groups than within them, suggesting that selection among rather than within provenances would result in distinct genetic gain in gum yield. The relative δ15N values ( ) were higher in B. aegyptiaca than in the N2-fixing acacia provenances. The amount of Ndfa increased significantly with age in all provenances, indicating that A. senegal is a potentially efficient nitrogen fixer and has an important role in t agroforestry development. The total above-ground contribution of fixed N to foliage growth in 4-year-old A. senegal trees was highest in the Rahad sand-soil provenance (46.7 kg N ha-1) and lowest in the Mazmoom clay-soil provenance (28.7 kg N ha-1). This study represents the first use of the δ15N method for estimating the N input by A. senegal in the gum belt of Sudan. Key words: Acacia senegal, agroforestry, clay plain, δ13C, δ15N, gum arabic, nutrient cycling, Ndfa, Sorghum bicolor, Sesamum indicum
Resumo:
Elucidating the mechanisms responsible for the patterns of species abundance, diversity, and distribution within and across ecological systems is a fundamental research focus in ecology. Species abundance patterns are shaped in a convoluted way by interplays between inter-/intra-specific interactions, environmental forcing, demographic stochasticity, and dispersal. Comprehensive models and suitable inferential and computational tools for teasing out these different factors are quite limited, even though such tools are critically needed to guide the implementation of management and conservation strategies, the efficacy of which rests on a realistic evaluation of the underlying mechanisms. This is even more so in the prevailing context of concerns over climate change progress and its potential impacts on ecosystems. This thesis utilized the flexible hierarchical Bayesian modelling framework in combination with the computer intensive methods known as Markov chain Monte Carlo, to develop methodologies for identifying and evaluating the factors that control the structure and dynamics of ecological communities. These methodologies were used to analyze data from a range of taxa: macro-moths (Lepidoptera), fish, crustaceans, birds, and rodents. Environmental stochasticity emerged as the most important driver of community dynamics, followed by density dependent regulation; the influence of inter-specific interactions on community-level variances was broadly minor. This thesis contributes to the understanding of the mechanisms underlying the structure and dynamics of ecological communities, by showing directly that environmental fluctuations rather than inter-specific competition dominate the dynamics of several systems. This finding emphasizes the need to better understand how species are affected by the environment and acknowledge species differences in their responses to environmental heterogeneity, if we are to effectively model and predict their dynamics (e.g. for management and conservation purposes). The thesis also proposes a model-based approach to integrating the niche and neutral perspectives on community structure and dynamics, making it possible for the relative importance of each category of factors to be evaluated in light of field data.
Resumo:
Telecommunications network management is based on huge amounts of data that are continuously collected from elements and devices from all around the network. The data is monitored and analysed to provide information for decision making in all operation functions. Knowledge discovery and data mining methods can support fast-pace decision making in network operations. In this thesis, I analyse decision making on different levels of network operations. I identify the requirements decision-making sets for knowledge discovery and data mining tools and methods, and I study resources that are available to them. I then propose two methods for augmenting and applying frequent sets to support everyday decision making. The proposed methods are Comprehensive Log Compression for log data summarisation and Queryable Log Compression for semantic compression of log data. Finally I suggest a model for a continuous knowledge discovery process and outline how it can be implemented and integrated to the existing network operations infrastructure.
Resumo:
The metabolism of an organism consists of a network of biochemical reactions that transform small molecules, or metabolites, into others in order to produce energy and building blocks for essential macromolecules. The goal of metabolic flux analysis is to uncover the rates, or the fluxes, of those biochemical reactions. In a steady state, the sum of the fluxes that produce an internal metabolite is equal to the sum of the fluxes that consume the same molecule. Thus the steady state imposes linear balance constraints to the fluxes. In general, the balance constraints imposed by the steady state are not sufficient to uncover all the fluxes of a metabolic network. The fluxes through cycles and alternative pathways between the same source and target metabolites remain unknown. More information about the fluxes can be obtained from isotopic labelling experiments, where a cell population is fed with labelled nutrients, such as glucose that contains 13C atoms. Labels are then transferred by biochemical reactions to other metabolites. The relative abundances of different labelling patterns in internal metabolites depend on the fluxes of pathways producing them. Thus, the relative abundances of different labelling patterns contain information about the fluxes that cannot be uncovered from the balance constraints derived from the steady state. The field of research that estimates the fluxes utilizing the measured constraints to the relative abundances of different labelling patterns induced by 13C labelled nutrients is called 13C metabolic flux analysis. There exist two approaches of 13C metabolic flux analysis. In the optimization approach, a non-linear optimization task, where candidate fluxes are iteratively generated until they fit to the measured abundances of different labelling patterns, is constructed. In the direct approach, linear balance constraints given by the steady state are augmented with linear constraints derived from the abundances of different labelling patterns of metabolites. Thus, mathematically involved non-linear optimization methods that can get stuck to the local optima can be avoided. On the other hand, the direct approach may require more measurement data than the optimization approach to obtain the same flux information. Furthermore, the optimization framework can easily be applied regardless of the labelling measurement technology and with all network topologies. In this thesis we present a formal computational framework for direct 13C metabolic flux analysis. The aim of our study is to construct as many linear constraints to the fluxes from the 13C labelling measurements using only computational methods that avoid non-linear techniques and are independent from the type of measurement data, the labelling of external nutrients and the topology of the metabolic network. The presented framework is the first representative of the direct approach for 13C metabolic flux analysis that is free from restricting assumptions made about these parameters.In our framework, measurement data is first propagated from the measured metabolites to other metabolites. The propagation is facilitated by the flow analysis of metabolite fragments in the network. Then new linear constraints to the fluxes are derived from the propagated data by applying the techniques of linear algebra.Based on the results of the fragment flow analysis, we also present an experiment planning method that selects sets of metabolites whose relative abundances of different labelling patterns are most useful for 13C metabolic flux analysis. Furthermore, we give computational tools to process raw 13C labelling data produced by tandem mass spectrometry to a form suitable for 13C metabolic flux analysis.
Resumo:
Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.
Resumo:
Aims: Develop and validate tools to estimate residual noise covariance in Planck frequency maps. Quantify signal error effects and compare different techniques to produce low-resolution maps. Methods: We derive analytical estimates of covariance of the residual noise contained in low-resolution maps produced using a number of map-making approaches. We test these analytical predictions using Monte Carlo simulations and their impact on angular power spectrum estimation. We use simulations to quantify the level of signal errors incurred in different resolution downgrading schemes considered in this work. Results: We find an excellent agreement between the optimal residual noise covariance matrices and Monte Carlo noise maps. For destriping map-makers, the extent of agreement is dictated by the knee frequency of the correlated noise component and the chosen baseline offset length. The significance of signal striping is shown to be insignificant when properly dealt with. In map resolution downgrading, we find that a carefully selected window function is required to reduce aliasing to the sub-percent level at multipoles, ell > 2Nside, where Nside is the HEALPix resolution parameter. We show that sufficient characterization of the residual noise is unavoidable if one is to draw reliable contraints on large scale anisotropy. Conclusions: We have described how to compute the low-resolution maps, with a controlled sky signal level, and a reliable estimate of covariance of the residual noise. We have also presented a method to smooth the residual noise covariance matrices to describe the noise correlations in smoothed, bandwidth limited maps.