16 resultados para Kahler metrics
em Helda - Digital Repository of University of Helsinki
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
Wireless access is expected to play a crucial role in the future of the Internet. The demands of the wireless environment are not always compatible with the assumptions that were made on the era of the wired links. At the same time, new services that take advantage of the advances in many areas of technology are invented. These services include delivery of mass media like television and radio, Internet phone calls, and video conferencing. The network must be able to deliver these services with acceptable performance and quality to the end user. This thesis presents an experimental study to measure the performance of bulk data TCP transfers, streaming audio flows, and HTTP transfers which compete the limited bandwidth of the GPRS/UMTS-like wireless link. The wireless link characteristics are modeled with a wireless network emulator. We analyze how different competing workload types behave with regular TPC and how the active queue management, the Differentiated services (DiffServ), and a combination of TCP enhancements affect the performance and the quality of service. We test on four link types including an error-free link and the links with different Automatic Repeat reQuest (ARQ) persistency. The analysis consists of comparing the resulting performance in different configurations based on defined metrics. We observed that DiffServ and Random Early Detection (RED) with Explicit Congestion Notification (ECN) are useful, and in some conditions necessary, for quality of service and fairness because a long queuing delay and congestion related packet losses cause problems without DiffServ and RED. However, we observed situations, where there is still room for significant improvements if the link-level is aware of the quality of service. Only very error-prone link diminishes the benefits to nil. The combination of TCP enhancements improves performance. These include initial window of four, Control Block Interdependence (CBI) and Forward RTO recovery (F-RTO). The initial window of four helps a later starting TCP flow to start faster but generates congestion under some conditions. CBI prevents slow-start overshoot and balances slow start in the presence of error drops, and F-RTO reduces unnecessary retransmissions successfully.
Resumo:
Free and Open Source Software (FOSS) has gained increased interest in the computer software industry, but assessing its quality remains a challenge. FOSS development is frequently carried out by globally distributed development teams, and all stages of development are publicly visible. Several product and process-level quality factors can be measured using the public data. This thesis presents a theoretical background for software quality and metrics and their application in a FOSS environment. Information available from FOSS projects in three information spaces are presented, and a quality model suitable for use in a FOSS context is constructed. The model includes both process and product quality metrics, and takes into account the tools and working methods commonly used in FOSS projects. A subset of the constructed quality model is applied to three FOSS projects, highlighting both theoretical and practical concerns in implementing automatic metric collection and analysis. The experiment shows that useful quality information can be extracted from the vast amount of data available. In particular, projects vary in their growth rate, complexity, modularity and team structure.
Resumo:
Place identification is the methodology of automatically detecting spatial regions or places that are meaningful to a user by analysing her location traces. Following this approach several algorithms have been proposed in the literature. Most of the algorithms perform well on a particular data set with suitable choice of parameter values. However, tuneable parameters make it difficult for an algorithm to generalise to data sets collected from different geographical locations, different periods of time or containing different activities. This thesis compares the generalisation performance of our proposed DPCluster algorithm along with six state-of-the-art place identification algorithms on twelve location data sets collected using Global Positioning System (GPS). Spatial and temporal variations present in the data help us to identify strengths and weaknesses of the place identification algorithms under study. We begin by discussing the notion of a place and its importance in location-aware computing. Next, we discuss different phases of the place identification process found in the literature followed by a thorough description of seven algorithms. After that, we define evaluation metrics and compare generalisation performance of individual place identification algorithms and report the results. The results indicate that the DPCluster algorithm performs superior to all other algorithms in terms of generalisation performance.
Resumo:
Free and open source software development is an alternative to traditional software engineering as an approach to the development of complex software systems. It is a way of developing software based on geographically distributed teams of volunteers without apparent central plan or traditional mechanisms of coordination. The purpose of this thesis is to summarize the current knowledge about free and open source software development and explore the ways on which further understanding on it could be gained. The results of research on the field as well as the research methods are introduced and discussed. Also adapting software process metrics to the context of free and open source software development is illustrated and the possibilities to utilize them as tools to validate other research are discussed.
Resumo:
The tackling of coastal eutrophication requires water protection measures based on status assessments of water quality. The main purpose of this thesis was to evaluate whether it is possible both scientifically and within the terms of the European Union Water Framework Directive (WFD) to assess the status of coastal marine waters reliably by using phytoplankton biomass (ww) and chlorophyll a (Chl) as indicators of eutrophication in Finnish coastal waters. Empirical approaches were used to study whether the criteria, established for determining an indicator, are fulfilled. The first criterion (i) was that an indicator should respond to anthropogenic stresses in a predictable manner and has low variability in its response. Summertime Chl could be predicted accurately by nutrient concentrations, but not from the external annual loads alone, because of the rapid affect of primary production and sedimentation close to the loading sources in summer. The most accurate predictions were achieved in the Archipelago Sea, where total phosphorus (TP) and total nitrogen (TN) alone accounted for 87% and 78% of the variation in Chl, respectively. In river estuaries, the TP mass-balance regression model predicted Chl most accurately when nutrients originated from point-sources, whereas land-use regression models were most accurate in cases when nutrients originated mainly from diffuse sources. The inclusion of morphometry (e.g. mean depth) into nutrient models improved accuracy of the predictions. The second criterion (ii) was associated with the WFD. It requires that an indicator should have type-specific reference conditions, which are defined as "conditions where the values of the biological quality elements are at high ecological status". In establishing reference conditions, the empirical approach could only be used in the outer coastal water types, where historical observations of Secchi depth of the early 1900s are available. The most accurate prediction was achieved in the Quark. In the inner coastal water types, reference Chl, estimated from present monitoring data, are imprecise - not only because of the less accurate estimation method but also because the intrinsic characteristics, described for instance by morphometry, vary considerably inside these extensive inner coastal types. As for phytoplankton biomass, the reference values were less accurate than in the case of Chl, because it was possible to estimate reference conditions for biomass only by using the reconstructed Chl values, not the historical Secchi observations. An paleoecological approach was also applied to estimate annual average reference conditions for Chl. In Laajalahti, an urban embayment off Helsinki, strongly loaded by municipal waste waters in the 1960s and 1970s, reference conditions prevailed in the mid- and late 1800s. The recovery of the bay from pollution has been delayed as a consequence of benthic release of nutrients. Laajalahti will probably not achieve the good quality objectives of the WFD on time. The third criterion (iii) was associated with coastal management including the resources it has available. Analyses of Chl are cheap and fast to carry out compared to the analyses of phytoplankton biomass and species composition; the fact which has an effect on number of samples to be taken and thereby on the reliability of assessments. However, analyses on phytoplankton biomass and species composition provide more metrics for ecological classification, the metrics which reveal various aspects of eutrophication contrary to what Chl alone does.
Resumo:
While environmental variation is an ubiquitous phenomenon in the natural world which has for long been appreciated by the scientific community recent changes in global climatic conditions have begun to raise consciousness about the economical, political and sociological ramifications of global climate change. Climate warming has already resulted in documented changes in ecosystem functioning, with direct repercussions on ecosystem services. While predicting the influence of ecosystem changes on vital ecosystem services can be extremely difficult, knowledge of the organisation of ecological interactions within natural communities can help us better understand climate driven changes in ecosystems. The role of environmental variation as an agent mediating population extinctions is likely to become increasingly important in the future. In previous studies population extinction risk in stochastic environmental conditions has been tied to an interaction between population density dependence and the temporal autocorrelation of environmental fluctuations. When populations interact with each other, forming ecological communities, the response of such species assemblages to environmental stochasticity can depend, e.g., on trophic structure in the food web and the similarity in species-specific responses to environmental conditions. The results presented in this thesis indicate that variation in the correlation structure between species-specific environmental responses (environmental correlation) can have important qualitative and quantitative effects on community persistence and biomass stability in autocorrelated (coloured) environments. In addition, reddened environmental stochasticity and ecological drift processes (such as demographic stochasticity and dispersal limitation) have important implications for patterns in species relative abundances and community dynamics over time and space. Our understanding of patterns in biodiversity at local and global scale can be enhanced by considering the relevance of different drift processes for community organisation and dynamics. Although the results laid out in this thesis are based on mathematical simulation models, they can be valuable in planning effective empirical studies as well as in interpreting existing empirical results. Most of the metrics considered here are directly applicable to empirical data.
Resumo:
Phytoplankton ecology and productivity is one of the main branches of contemporary oceanographic research. Research groups in this branch have increasingly started to utilise bio-optical applications. My main research objective was to critically investigate the advantages and deficiencies of the fast repetition rate (FRR) fluorometry for studies of productivity of phytoplankton, and the responses of phytoplankton towards varying environmental stress. Second, I aimed to clarify the applicability of the FRR system to the optical environment of the Baltic Sea. The FRR system offers a highly dynamic tool for studies of phytoplankton photophysiology and productivity both in the field and in a controlled environment. The FRR metrics obtain high-frequency in situ determinations of the light-acclimative and photosynthetic parameters of intact phytoplankton communities. The measurement protocol is relatively easy to use without phases requiring analytical determinations. The most notable application of the FRR system lies in its potential for making primary productivity (PP) estimations. However, the realisation of this scheme is not straightforward. The FRR-PP, based on the photosynthetic electron flow (PEF) rate, are linearly related to the photosynthetic gas exchange (fixation of 14C) PP only in environments where the photosynthesis is light-limited. If the light limitation is not present, as is usually the case in the near-surface layers of the water column, the two PP approaches will deviate. The prompt response of the PEF rate to the short-term variability in the natural light field makes the field comparisons between the PEF-PP and the 14C-PP difficult to interpret, because this variability is averaged out in the 14C-incubations. Furthermore, the FRR based PP models are tuned to closely follow the vertical pattern of the underwater irradiance. Due to the photoacclimational plasticity of phytoplankton, this easily leads to overestimates of water column PP, if precautionary measures are not taken. Natural phytoplankton is subject to broad-waveband light. Active non-spectral bio-optical instruments, like the FRR fluorometer, emit light in a relatively narrow waveband, which by its nature does not represent the in situ light field. Thus, the spectrally-dependent parameters provided by the FRR system need to be spectrally scaled to the natural light field of the Baltic Sea. In general, the requirement of spectral scaling in the water bodies under terrestrial impact concerns all light-adaptive parameters provided by any active non-spectral bio-optical technique. The FRR system can be adopted to studies of all phytoplankton that possess efficient light harvesting in the waveband matching the bluish FRR excitation. Although these taxa cover the large bulk of all the phytoplankton taxa, one exception with a pronounced ecological significance is found in the Baltic Sea. The FRR system cannot be used to monitor the photophysiology of the cyanobacterial taxa harvesting light in the yellow-red waveband. These taxa include the ecologically-significant bloom-forming cyanobacterial taxa in the Baltic Sea.
Resumo:
Background: The aging population is placing increasing demands on surgical services, simultaneously with a decreasing supply of professional labor and a worsening economic situation. Under growing financial constraints, successful operating room management will be one of the key issues in the struggle for technical efficiency. This study focused on several issues affecting operating room efficiency. Materials and methods: The current formal operating room management in Finland and the use of performance metrics and information systems used to support this management were explored using a postal survey. We also studied the feasibility of a wireless patient tracking system as a tool for managing the process. The reliability of the system as well as the accuracy and precision of its automatically recorded time stamps were analyzed. The benefits of a separate anesthesia induction room in a prospective setting were compared with the traditional way of working, where anesthesia is induced in the operating room. Using computer simulation, several models of parallel processing for the operating room were compared with the traditional model with respect to cost-efficiency. Moreover, international differences in operating room times for two common procedures, laparoscopic cholecystectomy and open lung lobectomy, were investigated. Results: The managerial structure of Finnish operating units was not clearly defined. Operating room management information systems were found to be out-of-date, offering little support to online evaluation of the care process. Only about half of the information systems provided information in real time. Operating room performance was most often measured by the number of procedures in a time unit, operating room utilization, and turnover time. The wireless patient tracking system was found to be feasible for hospital use. Automatic documentation of the system facilitated patient flow management by increasing process transparency via more available and accurate data, while lessening work for staff. Any parallel work flow model was more cost-efficient than the traditional way of performing anesthesia induction in the operating room. Mean operating times for two common procedures differed by 50% among eight hospitals in different countries. Conclusions: The structure of daily operative management of an operating room warrants redefinition. Performance measures as well as information systems require updating. Parallel work flows are more cost-efficient than the traditional induction-in-room model.
Resumo:
A key trait of Free and Open Source Software (FOSS) development is its distributed nature. Nevertheless, two project-level operations, the fork and the merge of program code, are among the least well understood events in the lifespan of a FOSS project. Some projects have explicitly adopted these operations as the primary means of concurrent development. In this study, we examine the effect of highly distributed software development, is found in the Linux kernel project, on collection and modelling of software development data. We find that distributed development calls for sophisticated temporal modelling techniques where several versions of the source code tree can exist at once. Attention must be turned towards the methods of quality assurance and peer review that projects employ to manage these parallel source trees. Our analysis indicates that two new metrics, fork rate and merge rate, could be useful for determining the role of distributed version control systems in FOSS projects. The study presents a preliminary data set consisting of version control and mailing list data.
Resumo:
A key trait of Free and Open Source Software (FOSS) development is its distributed nature. Nevertheless, two project-level operations, the fork and the merge of program code, are among the least well understood events in the lifespan of a FOSS project. Some projects have explicitly adopted these operations as the primary means of concurrent development. In this study, we examine the effect of highly distributed software development, is found in the Linux kernel project, on collection and modelling of software development data. We find that distributed development calls for sophisticated temporal modelling techniques where several versions of the source code tree can exist at once. Attention must be turned towards the methods of quality assurance and peer review that projects employ to manage these parallel source trees. Our analysis indicates that two new metrics, fork rate and merge rate, could be useful for determining the role of distributed version control systems in FOSS projects. The study presents a preliminary data set consisting of version control and mailing list data.
Resumo:
Herbivorous insects, their host plants and natural enemies form the largest and most species-rich communities on earth. But what forces structure such communities? Do they represent random collections of species, or are they assembled by given rules? To address these questions, food webs offer excellent tools. As a result of their versatile information content, such webs have become the focus of intensive research over the last few decades. In this thesis, I study herbivore-parasitoid food webs from a new perspective: I construct multiple, quantitative food webs in a spatially explicit setting, at two different scales. Focusing on food webs consisting of specialist herbivores and their natural enemies on the pedunculate oak, Quercus robur, I examine consistency in food web structure across space and time, and how landscape context affects this structure. As an important methodological development, I use DNA barcoding to resolve potential cryptic species in the food webs, and to examine their effect on food web structure. I find that DNA barcoding changes our perception of species identity for as many as a third of the individuals, by reducing misidentifications and by resolving several cryptic species. In terms of the variation detected in food web structure, I find surprising consistency in both space and time. From a spatial perspective, landscape context leaves no detectable imprint on food web structure, while species richness declines significantly with decreasing connectivity. From a temporal perspective, food web structure remains predictable from year to year, despite considerable species turnover in local communities. The rate of such turnover varies between guilds and species within guilds. The factors best explaining these observations are abundant and common species, which have a quantitatively dominant imprint on overall structure, and suffer the lowest turnover. By contrast, rare species with little impact on food web structure exhibit the highest turnover rates. These patterns reveal important limitations of modern metrics of quantitative food web structure. While they accurately describe the overall topology of the web and its most significant interactions, they are disproportionately affected by species with given traits, and insensitive to the specific identity of species. As rare species have been shown to be important for food web stability, metrics depicting quantitative food web structure should then not be used as the sole descriptors of communities in a changing world. To detect and resolve the versatile imprint of global environmental change, one should rather use these metrics as one tool among several.
Resumo:
The increasing focus of relationship marketing and customer relationship management (CRM) studies on issues of customer profitability has led to the emergence of an area of research on profitable customer management. Nevertheless, there is a notable lack of empirical research examining the current practices of firms specifically with regard to the profitable management of customer relationships according to the approaches suggested in theory. This thesis fills this research gap by exploring profitable customer management in the retail banking sector. Several topics are covered, including marketing metrics and accountability; challenges in the implementation of profitable customer management approaches in practice; analytic versus heuristic (‘rule of thumb’) decision making; and the modification of costly customer behavior in order to increase customer profitability, customer lifetime value (CLV), and customer equity, i.e. the financial value of the customer base. The thesis critically reviews the concept of customer equity and proposes a Customer Equity Scorecard, providing a starting point for a constructive dialog between marketing and finance concerning the development of appropriate metrics to measure marketing outcomes. Since customer management and measurement issues go hand in hand, profitable customer management is contingent on both marketing management skills and financial measurement skills. A clear gap between marketing theory and practice regarding profitable customer management is also identified. The findings show that key customer management aspects that have been proposed within the literature on profitable customer management for many years, are not being actively applied by the banks included in the research. Instead, several areas of customer management decision making are found to be influenced by heuristics. This dilemma for marketing accountability is addressed by emphasizing that CLV and customer equity, which are aggregate metrics, only provide certain indications regarding the relative value of customers and the approximate value of the customer base (or groups of customers), respectively. The value created by marketing manifests itself in the effect of marketing actions on customer perceptions, behavior, and ultimately the components of CLV, namely revenues, costs, risk, and retention, as well as additional components of customer equity, such as customer acquisition. The thesis also points out that although costs are a crucial component of CLV, they have largely been neglected in prior CRM research. Cost-cutting has often been viewed negatively in customer-focused marketing literature on service quality and customer profitability, but the case studies in this thesis demonstrate that reduced costs do not necessarily have to lead to lower service quality, customer retention, and customer-related revenues. Consequently, this thesis provides an expanded foundation upon which marketers can stake their claim for accountability. By focusing on the range of drivers and all of the components of CLV and customer equity, marketing has the potential to provide specific evidence concerning how various activities have affected the drivers and components of CLV within different groups of customers, and the implications for customer equity on a customer base level.
Resumo:
Tasaikäisen metsän alle muodostuvilla alikasvoksilla on merkitystä puunkorjuun, metsänuudistamisen, näkemä-ja maisema-analyysien sekä biodiversiteetin ja hiilitaseen arvioinnin kannalta. Ilma-aluksista tehtävä laserkeilaus on osoittautunut tehokkaaksi kaukokartoitusmenetelmäksi varttuneiden puustojen mittauksessa. Laserkeilauksen käyttöönotto operatiivisessa metsäsuunnittelussa mahdollistaa aiempaa tarkemman tiedon tuottamisen alikasvoksista, mikäli alikasvoksen ominaisuuksia voidaan tulkita laseraineistoista. Tässä työssä käytettiin tarkasti mitattuja maastokoealoja ja kaikulaserkeilausaineistoja (discrete return LiDAR) usealta vuodelta (1–2 km lentokorkeus, 0,9–9,7 pulssia m-2). Laserkeilausaineistot oli hankittu Optech ALTM3100 ja Leica ALS50-II sensoreilla. Koealat edustavat suomalaisia tasaikäisiä männiköitä eri kehitysvaiheissa. Tutkimuskysymykset olivat: 1) Minkälainen on alikasvoksesta saatu lasersignaali yksittäisen pulssin tasolla ja mitkä tekijät signaaliin vaikuttavat? 2) Mikä on käytännön sovelluksissa hyödynnettävien aluepohjaisten laserpiirteiden selitysvoima alikasvospuuston ominaisuuksien ennustamisessa? Erityisesti haluttiin selvittää, miten laserpulssin energiahäviöt ylempiin latvuskerroksiin vaikuttavat saatuun signaaliin, ja voidaanko laserkaikujen intensiteetille tehdä energiahäviöiden korjaus. Puulajien väliset erot laserkaiun intensiteetissä olivat pieniä ja vaihtelivat keilauksesta toiseen. Intensiteetin käyttömahdollisuudet alikasvoksen puulajin tulkinnassa ovat siten hyvin rajoittuneet. Energiahäviöt ylempiin latvuskerroksiin aiheuttivat alikasvoksesta saatuun lasersignaaliin kohinaa. Energiahäviöiden korjaus tehtiin alikasvoksesta saaduille laserpulssin 2. ja 3. kaiuille. Korjauksen avulla pystyttiin pienentämään kohteen sisäistä intensiteetin hajontaa ja parantamaan kohteiden luokittelutarkkuutta alikasvoskerroksessa. Käytettäessä 2. kaikuja oikeinluokitusprosentti luokituksessa maan ja yleisimmän puulajin välillä oli ennen korjausta 49,2–54,9 % ja korjauksen jälkeen 57,3–62,0 %. Vastaavat kappa-arvot olivat 0,03–0,13 ja 0,10–0,22. Tärkein energiahäviöitä selittävä tekijä oli pulssista saatujen aikaisempien kaikujen intensiteetti, mutta hieman merkitystä oli myös pulssin leikkausgeometrialla ylemmän latvuskerroksen puiden kanssa. Myös 3. kaiuilla luokitustarkkuus parani. Puulajien välillä havaittiin eroja siinä, kuinka herkästi ne tuottavat kaiun laserpulssin osuessa puuhun. Kuusi tuotti kaiun suuremmalla todennäköisyydellä kuin lehtipuut. Erityisen selvä tämä ero oli pulsseilla, joissa oli energiahäviöitä. Laserkaikujen korkeusjakaumapiirteet voivat siten olla riippuvaisia puulajista. Sensorien välillä havaittiin selviä eroja intensiteettijakaumissa, mikä vaikeuttaa eri sensoreilla hankittujen aineistojen yhdistämistä. Myös kaiun todennäköisyydet erosivat jonkin verran sensorien välillä, mikä aiheutti pieniä eroavaisuuksia kaikujen korkeusjakaumiin. Aluepohjaisista laserpiirteistä löydettiin alikasvoksen runkolukua ja keskipituutta hyvin selittäviä piirteitä, kun rajoitettiin tarkastelu yli 1 m pituisiin puihin. Piirteiden selitysvoima oli parempi runkoluvulle kuin keskipituudelle. Selitysvoima ei merkittävästi alentunut pulssitiheyden pienentyessä, mikä on hyvä asia käytännön sovelluksia ajatellen. Lehtipuun osuutta ei pystytty selittämään. Tulosten perusteella kaikulaserkeilausta voi olla mahdollista hyödyntää esimerkiksi ennakkoraivaustarpeen arvioinnissa. Sen sijaan alikasvoksen tarkempi luokittelu (esim. puulajitulkinta) voi olla vaikeaa. Kaikkein pienimpiä alikasvospuita ei pystytä havaitsemaan. Lisää tutkimuksia tarvitaan tulosten yleistämiseksi erilaisiin metsiköihin.