This thesis is concerned with the role played by software tools in the analysis and dissemination of linguistic corpora and their contribution to a more widespread adoption of corpora in different fields. Chapter 1 contains an overview of some of the most relevant corpus analysis tools available today, presenting their most interesting features and some of their drawbacks. Chapter 2 begins with an explanation of the reasons why none of the available tools appear to satisfy the requirements of the user community and then continues with technical overview of the current status of the new system developed as part of this work. This presentation is followed by highlights of features that make the system appealing to users and corpus builders (i.e. scholars willing to make their corpora available to the public). The chapter concludes with an indication of future directions for the projects and information on the current availability of the software. Chapter 3 describes the design of an experiment devised to evaluate the usability of the new system in comparison to another corpus tool. Usage of the tool was tested in the context of a documentation task performed on a real assignment during a translation class in a master's degree course. In chapter 4 the findings of the experiment are presented on two levels of analysis: firstly a discussion on how participants interacted with and evaluated the two corpus tools in terms of interface and interaction design, usability and perceived ease of use. Then an analysis follows of how users interacted with corpora to complete the task and what kind of queries they submitted. Finally, some general conclusions are drawn and areas for future work are outlined.


Questa tesi descrive un sistema Client Server che ricava dagli smartphone, tramite modello crowdsourcing e crowdsensing, posizioni che successivamente verranno georeferenziate sulla mappa con lo scopo di favorire l'accessibilità urbana.


Lo scopo della tesi è quello di descrivere e mettere a confronto tre diversi linguaggi, e quindi approcci, alla programmazione server-side e di back-end, ovvero il linguaggio PHP, il linguaggio Python ed il linguaggio Javascript, utilizzato però per una programmazione “Server Side”, e quindi associato al framework NodeJS. Questo confronto si pone l’obiettivo di sottolineare le differenti caratteristiche di ogni linguaggio, gli scopi a cui esso maggiormente si addice e di fornire una sorta di guida per far in modo che si possa comprendere al meglio quale dei tre linguaggi maggiormente usati per la programmazione backend si conformi meglio all’obiettivo prepostosi.


Poiché nell’ultimo decennio i dispositivi mobile assumono un ruolo sempre più determinante nello svolgimento della vita stessa, nel corso del tempo si sono ricercate e sviluppate app per facilitare le più svariate operazioni quotidiane. Visto la vastità del mercato degli smartphone, nel tempo sono stati sviluppati vari sistemi operativi in grado di governare queste piattaforme. Per una azienda, tuttavia, gestire i costi di implementazione di una stessa app in ambienti differenti risulta più oneroso che gestire i costi di una sola in grado di operare nei diversi sistemi operativi. Quest’ultimo tipo di app viene comunemente denominato app multipiattaforma. Un modo per implementare questo genere di applicazioni vede come strumento di utilizzo Visual Studio, noto IDE. Nel caso specifico Visual Studio ha integrato il progetto Apache Cordova per le creazione di applicativi multipiattaforma. In questo elaborato di tesi tramite i due strumenti appena introdotti si sono sviluppate due differenti app, al fine di valutarne le performance in termini di tempo. La prima app propone la risoluzione di un noto problema di calcolo combinatorio conosciuto con il nome di Knapsack, ovvero il problema dello zaino. La seconda cerca invece di digitalizzare una semplice espressione matematica contenuta in un’immagine e di fornirne quindi il risultato. Dai dati ottenuti si possono operare confronti per determinare la validità dello strumento di sviluppo, mettendo in luce anche possibili evoluzioni di queste due app.


The past decade has seen the energy consumption in servers and Internet Data Centers (IDCs) skyrocket. A recent survey estimated that the worldwide spending on servers and cooling have risen to above $30 billion and is likely to exceed spending on the new server hardware . The rapid rise in energy consumption has posted a serious threat to both energy resources and the environment, which makes green computing not only worthwhile but also necessary. This dissertation intends to tackle the challenges of both reducing the energy consumption of server systems and by reducing the cost for Online Service Providers (OSPs). Two distinct subsystems account for most of IDC’s power: the server system, which accounts for 56% of the total power consumption of an IDC, and the cooling and humidifcation systems, which accounts for about 30% of the total power consumption. The server system dominates the energy consumption of an IDC, and its power draw can vary drastically with data center utilization. In this dissertation, we propose three models to achieve energy effciency in web server clusters: an energy proportional model, an optimal server allocation and frequency adjustment strategy, and a constrained Markov model. The proposed models have combined Dynamic Voltage/Frequency Scaling (DV/FS) and Vary-On, Vary-off (VOVF) mechanisms that work together for more energy savings. Meanwhile, corresponding strategies are proposed to deal with the transition overheads. We further extend server energy management to the IDC’s costs management, helping the OSPs to conserve, manage their own electricity cost, and lower the carbon emissions. We have developed an optimal energy-aware load dispatching strategy that periodically maps more requests to the locations with lower electricity prices. A carbon emission limit is placed, and the volatility of the carbon offset market is also considered. Two energy effcient strategies are applied to the server system and the cooling system respectively. With the rapid development of cloud services, we also carry out research to reduce the server energy in cloud computing environments. In this work, we propose a new live virtual machine (VM) placement scheme that can effectively map VMs to Physical Machines (PMs) with substantial energy savings in a heterogeneous server cluster. A VM/PM mapping probability matrix is constructed, in which each VM request is assigned with a probability running on PMs. The VM/PM mapping probability matrix takes into account resource limitations, VM operation overheads, server reliability as well as energy effciency. The evolution of Internet Data Centers and the increasing demands of web services raise great challenges to improve the energy effciency of IDCs. We also express several potential areas for future research in each chapter.


Eine immer größere Zahl von Mitarbeitern öffentlicher Verwaltungen arbeitet direkt oder indirekt mit Geodaten. Nicht alle sind Spezialisten im Umgang mit GIS Software. ArcGIS Server bietet mit einem umfangreichen Framework die Möglichkeit, angepasste GIS Anwendungen zu entwickeln, den Funktionsumfang auf benötigte Funktionen zu reduzieren und komplexe Arbeitsabläufe zu optimieren. Die Abteilung Geoinformation und Vermessung des Kantons Luzern entwickelt seit dem Jahr 2006 in Zusammenarbeit mit der Universität Bern ArcGIS Server-basierte Webanwendungen für verschiedene Abteilungen der kantonalen Verwaltung. In dieser Zeit sind mehrere Anwendungen entstanden, darunter eine Webapplikation für die effiziente Erfassung, Beurteilung und Verwaltung von Waldeingriffsflächen (Waldportal), für die dynamische Abgrenzung und Auswertung von Einzugsgebieten, sowie für das Betrachten von aufgezeichneten Videos von Kantonsstraßenabschnitten. In der Präsentation werden die genannten Applikationen vorgestellt und Hintergründe der Entwicklung sowie der Architektur besprochen.


Leakage power consumption is a com- ponent of the total power consumption in data cen- ters that is not traditionally considered in the set- point temperature of the room. However, the effect of this power component, increased with temperature, can determine the savings associated with the careful management of the cooling system, as well as the re- liability of the system. The work presented in this paper detects the need of addressing leakage power in order to achieve substantial savings in the energy consumption of servers. In particular, our work shows that, by a careful detection and management of two working regions (low and high impact of thermal- dependent leakage), energy consumption of the data- center can be optimized by a reduction of the cooling budget.


Reducing the energy consumption for computation and cooling in servers is a major challenge considering the data center energy costs today. To ensure energy-efficient operation of servers in data centers, the relationship among computa- tional power, temperature, leakage, and cooling power needs to be analyzed. By means of an innovative setup that enables monitoring and controlling the computing and cooling power consumption separately on a commercial enterprise server, this paper studies temperature-leakage-energy tradeoffs, obtaining an empirical model for the leakage component. Using this model, we design a controller that continuously seeks and settles at the optimal fan speed to minimize the energy consumption for a given workload. We run a customized dynamic load-synthesis tool to stress the system. Our proposed cooling controller achieves up to 9% energy savings and 30W reduction in peak power in comparison to the default cooling control scheme.


As advanced Cloud services are becoming mainstream, the contribution of data centers in the overall power consumption of modern cities is growing dramatically. The average consumption of a single data center is equivalent to the energy consumption of 25.000 households. Modeling the power consumption for these infrastructures is crucial to anticipate the effects of aggressive optimization policies, but accurate and fast power modeling is a complex challenge for high-end servers not yet satisfied by analytical approaches. This work proposes an automatic method, based on Multi-Objective Particle Swarm Optimization, for the identification of power models of enterprise servers in Cloud data centers. Our approach, as opposed to previous procedures, does not only consider the workload consolidation for deriving the power model, but also incorporates other non traditional factors like the static power consumption and its dependence with temperature. Our experimental results shows that we reach slightly better models than classical approaches, but simul- taneously simplifying the power model structure and thus the numbers of sensors needed, which is very promising for a short-term energy prediction. This work, validated with real Cloud applications, broadens the possibilities to derive efficient energy saving techniques for Cloud facilities.


The computational and cooling power demands of enterprise servers are increasing at an unsustainable rate. Understanding the relationship between computational power, temperature, leakage, and cooling power is crucial to enable energy-efficient operation at the server and data center levels. This paper develops empirical models to estimate the contributions of static and dynamic power consumption in enterprise servers for a wide range of workloads, and analyzes the interactions between temperature, leakage, and cooling power for various workload allocation policies. We propose a cooling management policy that minimizes the server energy consumption by setting the optimum fan speed during runtime. Our experimental results on a presently shipping enterprise server demonstrate that including leakage awareness in workload and cooling management provides additional energy savings without any impact on performance.


Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000–100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.


MetaFam is a comprehensive relational database of protein family information. This web-accessible resource integrates data from several primary sequence and secondary protein family databases. By pooling together the information from these disparate sources, MetaFam is able to provide the most complete protein family sets available. Users are able to explore the interrelationships among these primary and secondary databases using a powerful graphical visualization tool, MetaFamView. Additionally, users can identify corresponding sequence entries among the sequence databases, obtain a quick summary of corresponding families (and their sequence members) among the family databases, and even attempt to classify their own unassigned sequences. Hypertext links to the appropriate source databases are provided at every level of navigation. Global family database statistics and information are also provided. Public access to the data is available at http://metafam.ahc.umn.edu/.