Biblioteca Digital

184 resultados para APACHE

Design and Implementation of a Cloud Infrastructure for Distributed Scientific Calculation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cloud computing enables independent end users and applications to share data and pooled resources, possibly located in geographically distributed Data Centers, in a fully transparent way. This need is particularly felt by scientific applications to exploit distributed resources in efficient and scalable way for the processing of big amount of data. This paper proposes an open so- lution to deploy a Platform as a service (PaaS) over a set of multi- site data centers by applying open source virtualization tools to facilitate operation among virtual machines while optimizing the usage of distributed resources. An experimental testbed is set up in Openstack environment to obtain evaluations with different types of TCP sample connections to demonstrate the functionality of the proposed solution and to obtain throughput measurements in relation to relevant design parameters.

Construção de um classificador automático de severidade de bugs para sistemas open source

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Software bug analysis is one of the most important activities in Software Quality. The rapid and correct implementation of the necessary repair influence both developers, who must leave the fully functioning software, and users, who need to perform their daily tasks. In this context, if there is an incorrect classification of bugs, there may be unwanted situations. One of the main factors to be assigned bugs in the act of its initial report is severity, which lives up to the urgency of correcting that problem. In this scenario, we identified in datasets with data extracted from five open source systems (Apache, Eclipse, Kernel, Mozilla and Open Office), that there is an irregular distribution of bugs with respect to existing severities, which is an early sign of misclassification. In the dataset analyzed, exists a rate of about 85% bugs being ranked with normal severity. Therefore, this classification rate can have a negative influence on software development context, where the misclassified bug can be allocated to a developer with little experience to solve it and thus the correction of the same may take longer, or even generate a incorrect implementation. Several studies in the literature have disregarded the normal bugs, working only with the portion of bugs considered severe or not severe initially. This work aimed to investigate this portion of the data, with the purpose of identifying whether the normal severity reflects the real impact and urgency, to investigate if there are bugs (initially classified as normal) that could be classified with other severity, and to assess if there are impacts for developers in this context. For this, an automatic classifier was developed, which was based on three algorithms (Näive Bayes, Max Ent and Winnow) to assess if normal severity is correct for the bugs categorized initially with this severity. The algorithms presented accuracy of about 80%, and showed that between 21% and 36% of the bugs should have been classified differently (depending on the algorithm), which represents somewhere between 70,000 and 130,000 bugs of the dataset.

Detección de fraude bancario en tiempo real utilizando tecnologías de procesamiento distribuido

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En este Trabajo de Fin de Máster se desarrollará un sistema de detección de fraude en pagos con tarjeta de crédito en tiempo real utilizando tecnologías de procesamiento distribuido. Concretamente se considerarán dos tecnologías: TIBCO, un conjunto de herramientas comerciales diseñadas para el procesamiento de eventos complejos, y Apache Spark, un sistema abierto para el procesamiento de datos en tiempo real. Además de implementar el sistema utilizando las dos tecnologías propuestas, un objetivo, otro objetivo de este Trabajo de Fin de Máster consiste en analizar y comparar estos dos sistemas implementados usados para procesamiento en tiempo real. Para la detección de fraude en pagos con tarjeta de crédito se aplicarán técnicas de aprendizaje máquina, concretamente del campo de anomaly/outlier detection. Como fuentes de datos que alimenten los sistemas, haremos uso de tecnologías de colas de mensajes como TIBCO EMS y Kafka. Los datos generados son enviados a estas colas para que los respectivos sistemas puedan procesarlos y aplicar el algoritmo de aprendizaje máquina, determinando si una nueva instancia es fraude o no. Ambos sistemas hacen uso de una base de datos MongoDB para almacenar los datos generados de forma pseudoaleatoria por los generadores de mensajes, correspondientes a movimientos de tarjetas de crédito. Estos movimientos posteriormente serán usados como conjunto de entrenamiento para el algoritmo de aprendizaje máquina.

Using population-based critical care data to evaluate trauma outcomes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Copyright © 2015 Royal College of Surgeons of Edinburgh (Scottish charity number SC005317) and Royal College of Surgeons in Ireland. Published by Elsevier Ltd. All rights reserved. Acknowledgements We would like to thank the Scottish Intensive Care Society Audit Group (SICSAG) for providing the data for this study. Mr Jan Jansen is in receipt of an NHS Research Scotland fellowship which includes salary funding.

Using population-based critical care data to evaluate trauma outcomes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Copyright © 2015 Royal College of Surgeons of Edinburgh (Scottish charity number SC005317) and Royal College of Surgeons in Ireland. Published by Elsevier Ltd. All rights reserved. Acknowledgements We would like to thank the Scottish Intensive Care Society Audit Group (SICSAG) for providing the data for this study. Mr Jan Jansen is in receipt of an NHS Research Scotland fellowship which includes salary funding.

Studying Recommender Systems to Enhance Distributed Computing Schedulers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Distributed Computing frameworks belong to a class of programming models that allow developers to

launch workloads on large clusters of machines. Due to the dramatic increase in the volume of

data gathered by ubiquitous computing devices, data analytic workloads have become a common

case among distributed computing applications, making Data Science an entire field of

Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,

a sequence of operations they wish to apply on this dataset, and some constraint they may have

related to their work (performances, QoS, budget, etc). However, it is actually extremely

difficult, without domain expertise, to perform data science. One need to select the right amount

and type of resources, pick up a framework, and configure it. Also, users are often running their

application in shared environments, ruled by schedulers expecting them to specify precisely their resource

needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and

profiling are hard, high dimensional problems that block users from making the right

configuration choices and determining the right amount of resources they need. Paradoxically, the

system is gathering a large amount of monitoring data at runtime, which remains unused.

In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit

monitoring data to learn about workloads, and process user requests into a tailored execution

context. In this work, we study different techniques that have been used to make steps toward

such system awareness, and explore a new way to do so by implementing machine learning

techniques to recommend a specific subset of system configurations for Apache Spark applications.

Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight

the complexity in choosing the best one for a given workload.

An Exploration of the challenges associated with software logging in large systems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the past few years, logging has evolved from from simple printf statements to more complex and widely used logging libraries. Today logging information is used to support various development activities such as fixing bugs, analyzing the results of load tests, monitoring performance and transferring knowledge. Recent research has examined how to improve logging practices by informing developers what to log and where to log. Furthermore, the strong dependence on logging has led to the development of logging libraries that have reduced the intricacies of logging, which has resulted in an abundance of log information. Two recent challenges have emerged as modern software systems start to treat logging as a core aspect of their software. In particular, 1) infrastructural challenges have emerged due to the plethora of logging libraries available today and 2) processing challenges have emerged due to the large number of log processing tools that ingest logs and produce useful information from them. In this thesis, we explore these two challenges. We first explore the infrastructural challenges that arise due to the plethora of logging libraries available today. As systems evolve, their logging infrastructure has to evolve (commonly this is done by migrating to new logging libraries). We explore logging library migrations within Apache Software Foundation (ASF) projects. We i find that close to 14% of the pro jects within the ASF migrate their logging libraries at least once. For processing challenges, we explore the different factors which can affect the likelihood of a logging statement changing in the future in four open source systems namely ActiveMQ, Camel, Cloudstack and Liferay. Such changes are likely to negatively impact the log processing tools that must be updated to accommodate such changes. We find that 20%-45% of the logging statements within the four systems are changed at least once. We construct random forest classifiers and Cox models to determine the likelihood of both just-introduced and long-lived logging statements changing in the future. We find that file ownership, developer experience, log density and SLOC are important factors in determining the stability of logging statements.

Pulling Back the Veil: The Characterization and Habitability of Enshrouded Worlds

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Proposta de modelo de recomendação de conteúdo baseado em arquivos de legendas de filmes e séries

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A crescente complexidade dos objetos armazenados e o grande volume de dados exigem modelos de recuperação e recomendação cada vez mais sofisticados. O objetivo deste trabalho é propor um modelo de recomendação de conteúdo baseado em arquivos de legendas de filmes e séries. Utilizando a ferramenta Apache Lucene, para recuperação da informação, e a ferramenta OGMA, para análise de textos, foi possível propor, para o modelo, três etapas distintas: uma pesquisa utilizando palavra-chave, a classificação de filmes e séries por gênero e a identificação de títulos similares. Também é apresentado uma adaptação ao modelo para identificar em cada título um sentimento, denominado análise de sentimentos. Como resultado ressaltamos que a pesquisa por palavras-chave gerourecomendações surpreendentes, já que proporcionam ao usuário liberdade de pesquisa dentro de um conteúdo específico. Já a classificação por gênero apresentou índice de 73% de acerto em comparação com os gêneros apresentados pelo site IMDb, facilitando a recomendação de conteúdo. A análise de sentimentos demonstrou recomendações com coesão, determinando títulos apropriados para cada sentimento. Por último, a identificação de títulos similares, apresentou resultados primários, trazendo apenas filmes e séries com a mesma temática, sem apresentar nenhum resultado em comum com o site IMDb. Concluiu-se que apesar da enorme dificuldade de ser assertivo na recuperação da informação, existevantagens em se utilizar os arquivos de legendas para ajudar na composição dos sistemas de recomendação.

Do simple ventilation and gas exchange measurements predict early successful weaning from respiratory support in unselected general intensive care patients?

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background. The value of respiratory variables as weaning predictors in the intensive care unit (ICU) is controversial. We evaluated the ability of tidal volume (Vtexp), respiratory rate ( f ), minute volume (MVexp), rapid shallow breathing index ( f/Vt), inspired–expired oxygen concentration difference [(I–E)O2], and end-tidal carbon dioxide concentration (PE′CO2) at the end of a weaning trial to predict early weaning outcomes. Methods. Seventy-three patients who required .24 h of mechanical ventilation were studied. A controlled pressure support weaning trial was undertaken until 5 cm H2O continuous positive airway pressure or predefined criteria were reached. The ability of data from the last 5 min of the trial to predict whether a predefined endpoint indicating discontinuation of ventilator support within the next 24 h was evaluated. Results. Pre-test probability for achieving the outcome was 44% in the cohort (n¼32). Non-achievers were older, had higher APACHE II and organ failure scores before the trial, and higher baseline arterial H+ concentrations. The Vt, MV, f, and f/Vt had no predictive power using a range of cut-off values or from receiver operating characteristic (ROC) analysis. The [I–E]O2 and PE′CO2 had weak discriminatory power [areaunder the ROC curve: [I–E]O2 0.64 (P¼0.03); PE′CO2 0.63 (P¼0.05)]. Using best cut-off values for [I–E]O2 of 5.6% and PE′CO2 of 5.1 kPa, positive and negative likelihood ratios were 2 and 0.5, respectively, which only changed the pre- to post-test probability by about 20%. Conclusions. In unselected ICU patients, respiratory variables predict early weaning from mechanical ventilation poorly.

RCOS: Real time context sharing across a fleet of smart mobile devices

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Today, biodiversity is endangered by the currently applied intensive farming methods imposed on food producers by intermediate actors (e.g.: retailers). The lack of a direct communication technology between the food producer and the consumer creates dependency on the intermediate actors for both producers and the consumers. A tool allowing producers to directly and efficiently market produce that meets customer demands could greatly reduce the dependency enforced by intermediate actors. To this end, in this thesis, we propose, develop, implement and validate a Real Time Context Sharing (RCOS) system. RCOS takes advantage of the widely used publish/subscribe paradigm to exchange messages between producers and consumers, directly, according to their interest and context. Current systems follow a topic-based model or a content-based model. With RCOS, we propose a context-awareness approach into the matching process of publish/subscribe paradigm. Finally, as a proof of concept, we extend the Apache ActiveMQ Artemis software and create a client prototype. We evaluate our proof of concept for larger scale deployment.

Gestión de recursos de una unidad de rescate en montaña

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este Trabajo de Fin de Grado va enfocado al desarrollo de un sistema eficaz para gestionar los recursos que componen una Unidad de Rescate en Montaña (URM). Estos recursos son tanto los medios humanos como los materiales, los primeros son los rescatadores, los segundos el material especializado que se emplea en las maniobras de rescate. Mantener información actualizada del estado del material es básico para garantizar la seguridad de las intervenciones de este tipo de unidad. Además saber qué cantidad, tipo y ubicación de dicho material, es básico documentar el empleo de este facilita la trazabilidad de su uso cumpliendo la normativa vigente que regula el trabajo en altura. Como modelo de forma de trabajo de una URM se tomará la perteneciente al Consorcio Provincial de Bomberos de Málaga (CPB Málaga). El sistema desarrollado es una aplicación web que bajo entorno Java Enterprise Edition (Java 2EE) que se ejecutará en un servidor Apache Tomcat y empleará una base de datos en MySql para almacenar la información, de esta forma se facilita el acceso distribuido de la gestión a los diferentes usuarios y el mantenimiento del sistema al estar todo centralizado, además de las ventajas que da un sistema desarrollado en Java que puede ser desplegado con independencia del sistema operativo empleado.

Epidemiología, diagnóstico y tratamiento de la sepsis severa en Uruguay: un estudio multicéntrico prospectivo

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objetivo: conocer las características epidemiológicas de pacientes que ingresan por sepsis severa (SS) y shock séptico (ShS); valorar la implementación de recomendaciones de la campaña Sobrevivir a las Sepsis (CSS) y determinar variables asociadas con mal pronóstico vital. Diseño: estudio prospectivo, observacional, cohorte única, multicéntrico, durante un año (setiembre 2011 - agosto 2012). Ámbito: cinco centros de Montevideo, del subsector público y privado con cobertura de 800.000 habitantes. Pacientes y métodos: 153 pacientes que ingresaron con diagnóstico de SS y ShS a las unidades de cuidados intensivos (UCI) de forma consecutiva. Variables de interés principales: aquellas relacionadas con características del paciente y episodio de sepsis, medidas diagnósticas y terapéuticas según la CSS en las primeras 48 horas, y pronósticas en UCI, hospital y a los seis meses. Resultados: se incluyeron 153 pacientes, la mediana de edad fue 68 años, la de Acute Physiology and Chronic Health Evaluation (APACHE II) fue de 24; 73,9% recibieron asistencia respiratoria mecánica (ARM), con una mediana de 8 días. La mediana de estadía en CTI fue de 12 y la de estadía hospitalaria fue de 19 días. De los episodios de SS, ShS, 69,3% de los casos fue comunitario; 77,8% presentó shock, y 37,9% inmunodebilidad-inmunocompromiso. Predominó la sepsis de origen respiratorio en 30,1%, se aisló microorganismos en 64,1%, siendo bacterianas 95,9%. La mortalidad en CTI fue 49,7%, hospitalaria 54,9% y a seis meses 58,8%. Se asociaron a mayor mortalidad hospitalaria: edad, APACHE II, inmunodebilidad-compromiso, demoras de ingreso a UCI e inicio de antimicrobianos y balance positivo. Conclusiones: los pacientes ingresan a UCI con formas severas o estado biológico comprometido. Existen demoras y limitaciones en el diagnóstico y terapéutica inicial, situaciones que se asocian a mayor mortalidad hospitalaria.

[Comanche pictograph map of the Battle of Sierra Blanca, 1787].

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Shows battle between two warring Plains Indian tribes, the Comanche and Apache.

Dynamic Multi-Objective Optimization With jMetal and Spark: a Case Study

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Technologies for Big Data and Data Science are receiving increasing research interest nowadays. This paper introduces the prototyping architecture of a tool aimed to solve Big Data Optimization problems. Our tool combines the jMetal framework for multi-objective optimization with Apache Spark, a technology that is gaining momentum. In particular, we make use of the streaming facilities of Spark to feed an optimization problem with data from different sources. We demonstrate the use of our tool by solving a dynamic bi-objective instance of the Traveling Salesman Problem (TSP) based on near real-time traffic data from New York City, which is updated several times per minute. Our experiment shows that both jMetal and Spark can be integrated providing a software platform to deal with dynamic multi-optimization problems.

«
1
2
...
5
6
7
8
9
10
11
12
13
»