909 resultados para Data-Driven Behavior Modeling


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. This has motivated many initiatives that have been developing scientific workflow tools. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from workflow tasks specification, decentralizing the control of workflow activities, and allowing their tasks to run autonomous in distributed infrastructures, for instance on Clouds. Furthermore many workflow tools only support the execution of Direct Acyclic Graphs (DAG) without the concept of iterations, where activities are executed millions of iterations during long periods of time and supporting dynamic workflow reconfigurations after certain iteration. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on the Process Networks model, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures, e. g. on Clouds. Each AWA executes a Task developed as a Java class that implements a generic interface allowing end-users to code their applications without concerns for low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables support to dynamic workflow reconfiguration and monitoring of the execution of workflows. We describe how AWARD supports dynamic reconfiguration and discuss typical workflow reconfiguration scenarios. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to a small dedicated cluster and the Amazon (Elastic Computing EC2) Cloud.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from tasks specification, decentralizing the control of workflow activities allowing their tasks to run in distributed infrastructures, and supporting dynamic workflow reconfigurations. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on Process Networks, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures. Each AWA executes a task developed as a Java class with a generic interface allowing end-users to code their applications without low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables dynamic workflow reconfiguration. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to the Amazon (Elastic Computing EC2) Cloud.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a rule-based automatic syllabifier for Danish is described using the Maximal Onset Principle. Prior success rates of rule-based methods applied to Portuguese and Catalan syllabification modules were on the basis of this work. The system was implemented and tested using a very small set of rules. The results gave rise to 96.9% and 98.7% of word accuracy rate, contrary to our initial expectations, being Danish a language with a complex syllabic structure and thus difficult to be rule-driven. Comparison with data-driven syllabification system using artificial neural networks showed a higher accuracy rate of the former system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

É possível assistir nos dias de hoje, a um processo tecnológico evolutivo acentuado por toda a parte do globo. No caso das empresas, quer as pequenas, médias ou de grandes dimensões, estão cada vez mais dependentes dos sistemas informatizados para realizar os seus processos de negócio, e consequentemente à geração de informação referente aos negócios e onde, muitas das vezes, os dados não têm qualquer relacionamento entre si. A maioria dos sistemas convencionais informáticos não são projetados para gerir e armazenar informações estratégicas, impossibilitando assim que esta sirva de apoio como recurso estratégico. Portanto, as decisões são tomadas com base na experiência dos administradores, quando poderiam serem baseadas em factos históricos armazenados pelos diversos sistemas. Genericamente, as organizações possuem muitos dados, mas na maioria dos casos extraem pouca informação, o que é um problema em termos de mercados competitivos. Como as organizações procuram evoluir e superar a concorrência nas tomadas de decisão, surge neste contexto o termo Business Intelligence(BI). A GisGeo Information Systems é uma empresa que desenvolve software baseado em SIG (sistemas de informação geográfica) recorrendo a uma filosofia de ferramentas open-source. O seu principal produto baseia-se na localização geográfica dos vários tipos de viaturas, na recolha de dados, e consequentemente a sua análise (quilómetros percorridos, duração de uma viagem entre dois pontos definidos, consumo de combustível, etc.). Neste âmbito surge o tema deste projeto que tem objetivo de dar uma perspetiva diferente aos dados existentes, cruzando os conceitos BI com o sistema implementado na empresa de acordo com a sua filosofia. Neste projeto são abordados alguns dos conceitos mais importantes adjacentes a BI como, por exemplo, modelo dimensional, data Warehouse, o processo ETL e OLAP, seguindo a metodologia de Ralph Kimball. São também estudadas algumas das principais ferramentas open-source existentes no mercado, assim como quais as suas vantagens/desvantagens relativamente entre elas. Em conclusão, é então apresentada a solução desenvolvida de acordo com os critérios enumerados pela empresa como prova de conceito da aplicabilidade da área Business Intelligence ao ramo de Sistemas de informação Geográfica (SIG), recorrendo a uma ferramenta open-source que suporte visualização dos dados através de dashboards.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Esta dissertação procura investigar e documentar o que está sendo realizado atualmente no Jornalismo de Dados (data-driven journalism) em Portugal. Por ser um campo novo no Jornalismo, se procura, por meio de entrevistas, compreender como os editores de jornais lusitanos definem, caracterizam, utilizam e percebem as potencialidades dessa nova categoria do jornalismo digital. Também são analisados exemplos de reportagens com características de Jornalismo de Dados que foram citadas pelos entrevistados. Contextualizar a evolução e a importância da tecnologia para o surgimento do Jornalismo de Dados foi outro objetivo da pesquisa. Assim, se pretende apresentar o estado da arte do Jornalismo de Dados nos jornais generalistas diários portugueses, visando perceber as tendências atuais na área e deixar um registro para futuros trabalhos sobre o assunto.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the recent past, hardly anyone could predict this course of GIS development. GIS is moving from desktop to cloud. Web 2.0 enabled people to input data into web. These data are becoming increasingly geolocated. Big amounts of data formed something that is called "Big Data". Scientists still don't know how to deal with it completely. Different Data Mining tools are used for trying to extract some useful information from this Big Data. In our study, we also deal with one part of these data - User Generated Geographic Content (UGGC). The Panoramio initiative allows people to upload photos and describe them with tags. These photos are geolocated, which means that they have exact location on the Earth's surface according to a certain spatial reference system. By using Data Mining tools, we are trying to answer if it is possible to extract land use information from Panoramio photo tags. Also, we tried to answer to what extent this information could be accurate. At the end, we compared different Data Mining methods in order to distinguish which one has the most suited performances for this kind of data, which is text. Our answers are quite encouraging. With more than 70% of accuracy, we proved that extracting land use information is possible to some extent. Also, we found Memory Based Reasoning (MBR) method the most suitable method for this kind of data in all cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Customer lifetime value (LTV) enables using client characteristics, such as recency, frequency and monetary (RFM) value, to describe the value of a client through time in terms of profitability. We present the concept of LTV applied to telemarketing for improving the return-on-investment, using a recent (from 2008 to 2013) and real case study of bank campaigns to sell long- term deposits. The goal was to benefit from past contacts history to extract additional knowledge. A total of twelve LTV input variables were tested, un- der a forward selection method and using a realistic rolling windows scheme, highlighting the validity of five new LTV features. The results achieved by our LTV data-driven approach using neural networks allowed an improvement up to 4 pp in the Lift cumulative curve for targeting the deposit subscribers when compared with a baseline model (with no history data). Explanatory knowledge was also extracted from the proposed model, revealing two highly relevant LTV features, the last result of the previous campaign to sell the same product and the frequency of past client successes. The obtained results are particularly valuable for contact center companies, which can improve pre- dictive performance without even having to ask for more information to the companies they serve.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Developing and implementing data-oriented workflows for data migration processes are complex tasks involving several problems related to the integration of data coming from different schemas. Usually, they involve very specific requirements - every process is almost unique. Having a way to abstract their representation will help us to better understand and validate them with business users, which is a crucial step for requirements validation. In this demo we present an approach that provides a way to enrich incrementally conceptual models in order to support an automatic way for producing their correspondent physical implementation. In this demo we will show how B2K (Business to Kettle) system works transforming BPMN 2.0 conceptual models into Kettle data-integration executable processes, approaching the most relevant aspects related to model design and enrichment, model to system transformation, and system execution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Results of a search for decays of massive particles to fully hadronic final states are presented. This search uses 20.3 fb−1 of data collected by the ATLAS detector in s√=8TeV proton--proton collisions at the LHC. Signatures based on high jet multiplicities without requirements on the missing transverse momentum are used to search for R-parity-violating supersymmetric gluino pair production with subsequent decays to quarks. The analysis is performed using a requirement on the number of jets, in combination with separate requirements on the number of b-tagged jets, as well as a topological observable formed from the scalar sum of the mass values of large-radius jets in the event. Results are interpreted in the context of all possible branching ratios of direct gluino decays to various quark flavors. No significant deviation is observed from the expected Standard Model backgrounds estimated using jet-counting as well as data-driven templates of the total-jet-mass spectra. Gluino pair decays to ten or more quarks via intermediate neutralinos are excluded for a gluino with mass mg~<1TeV for a neutralino mass mχ~01=500GeV. Direct gluino decays to six quarks are excluded for mg~<917GeV for light-flavor final states, and results for various flavor hypotheses are presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tese de Doutoramento em Biologia Ambiental e Molecular

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The environmental and socio-economic importance of coastal areas is widely recognized, but at present these areas face severe weaknesses and high-risk situations. The increased demand and growing human occupation of coastal zones have greatly contributed to exacerbating such weaknesses. Today, throughout the world, in all countries with coastal regions, episodes of waves overtopping and coastal flooding are frequent. These episodes are usually responsible for property losses and often put human lives at risk. The floods are caused by coastal storms primarily due to the action of very strong winds. The propagation of these storms towards the coast induces high water levels. It is expected that climate change phenomena will contribute to the intensification of coastal storms. In this context, an estimation of coastal flooding hazards is of paramount importance for the planning and management of coastal zones. Consequently, carrying out a series of storm scenarios and analyzing their impacts through numerical modeling is of prime interest to coastal decision-makers. Firstly, throughout this work, historical storm tracks and intensities are characterized for the northeastern region of United States coast, in terms of probability of occurrence. Secondly, several storm events with high potential of occurrence are generated using a specific tool of DelftDashboard interface for Delft3D software. Hydrodynamic models are then used to generate ensemble simulations to assess storms' effects on coastal water levels. For the United States’ northeastern coast, a highly refined regional domain is considered surrounding the area of The Battery, New York, situated in New York Harbor. Based on statistical data of numerical modeling results, a review of the impact of coastal storms to different locations within the study area is performed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To examine effects of mother's anxiety and depression and associated risk factors during early pregnancy on fetal growth and activity. Repeated measures of mother's anxiety (State-Anxiety Inventory (STAI-S)) and depression (Edinburgh Postnatal Depression Scale (EPDS)) and related socio demographics and substance consumption were obtained at the 1st and 2nd pregnancy trimesters, and fetus' (N = 147) biometric data and behavior was recorded during ultrasound examination at 20-22 weeks of gestation. Higher anxiety symptoms were associated to both lower fetal growth and higher fetal activity. While lower education, primiparity, adolescent motherhood, and tobacco consumption predicted lower fetal growth, coffee intake predicted lower fetal activity. Vulnerability of fetal development to mother's psychological symptoms as well as to other sociodemographic and substance consumption risk factors during early and mid pregnancy is suggested.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: To develop data-driven criteria for clinically inactive disease on and off therapy for juvenile dermatomyositis (JDM). METHODS: The Paediatric Rheumatology International Trials Organisation (PRINTO) database contains 275 patients with active JDM evaluated prospectively up to 24 months. Thirty-eight patients off therapy at 24 months were defined as clinically inactive and included in the reference group. These were compared with a random sample of 76 patients who had active disease at study baseline. Individual measures of muscle strength/endurance, muscle enzymes, physician's and parent's global disease activity/damage evaluations, inactive disease criteria derived from the literature and other ad hoc criteria were evaluated for sensitivity, specificity and Cohen's κ agreement. RESULTS: The individual measures that best characterised inactive disease (sensitivity and specificity >0.8 and Cohen's κ >0.8) were manual muscle testing (MMT) ≥78, physician global assessment of muscle activity=0, physician global assessment of overall disease activity (PhyGloVAS) ≤0.2, Childhood Myositis Assessment Scale (CMAS) ≥48, Disease Activity Score ≤3 and Myositis Disease Activity Assessment Visual Analogue Scale ≤0.2. The best combination of variables to classify a patient as being in a state of inactive disease on or off therapy is at least three of four of the following criteria: creatine kinase ≤150, CMAS ≥48, MMT ≥78 and PhyGloVAS ≤0.2. After 24 months, 30/31 patients (96.8%) were inactive off therapy and 69/145 (47.6%) were inactive on therapy. CONCLUSION: PRINTO established data-driven criteria with clearly evidence-based cut-off values to identify JDM patients with clinically inactive disease. These criteria can be used in clinical trials, in research and in clinical practice.