933 resultados para Data Extraction


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The study of electricity markets operation has been gaining an increasing importance in the last years, as result of the new challenges that the restructuring process produced. Currently, lots of information concerning electricity markets is available, as market operators provide, after a period of confidentiality, data regarding market proposals and transactions. These data can be used as source of knowledge to define realistic scenarios, which are essential for understanding and forecast electricity markets behavior. The development of tools able to extract, transform, store and dynamically update data, is of great importance to go a step further into the comprehension of electricity markets and of the behaviour of the involved entities. In this paper an adaptable tool capable of downloading, parsing and storing data from market operators’ websites is presented, assuring constant updating and reliability of the stored data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Electricity markets worldwide suffered profound transformations. The privatization of previously nationally owned systems; the deregulation of privately owned systems that were regulated; and the strong interconnection of national systems, are some examples of such transformations [1, 2]. In general, competitive environments, as is the case of electricity markets, require good decision-support tools to assist players in their decisions. Relevant research is being undertaken in this field, namely concerning player modeling and simulation, strategic bidding and decision-support.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis submitted to Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa, in partial fulfilment of the requirements for the degree of Master in Computer Science

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Carte du Ciel (from French, map of the sky) is a part of a 19th century extensive international astronomical project whose goal was to map the entire visible sky. The results of this vast effort were collected in the form of astrographic plates and their paper representatives that are called astrographic maps and are widely distributed among many observatories and astronomical institutes over the world. Our goal is to design methods and algorithms to automatically extract data from digitized Carte du Ciel astrographic maps. This paper examines the image processing and pattern recognition techniques that can be adopted for automatic extraction of astronomical data from stars’ triple expositions that can aid variable stars detection in Carte du Ciel maps.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present the extraction and processing of the IUE Low Dispersion spectra within the framework of the ESA “IUE Newly Extracted Spectra” (INES) System. Weak points of SWET, the optimal extraction implementation to produce the NEWSIPS output products (extracted spectra) are discussed, and the procedures implemented in INES to solve these problems are outlined. The more relevant modifications are: 1) the use of a new noise model, 2) a more accurate representation of the spatial profile of the spectrum and 3) a more reliable determination of the background. The INES extraction also includes a correction for the contamination by solar light in long wavelength spectra. Examples showing the improvements obtained in INES with respect to SWET are described. Finally, the linearity and repeatability characteristics of INES data are evaluated and the validity of the errors provided in the extraction is discussed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Esta dissertação apresenta uma proposta de sistema capaz de preencher a lacuna entre documentos legislativos em formato PDF e documentos legislativos em formato aberto. O objetivo principal é mapear o conhecimento presente nesses documentos de maneira a representar essa coleção como informação interligada. O sistema é composto por vários componentes responsáveis pela execução de três fases propostas: extração de dados, organização de conhecimento, acesso à informação. A primeira fase propõe uma abordagem à extração de estrutura, texto e entidades de documentos PDF de maneira a obter a informação desejada, de acordo com a parametrização do utilizador. Esta abordagem usa dois métodos de extração diferentes, de acordo com as duas fases de processamento de documentos – análise de documento e compreensão de documento. O critério utilizado para agrupar objetos de texto é a fonte usada nos objetos de texto de acordo com a sua definição no código de fonte (Content Stream) do PDF. A abordagem está dividida em três partes: análise de documento, compreensão de documento e conjunção. A primeira parte da abordagem trata da extração de segmentos de texto, adotando uma abordagem geométrica. O resultado é uma lista de linhas do texto do documento; a segunda parte trata de agrupar os objetos de texto de acordo com o critério estipulado, produzindo um documento XML com o resultado dessa extração; a terceira e última fase junta os resultados das duas fases anteriores e aplica regras estruturais e lógicas no sentido de obter o documento XML final. A segunda fase propõe uma ontologia no domínio legal capaz de organizar a informação extraída pelo processo de extração da primeira fase. Também é responsável pelo processo de indexação do texto dos documentos. A ontologia proposta apresenta três características: pequena, interoperável e partilhável. A primeira característica está relacionada com o facto da ontologia não estar focada na descrição pormenorizada dos conceitos presentes, propondo uma descrição mais abstrata das entidades presentes; a segunda característica é incorporada devido à necessidade de interoperabilidade com outras ontologias do domínio legal, mas também com as ontologias padrão que são utilizadas geralmente; a terceira característica é definida no sentido de permitir que o conhecimento traduzido, segundo a ontologia proposta, seja independente de vários fatores, tais como o país, a língua ou a jurisdição. A terceira fase corresponde a uma resposta à questão do acesso e reutilização do conhecimento por utilizadores externos ao sistema através do desenvolvimento dum Web Service. Este componente permite o acesso à informação através da disponibilização de um grupo de recursos disponíveis a atores externos que desejem aceder à informação. O Web Service desenvolvido utiliza a arquitetura REST. Uma aplicação móvel Android também foi desenvolvida de maneira a providenciar visualizações dos pedidos de informação. O resultado final é então o desenvolvimento de um sistema capaz de transformar coleções de documentos em formato PDF para coleções em formato aberto de maneira a permitir o acesso e reutilização por outros utilizadores. Este sistema responde diretamente às questões da comunidade de dados abertos e de Governos, que possuem muitas coleções deste tipo, para as quais não existe a capacidade de raciocinar sobre a informação contida, e transformá-la em dados que os cidadãos e os profissionais possam visualizar e utilizar.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Esta dissertação incide sobre a problemática da construção de um data warehouse para a empresa AdClick que opera na área de marketing digital. O marketing digital é um tipo de marketing que utiliza os meios de comunicação digital, com a mesma finalidade do método tradicional que se traduz na divulgação de bens, negócios e serviços e a angariação de novos clientes. Existem diversas estratégias de marketing digital tendo em vista atingir tais objetivos, destacando-se o tráfego orgânico e tráfego pago. Onde o tráfego orgânico é caracterizado pelo desenvolvimento de ações de marketing que não envolvem quaisquer custos inerentes à divulgação e/ou angariação de potenciais clientes. Por sua vez o tráfego pago manifesta-se pela necessidade de investimento em campanhas capazes de impulsionar e atrair novos clientes. Inicialmente é feita uma abordagem do estado da arte sobre business intelligence e data warehousing, e apresentadas as suas principais vantagens as empresas. Os sistemas business intelligence são necessários, porque atualmente as empresas detêm elevados volumes de dados ricos em informação, que só serão devidamente explorados fazendo uso das potencialidades destes sistemas. Nesse sentido, o primeiro passo no desenvolvimento de um sistema business intelligence é concentrar todos os dados num sistema único integrado e capaz de dar apoio na tomada de decisões. É então aqui que encontramos a construção do data warehouse como o sistema único e ideal para este tipo de requisitos. Nesta dissertação foi elaborado o levantamento das fontes de dados que irão abastecer o data warehouse e iniciada a contextualização dos processos de negócio existentes na empresa. Após este momento deu-se início à construção do data warehouse, criação das dimensões e tabelas de factos e definição dos processos de extração e carregamento dos dados para o data warehouse. Assim como a criação das diversas views. Relativamente ao impacto que esta dissertação atingiu destacam-se as diversas vantagem a nível empresarial que a empresa parceira neste trabalho retira com a implementação do data warehouse e os processos de ETL para carregamento de todas as fontes de informação. Sendo que algumas vantagens são a centralização da informação, mais flexibilidade para os gestores na forma como acedem à informação. O tratamento dos dados de forma a ser possível a extração de informação a partir dos mesmos.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In recent years a set of production paradigms were proposed in order to capacitate manufacturers to meet the new market requirements, such as the shift in demand for highly customized products resulting in a shorter product life cycle, rather than the traditional mass production standardized consumables. These new paradigms advocate solutions capable of facing these requirements, empowering manufacturing systems with a high capacity to adapt along with elevated flexibility and robustness in order to deal with disturbances, like unexpected orders or malfunctions. Evolvable Production Systems propose a solution based on the usage of modularity and self-organization with a fine granularity level, supporting pluggability and in this way allowing companies to add and/or remove components during execution without any extra re-programming effort. However, current monitoring software was not designed to fully support these characteristics, being commonly based on centralized SCADA systems, incapable of re-adapting during execution to the unexpected plugging/unplugging of devices nor changes in the entire system’s topology. Considering these aspects, the work developed for this thesis encompasses a fully distributed agent-based architecture, capable of performing knowledge extraction at different levels of abstraction without sacrificing the capacity to add and/or remove monitoring entities, responsible for data extraction and analysis, during runtime.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Context There are no evidence syntheses available to guide clinicians on when to titrate antihypertensive medication after initiation. Objective To model the blood pressure (BP) response after initiating antihypertensive medication. Data sources electronic databases including Medline, Embase, Cochrane Register and reference lists up to December 2009. Study selection Trials that initiated antihypertensive medication as single therapy in hypertensive patients who were either drug naive or had a placebo washout from previous drugs. Data extraction Office BP measurements at a minimum of two weekly intervals for a minimum of 4 weeks. An asymptotic approach model of BP response was assumed and non-linear mixed effects modelling used to calculate model parameters. Results and conclusions Eighteen trials that recruited 4168 patients met inclusion criteria. The time to reach 50% of the maximum estimated BP lowering effect was 1 week (systolic 0.91 weeks, 95% CI 0.74 to 1.10; diastolic 0.95, 0.75 to 1.15). Models incorporating drug class as a source of variability did not improve fit of the data. Incorporating the presence of a titration schedule improved model fit for both systolic and diastolic pressure. Titration increased both the predicted maximum effect and the time taken to reach 50% of the maximum (systolic 1.2 vs 0.7 weeks; diastolic 1.4 vs 0.7 weeks). Conclusions Estimates of the maximum efficacy of antihypertensive agents can be made early after starting therapy. This knowledge will guide clinicians in deciding when a newly started antihypertensive agent is likely to be effective or not at controlling BP.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

CONTEXT: Subclinical hypothyroidism has been associated with increased risk of coronary heart disease (CHD), particularly with thyrotropin levels of 10.0 mIU/L or greater. The measurement of thyroid antibodies helps predict the progression to overt hypothyroidism, but it is unclear whether thyroid autoimmunity independently affects CHD risk. OBJECTIVE: The objective of the study was to compare the CHD risk of subclinical hypothyroidism with and without thyroid peroxidase antibodies (TPOAbs). DATA SOURCES AND STUDY SELECTION: A MEDLINE and EMBASE search from 1950 to 2011 was conducted for prospective cohorts, reporting baseline thyroid function, antibodies, and CHD outcomes. DATA EXTRACTION: Individual data of 38 274 participants from six cohorts for CHD mortality followed up for 460 333 person-years and 33 394 participants from four cohorts for CHD events. DATA SYNTHESIS: Among 38 274 adults (median age 55 y, 63% women), 1691 (4.4%) had subclinical hypothyroidism, of whom 775 (45.8%) had positive TPOAbs. During follow-up, 1436 participants died of CHD and 3285 had CHD events. Compared with euthyroid individuals, age- and gender-adjusted risks of CHD mortality in subclinical hypothyroidism were similar among individuals with and without TPOAbs [hazard ratio (HR) 1.15, 95% confidence interval (CI) 0.87-1.53 vs HR 1.26, CI 1.01-1.58, P for interaction = .62], as were risks of CHD events (HR 1.16, CI 0.87-1.56 vs HR 1.26, CI 1.02-1.56, P for interaction = .65). Risks of CHD mortality and events increased with higher thyrotropin, but within each stratum, risks did not differ by TPOAb status. CONCLUSIONS: CHD risk associated with subclinical hypothyroidism did not differ by TPOAb status, suggesting that biomarkers of thyroid autoimmunity do not add independent prognostic information for CHD outcomes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

OBJECTIVE: The objective was to determine the risk of stroke associated with subclinical hypothyroidism. DATA SOURCES AND STUDY SELECTION: Published prospective cohort studies were identified through a systematic search through November 2013 without restrictions in several databases. Unpublished studies were identified through the Thyroid Studies Collaboration. We collected individual participant data on thyroid function and stroke outcome. Euthyroidism was defined as TSH levels of 0.45-4.49 mIU/L, and subclinical hypothyroidism was defined as TSH levels of 4.5-19.9 mIU/L with normal T4 levels. DATA EXTRACTION AND SYNTHESIS: We collected individual participant data on 47 573 adults (3451 subclinical hypothyroidism) from 17 cohorts and followed up from 1972-2014 (489 192 person-years). Age- and sex-adjusted pooled hazard ratios (HRs) for participants with subclinical hypothyroidism compared to euthyroidism were 1.05 (95% confidence interval [CI], 0.91-1.21) for stroke events (combined fatal and nonfatal stroke) and 1.07 (95% CI, 0.80-1.42) for fatal stroke. Stratified by age, the HR for stroke events was 3.32 (95% CI, 1.25-8.80) for individuals aged 18-49 years. There was an increased risk of fatal stroke in the age groups 18-49 and 50-64 years, with a HR of 4.22 (95% CI, 1.08-16.55) and 2.86 (95% CI, 1.31-6.26), respectively (p trend 0.04). We found no increased risk for those 65-79 years old (HR, 1.00; 95% CI, 0.86-1.18) or ≥ 80 years old (HR, 1.31; 95% CI, 0.79-2.18). There was a pattern of increased risk of fatal stroke with higher TSH concentrations. CONCLUSIONS: Although no overall effect of subclinical hypothyroidism on stroke could be demonstrated, an increased risk in subjects younger than 65 years and those with higher TSH concentrations was observed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

CONTEXT Subclinical hypothyroidism has been associated with increased risk of coronary heart disease (CHD), particularly with thyrotropin levels of 10.0 mIU/L or greater. The measurement of thyroid antibodies helps predict the progression to overt hypothyroidism, but it is unclear whether thyroid autoimmunity independently affects CHD risk. OBJECTIVE The objective of the study was to compare the CHD risk of subclinical hypothyroidism with and without thyroid peroxidase antibodies (TPOAbs). DATA SOURCES AND STUDY SELECTION A MEDLINE and EMBASE search from 1950 to 2011 was conducted for prospective cohorts, reporting baseline thyroid function, antibodies, and CHD outcomes. DATA EXTRACTION Individual data of 38 274 participants from six cohorts for CHD mortality followed up for 460 333 person-years and 33 394 participants from four cohorts for CHD events. DATA SYNTHESIS Among 38 274 adults (median age 55 y, 63% women), 1691 (4.4%) had subclinical hypothyroidism, of whom 775 (45.8%) had positive TPOAbs. During follow-up, 1436 participants died of CHD and 3285 had CHD events. Compared with euthyroid individuals, age- and gender-adjusted risks of CHD mortality in subclinical hypothyroidism were similar among individuals with and without TPOAbs [hazard ratio (HR) 1.15, 95% confidence interval (CI) 0.87-1.53 vs HR 1.26, CI 1.01-1.58, P for interaction = .62], as were risks of CHD events (HR 1.16, CI 0.87-1.56 vs HR 1.26, CI 1.02-1.56, P for interaction = .65). Risks of CHD mortality and events increased with higher thyrotropin, but within each stratum, risks did not differ by TPOAb status. CONCLUSIONS CHD risk associated with subclinical hypothyroidism did not differ by TPOAb status, suggesting that biomarkers of thyroid autoimmunity do not add independent prognostic information for CHD outcomes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

OBJECTIVE The objective was to determine the risk of stroke associated with subclinical hypothyroidism. DATA SOURCES AND STUDY SELECTION Published prospective cohort studies were identified through a systematic search through November 2013 without restrictions in several databases. Unpublished studies were identified through the Thyroid Studies Collaboration. We collected individual participant data on thyroid function and stroke outcome. Euthyroidism was defined as TSH levels of 0.45-4.49 mIU/L, and subclinical hypothyroidism was defined as TSH levels of 4.5-19.9 mIU/L with normal T4 levels. DATA EXTRACTION AND SYNTHESIS We collected individual participant data on 47 573 adults (3451 subclinical hypothyroidism) from 17 cohorts and followed up from 1972-2014 (489 192 person-years). Age- and sex-adjusted pooled hazard ratios (HRs) for participants with subclinical hypothyroidism compared to euthyroidism were 1.05 (95% confidence interval [CI], 0.91-1.21) for stroke events (combined fatal and nonfatal stroke) and 1.07 (95% CI, 0.80-1.42) for fatal stroke. Stratified by age, the HR for stroke events was 3.32 (95% CI, 1.25-8.80) for individuals aged 18-49 years. There was an increased risk of fatal stroke in the age groups 18-49 and 50-64 years, with a HR of 4.22 (95% CI, 1.08-16.55) and 2.86 (95% CI, 1.31-6.26), respectively (p trend 0.04). We found no increased risk for those 65-79 years old (HR, 1.00; 95% CI, 0.86-1.18) or ≥ 80 years old (HR, 1.31; 95% CI, 0.79-2.18). There was a pattern of increased risk of fatal stroke with higher TSH concentrations. CONCLUSIONS Although no overall effect of subclinical hypothyroidism on stroke could be demonstrated, an increased risk in subjects younger than 65 years and those with higher TSH concentrations was observed.