922 resultados para Data replication processes
Resumo:
The discovery of the Cosmic Microwave Background (CMB) radiation in 1965 is one of the fundamental milestones supporting the Big Bang theory. The CMB is one of the most important source of information in cosmology. The excellent accuracy of the recent CMB data of WMAP and Planck satellites confirmed the validity of the standard cosmological model and set a new challenge for the data analysis processes and their interpretation. In this thesis we deal with several aspects and useful tools of the data analysis. We focus on their optimization in order to have a complete exploitation of the Planck data and contribute to the final published results. The issues investigated are: the change of coordinates of CMB maps using the HEALPix package, the problem of the aliasing effect in the generation of low resolution maps, the comparison of the Angular Power Spectrum (APS) extraction performances of the optimal QML method, implemented in the code called BolPol, and the pseudo-Cl method, implemented in Cromaster. The QML method has been then applied to the Planck data at large angular scales to extract the CMB APS. The same method has been applied also to analyze the TT parity and the Low Variance anomalies in the Planck maps, showing a consistent deviation from the standard cosmological model, the possible origins for this results have been discussed. The Cromaster code instead has been applied to the 408 MHz and 1.42 GHz surveys focusing on the analysis of the APS of selected regions of the synchrotron emission. The new generation of CMB experiments will be dedicated to polarization measurements, for which are necessary high accuracy devices for separating the polarizations. Here a new technology, called Photonic Crystals, is exploited to develop a new polarization splitter device and its performances are compared to the devices used nowadays.
Resumo:
Temperature sensitive (ts) mutant viruses have helped elucidate replication processes in many viral systems. Several panels of replication-defective ts mutants in which viral RNA synthesis is abolished at the nonpermissive temperature (RNA$\sp{-})$ have been isolated for Mouse Hepatitis Virus, MHV (Robb et al., 1979; Koolen et al., 1983; Martin et al., 1988; Schaad et al., 1990). However, no one had investigated genetic or phenotypic relationships between these different mutant panels. In order to determine how the panel of MHV-JHM RNA$\sp{-}$ ts mutants (Robb et al., 1979) were genetically related to other described MHV RNA$\sp{-}$ ts mutants, the MHV-JHM mutants were tested for complementation with representatives from two different sets of MHV-A59 ts mutants (Koolen et al., 1983; Schaad et al., 1990). The three ts mutant panels together were found to comprise eight genetically distinct complementation groups. Of these eight complementation groups, three complementation classes are unique to their particular mutant panel; genetically equivalent mutants were not observed within the other two mutant panels. Two complementation groups were common to all three mutant panels. The three remaining complementation groups overlapped two of the three mutant sets. Mutants MHV-JHM tsA204 and MHV-A59 ts261 were shown to be within one of these overlapping complementation groups. The phenotype of the MHV-JHM mutants within this complementation class has been previously characterized (Leibowitz et al., 1982; Leibowitz et al, 1990). When these mutants were grown at the permissive temperature, then shifted up to the nonpermissive temperature at the start of RNA synthesis, genome-length RNA and leader RNA fragments accumulated, but no subgenomic mRNA was synthesized. MHV-A59 ts261 produced leader RNA fragments identical to those observed with MHV-JHM tsA204. Thus, these two MHV RNA$\sp{-}$ ts mutants that were genetically equivalent by complementation testing were phenotypically similar as well. Recombination frequencies obtained from crosses of MHV-A59 ts261 with several of the gene 1 MHV-A59 mutants indicated that the causal mutation(s) of MHV-A59 ts261 was located near the overlapping junction of ORF1a and ORF1b, in the 3$\sp\prime$ end of ORF1a, or the 5$\sp\prime$ end of ORF1b. Sequence analysis of this junction and 1400 nucleotides into the 5$\sp\prime$ end of ORF1b of MHV-A59 ts261 revealed one nucleotide change from the wildtype MHV-A59. This substitution at nucleotide 13,598 (A to G) was a silent mutation in the ORF1a reading frame, but resulted in an amino acid change in ORF1b gene product (I to V). This amino acid change would be expressed only in the readthrough translation product produced upon successful ribosome frameshifting. A revertant of MHV-A59 ts261 (R2) also retained this guanidine residue, but had a second substitution at nucleotide 14,475 in ORF1b. This mutation results in the substitution of valine for an isoleucine.^ The data presented here suggest that the mutation in MHV-A59 ts261 (nucleotide 13,598) would be responsible for the MHV-JHM complementation group A phenotype. A second-site reversion at nucleotide 14,475 may correct this defect in the revertant. Sequencing of gene 1 immediately upstream of nucleotide 13,296 and downstream of nucleotide 15,010 must be conducted to test this hypothesis. ^
Resumo:
This dissertation contains four essays that all share a common purpose: developing new methodologies to exploit the potential of high-frequency data for the measurement, modeling and forecasting of financial assets volatility and correlations. The first two chapters provide useful tools for univariate applications while the last two chapters develop multivariate methodologies. In chapter 1, we introduce a new class of univariate volatility models named FloGARCH models. FloGARCH models provide a parsimonious joint model for low frequency returns and realized measures, and are sufficiently flexible to capture long memory as well as asymmetries related to leverage effects. We analyze the performances of the models in a realistic numerical study and on the basis of a data set composed of 65 equities. Using more than 10 years of high-frequency transactions, we document significant statistical gains related to the FloGARCH models in terms of in-sample fit, out-of-sample fit and forecasting accuracy compared to classical and Realized GARCH models. In chapter 2, using 12 years of high-frequency transactions for 55 U.S. stocks, we argue that combining low-frequency exogenous economic indicators with high-frequency financial data improves the ability of conditionally heteroskedastic models to forecast the volatility of returns, their full multi-step ahead conditional distribution and the multi-period Value-at-Risk. Using a refined version of the Realized LGARCH model allowing for time-varying intercept and implemented with realized kernels, we document that nominal corporate profits and term spreads have strong long-run predictive ability and generate accurate risk measures forecasts over long-horizon. The results are based on several loss functions and tests, including the Model Confidence Set. Chapter 3 is a joint work with David Veredas. We study the class of disentangled realized estimators for the integrated covariance matrix of Brownian semimartingales with finite activity jumps. These estimators separate correlations and volatilities. We analyze different combinations of quantile- and median-based realized volatilities, and four estimators of realized correlations with three synchronization schemes. Their finite sample properties are studied under four data generating processes, in presence, or not, of microstructure noise, and under synchronous and asynchronous trading. The main finding is that the pre-averaged version of disentangled estimators based on Gaussian ranks (for the correlations) and median deviations (for the volatilities) provide a precise, computationally efficient, and easy alternative to measure integrated covariances on the basis of noisy and asynchronous prices. Along these lines, a minimum variance portfolio application shows the superiority of this disentangled realized estimator in terms of numerous performance metrics. Chapter 4 is co-authored with Niels S. Hansen, Asger Lunde and Kasper V. Olesen, all affiliated with CREATES at Aarhus University. We propose to use the Realized Beta GARCH model to exploit the potential of high-frequency data in commodity markets. The model produces high quality forecasts of pairwise correlations between commodities which can be used to construct a composite covariance matrix. We evaluate the quality of this matrix in a portfolio context and compare it to models used in the industry. We demonstrate significant economic gains in a realistic setting including short selling constraints and transaction costs.
Resumo:
El proceso de toma de decisiones en las bibliotecas universitarias es de suma importancia, sin embargo, se encuentra complicaciones como la gran cantidad de fuentes de datos y los grandes volúmenes de datos a analizar. Las bibliotecas universitarias están acostumbradas a producir y recopilar una gran cantidad de información sobre sus datos y servicios. Las fuentes de datos comunes son el resultado de sistemas internos, portales y catálogos en línea, evaluaciones de calidad y encuestas. Desafortunadamente estas fuentes de datos sólo se utilizan parcialmente para la toma de decisiones debido a la amplia variedad de formatos y estándares, así como la falta de métodos eficientes y herramientas de integración. Este proyecto de tesis presenta el análisis, diseño e implementación del Data Warehouse, que es un sistema integrado de toma de decisiones para el Centro de Documentación Juan Bautista Vázquez. En primer lugar se presenta los requerimientos y el análisis de los datos en base a una metodología, esta metodología incorpora elementos claves incluyendo el análisis de procesos, la calidad estimada, la información relevante y la interacción con el usuario que influyen en una decisión bibliotecaria. A continuación, se propone la arquitectura y el diseño del Data Warehouse y su respectiva implementación la misma que soporta la integración, procesamiento y el almacenamiento de datos. Finalmente los datos almacenados se analizan a través de herramientas de procesamiento analítico y la aplicación de técnicas de Bibliomining ayudando a los administradores del centro de documentación a tomar decisiones óptimas sobre sus recursos y servicios.
Resumo:
This thesis presents a study of the Grid data access patterns in distributed analysis in the CMS experiment at the LHC accelerator. This study ranges from the deep analysis of the historical patterns of access to the most relevant data types in CMS, to the exploitation of a supervised Machine Learning classification system to set-up a machinery able to eventually predict future data access patterns - i.e. the so-called dataset “popularity” of the CMS datasets on the Grid - with focus on specific data types. All the CMS workflows run on the Worldwide LHC Computing Grid (WCG) computing centers (Tiers), and in particular the distributed analysis systems sustains hundreds of users and applications submitted every day. These applications (or “jobs”) access different data types hosted on disk storage systems at a large set of WLCG Tiers. The detailed study of how this data is accessed, in terms of data types, hosting Tiers, and different time periods, allows to gain precious insight on storage occupancy over time and different access patterns, and ultimately to extract suggested actions based on this information (e.g. targetted disk clean-up and/or data replication). In this sense, the application of Machine Learning techniques allows to learn from past data and to gain predictability potential for the future CMS data access patterns. Chapter 1 provides an introduction to High Energy Physics at the LHC. Chapter 2 describes the CMS Computing Model, with special focus on the data management sector, also discussing the concept of dataset popularity. Chapter 3 describes the study of CMS data access patterns with different depth levels. Chapter 4 offers a brief introduction to basic machine learning concepts and gives an introduction to its application in CMS and discuss the results obtained by using this approach in the context of this thesis.
Resumo:
O presente projecto tem como objectivo a disponibilização de uma plataforma de serviços para gestão e contabilização de tempo remunerável, através da marcação de horas de trabalho, férias e faltas (com ou sem justificação). Pretende-se a disponibilização de relatórios com base nesta informação e a possibilidade de análise automática dos dados, como por exemplo excesso de faltas e férias sobrepostas de trabalhadores. A ênfase do projecto está na disponibilização de uma arquitectura que facilite a inclusão destas funcionalidades. O projecto está implementado sobre a plataforma Google App Engine (i.e. GAE), de forma a disponibilizar uma solução sob o paradigma de Software as a Service, com garantia de disponibilidade e replicação de dados. A plataforma foi escolhida a partir da análise das principais plataformas cloud existentes: Google App Engine, Windows Azure e Amazon Web Services. Foram analisadas as características de cada plataforma, nomeadamente os modelos de programação, os modelos de dados disponibilizados, os serviços existentes e respectivos custos. A escolha da plataforma foi realizada com base nas suas características à data de iniciação do presente projecto. A solução está estruturada em camadas, com as seguintes componentes: interface da plataforma, lógica de negócio e lógica de acesso a dados. A interface disponibilizada está concebida com observação dos princípios arquitecturais REST, suportando dados nos formatos JSON e XML. A esta arquitectura base foi acrescentada uma componente de autorização, suportada em Spring-Security, sendo a autenticação delegada para os serviços Google Acounts. De forma a permitir o desacoplamento entre as várias camadas foi utilizado o padrão Dependency Injection. A utilização deste padrão reduz a dependência das tecnologias utilizadas nas diversas camadas. Foi implementado um protótipo, para a demonstração do trabalho realizado, que permite interagir com as funcionalidades do serviço implementadas, via pedidos AJAX. Neste protótipo tirou-se partido de várias bibliotecas javascript e padrões que simplificaram a sua realização, tal como o model-view-viewmodel através de data binding. Para dar suporte ao desenvolvimento do projecto foi adoptada uma abordagem de desenvolvimento ágil, baseada em Scrum, de forma a implementar os requisitos do sistema, expressos em user stories. De forma a garantir a qualidade da implementação do serviço foram realizados testes unitários, sendo também feita previamente a análise da funcionalidade e posteriormente produzida a documentação recorrendo a diagramas UML.
Resumo:
Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática
Resumo:
The last three decades have seen quite dramatic changes the way we modeled time dependent data. Linear processes have been in the center stage in modeling time series. As far as the second order properties are concerned, the theory and the methodology are very adequate.However, there are more and more evidences that linear models are not sufficiently flexible and rich enough for modeling purposes and that failure to account for non-linearities can be very misleading and have undesired consequences.
Resumo:
Developing and implementing data-oriented workflows for data migration processes are complex tasks involving several problems related to the integration of data coming from different schemas. Usually, they involve very specific requirements - every process is almost unique. Having a way to abstract their representation will help us to better understand and validate them with business users, which is a crucial step for requirements validation. In this demo we present an approach that provides a way to enrich incrementally conceptual models in order to support an automatic way for producing their correspondent physical implementation. In this demo we will show how B2K (Business to Kettle) system works transforming BPMN 2.0 conceptual models into Kettle data-integration executable processes, approaching the most relevant aspects related to model design and enrichment, model to system transformation, and system execution.
Resumo:
Today it is easy to find a lot of tools to define data migration schemas among different types of information systems. Data migration processes use to be implemented on a very diverse range of applications, ranging from conventional operational systems to data warehousing platforms. The implementation of a data migration process often involves a serious planning, considering the development of conceptual migration schemas at early stages. Such schemas help architects and engineers to plan and discuss the most adequate way to migrate data between two different systems. In this paper we present and discuss a way for enriching data migration conceptual schemas in BPMN using a domain-specific language, demonstrating how to convert such enriched schemas to a first correspondent physical representation (a skeleton) in a conventional ETL implementation tool like Kettle.
Resumo:
BACKGROUND AND PURPOSE: Beyond the Framingham Stroke Risk Score, prediction of future stroke may improve with a genetic risk score (GRS) based on single-nucleotide polymorphisms associated with stroke and its risk factors. METHODS: The study includes 4 population-based cohorts with 2047 first incident strokes from 22,720 initially stroke-free European origin participants aged ≥55 years, who were followed for up to 20 years. GRSs were constructed with 324 single-nucleotide polymorphisms implicated in stroke and 9 risk factors. The association of the GRS to first incident stroke was tested using Cox regression; the GRS predictive properties were assessed with area under the curve statistics comparing the GRS with age and sex, Framingham Stroke Risk Score models, and reclassification statistics. These analyses were performed per cohort and in a meta-analysis of pooled data. Replication was sought in a case-control study of ischemic stroke. RESULTS: In the meta-analysis, adding the GRS to the Framingham Stroke Risk Score, age and sex model resulted in a significant improvement in discrimination (all stroke: Δjoint area under the curve=0.016, P=2.3×10(-6); ischemic stroke: Δjoint area under the curve=0.021, P=3.7×10(-7)), although the overall area under the curve remained low. In all the studies, there was a highly significantly improved net reclassification index (P<10(-4)). CONCLUSIONS: The single-nucleotide polymorphisms associated with stroke and its risk factors result only in a small improvement in prediction of future stroke compared with the classical epidemiological risk factors for stroke.
Resumo:
Diplomityön tavoitteena oli luoda soveltamisedellytyksiä ja -apuvälineitä Six Sigma -parannusprojektien läpiviennille Perlos Oyj Connectorsin meistoprosessissa. Työn teoriaosassa käsitellään Six Sigman perusteoriaa, organisointia ja soveltamista sekä suorituskykymittariston ja tiedonkeruumenetelmien luomista. Kirjallisuuden tarjoaman teorian jälkeen työssä tutustuttiin meistoprosessiin ja Perloksen tapaan soveltaa Six Sigmaa prosessien parantamiseen. Tämän perusteella kehitettiin tiedonkeruu-, dokumentointi- ynnä muita apuvälineitä parantamisprojektien tueksi. Lopuksi koekäytettiin luotuja välineitä meistoon tehdyn osittaisen Six Sigma -projektin yhteydessä. Luotettavan tiedon saaminen prosesseista on perusedellytys Six Sigma -projektien läpiviennille. Menetelmän kouluttaminen työntekijöille on myös tärkeää. Erilaiset dokumentointi- ja laskentakaavakkeet toimivat projektien läpiviennin ja seurannan tukena. Työssä luotuja tiedonkeruu-, koulutus-, dokumentointi- ynnä muita apuvälineitä voidaan käyttää jatkossa apuna Perloksella tehtävissä Six Sigma -projekteissa. Mittaristoa ja tiedonkeruuta tulee kehittää edelleen, jotta prosessien toimintaa voidaan analysoida tehokkaasti. Työn lopussa on ajatuksia Six Sigman käytöstä sekä tiedonkeruujärjestelmien jatkokehityksestä.
Resumo:
Peer-reviewed
Resumo:
Tämä tutkimus käsittelee strategiamuutoksen toimeenpanoa projektiliiketoimintaympäristössä toimivan liiketoimintayksikön tuotekehitys-, suunnittelu- ja tuotetoimitusprosesseissa. Muutoksen tavoitteena on tuoterakenteiden ja toimitusprosessien hallinta pelkän resurssien hallinnan sijaan. Työn tavoitteena on kuvaus siitä, mitä muutoksia strategisten tavoitteiden saavuttaminen edellyttää liiketoimintayksikön tuoteprosessien, tuotearkkitehtuurin ja tietojärjestelmien osalta. Tutkimuksen teoriaosuus käsittelee teknologiastrategiaa, tuotetiedon hallintaa, tuotearkkitehtuureja sekä tuotekehitystoiminnan prosesseja ja menestystekijöitä. Tutkimuksen empiriaosuudessa peilataan liiketoimintayksikön nykytilaa ja strategisia tavoitteita esitettyyn teoreettiseen viitekehykseen, sekä suoritetaan analyysi tavoitellun muutoksen edellytyksistä. Työn tulos sisältää sarjan toimenpiteitä ja näistä muodostetun käsitekartan, joka kuvaa muutosprosessia kokonaisuutena: tuoterakenteiden, tuotetiedon hallinnan, prosessien ja organisaatioiden osalta.