Biblioteca Digital

47 resultados para Panel data analysis

Efficient modeling and inference for event-related fMRI data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Event-related functional magnetic resonance imaging (efMRI) has emerged as a powerful technique for detecting brains' responses to presented stimuli. A primary goal in efMRI data analysis is to estimate the Hemodynamic Response Function (HRF) and to locate activated regions in human brains when specific tasks are performed. This paper develops new methodologies that are important improvements not only to parametric but also to nonparametric estimation and hypothesis testing of the HRF. First, an effective and computationally fast scheme for estimating the error covariance matrix for efMRI is proposed. Second, methodologies for estimation and hypothesis testing of the HRF are developed. Simulations support the effectiveness of our proposed methods. When applied to an efMRI dataset from an emotional control study, our method reveals more meaningful findings than the popular methods offered by AFNI and FSL. (C) 2008 Elsevier B.V. All rights reserved.

Jpowder: a Java-based program for the display and examination of powder diffraction data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The ability to display and inspect powder diffraction data quickly and efficiently is a central part of the data analysis process. Whilst many computer programs are capable of displaying powder data, their focus is typically on advanced operations such as structure solution or Rietveld refinement. This article describes a lightweight software package, Jpowder, whose focus is fast and convenient visualization and comparison of powder data sets in a variety of formats from computers with network access. Jpowder is written in Java and uses its associated Web Start technology to allow ‘single-click deployment’ from a web page, http://www.jpowder.org. Jpowder is open source, free and available for use by anyone.

Provision of environmental output within a multi-output distance function approach

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper redefines technical efficiency by incorporating provision of environmental goods as one of the outputs of the farm. The proportion of permanent and rough grassland to total agricultural land area is used as a proxy for the provision of environmental goods. Stochastic frontier analysis was conducted using a Bayesian procedure. The methodology is applied to panel data on 215 dairy farms in England and Wales. Results show that farm efficiency rankings change when provision of environmental outputs by farms is incorporated in the efficiency analysis, which may have important political implications.

Extracting quantitative structural parameters for disordered polymers from neutron scattering data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The organization of non-crystalline polymeric materials at a local level, namely on a spatial scale between a few and 100 a, is still unclear in many respects. The determination of the local structure in terms of the configuration and conformation of the polymer chain and of the packing characteristics of the chain in the bulk material represents a challenging problem. Data from wide-angle diffraction experiments are very difficult to interpret due to the very large amount of information that they carry, that is the large number of correlations present in the diffraction patterns.We describe new approaches that permit a detailed analysis of the complex neutron diffraction patterns characterizing polymer melts and glasses. The coupling of different computer modelling strategies with neutron scattering data over a wide Q range allows the extraction of detailed quantitative information on the structural arrangements of the materials of interest. Proceeding from modelling routes as diverse as force field calculations, single-chain modelling and reverse Monte Carlo, we show the successes and pitfalls of each approach in describing model systems, which illustrate the need to attack the data analysis problem simultaneously from several fronts.

A process for analysis of microarray comparative genomics hybridisation studies for bacterial genomes

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Microarray based comparative genomic hybridisation (CGH) experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results: The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion: After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.

Performance drivers of UK unlisted property funds

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An unlisted property fund is a private investment vehicle which aims to provide direct property total returns and may also employ financial leverage which will accentuate performance. They have become a far more prevalent institutional property investment conduit since the early 2000’s. Investors have been primarily attracted to them due to the ease of executing a property exposure, both domestically and internationally, and for their diversification benefits given the capital intensive nature of constructing a well diversified commercial property investment portfolio. However, despite their greater prominence there has been little academic research conducted on the performance and risks of unlisted property fund investments. This can be attributed to a paucity of available data and limited time series where it exists. In this study we have made use of a unique dataset of institutional UK unlisted non-listed property funds over the period 2003Q4 to 2011Q4, using a panel modelling framework in order to determine the key factors which impact on fund performance. The sample provided a rich set of unlisted property fund factors including market exposures, direct property characteristics and the level of financial leverage employed. The findings from the panel regression analysis show that a small number of variables are able to account for the performance of unlisted property funds. These variables should be considered by investors when assessing the risk and return of these vehicles. The impact of financial leverage upon the performance of these vehicles through the recent global financial crisis and subsequent UK commercial property market downturn was also studied. The findings indicate a significant asymmetric effect of employing debt finance within unlisted property funds.

An overview of the use of neural networks for data mining tasks

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the recent years, the area of data mining has been experiencing considerable demand for technologies that extract knowledge from large and complex data sources. There has been substantial commercial interest as well as active research in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from large datasets. Artificial neural networks (NNs) are popular biologically-inspired intelligent methodologies, whose classification, prediction, and pattern recognition capabilities have been utilized successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction, and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks. © 2012 Wiley Periodicals, Inc.

Scaling up data mining techniques to large datasets using parallel and distributed processing

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.

Current problems in four-dimensional data assimilation

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this lecture is to review recent development in data analysis, initialization and data assimilation. The development of 3-dimensional multivariate schemes has been very timely because of its suitability to handle the many different types of observations during FGGE. Great progress has taken place in the initialization of global models by the aid of non-linear normal mode technique. However, in spite of great progress, several fundamental problems are still unsatisfactorily solved. Of particular importance is the question of the initialization of the divergent wind fields in the Tropics and to find proper ways to initialize weather systems driven by non-adiabatic processes. The unsatisfactory ways in which such processes are being initialized are leading to excessively long spin-up times.

The world at one's fingertips: interactive interpretation of environmental data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This chapter introduces the latest practices and technologies in the interactive interpretation of environmental data. With environmental data becoming ever larger, more diverse and more complex, there is a need for a new generation of tools that provides new capabilities over and above those of the standard workhorses of science. These new tools aid the scientist in discovering interesting new features (and also problems) in large datasets by allowing the data to be explored interactively using simple, intuitive graphical tools. In this way, new discoveries are made that are commonly missed by automated batch data processing. This chapter discusses the characteristics of environmental science data, common current practice in data analysis and the supporting tools and infrastructure. New approaches are introduced and illustrated from the points of view of both the end user and the underlying technology. We conclude by speculating as to future developments in the field and what must be achieved to fulfil this vision.

Error, reproducibility and sensitivity: a pipeline for data processing of Agilent oligonucleotide expression arrays

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.

Storing and manipulating environmental big data with JASMIN

Relevância:

90.00% 90.00%

Publicador:

Resumo:

JASMIN is a super-data-cluster designed to provide a high-performance high-volume data analysis environment for the UK environmental science community. Thus far JASMIN has been used primarily by the atmospheric science and earth observation communities, both to support their direct scientific workflow, and the curation of data products in the STFC Centre for Environmental Data Archival (CEDA). Initial JASMIN configuration and first experiences are reported here. Useful improvements in scientific workflow are presented. It is clear from the explosive growth in stored data and use that there was a pent up demand for a suitable big-data analysis environment. This demand is not yet satisfied, in part because JASMIN does not yet have enough compute, the storage is fully allocated, and not all software needs are met. Plans to address these constraints are introduced.

Pocket data mining - big data on small devices

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Owing to continuous advances in the computational power of handheld devices like smartphones and tablet computers, it has become possible to perform Big Data operations including modern data mining processes onboard these small devices. A decade of research has proved the feasibility of what has been termed as Mobile Data Mining, with a focus on one mobile device running data mining processes. However, it is not before 2010 until the authors of this book initiated the Pocket Data Mining (PDM) project exploiting the seamless communication among handheld devices performing data analysis tasks that were infeasible until recently. PDM is the process of collaboratively extracting knowledge from distributed data streams in a mobile computing environment. This book provides the reader with an in-depth treatment on this emerging area of research. Details of techniques used and thorough experimental studies are given. More importantly and exclusive to this book, the authors provide detailed practical guide on the deployment of PDM in the mobile environment. An important extension to the basic implementation of PDM dealing with concept drift is also reported. In the era of Big Data, potential applications of paramount importance offered by PDM in a variety of domains including security, business and telemedicine are discussed.

Pragmatic oriented data interoperability for smart healthcare information systems

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Smart healthcare is a complex domain for systems integration due to human and technical factors and heterogeneous data sources involved. As a part of smart city, it is such a complex area where clinical functions require smartness of multi-systems collaborations for effective communications among departments, and radiology is one of the areas highly relies on intelligent information integration and communication. Therefore, it faces many challenges regarding integration and its interoperability such as information collision, heterogeneous data sources, policy obstacles, and procedure mismanagement. The purpose of this study is to conduct an analysis of data, semantic, and pragmatic interoperability of systems integration in radiology department, and to develop a pragmatic interoperability framework for guiding the integration. We select an on-going project at a local hospital for undertaking our case study. The project is to achieve data sharing and interoperability among Radiology Information Systems (RIS), Electronic Patient Record (EPR), and Picture Archiving and Communication Systems (PACS). Qualitative data collection and analysis methods are used. The data sources consisted of documentation including publications and internal working papers, one year of non-participant observations and 37 interviews with radiologists, clinicians, directors of IT services, referring clinicians, radiographers, receptionists and secretary. We identified four primary phases of data analysis process for the case study: requirements and barriers identification, integration approach, interoperability measurements, and knowledge foundations. Each phase is discussed and supported by qualitative data. Through the analysis we also develop a pragmatic interoperability framework that summaries the empirical findings and proposes recommendations for guiding the integration in the radiology context.

The financial and environmental performance of firms exposed to the EU Emissions Trading Scheme

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis is an empirical-based study of the European Union’s Emissions Trading Scheme (EU ETS) and its implications in terms of corporate environmental and financial performance. The novelty of this study includes the extended scope of the data coverage, as most previous studies have examined only the power sector. The use of verified emissions data of ETS-regulated firms as the environmental compliance measure and as the potential differentiating criteria that concern the valuation of EU ETS-exposed firms in the stock market is also an original aspect of this study. The study begins in Chapter 2 by introducing the background information on the emission trading system (ETS), which focuses on (i) the adoption of ETS as an environmental management instrument and (ii) the adoption of ETS by the European Union as one of its central climate policies. Chapter 3 surveys four databases that provide carbon emissions data in order to determine the most suitable source of the data to be used in the later empirical chapters. The first empirical chapter, which is also Chapter 4 of this thesis, investigates the determinants of the emissions compliance performance of the EU ETS-exposed firms through constructing the best possible performance ratio from verified emissions data and self-configuring models for a panel regression analysis. Chapter 5 examines the impacts on the EU ETS-exposed firms in terms of their equity valuation with customised portfolios and multi-factor market models. The research design takes into account the emissions allowance (EUA) price as an additional factor, as it has the most direct association with the EU ETS to control for the exposure. The final empirical Chapter 6 takes the investigation one step further, by specifically testing the degree of ETS exposure facing different sectors with sector-based portfolios and an extended multi-factor market model. The findings from the emissions performance ratio analysis show that the business model of firms significantly influences emissions compliance, as the capital intensity has a positive association with the increasing emissions-to-emissions cap ratio. Furthermore, different sectors show different degrees of sensitivity towards the determining factors. The production factor influences the performance ratio of the Utilities sector, but not the Energy or Materials sectors. The results show that the capital intensity has a more profound influence on the utilities sector than on the materials sector. With regard to the financial performance impact, ETS-exposed firms as aggregate portfolios experienced a substantial underperformance during the 2001–2004 period, but not in the operating period of 2005–2011. The results of the sector-based portfolios show again the differentiating effect of the EU ETS on sectors, as one sector is priced indifferently against its benchmark, three sectors see a constant underperformance, and three sectors have altered outcomes.

«
1
2
3
4
»