5 resultados para functional data analysis
em CORA - Cork Open Research Archive - University College Cork - Ireland
                                
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
                                
Resumo:
Alzheimer’s Disease and other dementias are one of the most challenging illnesses confronting countries with ageing populations. Treatment options for dementia are limited, and the costs are significant. There is a growing need to develop new treatments for dementia, especially for the elderly. There is also growing evidence that centrally acting angiotensin converting enzyme (ACE) inhibitors, which cross the blood-brain barrier, are associated with a reduced rate of cognitive and functional decline in dementia, especially in Alzheimer’s disease (AD). The aim of this research is to investigate the effects of centrally acting ACE inhibitors (CACE-Is) on the rate of cognitive and functional decline in dementia, using a three phased KDD process. KDD, as a scientific way to process and analysis clinical data, is used to find useful insights from a variety of clinical databases. The data used are from three clinic databases: Geriatric Assessment Tool (GAT), the Doxycycline and Rifampin for Alzheimer’s Disease (DARAD), and the Qmci validation databases, which were derived from several different geriatric clinics in Canada. This research involves patients diagnosed with AD, vascular or mixed dementia only. Patients were included if baseline and end-point (at least six months apart) Standardised Mini-Mental State Examination (SMMSE), Quick Mild Cognitive Impairment (Qmci) or Activities Daily Living (ADL) scores were available. Basically, the rates of change are compared between patients taking CACE-Is, and those not currently treated with CACE-Is. The results suggest that there is a statistically significant difference in the rate of decline in cognitive and functional scores between CACE-I and NoCACE-I patients. This research also validates that the Qmci, a new short assessment test, has potential to replace the current popular screening tests for cognition in the clinic and clinical trials.
                                
Resumo:
Systematic, high-quality observations of the atmosphere, oceans and terrestrial environments are required to improve understanding of climate characteristics and the consequences of climate change. The overall aim of this report is to carry out a comparative assessment of approaches taken to addressing the state of European observations systems and related data analysis by some leading actors in the field. This research reports on approaches to climate observations and analyses in Ireland, Switzerland, Germany, The Netherlands and Austria and explores options for a more coordinated approach to national responses to climate observations in Europe. The key aspects addressed are: an assessment of approaches to develop GCOS and provision of analysis of GCOS data; an evaluation of how these countries are reporting development of GCOS; highlighting best practice in advancing GCOS implementation including analysis of Essential Climate Variables (ECVs); a comparative summary of the differences and synergies in terms of the reporting of climate observations; an overview of relevant European initiatives and recommendations on how identified gaps might be addressed in the short to medium term.
                                
Resumo:
Energy efficiency and user comfort have recently become priorities in the Facility Management (FM) sector. This has resulted in the use of innovative building components, such as thermal solar panels, heat pumps, etc., as they have potential to provide better performance, energy savings and increased user comfort. However, as the complexity of components increases, the requirement for maintenance management also increases. The standard routine for building maintenance is inspection which results in repairs or replacement when a fault is found. This routine leads to unnecessary inspections which have a cost with respect to downtime of a component and work hours. This research proposes an alternative routine: performing building maintenance at the point in time when the component is degrading and requires maintenance, thus reducing the frequency of unnecessary inspections. This thesis demonstrates that statistical techniques can be used as part of a maintenance management methodology to invoke maintenance before failure occurs. The proposed FM process is presented through a scenario utilising current Building Information Modelling (BIM) technology and innovative contractual and organisational models. This FM scenario supports a Degradation based Maintenance (DbM) scheduling methodology, implemented using two statistical techniques, Particle Filters (PFs) and Gaussian Processes (GPs). DbM consists of extracting and tracking a degradation metric for a component. Limits for the degradation metric are identified based on one of a number of proposed processes. These processes determine the limits based on the maturity of the historical information available. DbM is implemented for three case study components: a heat exchanger; a heat pump; and a set of bearings. The identified degradation points for each case study, from a PF, a GP and a hybrid (PF and GP combined) DbM implementation are assessed against known degradation points. The GP implementations are successful for all components. For the PF implementations, the results presented in this thesis find that the extracted metrics and limits identify degradation occurrences accurately for components which are in continuous operation. For components which have seasonal operational periods, the PF may wrongly identify degradation. The GP performs more robustly than the PF, but the PF, on average, results in fewer false positives. The hybrid implementations, which are a combination of GP and PF results, are successful for 2 of 3 case studies and are not affected by seasonal data. Overall, DbM is effectively applied for the three case study components. The accuracy of the implementations is dependant on the relationships modelled by the PF and GP, and on the type and quantity of data available. This novel maintenance process can improve equipment performance and reduce energy wastage from BSCs operation.
                                
Resumo:
An overview is given of a user interaction monitoring and analysis framework called BaranC. Monitoring and analysing human-digital interaction is an essential part of developing a user model as the basis for investigating user experience. The primary human-digital interaction, such as on a laptop or smartphone, is best understood and modelled in the wider context of the user and their environment. The BaranC framework provides monitoring and analysis capabilities that not only records all user interaction with a digital device (e.g. smartphone), but also collects all available context data (such as from sensors in the digital device itself, a fitness band or a smart appliances). The data collected by BaranC is recorded as a User Digital Imprint (UDI) which is, in effect, the user model and provides the basis for data analysis. BaranC provides functionality that is useful for user experience studies, user interface design evaluation, and providing user assistance services. An important concern for personal data is privacy, and the framework gives the user full control over the monitoring, storing and sharing of their data.
 
                    