26 resultados para data warehouse tuning aggregato business intelligence performance
em CentAUR: Central Archive University of Reading - UK
Resumo:
This paper discusses the problems inherent within traditional supply chain management's forecast and inventory management processes arising when tackling demand driven supply chain. A demand driven supply chain management architecture developed by Orchestr8 Ltd., U.K. is described to demonstrate its advantages over traditional supply chain management. Within this architecture, a metrics reporting system is designed by adopting business intelligence technology that supports users for decision making and planning supply activities over supply chain health.
Resumo:
Performance modelling is a useful tool in the lifeycle of high performance scientific software, such as weather and climate models, especially as a means of ensuring efficient use of available computing resources. In particular, sufficiently accurate performance prediction could reduce the effort and experimental computer time required when porting and optimising a climate model to a new machine. In this paper, traditional techniques are used to predict the computation time of a simple shallow water model which is illustrative of the computation (and communication) involved in climate models. These models are compared with real execution data gathered on AMD Opteron-based systems, including several phases of the U.K. academic community HPC resource, HECToR. Some success is had in relating source code to achieved performance for the K10 series of Opterons, but the method is found to be inadequate for the next-generation Interlagos processor. The experience leads to the investigation of a data-driven application benchmarking approach to performance modelling. Results for an early version of the approach are presented using the shallow model as an example.
Resumo:
Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.
Resumo:
Multiple versions of information and associated problems are well documented in both academic research and industry best practices. Many solutions have proposed a single version of the truth, with Business intelligence being adopted by many organizations. Business Intelligence (BI), however, is largely based on the collection of data, processing and presentation of information to meet different stakeholders’ requirement. This paper reviews the promise of Enterprise Intelligence, which promises to support decision-making based on a defined strategic understanding of the organizations goals and a unified version of the truth.
Resumo:
With the increasing awareness of protein folding disorders, the explosion of genomic information, and the need for efficient ways to predict protein structure, protein folding and unfolding has become a central issue in molecular sciences research. Molecular dynamics computer simulations are increasingly employed to understand the folding and unfolding of proteins. Running protein unfolding simulations is computationally expensive and finding ways to enhance performance is a grid issue on its own. However, more and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. This paper describes efforts to provide a grid-enabled data warehouse for protein unfolding data. We outline the challenge and present first results in the design and implementation of the data warehouse.
Resumo:
This paper reviews the literature concerning the practice of using Online Analytical Processing (OLAP) systems to recall information stored by Online Transactional Processing (OLTP) systems. Such a review provides a basis for discussion on the need for the information that are recalled through OLAP systems to maintain the contexts of transactions with the data captured by the respective OLTP system. The paper observes an industry trend involving the use of OLTP systems to process information into data, which are then stored in databases without the business rules that were used to process information and data stored in OLTP databases without associated business rules. This includes the necessitation of a practice, whereby, sets of business rules are used to extract, cleanse, transform and load data from disparate OLTP systems into OLAP databases to support the requirements for complex reporting and analytics. These sets of business rules are usually not the same as business rules used to capture data in particular OLTP systems. The paper argues that, differences between the business rules used to interpret these same data sets, risk gaps in semantics between information captured by OLTP systems and information recalled through OLAP systems. Literature concerning the modeling of business transaction information as facts with context as part of the modelling of information systems were reviewed to identify design trends that are contributing to the design quality of OLTP and OLAP systems. The paper then argues that; the quality of OLTP and OLAP systems design has a critical dependency on the capture of facts with associated context, encoding facts with contexts into data with business rules, storage and sourcing of data with business rules, decoding data with business rules into the facts with the context and recall of facts with associated contexts. The paper proposes UBIRQ, a design model to aid the co-design of data with business rules storage for OLTP and OLAP purposes. The proposed design model provides the opportunity for the implementation and use of multi-purpose databases, and business rules stores for OLTP and OLAP systems. Such implementations would enable the use of OLTP systems to record and store data with executions of business rules, which will allow for the use of OLTP and OLAP systems to query data with business rules used to capture the data. Thereby ensuring information recalled via OLAP systems preserves the contexts of transactions as per the data captured by the respective OLTP system.
Resumo:
The disuse hypothesis of cognitive aging attributes decrements in fluid intelligence in older adults to reduced cognitively stimulating activity. This study experimentally tested the hypothesis that a period of increased mentally stimulating activities thus would enhance older adults' fluid intelligence performance. Participants (N = 44, mean age 67.82) were administered pre- and post-test measures, including the fluid intelligence measure, Cattell's Culture Fair (CCF) test. Experimental participants engaged in diverse, novel, mentally stimulating activities for 10-12 weeks and were compared to a control condition. Results supported the hypothesis; the experimental group showed greater pre- to post-CCF gain than did controls (effect size d = 0.56), with a similar gain on a spatial-perceptual task (WAIS-R Blocks). Even brief periods of increased cognitive stimulation can improve older adults' problem solving and flexible thinking.
Resumo:
In this paper, we address issues in segmentation Of remotely sensed LIDAR (LIght Detection And Ranging) data. The LIDAR data, which were captured by airborne laser scanner, contain 2.5 dimensional (2.5D) terrain surface height information, e.g. houses, vegetation, flat field, river, basin, etc. Our aim in this paper is to segment ground (flat field)from non-ground (houses and high vegetation) in hilly urban areas. By projecting the 2.5D data onto a surface, we obtain a texture map as a grey-level image. Based on the image, Gabor wavelet filters are applied to generate Gabor wavelet features. These features are then grouped into various windows. Among these windows, a combination of their first and second order of statistics is used as a measure to determine the surface properties. The test results have shown that ground areas can successfully be segmented from LIDAR data. Most buildings and high vegetation can be detected. In addition, Gabor wavelet transform can partially remove hill or slope effects in the original data by tuning Gabor parameters.
Resumo:
Decision theory is the study of models of judgement involved in, and leading to, deliberate and (usually) rational choice. In real estate investment there are normative models for the allocation of assets. These asset allocation models suggest an optimum allocation between the respective asset classes based on the investors’ judgements of performance and risk. Real estate is selected, as other assets, on the basis of some criteria, e.g. commonly its marginal contribution to the production of a mean variance efficient multi asset portfolio, subject to the investor’s objectives and capital rationing constraints. However, decisions are made relative to current expectations and current business constraints. Whilst a decision maker may believe in the required optimum exposure levels as dictated by an asset allocation model, the final decision may/will be influenced by factors outside the parameters of the mathematical model. This paper discusses investors' perceptions and attitudes toward real estate and highlights the important difference between theoretical exposure levels and pragmatic business considerations. It develops a model to identify “soft” parameters in decision making which will influence the optimal allocation for that asset class. This “soft” information may relate to behavioural issues such as the tendency to mirror competitors; a desire to meet weight of money objectives; a desire to retain the status quo and many other non-financial considerations. The paper aims to establish the place of property in multi asset portfolios in the UK and examine the asset allocation process in practice, with a view to understanding the decision making process and to look at investors’ perceptions based on an historic analysis of market expectation; a comparison with historic data and an analysis of actual performance.
Resumo:
OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
Resumo:
The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.
Resumo:
The concepts of on-line transactional processing (OLTP) and on-line analytical processing (OLAP) are often confused with the technologies or models that are used to design transactional and analytics based information systems. This in some way has contributed to existence of gaps between the semantics in information captured during transactional processing and information stored for analytical use. In this paper, we propose the use of a unified semantics design model, as a solution to help bridge the semantic gaps between data captured by OLTP systems and the information provided by OLAP systems. The central focus of this design approach is on enabling business intelligence using not just data, but data with context.
Resumo:
We pursue the first large-scale investigation of a strongly growing mutual fund type: Islamic funds. Based on an unexplored, survivorship bias-adjusted data set, we analyse the financial performance and investment style of 265 Islamic equity funds from 20 countries. As Islamic funds often have diverse investment regions, we develop a (conditional) three-level Carhart model to simultaneously control for exposure to different national, regional and global equity markets and investment styles. Consistent with recent evidence for conventional funds, we find Islamic funds to display superior learning in more developed Islamic financial markets. While Islamic funds from these markets are competitive to international equity benchmarks, funds from especially Western nations with less Islamic assets tend to significantly underperform. Islamic funds’ investment style is somewhat tilted towards growth stocks. Funds from predominantly Muslim economies also show a clear small cap preference. These results are consistent over time and robust to time varying market exposures and capital market restrictions.
Resumo:
This study investigates the differential impact that various dimensions of corporate social performance have on the pricing of corporate debt as well as the assessment of the credit quality of specific bond issues. The empirical analysis, based on an extensive longitudinal data set, suggests that overall, good performance is rewarded and corporate social transgressions are penalized through lower and higher corporate bond yield spreads, respectively. Similar conclusions can be drawn when focusing on either the bond rating assigned to a specific debt issue or the probability of it being considered to be an asset of speculative grade.