952 resultados para financial data processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Commercially available software packages for IBM PC-compatibles are evaluated to use for data acquisition and processing work. Moss Landing Marine Laboratories (MLML) acquired computers since 1978 to use on shipboard data acquisition (Le. CTD, radiometric, etc.) and data processing. First Hewlett-Packard desktops were used then a transition to the DEC VAXstations, with software developed mostly by the author and others at MLML (Broenkow and Reaves, 1993; Feinholz and Broenkow, 1993; Broenkow et al, 1993). IBM PC were at first very slow and limited in available software, so they were not used in the early days. Improved technology such as higher speed microprocessors and a wide range of commercially available software made use of PC more reasonable today. MLML is making a transition towards using the PC for data acquisition and processing. Advantages are portability and available outside support.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal of this study is to analyze the dynamical properties of financial data series from nineteen worldwide stock market indices (SMI) during the period 1995–2009. SMI reveal a complex behavior that can be explored since it is available a considerable volume of data. In this paper is applied the window Fourier transform and methods of fractional calculus. The results reveal classification patterns typical of fractional order systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The high-throughput experimental data from the new gene microarray technology has spurred numerous efforts to find effective ways of processing microarray data for revealing real biological relationships among genes. This work proposes an innovative data pre-processing approach to identify noise data in the data sets and eliminate or reduce the impact of the noise data on gene clustering, With the proposed algorithm, the pre-processed data sets make the clustering results stable across clustering algorithms with different similarity metrics, the important information of genes and features is kept, and the clustering quality is improved. The primary evaluation on real microarray data sets has shown the effectiveness of the proposed algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the explosion of big data, processing large numbers of continuous data streams, i.e., big data stream processing (BDSP), has become a crucial requirement for many scientific and industrial applications in recent years. By offering a pool of computation, communication and storage resources, public clouds, like Amazon's EC2, are undoubtedly the most efficient platforms to meet the ever-growing needs of BDSP. Public cloud service providers usually operate a number of geo-distributed datacenters across the globe. Different datacenter pairs are with different inter-datacenter network costs charged by Internet Service Providers (ISPs). While, inter-datacenter traffic in BDSP constitutes a large portion of a cloud provider's traffic demand over the Internet and incurs substantial communication cost, which may even become the dominant operational expenditure factor. As the datacenter resources are provided in a virtualized way, the virtual machines (VMs) for stream processing tasks can be freely deployed onto any datacenters, provided that the Service Level Agreement (SLA, e.g., quality-of-information) is obeyed. This raises the opportunity, but also a challenge, to explore the inter-datacenter network cost diversities to optimize both VM placement and load balancing towards network cost minimization with guaranteed SLA. In this paper, we first propose a general modeling framework that describes all representative inter-task relationship semantics in BDSP. Based on our novel framework, we then formulate the communication cost minimization problem for BDSP into a mixed-integer linear programming (MILP) problem and prove it to be NP-hard. We then propose a computation-efficient solution based on MILP. The high efficiency of our proposal is validated by extensive simulation based studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Until mid 2006, SCIAMACHY data processors for the operational retrieval of nitrogen dioxide (NO2) column data were based on the historical version 2 of the GOME Data Processor (GDP). On top of known problems inherent to GDP 2, ground-based validations of SCIAMACHY NO2 data revealed issues specific to SCIAMACHY, like a large cloud-dependent offset occurring at Northern latitudes. In 2006, the GDOAS prototype algorithm of the improved GDP version 4 was transferred to the off-line SCIAMACHY Ground Processor (SGP) version 3.0. In parallel, the calibration of SCIAMACHY radiometric data was upgraded. Before operational switch-on of SGP 3.0 and public release of upgraded SCIAMACHY NO2 data, we have investigated the accuracy of the algorithm transfer: (a) by checking the consistency of SGP 3.0 with prototype algorithms; and (b) by comparing SGP 3.0 NO2 data with ground-based observations reported by the WMO/GAW NDACC network of UV-visible DOAS/SAOZ spectrometers. This delta-validation study concludes that SGP 3.0 is a significant improvement with respect to the previous processor IPF 5.04. For three particular SCIAMACHY states, the study reveals unexplained features in the slant columns and air mass factors, although the quantitative impact on SGP 3.0 vertical columns is not significant.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Personal data is a key asset for many companies, since this is the essence in providing personalized services. Not all companies, and specifically new entrants to the markets, have the opportunity to access the data they need to run their business. In this paper, we describe a comprehensive personal data framework that allows service providers to share and exchange personal data and knowledge about users, while facilitating users to decide who can access which data and why. We analyze the challenges related to personal data collection, integration, retrieval, and identity and privacy management, and present the framework architecture that addresses them. We also include the validation of the framework in a banking scenario, where social and financial data is collected and properly combined to generate new socio-economic knowledge about users that is then used by a personal lending service.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis describes the development of a complete data visualisation system for large tabular databases, such as those commonly found in a business environment. A state-of-the-art 'cyberspace cell' data visualisation technique was investigated and a powerful visualisation system using it was implemented. Although allowing databases to be explored and conclusions drawn, it had several drawbacks, the majority of which were due to the three-dimensional nature of the visualisation. A novel two-dimensional generic visualisation system, known as MADEN, was then developed and implemented, based upon a 2-D matrix of 'density plots'. MADEN allows an entire high-dimensional database to be visualised in one window, while permitting close analysis in 'enlargement' windows. Selections of records can be made and examined, and dependencies between fields can be investigated in detail. MADEN was used as a tool for investigating and assessing many data processing algorithms, firstly data-reducing (clustering) methods, then dimensionality-reducing techniques. These included a new 'directed' form of principal components analysis, several novel applications of artificial neural networks, and discriminant analysis techniques which illustrated how groups within a database can be separated. To illustrate the power of the system, MADEN was used to explore customer databases from two financial institutions, resulting in a number of discoveries which would be of interest to a marketing manager. Finally, the database of results from the 1992 UK Research Assessment Exercise was analysed. Using MADEN allowed both universities and disciplines to be graphically compared, and supplied some startling revelations, including empirical evidence of the 'Oxbridge factor'.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as “histogram binning” inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. ^ Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. ^ The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. ^ These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. ^ In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as "histogram binning" inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The advancement of GPS technology has made it possible to use GPS devices as orientation and navigation tools, but also as tools to track spatiotemporal information. GPS tracking data can be broadly applied in location-based services, such as spatial distribution of the economy, transportation routing and planning, traffic management and environmental control. Therefore, knowledge of how to process the data from a standard GPS device is crucial for further use. Previous studies have considered various issues of the data processing at the time. This paper, however, aims to outline a general procedure for processing GPS tracking data. The procedure is illustrated step-by-step by the processing of real-world GPS data of car movements in Borlänge in the centre of Sweden.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The health system is one sector dealing with a deluge of complex data. Many healthcare organisations struggle to utilise these volumes of health data effectively and efficiently. Also, there are many healthcare organisations, which still have stand-alone systems, not integrated for management of information and decision-making. This shows, there is a need for an effective system to capture, collate and distribute this health data. Therefore, implementing the data warehouse concept in healthcare is potentially one of the solutions to integrate health data. Data warehousing has been used to support business intelligence and decision-making in many other sectors such as the engineering, defence and retail sectors. The research problem that is going to be addressed is, "how can data warehousing assist the decision-making process in healthcare". To address this problem the researcher has narrowed an investigation focusing on a cardiac surgery unit. This research used the cardiac surgery unit at the Prince Charles Hospital (TPCH) as the case study. The cardiac surgery unit at TPCH uses a stand-alone database of patient clinical data, which supports clinical audit, service management and research functions. However, much of the time, the interaction between the cardiac surgery unit information system with other units is minimal. There is a limited and basic two-way interaction with other clinical and administrative databases at TPCH which support decision-making processes. The aims of this research are to investigate what decision-making issues are faced by the healthcare professionals with the current information systems and how decision-making might be improved within this healthcare setting by implementing an aligned data warehouse model or models. As a part of the research the researcher will propose and develop a suitable data warehouse prototype based on the cardiac surgery unit needs and integrating the Intensive Care Unit database, Clinical Costing unit database (Transition II) and Quality and Safety unit database [electronic discharge summary (e-DS)]. The goal is to improve the current decision-making processes. The main objectives of this research are to improve access to integrated clinical and financial data, providing potentially better information for decision-making for both improved from the questionnaire and by referring to the literature, the results indicate a centralised data warehouse model for the cardiac surgery unit at this stage. A centralised data warehouse model addresses current needs and can also be upgraded to an enterprise wide warehouse model or federated data warehouse model as discussed in the many consulted publications. The data warehouse prototype was able to be developed using SAS enterprise data integration studio 4.2 and the data was analysed using SAS enterprise edition 4.3. In the final stage, the data warehouse prototype was evaluated by collecting feedback from the end users. This was achieved by using output created from the data warehouse prototype as examples of the data desired and possible in a data warehouse environment. According to the feedback collected from the end users, implementation of a data warehouse was seen to be a useful tool to inform management options, provide a more complete representation of factors related to a decision scenario and potentially reduce information product development time. However, there are many constraints exist in this research. For example the technical issues such as data incompatibilities, integration of the cardiac surgery database and e-DS database servers and also, Queensland Health information restrictions (Queensland Health information related policies, patient data confidentiality and ethics requirements), limited availability of support from IT technical staff and time restrictions. These factors have influenced the process for the warehouse model development, necessitating an incremental approach. This highlights the presence of many practical barriers to data warehousing and integration at the clinical service level. Limitations included the use of a small convenience sample of survey respondents, and a single site case report study design. As mentioned previously, the proposed data warehouse is a prototype and was developed using only four database repositories. Despite this constraint, the research demonstrates that by implementing a data warehouse at the service level, decision-making is supported and data quality issues related to access and availability can be reduced, providing many benefits. Output reports produced from the data warehouse prototype demonstrated usefulness for the improvement of decision-making in the management of clinical services, and quality and safety monitoring for better clinical care. However, in the future, the centralised model selected can be upgraded to an enterprise wide architecture by integrating with additional hospital units’ databases.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Serving as a powerful tool for extracting localized variations in non-stationary signals, applications of wavelet transforms (WTs) in traffic engineering have been introduced; however, lacking in some important theoretical fundamentals. In particular, there is little guidance provided on selecting an appropriate WT across potential transport applications. This research described in this paper contributes uniquely to the literature by first describing a numerical experiment to demonstrate the shortcomings of commonly-used data processing techniques in traffic engineering (i.e., averaging, moving averaging, second-order difference, oblique cumulative curve, and short-time Fourier transform). It then mathematically describes WT’s ability to detect singularities in traffic data. Next, selecting a suitable WT for a particular research topic in traffic engineering is discussed in detail by objectively and quantitatively comparing candidate wavelets’ performances using a numerical experiment. Finally, based on several case studies using both loop detector data and vehicle trajectories, it is shown that selecting a suitable wavelet largely depends on the specific research topic, and that the Mexican hat wavelet generally gives a satisfactory performance in detecting singularities in traffic and vehicular data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Not-for-profit (NFP) financial ratio research has focused primarily on organisational efficiency measurements for external stakeholders. Ratios that also capture information about stability, capacity (liquidity), gearing and sustainability, enable an assessment of financial resilience. They are thus valuable tools that can provide a framework of internal accountability between boards and management. The establishment of an Australian NFP regulator highlights the importance of NFP sustainability, and affirms the timeliness of this paper. We propose a suite of key financial ratios for use by NFP boards and management, and demonstrate its practical usefulness by applying the ratios to financial data from the 2009 reports of ACFID (Australian Council for International Development)-affiliated international aid organisations.