937 resultados para Data structures (Computer science)
Resumo:
The majority of sugar mill locomotives are equipped with GPS devices from which locomotive position data is stored. Locomotive run information (e.g. start times, run destinations and activities) is electronically stored in software called TOTools. The latest software development allows TOTools to interpret historical GPS information by combining this data with run information recorded in TOTools and geographic information from a GIS application called MapInfo. As a result, TOTools is capable of summarising run activity details such as run start and finish times and shunt activities with great accuracy. This paper presents 15 reports developed to summarise run activities and speed information. The reports will be of use pre-season to assist in developing the next year's schedule and for determining priorities for investment in the track infrastructure. They will also be of benefit during the season to closely monitor locomotive run performance against the existing schedule.
Resumo:
Available industrial energy meters offer high accuracy and reliability, but are typically expensive and low-bandwidth, making them poorly suited to multi-sensor data acquisition schemes and power quality analysis. An alternative measurement system is proposed in this paper that is highly modular, extensible and compact. To minimise cost, the device makes use of planar coreless PCB transformers to provide galvanic isolation for both power and data. Samples from multiple acquisition devices may be concentrated by a central processor before integration with existing host control systems. This paper focusses on the practical design and implementation of planar coreless PCB transformers to facilitate the module's isolated power, clock and data signal transfer. Calculations necessary to design coreless PCB transformers, and circuits designed for the transformer's practical application in the measurement module are presented. The designed transformer and each application circuit have been experimentally verified, with test data and conclusions made applicable to coreless PCB transformers in general.
Resumo:
Most real-life data analysis problems are difficult to solve using exact methods, due to the size of the datasets and the nature of the underlying mechanisms of the system under investigation. As datasets grow even larger, finding the balance between the quality of the approximation and the computing time of the heuristic becomes non-trivial. One solution is to consider parallel methods, and to use the increased computational power to perform a deeper exploration of the solution space in a similar time. It is, however, difficult to estimate a priori whether parallelisation will provide the expected improvement. In this paper we consider a well-known method, genetic algorithms, and evaluate on two distinct problem types the behaviour of the classic and parallel implementations.
Resumo:
This paper addresses the development of trust in the use of Open Data through incorporation of appropriate authentication and integrity parameters for use by end user Open Data application developers in an architecture for trustworthy Open Data Services. The advantages of this architecture scheme is that it is far more scalable, not another certificate-based hierarchy that has problems with certificate revocation management. With the use of a Public File, if the key is compromised: it is a simple matter of the single responsible entity replacing the key pair with a new one and re-performing the data file signing process. Under this proposed architecture, the the Open Data environment does not interfere with the internal security schemes that might be employed by the entity. However, this architecture incorporates, when needed, parameters from the entity, e.g. person who authorized publishing as Open Data, at the time that datasets are created/added.
Resumo:
Relative abundance data is common in the life sciences, but appreciation that it needs special analysis and interpretation is scarce. Correlation is popular as a statistical measure of pairwise association but should not be used on data that carry only relative information. Using timecourse yeast gene expression data, we show how correlation of relative abundances can lead to conclusions opposite to those drawn from absolute abundances, and that its value changes when different components are included in the analysis. Once all absolute information has been removed, only a subset of those associations will reliably endure in the remaining relative data, specifically, associations where pairs of values behave proportionally across observations. We propose a new statistic φ to describe the strength of proportionality between two variables and demonstrate how it can be straightforwardly used instead of correlation as the basis of familiar analyses and visualization methods.
Resumo:
Large volumes of heterogeneous health data silos pose a big challenge when exploring for information to allow for evidence based decision making and ensuring quality outcomes. In this paper, we present a proof of concept for adopting data warehousing technology to aggregate and analyse disparate health data in order to understand the impact various lifestyle factors on obesity. We present a practical model for data warehousing with detailed explanation which can be adopted similarly for studying various other health issues.
Resumo:
Decision-making is such an integral aspect in health care routine that the ability to make the right decisions at crucial moments can lead to patient health improvements. Evidence-based practice, the paradigm used to make those informed decisions, relies on the use of current best evidence from systematic research such as randomized controlled trials. Limitations of the outcomes from randomized controlled trials (RCT), such as “quantity” and “quality” of evidence generated, has lowered healthcare professionals’ confidence in using EBP. An alternate paradigm of Practice-Based Evidence has evolved with the key being evidence drawn from practice settings. Through the use of health information technology, electronic health records (EHR) capture relevant clinical practice “evidence”. A data-driven approach is proposed to capitalize on the benefits of EHR. The issues of data privacy, security and integrity are diminished by an information accountability concept. Data warehouse architecture completes the data-driven approach by integrating health data from multi-source systems, unique within the healthcare environment.
Resumo:
The workshop is an activity of the IMIA Working Group ‘Security in Health Information Systems’ (SiHIS). It is focused to the growing global problem: how to protect personal health data in today’s global eHealth and digital health environment. It will review available trust building mechanisms, security measures and privacy policies. Technology alone does not solve this complex problem and current protection policies and legislation are considered woefully inadequate. Among other trust building tools, certification and accreditation mechanisms are dis-cussed in detail and the workshop will determine their acceptance and quality. The need for further research and international collective action are discussed. This workshop provides an opportunity to address a critical growing problem and make pragmatic proposals for sustainable and effective solutions for global eHealth and digital health.
Resumo:
Clinical Data Warehousing: A Business Analytic approach for managing health data
Resumo:
This paper analyzes the limitations upon the amount of in- domain (NIST SREs) data required for training a probabilistic linear discriminant analysis (PLDA) speaker verification system based on out-domain (Switchboard) total variability subspaces. By limiting the number of speakers, the number of sessions per speaker and the length of active speech per session available in the target domain for PLDA training, we investigated the relative effect of these three parameters on PLDA speaker verification performance in the NIST 2008 and NIST 2010 speaker recognition evaluation datasets. Experimental results indicate that while these parameters depend highly on each other, to beat out-domain PLDA training, more than 10 seconds of active speech should be available for at least 4 sessions/speaker for a minimum of 800 speakers. If further data is available, considerable improvement can be made over solely out-domain PLDA training.
Resumo:
This paper proposes the addition of a weighted median Fisher discriminator (WMFD) projection prior to length-normalised Gaussian probabilistic linear discriminant analysis (GPLDA) modelling in order to compensate the additional session variation. In limited microphone data conditions, a linear-weighted approach is introduced to increase the influence of microphone speech dataset. The linear-weighted WMFD-projected GPLDA system shows improvements in EER and DCF values over the pooled LDA- and WMFD-projected GPLDA systems in inter-view-interview condition as WMFD projection extracts more speaker discriminant information with limited number of sessions/ speaker data, and linear-weighted GPLDA approach estimates reliable model parameters with limited microphone data.
Resumo:
Fusing data from multiple sensing modalities, e.g. laser and radar, is a promising approach to achieve resilient perception in challenging environmental conditions. However, this may lead to \emph{catastrophic fusion} in the presence of inconsistent data, i.e. when the sensors do not detect the same target due to distinct attenuation properties. It is often difficult to discriminate consistent from inconsistent data across sensing modalities using local spatial information alone. In this paper we present a novel consistency test based on the log marginal likelihood of a Gaussian process model that evaluates data from range sensors in a relative manner. A new data point is deemed to be consistent if the model statistically improves as a result of its fusion. This approach avoids the need for absolute spatial distance threshold parameters as required by previous work. We report results from object reconstruction with both synthetic and experimental data that demonstrate an improvement in reconstruction quality, particularly in cases where data points are inconsistent yet spatially proximal.
Resumo:
Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.
Resumo:
Big data analysis in healthcare sector is still in its early stages when comparing with that of other business sectors due to numerous reasons. Accommodating the volume, velocity and variety of healthcare data Identifying platforms that examine data from multiple sources, such as clinical records, genomic data, financial systems, and administrative systems Electronic Health Record (EHR) is a key information resource for big data analysis and is also composed of varied co-created values. Successful integration and crossing of different subfields of healthcare data such as biomedical informatics and health informatics could lead to huge improvement for the end users of the health care system, i.e. the patients.
Resumo:
Huge amount of data are generated from a variety of information sources in healthcare while the data sources originate from a veracity of clinical information systems and corporate data warehouses. The data derived from the above data sources are used for analysis and trending purposes thus playing an influential role as a real time decision-making tool. The unstructured, narrative data provided by these data sources qualify as healthcare big-data and researchers argue that the application of big-data in healthcare might enable the accountability and efficiency.