Biblioteca Digital

32 resultados para Diagnostic imaging - Data processing

Data processing of physiological sensor data and alarm determination utilising activity recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current physiological sensors are passive and transmit sensed data to Monitoring centre (MC) through wireless body area network (WBAN) without processing data intelligently. We propose a solution to discern data requestors for prioritising and inferring data to reduce transactions and conserve battery power, which is important requirements of mobile health (mHealth). However, there is a problem for alarm determination without knowing the activity of the user. For example, 170 beats per minute of heart rate can be normal during exercising, however an alarm should be raised if this figure has been sensed during sleep. To solve this problem, we suggest utilising the existing activity recognition (AR) applications. Most of health related wearable devices include accelerometers along with physiological sensors. This paper presents a novel approach and solution to utilise physiological data with AR so that they can provide not only improved and efficient services such as alarm determination but also provide richer health information which may provide content for new markets as well as additional application services such as converged mobile health with aged care services. This has been verified by experimented tests using vital signs such as heart pulse rate, respiration rate and body temperature with a demonstrated outcome of AR accelerometer sensors integrated with an Android app.

Enhancement of sensor data transmission by inference and efficient data processing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When wearable and personal health device and sensors capture data such as heart rate and body temperature for fitness tracking and health services, they simply transfer data without filtering or optimising. This can cause over-loading to the sensors as well as rapid battery consumption when they interact with Internet of Things (IoT) networks, which are expected to increase and de-mand more health data from device wearers. To solve the problem, this paper proposes to infer sensed data to reduce the data volume, which will affect the bandwidth and battery power reduction that are essential requirements to sensor devices. This is achieved by applying beacon data points after the inferencing of data processing utilising variance rates, which compare the sensed data with ad-jacent data before and after. This novel approach verifies by experiments that data volume can be saved by up to 99.5% with a 98.62% accuracy. Whilst most existing works focus on sensor network improvements such as routing, operation and reading data algorithms, we efficiently reduce data volume to reduce band-width and battery power consumption while maintaining accuracy by implement-ing intelligence and optimisation in sensor devices.

Data pre-processing for more effective gene clustering

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The high-throughput experimental data from the new gene microarray technology has spurred numerous efforts to find effective ways of processing microarray data for revealing real biological relationships among genes. This work proposes an innovative data pre-processing approach to identify noise data in the data sets and eliminate or reduce the impact of the noise data on gene clustering, With the proposed algorithm, the pre-processed data sets make the clustering results stable across clustering algorithms with different similarity metrics, the important information of genes and features is kept, and the clustering quality is improved. The primary evaluation on real microarray data sets has shown the effectiveness of the proposed algorithm.

A general communication cost optimization framework for big data stream processing in geo-distributed data centers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the explosion of big data, processing large numbers of continuous data streams, i.e., big data stream processing (BDSP), has become a crucial requirement for many scientific and industrial applications in recent years. By offering a pool of computation, communication and storage resources, public clouds, like Amazon's EC2, are undoubtedly the most efficient platforms to meet the ever-growing needs of BDSP. Public cloud service providers usually operate a number of geo-distributed datacenters across the globe. Different datacenter pairs are with different inter-datacenter network costs charged by Internet Service Providers (ISPs). While, inter-datacenter traffic in BDSP constitutes a large portion of a cloud provider's traffic demand over the Internet and incurs substantial communication cost, which may even become the dominant operational expenditure factor. As the datacenter resources are provided in a virtualized way, the virtual machines (VMs) for stream processing tasks can be freely deployed onto any datacenters, provided that the Service Level Agreement (SLA, e.g., quality-of-information) is obeyed. This raises the opportunity, but also a challenge, to explore the inter-datacenter network cost diversities to optimize both VM placement and load balancing towards network cost minimization with guaranteed SLA. In this paper, we first propose a general modeling framework that describes all representative inter-task relationship semantics in BDSP. Based on our novel framework, we then formulate the communication cost minimization problem for BDSP into a mixed-integer linear programming (MILP) problem and prove it to be NP-hard. We then propose a computation-efficient solution based on MILP. The high efficiency of our proposal is validated by extensive simulation based studies.

Agents and stream data mining: a new perspective

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many organizations struggle with the massive amount of data they collect. Today, data does more than serve as the ingredients for churning out statistical reports. They help support efficient operations in many organizations, and to some extent, data provide the competitive intelligence organizations need to survive in today's economy. Data mining can't always deliver timely and relevant results because data are constantly changing. However, stream-data processing might be more effective, judging by the Matrix project.

Learning from large data : bias, variance, sampling, and learning curves

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the fundamental machine learning tasks is that of predictive classification. Given that organisations collect an ever increasing amount of data, predictive classification methods must be able to effectively and efficiently handle large amounts of data. However, it is understood that present requirements push existing algorithms to, and sometimes beyond, their limits since many classification prediction algorithms were designed when currently common data set sizes were beyond imagination. This has led to a significant amount of research into ways of making classification learning algorithms more effective and efficient. Although substantial progress has been made, a number of key questions have not been answered. This dissertation investigates two of these key questions. The first is whether different types of algorithms to those currently employed are required when using large data sets. This is answered by analysis of the way in which the bias plus variance decomposition of predictive classification error changes as training set size is increased. Experiments find that larger training sets require different types of algorithms to those currently used. Some insight into the characteristics of suitable algorithms is provided, and this may provide some direction for the development of future classification prediction algorithms which are specifically designed for use with large data sets. The second question investigated is that of the role of sampling in machine learning with large data sets. Sampling has long been used as a means of avoiding the need to scale up algorithms to suit the size of the data set by scaling down the size of the data sets to suit the algorithm. However, the costs of performing sampling have not been widely explored. Two popular sampling methods are compared with learning from all available data in terms of predictive accuracy, model complexity, and execution time. The comparison shows that sub-sampling generally products models with accuracy close to, and sometimes greater than, that obtainable from learning with all available data. This result suggests that it may be possible to develop algorithms that take advantage of the sub-sampling methodology to reduce the time required to infer a model while sacrificing little if any accuracy. Methods of improving effective and efficient learning via sampling are also investigated, and now sampling methodologies proposed. These methodologies include using a varying-proportion of instances to determine the next inference step and using a statistical calculation at each inference step to determine sufficient sample size. Experiments show that using a statistical calculation of sample size can not only substantially reduce execution time but can do so with only a small loss, and occasional gain, in accuracy. One of the common uses of sampling is in the construction of learning curves. Learning curves are often used to attempt to determine the optimal training size which will maximally reduce execution time while nut being detrimental to accuracy. An analysis of the performance of methods for detection of convergence of learning curves is performed, with the focus of the analysis on methods that calculate the gradient, of the tangent to the curve. Given that such methods can be susceptible to local accuracy plateaus, an investigation into the frequency of local plateaus is also performed. It is shown that local accuracy plateaus are a common occurrence, and that ensuring a small loss of accuracy often results in greater computational cost than learning from all available data. These results cast doubt over the applicability of gradient of tangent methods for detecting convergence, and of the viability of learning curves for reducing execution time in general.

Instructional strategies integrating cognitive style construct : a meta-knowledge processing model

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The overarching goal of this dissertation was to evaluate the contextual components of instructional strategies for the acquisition of complex programming concepts. A meta-knowledge processing model is proposed, on the basis of the research findings, thereby facilitating the selection of media treatment for electronic courseware. When implemented, this model extends the work of Smith (1998), as a front-end methodology, for his glass-box interpreter called Bradman, for teaching novice programmers. Technology now provides the means to produce individualized instructional packages with relative ease. Multimedia and Web courseware development accentuate a highly graphical (or visual) approach to instructional formats. Typically, little consideration is given to the effectiveness of screen-based visual stimuli, and curiously, students are expected to be visually literate, despite the complexity of human-computer interaction. Visual literacy is much harder for some people to acquire than for others! (see Chapter Four: Conditions-of-the-Learner) An innovative research programme was devised to investigate the interactive effect of instructional strategies, enhanced with text-plus-textual metaphors or text-plus-graphical metaphors, and cognitive style, on the acquisition of a special category of abstract (process) programming concept. This type of concept was chosen to focus on the role of analogic knowledge involved in computer programming. The results are discussed within the context of the internal/external exchange process, drawing on Ritchey's (1980) concepts of within-item and between-item encoding elaborations. The methodology developed for the doctoral project integrates earlier research knowledge in a novel, interdisciplinary, conceptual framework, including: from instructional science in the USA, for the concept learning models; British cognitive psychology and human memory research, for defining the cognitive style construct; and Australian educational research, to provide the measurement tools for instructional outcomes. The experimental design consisted of a screening test to determine cognitive style, a pretest to determine prior domain knowledge in abstract programming knowledge elements, the instruction period, and a post-test to measure improved performance. This research design provides a three-level discovery process to articulate: 1) the fusion of strategic knowledge required by the novice learner for dealing with contexts within instructional strategies 2) acquisition of knowledge using measurable instructional outcome and learner characteristics 3) knowledge of the innate environmental factors which influence the instructional outcomes This research has successfully identified the interactive effect of instructional strategy, within an individual's cognitive style construct, in their acquisition of complex programming concepts. However, the significance of the three-level discovery process lies in the scope of the methodology to inform the design of a meta-knowledge processing model for instructional science. Firstly, the British cognitive style testing procedure, is a low cost, user friendly, computer application that effectively measures an individual's position on the two cognitive style continua (Riding & Cheema,1991). Secondly, the QUEST Interactive Test Analysis System (Izard,1995), allows for a probabilistic determination of an individual's knowledge level, relative to other participants, and relative to test-item difficulties. Test-items can be related to skill levels, and consequently, can be used by instructional scientists to measure knowledge acquisition. Finally, an Effect Size Analysis (Cohen,1977) allows for a direct comparison between treatment groups, giving a statistical measurement of how large an effect the independent variables have on the dependent outcomes. Combined with QUEST's hierarchical positioning of participants, this tool can assist in identifying preferred learning conditions for the evaluation of treatment groups. By combining these three assessment analysis tools into instructional research, a computerized learning shell, customised for individuals' cognitive constructs can be created (McKay & Garner,1999). While this approach has widespread application, individual researchers/trainers would nonetheless, need to validate with an extensive pilot study programme (McKay,1999a; McKay,1999b), the interactive effects within their specific learning domain. Furthermore, the instructional material does not need to be limited to a textual/graphical comparison, but could be applied to any two or more instructional treatments of any kind. For instance: a structured versus exploratory strategy. The possibilities and combinations are believed to be endless, provided the focus is maintained on linking of the front-end identification of cognitive style with an improved performance outcome. My in-depth analysis provides a better understanding of the interactive effects of the cognitive style construct and instructional format on the acquisition of abstract concepts, involving spatial relations and logical reasoning. In providing the basis for a meta-knowledge processing model, this research is expected to be of interest to educators, cognitive psychologists, communications engineers and computer scientists specialising in computer-human interactions.

Information processing approach to psychopathology involving a computer simulation of depression

Relevância:

100.00% 100.00%

Publicador:

Professional development for school administrators : computer applications in data communication

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This action research project set out to develop the competence of senior personnal from a private vocational college in Thailand in the use of administrative computer systems. The findings demonstrate the critical significance of progressive incremental learning that is tailored to the professional personal needs of learners. Learner competence was found to be dependent upon the creation of an environment promoting learner confidence.

An approach to filtering RFID data streams

Relevância:

100.00% 100.00%

Publicador:

Resumo:

RFID is gaining significant thrust as the preferred choice of automatic identification and data collection system. However, there are various data processing and management problems such as missed readings and duplicate readings which hinder wide scale adoption of RFID systems. To this end we propose an approach that filters the captured data including both noise removal and duplicate elimination. Experimental results demonstrate that the proposed approach improves missed data restoration process when compared with the existing method.

Effective anomaly detection in sensor networks data streams

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses a major challenge in data mining applications where the full information about the underlying processes, such as sensor networks or large online database, cannot be practically obtained due to physical limitations such as low bandwidth or memory, storage, or computing power. Motivated by the recent theory on direct information sampling called compressed sensing (CS), we propose a framework for detecting anomalies from these largescale data mining applications where the full information is not practically possible to obtain. Exploiting the fact that the intrinsic dimension of the data in these applications are typically small relative to the raw dimension and the fact that compressed sensing is capable of capturing most information with few measurements, our work show that spectral methods that used for volume anomaly detection can be directly applied to the CS data with guarantee on performance. Our theoretical contributions are supported by extensive experimental results on large datasets which show satisfactory performance.

Dimensionality reduction of protein mass spectrometry data using random projection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Protein mass spectrometry (MS) pattern recognition has recently emerged as a new method for cancer diagnosis. Unfortunately, classification performance may degrade owing to the enormously high dimensionality of the data. This paper investigates the use of Random Projection in protein MS data dimensionality reduction. The effectiveness of Random Projection (RP) is analyzed and compared against Principal Component Analysis (PCA) by using three classification algorithms, namely Support Vector Machine, Feed-forward Neural Networks and K-Nearest Neighbour. Three real-world cancer data sets are employed to evaluate the performances of RP and PCA. Through the investigations, RP method demonstrated better or at least comparable classification performance as PCA if the dimensionality of the projection matrix is sufficiently large. This paper also explores the use of RP as a pre-processing step prior to PCA. The results show that without sacrificing classification accuracy, performing RP prior to PCA significantly improves the computational time.

Anomaly detection in large-scale data stream networks

Relevância:

100.00% 100.00%

Publicador:

Aggregation on the fly: Reducing traffic for big data in the cloud

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As a leading framework for processing and analyzing big data, MapReduce is leveraged by many enterprises to parallelize their data processing on distributed computing systems. Unfortunately, the all-to-all data forwarding from map tasks to reduce tasks in the traditional MapReduce framework would generate a large amount of network traffic. The fact that the intermediate data generated by map tasks can be combined with significant traffic reduction in many applications motivates us to propose a data aggregation scheme for MapReduce jobs in cloud. Specifically, we design an aggregation architecture under the existing MapReduce framework with the objective of minimizing the data traffic during the shuffle phase, in which aggregators can reside anywhere in the cloud. Some experimental results also show that our proposal outperforms existing work by reducing the network traffic significantly.

Big data related technologies, challenges and future prospects

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Big data is the large or complex data that exceed the processing capacity of conventional data processing systems. This book provides a big picture in this broad research area, covering all the phases of its value chains. The authors have attempted to survey most of the relevant technologies in each phrase of big data. The book is recommended for readers interested in advanced research in big data, also for industry practitioners who are interested in building big data applications. If the reader is not with necessary technical background, complementary readings may be needed.

«
1
2
3
»