156 resultados para Machine Typed Document


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Age-related Macular Degeneration (AMD) is one of the major causes of vision loss and blindness in ageing population. Currently, there is no cure for AMD, however early detection and subsequent treatment may prevent the severe vision loss or slow the progression of the disease. AMD can be classified into two types: dry and wet AMDs. The people with macular degeneration are mostly affected by dry AMD. Early symptoms of AMD are formation of drusen and yellow pigmentation. These lesions are identified by manual inspection of fundus images by the ophthalmologists. It is a time consuming, tiresome process, and hence an automated diagnosis of AMD screening tool can aid clinicians in their diagnosis significantly. This study proposes an automated dry AMD detection system using various entropies (Shannon, Kapur, Renyi and Yager), Higher Order Spectra (HOS) bispectra features, Fractional Dimension (FD), and Gabor wavelet features extracted from greyscale fundus images. The features are ranked using t-test, Kullback–Lieber Divergence (KLD), Chernoff Bound and Bhattacharyya Distance (CBBD), Receiver Operating Characteristics (ROC) curve-based and Wilcoxon ranking methods in order to select optimum features and classified into normal and AMD classes using Naive Bayes (NB), k-Nearest Neighbour (k-NN), Probabilistic Neural Network (PNN), Decision Tree (DT) and Support Vector Machine (SVM) classifiers. The performance of the proposed system is evaluated using private (Kasturba Medical Hospital, Manipal, India), Automated Retinal Image Analysis (ARIA) and STructured Analysis of the Retina (STARE) datasets. The proposed system yielded the highest average classification accuracies of 90.19%, 95.07% and 95% with 42, 54 and 38 optimal ranked features using SVM classifier for private, ARIA and STARE datasets respectively. This automated AMD detection system can be used for mass fundus image screening and aid clinicians by making better use of their expertise on selected images that require further examination.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Portable, water filled road safety barriers are used to provide protection and reduce the potential hazard due to errant vehicles in areas where the road conditions change frequently (e.g. near road work sites). As part of an effort to reduce excessive working widths typical of these systems, a study was conducted to assess the effectiveness of introducing polymeric foam filled panels into the design. Surrogate impact tests of a design typical of such as barrier system were conducted utilising a pneumatically powered horizontal impact testing machine up to impact energies of 7.40 kJ. Results of these tests are utilised to examine the barrier behaviour, in addition to being used to validate a couple FE/SPH model of the barrier system. Once validated, the FE/SPH model it utilised as the basis for a parametric study into the efficacy and effects of the inclusion of polymeric foam filled panels on the performance of portable water filled road safety barriers. It was found that extruded polystyrene foam functioned well, with a greater thickness of the foam panel significantly reducing the impacting body velocity as the barrier began to translate.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart by the sympathetic and parasympathetic branches of the autonomic nervous system. HRV analysis is an important tool to observe the heart’s ability to respond to normal regulatory impulses that affect its rhythm. Like many bio-signals, HRV signals are non-linear in nature. Higher order spectral analysis (HOS) is known to be a good tool for the analysis of non-linear systems and provides good noise immunity. A computer-based arrhythmia detection system of cardiac states is very useful in diagnostics and disease management. In this work, we studied the identification of the HRV signals using features derived from HOS. These features were fed to the support vector machine (SVM) for classification. Our proposed system can classify the normal and other four classes of arrhythmia with an average accuracy of more than 85%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of ‘topic’ concepts has shown improved search performance, given a query, by bringing together relevant documents which use different terms to describe a higher level concept. In this paper, we propose a method for discovering and utilizing concepts in indexing and search for a domain specific document collection being utilized in industry. This approach differs from others in that we only collect focused concepts to build the concept space and that instead of turning a user’s query into a concept based query, we experiment with different techniques of combining the original query with a concept query. We apply the proposed approach to a real-world document collection and the results show that in this scenario the use of concept knowledge at index and search can improve the relevancy of results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automated remote ultrasound detectors allow large amounts of data on bat presence and activity to be collected. Processing of such data involves identifying bat species from their echolocation calls. Automated species identification has the potential to provide more consistent, predictable, and potentially higher levels of accuracy than identification by humans. In contrast, identification by humans permits flexibility and intelligence in identification, as well as the incorporation of features and patterns that may be difficult to quantify. We compared humans with artificial neural networks (ANNs) in their ability to classify short recordings of bat echolocation calls of variable signal to noise ratios; these sequences are typical of those obtained from remote automated recording systems that are often used in large-scale ecological studies. We presented 45 recordings (1–4 calls) produced by known species of bats to ANNs and to 26 human participants with 1 month to 23 years of experience in acoustic identification of bats. Humans correctly classified 86% of recordings to genus and 56% to species; ANNs correctly identified 92% and 62%, respectively. There was no significant difference between the performance of ANNs and that of humans, but ANNs performed better than about 75% of humans. There was little relationship between the experience of the human participants and their classification rate. However, humans with <1 year of experience performed worse than others. Currently, identification of bat echolocation calls by humans is suitable for ecological research, after careful consideration of biases. However, improvements to ANNs and the data that they are trained on may in future increase their performance to beyond those demonstrated by humans.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background The use of dual growing rods is a fusionless surgical approach to the treatment of early onset scoliosis (EOS), which aims of harness potential growth in order to correct spinal deformity. The purpose of this study was to compare the in-vitro biomechanical response of two different dual rod designs under axial rotation loading. Methods Six porcine spines were dissected into seven level thoracolumbar multi-segmental units. Each specimen was mounted and tested in a biaxial Instron machine, undergoing nondestructive left/right axial rotation to peak moments of 4Nm at a constant rotation rate of 8deg.s-1. A motion tracking system (Optotrak) measured 3D displacements of individual vertebrae. Each spine was tested in an un-instrumented state first and then with appropriately sized semi-constrained growing rods and ‘rigid’ rods in alternating sequence. Range of motion, neutral zone size and stiffness were calculated from the moment-rotation curves and intervertebral ranges of motion were calculated from Optotrak data. Findings Irrespective of test sequence, rigid rods showed significantly reduction of total rotation across all instrumented levels (with increased stiffness) whilst semi-constrained rods exhibited similar rotation behavior to the un-instrumented (P<0.05). An 11% and 8% increase in stiffness for left and right axial rotation respectively and 15% reduction in total range of motion was recorded with dual rigid rods compared with semi-constrained rods. Interpretation Based on these findings, the semi-constrained growing rods do not increase axial rotation stiffness compared with un-instrumented spines. This is thought to provide a more physiological environment for the growing spine compared to dual rigid rod constructs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering is an important technique in organising and categorising web scale documents. The main challenges faced in clustering the billions of documents available on the web are the processing power required and the sheer size of the datasets available. More importantly, it is nigh impossible to generate the labels for a general web document collection containing billions of documents and a vast taxonomy of topics. However, document clusters are most commonly evaluated by comparison to a ground truth set of labels for documents. This paper presents a clustering and labeling solution where the Wikipedia is clustered and hundreds of millions of web documents in ClueWeb12 are mapped on to those clusters. This solution is based on the assumption that the Wikipedia contains such a wide range of diverse topics that it represents a small scale web. We found that it was possible to perform the web scale document clustering and labeling process on one desktop computer under a couple of days for the Wikipedia clustering solution containing about 1000 clusters. It takes longer to execute a solution with finer granularity clusters such as 10,000 or 50,000. These results were evaluated using a set of external data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The commercialization of aerial image processing is highly dependent on the platforms such as UAVs (Unmanned Aerial Vehicles). However, the lack of an automated UAV forced landing site detection system has been identified as one of the main impediments to allow UAV flight over populated areas in civilian airspace. This article proposes a UAV forced landing site detection system that is based on machine learning approaches including the Gaussian Mixture Model and the Support Vector Machine. A range of learning parameters are analysed including the number of Guassian mixtures, support vector kernels including linear, radial basis function Kernel (RBF) and polynormial kernel (poly), and the order of RBF kernel and polynormial kernel. Moreover, a modified footprint operator is employed during feature extraction to better describe the geometric characteristics of the local area surrounding a pixel. The performance of the presented system is compared to a baseline UAV forced landing site detection system which uses edge features and an Artificial Neural Network (ANN) region type classifier. Experiments conducted on aerial image datasets captured over typical urban environments reveal improved landing site detection can be achieved with an SVM classifier with an RBF kernel using a combination of colour and texture features. Compared to the baseline system, the proposed system provides significant improvement in term of the chance to detect a safe landing area, and the performance is more stable than the baseline in the presence of changes to the UAV altitude.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For traditional information filtering (IF) models, it is often assumed that the documents in one collection are only related to one topic. However, in reality users’ interests can be diverse and the documents in the collection often involve multiple topics. Topic modelling was proposed to generate statistical models to represent multiple topics in a collection of documents, but in a topic model, topics are represented by distributions over words which are limited to distinctively represent the semantics of topics. Patterns are always thought to be more discriminative than single terms and are able to reveal the inner relations between words. This paper proposes a novel information filtering model, Significant matched Pattern-based Topic Model (SPBTM). The SPBTM represents user information needs in terms of multiple topics and each topic is represented by patterns. More importantly, the patterns are organized into groups based on their statistical and taxonomic features, from which the more representative patterns, called Significant Matched Patterns, can be identified and used to estimate the document relevance. Experiments on benchmark data sets demonstrate that the SPBTM significantly outperforms the state-of-the-art models.