992 resultados para Machine Typed Document


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis we investigate the use of quantum probability theory for ranking documents. Quantum probability theory is used to estimate the probability of relevance of a document given a user's query. We posit that quantum probability theory can lead to a better estimation of the probability of a document being relevant to a user's query than the common approach, i. e. the Probability Ranking Principle (PRP), which is based upon Kolmogorovian probability theory. Following our hypothesis, we formulate an analogy between the document retrieval scenario and a physical scenario, that of the double slit experiment. Through the analogy, we propose a novel ranking approach, the quantum probability ranking principle (qPRP). Key to our proposal is the presence of quantum interference. Mathematically, this is the statistical deviation between empirical observations and expected values predicted by the Kolmogorovian rule of additivity of probabilities of disjoint events in configurations such that of the double slit experiment. We propose an interpretation of quantum interference in the document ranking scenario, and examine how quantum interference can be effectively estimated for document retrieval. To validate our proposal and to gain more insights about approaches for document ranking, we (1) analyse PRP, qPRP and other ranking approaches, exposing the assumptions underlying their ranking criteria and formulating the conditions for the optimality of the two ranking principles, (2) empirically compare three ranking principles (i. e. PRP, interactive PRP, and qPRP) and two state-of-the-art ranking strategies in two retrieval scenarios, those of ad-hoc retrieval and diversity retrieval, (3) analytically contrast the ranking criteria of the examined approaches, exposing similarities and differences, (4) study the ranking behaviours of approaches alternative to PRP in terms of the kinematics they impose on relevant documents, i. e. by considering the extent and direction of the movements of relevant documents across the ranking recorded when comparing PRP against its alternatives. Our findings show that the effectiveness of the examined ranking approaches strongly depends upon the evaluation context. In the traditional evaluation context of ad-hoc retrieval, PRP is empirically shown to be better or comparable to alternative ranking approaches. However, when we turn to examine evaluation contexts that account for interdependent document relevance (i. e. when the relevance of a document is assessed also with respect to other retrieved documents, as it is the case in the diversity retrieval scenario) then the use of quantum probability theory and thus of qPRP is shown to improve retrieval and ranking effectiveness over the traditional PRP and alternative ranking strategies, such as Maximal Marginal Relevance, Portfolio theory, and Interactive PRP. This work represents a significant step forward regarding the use of quantum theory in information retrieval. It demonstrates in fact that the application of quantum theory to problems within information retrieval can lead to improvements both in modelling power and retrieval effectiveness, allowing the constructions of models that capture the complexity of information retrieval situations. Furthermore, the thesis opens up a number of lines for future research. These include: (1) investigating estimations and approximations of quantum interference in qPRP; (2) exploiting complex numbers for the representation of documents and queries, and; (3) applying the concepts underlying qPRP to tasks other than document ranking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective To develop and evaluate machine learning techniques that identify limb fractures and other abnormalities (e.g. dislocations) from radiology reports. Materials and Methods 99 free-text reports of limb radiology examinations were acquired from an Australian public hospital. Two clinicians were employed to identify fractures and abnormalities from the reports; a third senior clinician resolved disagreements. These assessors found that, of the 99 reports, 48 referred to fractures or abnormalities of limb structures. Automated methods were then used to extract features from these reports that could be useful for their automatic classification. The Naive Bayes classification algorithm and two implementations of the support vector machine algorithm were formally evaluated using cross-fold validation over the 99 reports. Result Results show that the Naive Bayes classifier accurately identifies fractures and other abnormalities from the radiology reports. These results were achieved when extracting stemmed token bigram and negation features, as well as using these features in combination with SNOMED CT concepts related to abnormalities and disorders. The latter feature has not been used in previous works that attempted classifying free-text radiology reports. Discussion Automated classification methods have proven effective at identifying fractures and other abnormalities from radiology reports (F-Measure up to 92.31%). Key to the success of these techniques are features such as stemmed token bigrams, negations, and SNOMED CT concepts associated with morphologic abnormalities and disorders. Conclusion This investigation shows early promising results and future work will further validate and strengthen the proposed approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Cancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities. Aims In this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated. Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes. Results Death certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM) classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032) and false negative rate (0.0297) while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers. Conclusion The selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with an SVM classifier.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, a machine learning technique called anomaly detection is employed for wind turbine bearing fault detection. Basically, the anomaly detection algorithm is used to recognize the presence of unusual and potentially faulty data in a dataset, which contains two phases: a training phase and a testing phase. Two bearing datasets were used to validate the proposed technique, fault-seeded bearing from a test rig located at Case Western Reserve University to validate the accuracy of the anomaly detection method, and a test to failure data of bearings from the NSF I/UCR Center for Intelligent Maintenance Systems (IMS). The latter data set was used to compare anomaly detection with SVM, a previously well-known applied method, in rapidly finding the incipient faults.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study presents an acoustic emission (AE) based fault diagnosis for low speed bearing using multi-class relevance vector machine (RVM). A low speed test rig was developed to simulate the various defects with shaft speeds as low as 10 rpm under several loading conditions. The data was acquired using anAEsensor with the test bearing operating at a constant loading (5 kN) andwith a speed range from20 to 80 rpm. This study is aimed at finding a reliable method/tool for low speed machines fault diagnosis based on AE signal. In the present study, component analysis was performed to extract the bearing feature and to reduce the dimensionality of original data feature. The result shows that multi-class RVM offers a promising approach for fault diagnosis of low speed machines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Newsletter ACM SIGIR Forum: The Seventeenth Australian Document Computing Symposium was held in Dunedin, New Zealand on the 5th and 6th of December 2012. In total twenty four papers were submitted. From those eleven were accepted for full presentation and 8 for short presentation. A poster session was held jointly with the Australasian Language Technology Workshop.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the growing size and variety of social media files on the web, it’s becoming critical to efficiently organize them into clusters for further processing. This paper presents a novel scalable constrained document clustering method that harnesses the power of search engines capable of dealing with large text data. Instead of calculating distance between the documents and all of the clusters’ centroids, a neighborhood of best cluster candidates is chosen using a document ranking scheme. To make the method faster and less memory dependable, the in-memory and in-database processing are combined in a semi-incremental manner. This method has been extensively tested in the social event detection application. Empirical analysis shows that the proposed method is efficient both in computation and memory usage while producing notable accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multi-touch interfaces across a wide range of hardware platforms are becoming pervasive. This is due to the adoption of smart phones and tablets in both the consumer and corporate market place. This paper proposes a human-machine interface to interact with unmanned aerial systems based on the philosophy of multi-touch hardware-independent high-level interaction with multiple systems simultaneously. Our approach incorporates emerging development methods for multi-touch interfaces on mobile platforms. A framework is defined for supporting multiple protocols. An open source solution is presented that demonstrates: architecture supporting different communications hardware; an extensible approach for supporting multiple protocols; and the ability to monitor and interact with multiple UAVs from multiple clients simultaneously. Validation tests were conducted to assess the performance, scalability and impact on packet latency under different client configurations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the modeling and motion-sensorless direct torque and flux control of a novel dual-airgap axial-flux permanent-magnet machine optimized for use in flywheel energy storage system (FESS) applications. Independent closed-loop torque and stator flux regulation are performed in the stator flux ( x-y) reference frame via two PI controllers. This facilitates fast torque dynamics, which is critical as far as energy charging/discharging in the FESS is concerned. As FESS applications demand high-speed operation, a new field-weakening algorithm is proposed in this paper. Flux weakening is achieved autonomously once the y-axis voltage exceeds the available inverter voltage. An inherently speed sensorless stator flux observer immune to stator resistance variations and dc-offset effects is also proposed for accurate flux and speed estimation. The proposed observer eliminates the rotary encoder, which in turn reduces the overall weight and cost of the system while improving its reliability. The effectiveness of the proposed control scheme has been verified by simulations and experiments on a machine prototype.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the modeling and position-sensorless vector control of a dual-airgap axial flux permanent magnet (AFPM) machine optimized for use in flywheel energy storage system (FESS) applications. The proposed AFPM machine has two sets of three-phase stator windings but requires only a single power converter to control both the electromagnetic torque and the axial levitation force. The proper controllability of the latter is crucial as it can be utilized to minimize the vertical bearing stress to improve the efficiency of the FESS. The method for controlling both the speed and axial displacement of the machine is discussed. An inherent speed sensorless observer is also proposed for speed estimation. The proposed observer eliminates the rotary encoder, which in turn reduces the overall weight and cost of the system while improving its reliability. The effectiveness of the proposed control scheme has been verified by simulations and experiments on a prototype machine.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Brain decoding of functional Magnetic Resonance Imaging data is a pattern analysis task that links brain activity patterns to the experimental conditions. Classifiers predict the neural states from the spatial and temporal pattern of brain activity extracted from multiple voxels in the functional images in a certain period of time. The prediction results offer insight into the nature of neural representations and cognitive mechanisms and the classification accuracy determines our confidence in understanding the relationship between brain activity and stimuli. In this paper, we compared the efficacy of three machine learning algorithms: neural network, support vector machines, and conditional random field to decode the visual stimuli or neural cognitive states from functional Magnetic Resonance data. Leave-one-out cross validation was performed to quantify the generalization accuracy of each algorithm on unseen data. The results indicated support vector machine and conditional random field have comparable performance and the potential of the latter is worthy of further investigation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents a study of how humans perceive and judge the relevance of documents. Humans are adept at making reasonably robust and quick decisions about what information is relevant to them, despite the ever increasing complexity and volume of their surrounding information environment. The literature on document relevance has identified various dimensions of relevance (e.g., topicality, novelty, etc.), however little is understood about how these dimensions may interact. We performed a crowdsourced study of how human subjects judge two relevance dimensions in relation to document snippets retrieved from an internet search engine. The order of the judgment was controlled. For those judgments exhibiting an order effect, a q–test was performed to determine whether the order effects can be explained by a quantum decision model based on incompatible decision perspectives. Some evidence of incompatibility was found which suggests incompatible decision perspectives is appropriate for explaining interacting dimensions of relevance in such instances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Age-related Macular Degeneration (AMD) is one of the major causes of vision loss and blindness in ageing population. Currently, there is no cure for AMD, however early detection and subsequent treatment may prevent the severe vision loss or slow the progression of the disease. AMD can be classified into two types: dry and wet AMDs. The people with macular degeneration are mostly affected by dry AMD. Early symptoms of AMD are formation of drusen and yellow pigmentation. These lesions are identified by manual inspection of fundus images by the ophthalmologists. It is a time consuming, tiresome process, and hence an automated diagnosis of AMD screening tool can aid clinicians in their diagnosis significantly. This study proposes an automated dry AMD detection system using various entropies (Shannon, Kapur, Renyi and Yager), Higher Order Spectra (HOS) bispectra features, Fractional Dimension (FD), and Gabor wavelet features extracted from greyscale fundus images. The features are ranked using t-test, Kullback–Lieber Divergence (KLD), Chernoff Bound and Bhattacharyya Distance (CBBD), Receiver Operating Characteristics (ROC) curve-based and Wilcoxon ranking methods in order to select optimum features and classified into normal and AMD classes using Naive Bayes (NB), k-Nearest Neighbour (k-NN), Probabilistic Neural Network (PNN), Decision Tree (DT) and Support Vector Machine (SVM) classifiers. The performance of the proposed system is evaluated using private (Kasturba Medical Hospital, Manipal, India), Automated Retinal Image Analysis (ARIA) and STructured Analysis of the Retina (STARE) datasets. The proposed system yielded the highest average classification accuracies of 90.19%, 95.07% and 95% with 42, 54 and 38 optimal ranked features using SVM classifier for private, ARIA and STARE datasets respectively. This automated AMD detection system can be used for mass fundus image screening and aid clinicians by making better use of their expertise on selected images that require further examination.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Portable, water filled road safety barriers are used to provide protection and reduce the potential hazard due to errant vehicles in areas where the road conditions change frequently (e.g. near road work sites). As part of an effort to reduce excessive working widths typical of these systems, a study was conducted to assess the effectiveness of introducing polymeric foam filled panels into the design. Surrogate impact tests of a design typical of such as barrier system were conducted utilising a pneumatically powered horizontal impact testing machine up to impact energies of 7.40 kJ. Results of these tests are utilised to examine the barrier behaviour, in addition to being used to validate a couple FE/SPH model of the barrier system. Once validated, the FE/SPH model it utilised as the basis for a parametric study into the efficacy and effects of the inclusion of polymeric foam filled panels on the performance of portable water filled road safety barriers. It was found that extruded polystyrene foam functioned well, with a greater thickness of the foam panel significantly reducing the impacting body velocity as the barrier began to translate.