42 resultados para Information Filtering, Pattern Mining, Relevance Feature Discovery, Text Mining

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method to enhance fault localization for software systems based on a frequent pattern mining algorithm. Our method is based on a large set of test cases for a given set of programs in which faults can be detected. The test executions are recorded as function call trees. Based on test oracles the tests can be classified into successful and failing tests. A frequent pattern mining algorithm is used to identify frequent subtrees in successful and failing test executions. This information is used to rank functions according to their likelihood of containing a fault. The ranking suggests an order in which to examine the functions during fault analysis. We validate our approach experimentally using a subset of Siemens benchmark programs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information systems for business are frequently heavily reliant on software. Two important feedback-related effects of embedding software in a business process are identified. First, the system dynamics of the software maintenance process can become complex, particularly in the number and scope of the feedback loops. Secondly, responsiveness to feedback can have a big effect on the evolvability of the information system. Ways have been explored to provide an effective mechanism for improving the quality of feedback between stakeholders during software maintenance. Understanding can be improved by using representations of information systems that are both service-based and architectural in scope. The conflicting forces that encourage change or stability can be resolved using patterns and pattern languages. A morphology of information systems pattern languages has been described to facilitate the identification and reuse of patterns and pattern languages. The kind of planning process needed to achieve consensus on a system's evolution is also considered.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We show how multivariate GARCH models can be used to generate a time-varying “information share” (Hasbrouck, 1995) to represent the changing patterns of price discovery in closely related securities. We find that time-varying information shares can improve credit spread predictions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aims and objectives: To assess the level of confidence that rheumatology patients would have in nurse prescribing, the effects on likely adherence and particular concerns that these patients have. In addition, given that information provision has been cited as a potential benefit of nurse prescribing, the present study assessed the extent to which these patients would want an explanation for the selected medicine, as well as which types of information should be included in such an explanation. Background: Nurse prescribing has been successfully implemented in the UK in several healthcare settings. Existing research has not addressed the effects on patients' confidence and likely adherence, nor have patients' information needs been established. However, we know that inadequate medicines information provision by health professionals is one of the largest causes of patient dissatisfaction. Methods: Fifty-four patients taking disease-modifying drugs for inflammatory joint disease attending a specialist rheumatology clinic self-completed a written questionnaire. Results: Patients indicated a relatively high level of confidence in nurse prescribing and stated that they would be very likely to take the selected medication. The level of concern was relatively low and the majority of concerns raised did not relate to the nurse's status. Strong support was expressed for the nurse providing an explanation for medicine choice. Conclusion: This research provides support for the prescription of medicines by nurses working in the area of rheumatology, the importance of nurses providing a full explanation about the selected medicines they prescribe for these patients and some indication as to which categories of information should be included. Relevance to clinical practice: Rheumatology patients who have not yet experienced nurse prescribing are, in general, positive about nurses adopting this role. It is important that nurses provide appropriate information about the prescribed medicines, in a form that can be understood.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Imagery registration is a fundamental step, which greatly affects later processes in image mosaic, multi-spectral image fusion, digital surface modelling, etc., where the final solution needs blending of pixel information from more than one images. It is highly desired to find a way to identify registration regions among input stereo image pairs with high accuracy, particularly in remote sensing applications in which ground control points (GCPs) are not always available, such as in selecting a landing zone on an outer space planet. In this paper, a framework for localization in image registration is developed. It strengthened the local registration accuracy from two aspects: less reprojection error and better feature point distribution. Affine scale-invariant feature transform (ASIFT) was used for acquiring feature points and correspondences on the input images. Then, a homography matrix was estimated as the transformation model by an improved random sample consensus (IM-RANSAC) algorithm. In order to identify a registration region with a better spatial distribution of feature points, the Euclidean distance between the feature points is applied (named the S criterion). Finally, the parameters of the homography matrix were optimized by the Levenberg–Marquardt (LM) algorithm with selective feature points from the chosen registration region. In the experiment section, the Chang’E-2 satellite remote sensing imagery was used for evaluating the performance of the proposed method. The experiment result demonstrates that the proposed method can automatically locate a specific region with high registration accuracy between input images by achieving lower root mean square error (RMSE) and better distribution of feature points.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we introduce a novel high-level visual content descriptor which is devised for performing semantic-based image classification and retrieval. The work can be treated as an attempt to bridge the so called “semantic gap”. The proposed image feature vector model is fundamentally underpinned by the image labelling framework, called Collaterally Confirmed Labelling (CCL), which incorporates the collateral knowledge extracted from the collateral texts of the images with the state-of-the-art low-level image processing and visual feature extraction techniques for automatically assigning linguistic keywords to image regions. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicates that our proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new image data fusion scheme by combining median filtering with self-organizing feature map (SOFM) neural networks. The scheme consists of three steps: (1) pre-processing of the images, where weighted median filtering removes part of the noise components corrupting the image, (2) pixel clustering for each image using self-organizing feature map neural networks, and (3) fusion of the images obtained in Step (2), which suppresses the residual noise components and thus further improves the image quality. It proves that such a three-step combination offers an impressive effectiveness and performance improvement, which is confirmed by simulations involving three image sensors (each of which has a different noise structure).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this article is to study the problem of pedestrian classification across different light spectrum domains (visible and far-infrared (FIR)) and modalities (intensity, depth and motion). In recent years, there has been a number of approaches for classifying and detecting pedestrians in both FIR and visible images, but the methods are difficult to compare, because either the datasets are not publicly available or they do not offer a comparison between the two domains. Our two primary contributions are the following: (1) we propose a public dataset, named RIFIR , containing both FIR and visible images collected in an urban environment from a moving vehicle during daytime; and (2) we compare the state-of-the-art features in a multi-modality setup: intensity, depth and flow, in far-infrared over visible domains. The experiments show that features families, intensity self-similarity (ISS), local binary patterns (LBP), local gradient patterns (LGP) and histogram of oriented gradients (HOG), computed from FIR and visible domains are highly complementary, but their relative performance varies across different modalities. In our experiments, the FIR domain has proven superior to the visible one for the task of pedestrian classification, but the overall best results are obtained by a multi-domain multi-modality multi-feature fusion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Policy makers have identified the relationship between entrepreneurship and economic development. Yet, little is known about how this relationship varies over time in cities with different market sizes. This study examines the link between entrepreneurship and economic development using a panel of 127 European cities between 1994 and 2009. We found that the immediate economic development impact of new firm start-ups is positive for both small-/medium-size cities and large cities. The relationship is U-shaped for large cities, with the indirect effect taking 7 years, but the indirect effect does not occur in small-/medium-size cities. We offer useful information for policy makers, practitioners, and scholars.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Why some organisms become invasive when introduced into novel regions while others fail to even establish is a fundamental question in ecology. Barriers to success are expected to filter species at each stage along the invasion pathway. No study to date, however, has investigated how species traits associate with success from introduction to spread at a large spatial scale in any group. Using the largest data set of mammalian introductions at the global scale and recently developed phylogenetic comparative methods, we show that human-mediated introductions considerably bias which species have the opportunity to become invasive, as highly productive mammals with longer reproductive lifespans are far more likely to be introduced. Subsequently, greater reproductive output and higher introduction effort are associated with success at both the establishment and spread stages. High productivity thus supports population growth and invasion success, with barriers at each invasion stage filtering species with progressively greater fecundity.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The study of motor unit action potential (MUAP) activity from electrornyographic signals is an important stage on neurological investigations that aim to understand the state of the neuromuscular system. In this context, the identification and clustering of MUAPs that exhibit common characteristics, and the assessment of which data features are most relevant for the definition of such cluster structure are central issues. In this paper, we propose the application of an unsupervised Feature Relevance Determination (FRD) method to the analysis of experimental MUAPs obtained from healthy human subjects. In contrast to approaches that require the knowledge of a priori information from the data, this FRD method is embedded on a constrained mixture model, known as Generative Topographic Mapping, which simultaneously performs clustering and visualization of MUAPs. The experimental results of the analysis of a data set consisting of MUAPs measured from the surface of the First Dorsal Interosseous, a hand muscle, indicate that the MUAP features corresponding to the hyperpolarization period in the physisiological process of generation of muscle fibre action potentials are consistently estimated as the most relevant and, therefore, as those that should be paid preferential attention for the interpretation of the MUAP groupings.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This paper describes a proposed new approach to the Computer Network Security Intrusion Detection Systems (NIDS) application domain knowledge processing focused on a topic map technology-enabled representation of features of the threat pattern space as well as the knowledge of situated efficacy of alternative candidate algorithms for pattern recognition within the NIDS domain. Thus an integrative knowledge representation framework for virtualisation, data intelligence and learning loop architecting in the NIDS domain is described together with specific aspects of its deployment.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Aircraft Maintenance, Repair and Overhaul (MRO) feedback commonly includes an engineer’s complex text-based inspection report. Capturing and normalizing the content of these textual descriptions is vital to cost and quality benchmarking, and provides information to facilitate continuous improvement of MRO process and analytics. As data analysis and mining tools requires highly normalized data, raw textual data is inadequate. This paper offers a textual-mining solution to efficiently analyse bulk textual feedback data. Despite replacement of the same parts and/or sub-parts, the actual service cost for the same repair is often distinctly different from similar previously jobs. Regular expression algorithms were incorporated with an aircraft MRO glossary dictionary in order to help provide additional information concerning the reason for cost variation. Professional terms and conventions were included within the dictionary to avoid ambiguity and improve the outcome of the result. Testing results show that most descriptive inspection reports can be appropriately interpreted, allowing extraction of highly normalized data. This additional normalized data strongly supports data analysis and data mining, whilst also increasing the accuracy of future quotation costing. This solution has been effectively used by a large aircraft MRO agency with positive results.