9 resultados para Data classification
em Universidad de Alicante
Resumo:
In this paper, we propose a novel filter for feature selection. Such filter relies on the estimation of the mutual information between features and classes. We bypass the estimation of the probability density function with the aid of the entropic-graphs approximation of Rényi entropy, and the subsequent approximation of the Shannon one. The complexity of such bypassing process does not depend on the number of dimensions but on the number of patterns/samples, and thus the curse of dimensionality is circumvented. We show that it is then possible to outperform a greedy algorithm based on the maximal relevance and minimal redundancy criterion. We successfully test our method both in the contexts of image classification and microarray data classification.
Resumo:
The construction industry is characterised by fragmentation and suffers from lack of collaboration, often adopting adversarial working practices to achieve deliverables. For the UK Government and construction industry, BIM is a game changer aiming to rectify this fragmentation and promote collaboration. However it has become clear that there is an essential need to have better controls and definitions of both data deliverables and data classification. Traditional methods and techniques for collating and inputting data have shown to be time consuming and provide little to improve or add value to the overall task of improving deliverables. Hence arose the need in the industry to develop a Digital Plan of Work (DPoW) toolkit that would aid the decision making process, providing the required control over the project workflows and data deliverables, and enabling better collaboration through transparency of need and delivery. The specification for the existing Digital Plan of Work (DPoW) was to be, an industry standard method of describing geometric, requirements and data deliveries at key stages of the project cycle, with the addition of a structured and standardised information classification system. However surveys and interviews conducted within this research indicate that the current DPoW resembles a digitised version of the pre-existing plans of work and does not push towards the data enriched decision-making abilities that advancements in technology now offer. A Digital Framework is not simply the digitisation of current or historic standard methods and procedures, it is a new intelligent driven digital system that uses new tools, processes, procedures and work flows to eradicate waste and increase efficiency. In addition to reporting on conducted surveys above, this research paper will present a theoretical investigation into usage of Intelligent Decision Support Systems within a digital plan of work framework. Furthermore this paper will present findings on the suitability to utilise advancements in intelligent decision-making system frameworks and Artificial Intelligence for a UK BIM Framework. This should form the foundations of decision-making for projects implemented at BIM level 2. The gap identified in this paper is that the current digital toolkit does not incorporate the intelligent characteristics available in other industries through advancements in technology and collation of vast amounts of data that a digital plan of work framework could have access to and begin to develop, learn and adapt for decision-making through the live interaction of project stakeholders.
Resumo:
In the current Information Age, data production and processing demands are ever increasing. This has motivated the appearance of large-scale distributed information. This phenomenon also applies to Pattern Recognition so that classic and common algorithms, such as the k-Nearest Neighbour, are unable to be used. To improve the efficiency of this classifier, Prototype Selection (PS) strategies can be used. Nevertheless, current PS algorithms were not designed to deal with distributed data, and their performance is therefore unknown under these conditions. This work is devoted to carrying out an experimental study on a simulated framework in which PS strategies can be compared under classical conditions as well as those expected in distributed scenarios. Our results report a general behaviour that is degraded as conditions approach to more realistic scenarios. However, our experiments also show that some methods are able to achieve a fairly similar performance to that of the non-distributed scenario. Thus, although there is a clear need for developing specific PS methodologies and algorithms for tackling these situations, those that reported a higher robustness against such conditions may be good candidates from which to start.
Resumo:
Background: The harmonization of European health systems brings with it a need for tools to allow the standardized collection of information about medical care. A common coding system and standards for the description of services are needed to allow local data to be incorporated into evidence-informed policy, and to permit equity and mobility to be assessed. The aim of this project has been to design such a classification and a related tool for the coding of services for Long Term Care (DESDE-LTC), based on the European Service Mapping Schedule (ESMS). Methods: The development of DESDE-LTC followed an iterative process using nominal groups in 6 European countries. 54 researchers and stakeholders in health and social services contributed to this process. In order to classify services, we use the minimal organization unit or “Basic Stable Input of Care” (BSIC), coded by its principal function or “Main Type of Care” (MTC). The evaluation of the tool included an analysis of feasibility, consistency, ontology, inter-rater reliability, Boolean Factor Analysis, and a preliminary impact analysis (screening, scoping and appraisal). Results: DESDE-LTC includes an alpha-numerical coding system, a glossary and an assessment instrument for mapping and counting LTC. It shows high feasibility, consistency, inter-rater reliability and face, content and construct validity. DESDE-LTC is ontologically consistent. It is regarded by experts as useful and relevant for evidence-informed decision making. Conclusion: DESDE-LTC contributes to establishing a common terminology, taxonomy and coding of LTC services in a European context, and a standard procedure for data collection and international comparison.
Resumo:
3D sensors provides valuable information for mobile robotic tasks like scene classification or object recognition, but these sensors often produce noisy data that makes impossible applying classical keypoint detection and feature extraction techniques. Therefore, noise removal and downsampling have become essential steps in 3D data processing. In this work, we propose the use of a 3D filtering and down-sampling technique based on a Growing Neural Gas (GNG) network. GNG method is able to deal with outliers presents in the input data. These features allows to represent 3D spaces, obtaining an induced Delaunay Triangulation of the input space. Experiments show how the state-of-the-art keypoint detectors improve their performance using GNG output representation as input data. Descriptors extracted on improved keypoints perform better matching in robotics applications as 3D scene registration.
Resumo:
A new classification of microtidal sand and gravel beaches with very different morphologies is presented below. In 557 studied transects, 14 variables were used. Among the variables to be emphasized is the depth of the Posidonia oceanica. The classification was performed for 9 types of beaches: Type 1: Sand and gravel beaches, Type 2: Sand and gravel separated beaches, Type 3: Gravel and sand beaches, Type 4: Gravel and sand separated beaches, Type 5: Pure gravel beaches, Type 6: Open sand beaches, Type 7: Supported sand beaches, Type 8: Bisupported sand beaches and Type 9: Enclosed beaches. For the classification, several tools were used: discriminant analysis, neural networks and Support Vector Machines (SVM), the results were then compared. As there is no theory for deciding which is the most convenient neural network architecture to deal with a particular data set, an experimental study was performed with different numbers of neuron in the hidden layer. Finally, an architecture with 30 neurons was chosen. Different kernels were employed for SVM (Linear, Polynomial, Radial basis function and Sigmoid). The results obtained for the discriminant analysis were not as good as those obtained for the other two methods (ANN and SVM) which showed similar success.
Resumo:
In this work we present a semantic framework suitable of being used as support tool for recommender systems. Our purpose is to use the semantic information provided by a set of integrated resources to enrich texts by conducting different NLP tasks: WSD, domain classification, semantic similarities and sentiment analysis. After obtaining the textual semantic enrichment we would be able to recommend similar content or even to rate texts according to different dimensions. First of all, we describe the main characteristics of the semantic integrated resources with an exhaustive evaluation. Next, we demonstrate the usefulness of our resource in different NLP tasks and campaigns. Moreover, we present a combination of different NLP approaches that provide enough knowledge for being used as support tool for recommender systems. Finally, we illustrate a case of study with information related to movies and TV series to demonstrate that our framework works properly.
Resumo:
Due to confidentiality considerations, the microdata available from the 2011 Spanish Census have been codified at a provincial (NUTS 3) level except when the municipal (LAU 2) population exceeds 20,000 inhabitants (a requirement that is met by less than 5% of all municipalities). For the remainder of the municipalities within a given province, information is only provided for their classification in wide population intervals. These limitations, hampering territorially-focused socio-economic analyses, and more specifically, those related to the labour market, are observed in many other countries. This article proposes and demonstrates an automatic procedure aimed at delineating a set of areas that meet such population requirements and that may be used to re-codify the geographic reference in these cases, thereby increasing the territorial detail at which individual information is available. The method aggregates municipalities into clusters based on the optimisation of a relevant objective function subject to a number of statistical constraints, and is implemented using evolutionary computation techniques. Clusters are defined to fit outer boundaries at the level of labour market areas.
Resumo:
This article describes the Robot Vision challenge, a competition that evaluates solutions for the visual place classification problem. Since its origin, this challenge has been proposed as a common benchmark where worldwide proposals are measured using a common overall score. Each new edition of the competition introduced novelties, both for the type of input data and subobjectives of the challenge. All the techniques used by the participants have been gathered up and published to make it accessible for future developments. The legacy of the Robot Vision challenge includes data sets, benchmarking techniques, and a wide experience in the place classification research that is reflected in this article.