874 resultados para correlation-based feature selection


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis we present an overview of sparse approximations of grey level images. The sparse representations are realized by classic, Matching Pursuit (MP) based, greedy selection strategies. One such technique, termed Orthogonal Matching Pursuit (OMP), is shown to be suitable for producing sparse approximations of images, if they are processed in small blocks. When the blocks are enlarged, the proposed Self Projected Matching Pursuit (SPMP) algorithm, successfully renders equivalent results to OMP. A simple coding algorithm is then proposed to store these sparse approximations. This is shown, under certain conditions, to be competitive with JPEG2000 image compression standard. An application termed image folding, which partially secures the approximated images is then proposed. This is extended to produce a self contained folded image, containing all the information required to perform image recovery. Finally a modified OMP selection technique is applied to produce sparse approximations of Red Green Blue (RGB) images. These RGB approximations are then folded with the self contained approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The evaluation of geospatial data quality and trustworthiness presents a major challenge to geospatial data users when making a dataset selection decision. The research presented here therefore focused on defining and developing a GEO label – a decision support mechanism to assist data users in efficient and effective geospatial dataset selection on the basis of quality, trustworthiness and fitness for use. This thesis thus presents six phases of research and development conducted to: (a) identify the informational aspects upon which users rely when assessing geospatial dataset quality and trustworthiness; (2) elicit initial user views on the GEO label role in supporting dataset comparison and selection; (3) evaluate prototype label visualisations; (4) develop a Web service to support GEO label generation; (5) develop a prototype GEO label-based dataset discovery and intercomparison decision support tool; and (6) evaluate the prototype tool in a controlled human-subject study. The results of the studies revealed, and subsequently confirmed, eight geospatial data informational aspects that were considered important by users when evaluating geospatial dataset quality and trustworthiness, namely: producer information, producer comments, lineage information, compliance with standards, quantitative quality information, user feedback, expert reviews, and citations information. Following an iterative user-centred design (UCD) approach, it was established that the GEO label should visually summarise availability and allow interrogation of these key informational aspects. A Web service was developed to support generation of dynamic GEO label representations and integrated into a number of real-world GIS applications. The service was also utilised in the development of the GEO LINC tool – a GEO label-based dataset discovery and intercomparison decision support tool. The results of the final evaluation study indicated that (a) the GEO label effectively communicates the availability of dataset quality and trustworthiness information and (b) GEO LINC successfully facilitates ‘at a glance’ dataset intercomparison and fitness for purpose-based dataset selection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature selection is important in medical field for many reasons. However, selecting important variables is a difficult task with the presence of censoring that is a unique feature in survival data analysis. This paper proposed an approach to deal with the censoring problem in endovascular aortic repair survival data through Bayesian networks. It was merged and embedded with a hybrid feature selection process that combines cox's univariate analysis with machine learning approaches such as ensemble artificial neural networks to select the most relevant predictive variables. The proposed algorithm was compared with common survival variable selection approaches such as; least absolute shrinkage and selection operator LASSO, and Akaike information criterion AIC methods. The results showed that it was capable of dealing with high censoring in the datasets. Moreover, ensemble classifiers increased the area under the roc curves of the two datasets collected from two centers located in United Kingdom separately. Furthermore, ensembles constructed with center 1 enhanced the concordance index of center 2 prediction compared to the model built with a single network. Although the size of the final reduced model using the neural networks and its ensembles is greater than other methods, the model outperformed the others in both concordance index and sensitivity for center 2 prediction. This indicates the reduced model is more powerful for cross center prediction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A number of studies in the areas of Biomedical Engineering and Health Sciences have employed machine learning tools to develop methods capable of identifying patterns in different sets of data. Despite its extinction in many countries of the developed world, Hansen’s disease is still a disease that affects a huge part of the population in countries such as India and Brazil. In this context, this research proposes to develop a method that makes it possible to understand in the future how Hansen’s disease affects facial muscles. By using surface electromyography, a system was adapted so as to capture the signals from the largest possible number of facial muscles. We have first looked upon the literature to learn about the way researchers around the globe have been working with diseases that affect the peripheral neural system and how electromyography has acted to contribute to the understanding of these diseases. From these data, a protocol was proposed to collect facial surface electromyographic (sEMG) signals so that these signals presented a high signal to noise ratio. After collecting the signals, we looked for a method that would enable the visualization of this information in a way to make it possible to guarantee that the method used presented satisfactory results. After identifying the method's efficiency, we tried to understand which information could be extracted from the electromyographic signal representing the collected data. Once studies demonstrating which information could contribute to a better understanding of this pathology were not to be found in literature, parameters of amplitude, frequency and entropy were extracted from the signal and a feature selection was made in order to look for the features that better distinguish a healthy individual from a pathological one. After, we tried to identify the classifier that best discriminates distinct individuals from different groups, and also the set of parameters of this classifier that would bring the best outcome. It was identified that the protocol proposed in this study and the adaptation with disposable electrodes available in market proved their effectiveness and capability of being used in different studies whose intention is to collect data from facial electromyography. The feature selection algorithm also showed that not all of the features extracted from the signal are significant for data classification, with some more relevant than others. The classifier Support Vector Machine (SVM) proved itself efficient when the adequate Kernel function was used with the muscle from which information was to be extracted. Each investigated muscle presented different results when the classifier used linear, radial and polynomial kernel functions. Even though we have focused on Hansen’s disease, the method applied here can be used to study facial electromyography in other pathologies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A number of studies in the areas of Biomedical Engineering and Health Sciences have employed machine learning tools to develop methods capable of identifying patterns in different sets of data. Despite its extinction in many countries of the developed world, Hansen’s disease is still a disease that affects a huge part of the population in countries such as India and Brazil. In this context, this research proposes to develop a method that makes it possible to understand in the future how Hansen’s disease affects facial muscles. By using surface electromyography, a system was adapted so as to capture the signals from the largest possible number of facial muscles. We have first looked upon the literature to learn about the way researchers around the globe have been working with diseases that affect the peripheral neural system and how electromyography has acted to contribute to the understanding of these diseases. From these data, a protocol was proposed to collect facial surface electromyographic (sEMG) signals so that these signals presented a high signal to noise ratio. After collecting the signals, we looked for a method that would enable the visualization of this information in a way to make it possible to guarantee that the method used presented satisfactory results. After identifying the method's efficiency, we tried to understand which information could be extracted from the electromyographic signal representing the collected data. Once studies demonstrating which information could contribute to a better understanding of this pathology were not to be found in literature, parameters of amplitude, frequency and entropy were extracted from the signal and a feature selection was made in order to look for the features that better distinguish a healthy individual from a pathological one. After, we tried to identify the classifier that best discriminates distinct individuals from different groups, and also the set of parameters of this classifier that would bring the best outcome. It was identified that the protocol proposed in this study and the adaptation with disposable electrodes available in market proved their effectiveness and capability of being used in different studies whose intention is to collect data from facial electromyography. The feature selection algorithm also showed that not all of the features extracted from the signal are significant for data classification, with some more relevant than others. The classifier Support Vector Machine (SVM) proved itself efficient when the adequate Kernel function was used with the muscle from which information was to be extracted. Each investigated muscle presented different results when the classifier used linear, radial and polynomial kernel functions. Even though we have focused on Hansen’s disease, the method applied here can be used to study facial electromyography in other pathologies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rhythm analysis of written texts focuses on literary analysis and it mainly considers poetry. In this paper we investigate the relevance of rhythmic features for categorizing texts in prosaic form pertaining to different genres. Our contribution is threefold. First, we define a set of rhythmic features for written texts. Second, we extract these features from three corpora, of speeches, essays, and newspaper articles. Third, we perform feature selection by means of statistical analyses, and determine a subset of features which efficiently discriminates between the three genres. We find that using as little as eight rhythmic features, documents can be adequately assigned to a given genre with an accuracy of around 80 %, significantly higher than the 33 % baseline which results from random assignment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent paradigms in wireless communication architectures describe environments where nodes present a highly dynamic behavior (e.g., User Centric Networks). In such environments, routing is still performed based on the regular packet-switched behavior of store-and-forward. Albeit sufficient to compute at least an adequate path between a source and a destination, such routing behavior cannot adequately sustain the highly nomadic lifestyle that Internet users are today experiencing. This thesis aims to analyse the impact of the nodes’ mobility on routing scenarios. It also aims at the development of forwarding concepts that help in message forwarding across graphs where nodes exhibit human mobility patterns, as is the case of most of the user-centric wireless networks today. The first part of the work involved the analysis of the mobility impact on routing, and we found that node mobility significance can affect routing performance, and it depends on the link length, distance, and mobility patterns of nodes. The study of current mobility parameters showed that they capture mobility partially. The routing protocol robustness to node mobility depends on the routing metric sensitivity to node mobility. As such, mobility-aware routing metrics were devised to increase routing robustness to node mobility. Two categories of routing metrics proposed are the time-based and spatial correlation-based. For the validation of the metrics, several mobility models were used, which include the ones that mimic human mobility patterns. The metrics were implemented using the Network Simulator tool using two widely used multi-hop routing protocols of Optimized Link State Routing (OLSR) and Ad hoc On Demand Distance Vector (AODV). Using the proposed metrics, we reduced the path re-computation frequency compared to the benchmark metric. This means that more stable nodes were used to route data. The time-based routing metrics generally performed well across the different node mobility scenarios used. We also noted a variation on the performance of the metrics, including the benchmark metric, under different mobility models, due to the differences in the node mobility governing rules of the models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation presents the design of three high-performance successive-approximation-register (SAR) analog-to-digital converters (ADCs) using distinct digital background calibration techniques under the framework of a generalized code-domain linear equalizer. These digital calibration techniques effectively and efficiently remove the static mismatch errors in the analog-to-digital (A/D) conversion. They enable aggressive scaling of the capacitive digital-to-analog converter (DAC), which also serves as sampling capacitor, to the kT/C limit. As a result, outstanding conversion linearity, high signal-to-noise ratio (SNR), high conversion speed, robustness, superb energy efficiency, and minimal chip-area are accomplished simultaneously. The first design is a 12-bit 22.5/45-MS/s SAR ADC in 0.13-μm CMOS process. It employs a perturbation-based calibration based on the superposition property of linear systems to digitally correct the capacitor mismatch error in the weighted DAC. With 3.0-mW power dissipation at a 1.2-V power supply and a 22.5-MS/s sample rate, it achieves a 71.1-dB signal-to-noise-plus-distortion ratio (SNDR), and a 94.6-dB spurious free dynamic range (SFDR). At Nyquist frequency, the conversion figure of merit (FoM) is 50.8 fJ/conversion step, the best FoM up to date (2010) for 12-bit ADCs. The SAR ADC core occupies 0.06 mm2, while the estimated area the calibration circuits is 0.03 mm2. The second proposed digital calibration technique is a bit-wise-correlation-based digital calibration. It utilizes the statistical independence of an injected pseudo-random signal and the input signal to correct the DAC mismatch in SAR ADCs. This idea is experimentally verified in a 12-bit 37-MS/s SAR ADC fabricated in 65-nm CMOS implemented by Pingli Huang. This prototype chip achieves a 70.23-dB peak SNDR and an 81.02-dB peak SFDR, while occupying 0.12-mm2 silicon area and dissipating 9.14 mW from a 1.2-V supply with the synthesized digital calibration circuits included. The third work is an 8-bit, 600-MS/s, 10-way time-interleaved SAR ADC array fabricated in 0.13-μm CMOS process. This work employs an adaptive digital equalization approach to calibrate both intra-channel nonlinearities and inter-channel mismatch errors. The prototype chip achieves 47.4-dB SNDR, 63.6-dB SFDR, less than 0.30-LSB differential nonlinearity (DNL), and less than 0.23-LSB integral nonlinearity (INL). The ADC array occupies an active area of 1.35 mm2 and dissipates 30.3 mW, including synthesized digital calibration circuits and an on-chip dual-loop delay-locked loop (DLL) for clock generation and synchronization.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Malnutrition, as a global problem, is mainly caused by low level of mineral elements in staple food (deficient soil). Biofortification is based on selection of genotypes with enhanced concentration of mineral elements in grain, as well as decreased concentration of substances which interfere bioavailability of mineral elements in gut (like phytic acid), and increased content of substances that increase availability (such as β-carotene). The experiment with 51 maize ( Zea mays L.) inbred lines with different heterotic background was set up in order to evaluate chemical composition of grain and to determine the relations between phytic acid (PA), β-carotene, and mineral elements: Mg, Fe, Mn, and Zn. The highest average phytate, β-carotene, Fe, and Mn content was found in grain of inbreds from Lancaster heterotic group. The highest content of Mg was in grain of Independent source and Zn in grain of BSSS group. Increased level of Fe and Mn in Lancaster lines could be partially affected by higher PA content in grain, while increased β-carotene content could improve Mn and Zn availability from grain of BSSS genotypes and Mg availability from Lancaster inbreds. It is important to underline that PA reduction is followed by Zn content increase in grain of Lancaster heterotic group, as well as that variations in Mg, Fe, and Mn contents are independent on PA status in inbreds from Independent source, indicating that the genotypes with higher Mg, Fe and Mn status from this group could serve as favorable source for improved Mg, Fe, and Mn absorption.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mechanistic models used for prediction should be parsimonious, as models which are over-parameterised may have poor predictive performance. Determining whether a model is parsimonious requires comparisons with alternative model formulations with differing levels of complexity. However, creating alternative formulations for large mechanistic models is often problematic, and usually time-consuming. Consequently, few are ever investigated. In this paper, we present an approach which rapidly generates reduced model formulations by replacing a model’s variables with constants. These reduced alternatives can be compared to the original model, using data based model selection criteria, to assist in the identification of potentially unnecessary model complexity, and thereby inform reformulation of the model. To illustrate the approach, we present its application to a published radiocaesium plant-uptake model, which predicts uptake on the basis of soil characteristics (e.g. pH, organic matter content, clay content). A total of 1024 reduced model formulations were generated, and ranked according to five model selection criteria: Residual Sum of Squares (RSS), AICc, BIC, MDL and ICOMP. The lowest scores for RSS and AICc occurred for the same reduced model in which pH dependent model components were replaced. The lowest scores for BIC, MDL and ICOMP occurred for a further reduced model in which model components related to the distinction between adsorption on clay and organic surfaces were replaced. Both these reduced models had a lower RSS for the parameterisation dataset than the original model. As a test of their predictive performance, the original model and the two reduced models outlined above were used to predict an independent dataset. The reduced models have lower prediction sums of squares than the original model, suggesting that the latter may be overfitted. The approach presented has the potential to inform model development by rapidly creating a class of alternative model formulations, which can be compared.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Perspective taking is a crucial ability that guides our social interactions. In this study, we show how the specific patterns of errors of brain-damaged patients in perspective taking tasks can help us further understand the factors contributing to perspective taking abilities. Previous work (e.g., Samson, Apperly, Chiavarino, & Humphreys, 2004; Samson, Apperly, Kathirgamanathan, & Humphreys, 2005) distinguished two components of perspective taking: the ability to inhibit our own perspective and the ability to infer someone else’s perspective. We assessed these components using a new nonverbal false belief task which provided different response options to detect three types of response strategies that participants might be using: a complete and spared belief reasoning strategy, a reality-based response selection strategy in which participants respond from their own perspective, and a simplified mentalising strategy in which participants avoid responding from their own perspective but rely on inaccurate cues to infer the other person’s belief. One patient, with a self-perspective inhibition deficit, almost always used the reality-based response strategy; in contrast, the other patient, with a deficit in taking other perspectives, tended to use the simplified mentalising strategy without necessarily transposing her own perspective. We discuss the extent to which the pattern of performance of both patients could relate to their executive function deficit and how it can inform us on the cognitive and neural components involved in belief reasoning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we focus on pattern recognition methods related to EMG upper-limb prosthetic control. After giving a detailed review of the most widely used classification methods, we propose a new classification approach. It comes as a result of comparison in the Fourier analysis between able-bodied and trans-radial amputee subjects. We thus suggest a different classification method which considers each surface electrodes contribute separately, together with five time domain features, obtaining an average classification accuracy equals to 75% on a sample of trans-radial amputees. We propose an automatic feature selection procedure as a minimization problem in order to improve the method and its robustness.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research investigates the use of Artificial Intelligence (AI) systems for profiling and decision-making, and the consequences that it poses to rights and freedoms of individuals. In particular, the research considers that automated decision-making systems (ADMs) are opaque, can be biased, and their logic is correlation-based. For these reasons, ADMs do not take decisions as human beings do. Against this background, the risks for the rights of individuals combined with the demand for transparency of algorithms have created a debate on the need for a new 'right to explanation'. Assuming that, except in cases provided for by law, a decision made by a human does not entitle to a right to explanation, the question has been raised as to whether – if the decision is made by an algorithm – it is necessary to configure a right to explanation for the decision-subject. Therefore, the research addresses a right to explanation of automated decision-making, examining the relation between today’s technology and legal concepts of explanation, reasoning, and transparency. In particular, it focuses on the existence and scope of the right to explanation, considering legal and technical issues surrounding the use of ADMs. The research analyses the use of AI and the problems arising from it from a legal perspective, studying the EU legal framework – especially in the data protection field. In this context, a part of the research is focused on transparency requirements under the GDPR (namely, Articles 13–15, 22, as well as Recital 71). The research aims to outline an interpretative framework of such a right and make recommendations about its development, aiming to provide guidelines for an adequate explanation of automated decisions. Hence, the thesis analyses what an explanation might consist of, and the benefits of explainable AI – examined from legal and technical perspectives.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Aircraft manufacturing industries are looking for solutions in order to increase their productivity. One of the solutions is to apply the metrology systems during the production and assembly processes. Metrology Process Model (MPM) (Maropoulos et al, 2007) has been introduced which emphasises metrology applications with assembly planning, manufacturing processes and product designing. Measurability analysis is part of the MPM and the aim of this analysis is to check the feasibility for measuring the designed large scale components. Measurability Analysis has been integrated in order to provide an efficient matching system. Metrology database is structured by developing the Metrology Classification Model. Furthermore, the feature-based selection model is also explained. By combining two classification models, a novel approach and selection processes for integrated measurability analysis system (MAS) are introduced and such integrated MAS could provide much more meaningful matching results for the operators. © Springer-Verlag Berlin Heidelberg 2010.