30 resultados para experimental knowledge extraction
em CentAUR: Central Archive University of Reading - UK
Resumo:
This paper introduces a new neurofuzzy model construction and parameter estimation algorithm from observed finite data sets, based on a Takagi and Sugeno (T-S) inference mechanism and a new extended Gram-Schmidt orthogonal decomposition algorithm, for the modeling of a priori unknown dynamical systems in the form of a set of fuzzy rules. The first contribution of the paper is the introduction of a one to one mapping between a fuzzy rule-base and a model matrix feature subspace using the T-S inference mechanism. This link enables the numerical properties associated with a rule-based matrix subspace, the relationships amongst these matrix subspaces, and the correlation between the output vector and a rule-base matrix subspace, to be investigated and extracted as rule-based knowledge to enhance model transparency. The matrix subspace spanned by a fuzzy rule is initially derived as the input regression matrix multiplied by a weighting matrix that consists of the corresponding fuzzy membership functions over the training data set. Model transparency is explored by the derivation of an equivalence between an A-optimality experimental design criterion of the weighting matrix and the average model output sensitivity to the fuzzy rule, so that rule-bases can be effectively measured by their identifiability via the A-optimality experimental design criterion. The A-optimality experimental design criterion of the weighting matrices of fuzzy rules is used to construct an initial model rule-base. An extended Gram-Schmidt algorithm is then developed to estimate the parameter vector for each rule. This new algorithm decomposes the model rule-bases via an orthogonal subspace decomposition approach, so as to enhance model transparency with the capability of interpreting the derived rule-base energy level. This new approach is computationally simpler than the conventional Gram-Schmidt algorithm for resolving high dimensional regression problems, whereby it is computationally desirable to decompose complex models into a few submodels rather than a single model with large number of input variables and the associated curse of dimensionality problem. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.
Resumo:
A new robust neurofuzzy model construction algorithm has been introduced for the modeling of a priori unknown dynamical systems from observed finite data sets in the form of a set of fuzzy rules. Based on a Takagi-Sugeno (T-S) inference mechanism a one to one mapping between a fuzzy rule base and a model matrix feature subspace is established. This link enables rule based knowledge to be extracted from matrix subspace to enhance model transparency. In order to achieve maximized model robustness and sparsity, a new robust extended Gram-Schmidt (G-S) method has been introduced via two effective and complementary approaches of regularization and D-optimality experimental design. Model rule bases are decomposed into orthogonal subspaces, so as to enhance model transparency with the capability of interpreting the derived rule base energy level. A locally regularized orthogonal least squares algorithm, combined with a D-optimality used for subspace based rule selection, has been extended for fuzzy rule regularization and subspace based information extraction. By using a weighting for the D-optimality cost function, the entire model construction procedure becomes automatic. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.
Resumo:
Automatic indexing and retrieval of digital data poses major challenges. The main problem arises from the ever increasing mass of digital media and the lack of efficient methods for indexing and retrieval of such data based on the semantic content rather than keywords. To enable intelligent web interactions, or even web filtering, we need to be capable of interpreting the information base in an intelligent manner. For a number of years research has been ongoing in the field of ontological engineering with the aim of using ontologies to add such (meta) knowledge to information. In this paper, we describe the architecture of a system (Dynamic REtrieval Analysis and semantic metadata Management (DREAM)) designed to automatically and intelligently index huge repositories of special effects video clips, based on their semantic content, using a network of scalable ontologies to enable intelligent retrieval. The DREAM Demonstrator has been evaluated as deployed in the film post-production phase to support the process of storage, indexing and retrieval of large data sets of special effects video clips as an exemplar application domain. This paper provides its performance and usability results and highlights the scope for future enhancements of the DREAM architecture which has proven successful in its first and possibly most challenging proving ground, namely film production, where it is already in routine use within our test bed Partners' creative processes. (C) 2009 Published by Elsevier B.V.
Resumo:
The conceptual and parameter uncertainty of the semi-distributed INCA-N (Integrated Nutrients in Catchments-Nitrogen) model was studied using the GLUE (Generalized Likelihood Uncertainty Estimation) methodology combined with quantitative experimental knowledge, the concept known as 'soft data'. Cumulative inorganic N leaching, annual plant N uptake and annual mineralization proved to be useful soft data to constrain the parameter space. The INCA-N model was able to simulate the seasonal and inter-annual variations in the stream-water nitrate concentrations, although the lowest concentrations during the growing season were not reproduced. This suggested that there were some retention processes or losses either in peatland/wetland areas or in the river which were not included in the INCA-N model. The results of the study suggested that soft data was a way to reduce parameter equifinality, and that the calibration and testing of distributed hydrological and nutrient leaching models should be based both on runoff and/or nutrient concentration data and the qualitative knowledge of experimentalist. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Within generative L2 acquisition research there is a longstanding debate as to what underlies observable differences in L1/L2 knowledge/ performance. On the one hand, Full Accessibility approaches maintain that target L2 syntactic representations (new functional categories and features) are acquirable (e.g., Schwartz & Sprouse, 1996). Conversely, Partial Accessibility approaches claim that L2 variability and/or optionality, even at advanced levels, obtains as a result of inevitable deficits in L2 narrow syntax and is conditioned upon a maturational failure in adulthood to acquire (some) new functional features (e.g., Beck, 1998; Hawkins & Chan, 1997; Hawkins & Hattori, 2006; Tsimpli & Dimitrakopoulou, 2007). The present study tests the predictions of these two sets of approaches with advanced English learners of L2 Brazilian Portuguese (n = 21) in the domain of inflected infinitives. These advanced L2 learners reliably differentiate syntactically between finite verbs, uninflected and inflected infinitives, which, as argued, only supports Full Accessibility approaches. Moreover, we will discuss how testing the domain of inflected infinitives is especially interesting in light of recent proposals that Brazilian Portuguese colloquial dialects no longer actively instantiate them (Lightfoot, 1991; Pires, 2002, 2006; Pires & Rothman, 2009; Rothman, 2007).
Resumo:
In this paper, we introduce a novel high-level visual content descriptor which is devised for performing semantic-based image classification and retrieval. The work can be treated as an attempt to bridge the so called “semantic gap”. The proposed image feature vector model is fundamentally underpinned by the image labelling framework, called Collaterally Confirmed Labelling (CCL), which incorporates the collateral knowledge extracted from the collateral texts of the images with the state-of-the-art low-level image processing and visual feature extraction techniques for automatically assigning linguistic keywords to image regions. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicates that our proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models.
Resumo:
The transmissible spongiform encephalopathies (TSEs) are caused by infectious agents whose structures have not been fully characterized but include abnormal forms of the host protein PrP, designated PrPSc, which are deposited in infected tissues. The transmission routes of scrapie and chronic wasting disease (CWD) seem to include environmental spread in their epidemiology, yet the fate of TSE agents in the environment is poorly understood. There are concerns that, for example, buried carcasses may remain a potential reservoir of infectivity for many years. Experimental determination of the environmental fate requires methods for assessing binding/elution of TSE infectivity, or its surrogate marker PrPSc, to and from materials with which it might interact. We report a method using Sarkosyl for the extraction of murine PrPSc, and its application to soils containing recombinant ovine PrP (recPrP). Elution properties suggest that PrP binds strongly to one or more soil components. Elution from a clay soil also required proteinase K digestion, suggesting that in the clay soil binding occurs via the N-terminal of PrP to a component that is absent from the sandy soils tested.
Resumo:
Negative correlations between task performance in dynamic control tasks and verbalizable knowledge, as assessed by a post-task questionnaire, have been interpreted as dissociations that indicate two antagonistic modes of learning, one being “explicit”, the other “implicit”. This paper views the control tasks as finite-state automata and offers an alternative interpretation of these negative correlations. It is argued that “good controllers” observe fewer different state transitions and, consequently, can answer fewer post-task questions about system transitions than can “bad controllers”. Two experiments demonstrate the validity of the argument by showing the predicted negative relationship between control performance and the number of explored state transitions, and the predicted positive relationship between the number of explored state transitions and questionnaire scores. However, the experiments also elucidate important boundary conditions for the critical effects. We discuss the implications of these findings, and of other problems arising from the process control paradigm, for conclusions about implicit versus explicit learning processes.
Resumo:
Two experiments examined the claim for distinct implicit and explicit learning modes in the artificial grammar-learning task (Reber, 1967, 1989). Subjects initially attempted to memorize strings of letters generated by a finite-state grammar and then classified new grammatical and nongrammatical strings. Experiment 1 showed that subjects' assessment of isolated parts of strings was sufficient to account for their classification performance but that the rules elicited in free report were not sufficient. Experiment 2 showed that performing a concurrent random number generation task under different priorities interfered with free report and classification performance equally. Furthermore, giving different groups of subjects incidental or intentional learning instructions did not affect classification or free report.
Extraction of tidal channel networks from aerial photographs alone and combined with laser altimetry
Resumo:
Tidal channel networks play an important role in the intertidal zone, exerting substantial control over the hydrodynamics and sediment transport of the region and hence over the evolution of the salt marshes and tidal flats. The study of the morphodynamics of tidal channels is currently an active area of research, and a number of theories have been proposed which require for their validation measurement of channels over extensive areas. Remotely sensed data provide a suitable means for such channel mapping. The paper describes a technique that may be adapted to extract tidal channels from either aerial photographs or LiDAR data separately, or from both types of data used together in a fusion approach. Application of the technique to channel extraction from LiDAR data has been described previously. However, aerial photographs of intertidal zones are much more commonly available than LiDAR data, and most LiDAR flights now involve acquisition of multispectral images to complement the LiDAR data. In view of this, the paper investigates the use of multispectral data for semiautomatic identification of tidal channels, firstly from only aerial photographs or linescanner data, and secondly from fused linescanner and LiDAR data sets. A multi-level, knowledge-based approach is employed. The algorithm based on aerial photography can achieve a useful channel extraction, though may fail to detect some of the smaller channels, partly because the spectral response of parts of the non-channel areas may be similar to that of the channels. The algorithm for channel extraction from fused LiDAR and spectral data gives an increased accuracy, though only slightly higher than that obtained using LiDAR data alone. The results illustrate the difficulty of developing a fully automated method, and justify the semi-automatic approach adopted.
Resumo:
The study of the morphodynamics of tidal channel networks is important because of their role in tidal propagation and the evolution of salt-marshes and tidal flats. Channel dimensions range from tens of metres wide and metres deep near the low water mark to only 20-30cm wide and 20cm deep for the smallest channels on the marshes. The conventional method of measuring the networks is cumbersome, involving manual digitising of aerial photographs. This paper describes a semi-automatic knowledge-based network extraction method that is being implemented to work using airborne scanning laser altimetry (and later aerial photography). The channels exhibit a width variation of several orders of magnitude, making an approach based on multi-scale line detection difficult. The processing therefore uses multi-scale edge detection to detect channel edges, then associates adjacent anti-parallel edges together to form channels using a distance-with-destination transform. Breaks in the networks are repaired by extending channel ends in the direction of their ends to join with nearby channels, using domain knowledge that flow paths should proceed downhill and that any network fragment should be joined to a nearby fragment so as to connect eventually to the open sea.
Resumo:
This article is a commentary on several research studies conducted on the prospects for aerobic rice production systems that aim at reducing the demand for irrigation water which in certain major rice producing areas of the world is becoming increasingly scarce. The research studies considered, as reported in published articles mainly under the aegis of the International Rice Research Institute (IRRI), have a narrow scope in that they test only 3 or 4 rice varieties under different soil moisture treatments obtained with controlled irrigation, but with other agronomic factors of production held as constant. Consequently, these studies do not permit an assessment of the interactions among agronomic factors that will be of critical significance to the performance of any production system. Varying the production factor of "water" will seriously affect also the levels of the other factors required to optimise the performance of a production system. The major weakness in the studies analysed in this article originates from not taking account of the interactions between experimental and non-experimental factors involved in the comparisons between different production systems. This applies to the experimental field design used for the research studies as well as to the subsequent statistical analyses of the results. The existence of such interactions is a serious complicating element that makes meaningful comparisons between different crop production systems difficult. Consequently, the data and conclusions drawn from such research readily become biased towards proposing standardised solutions for possible introduction to farmers through a linear technology transfer process. Yet, the variability and diversity encountered in the real-world farming environment demand more flexible solutions and approaches in the dissemination of knowledge-intensive production practices through "experiential learning" types of processes, such as those employed by farmer field schools. This article illustrates, based on expertise of the 'system of rice intensification' (SRI), that several cost-effective and environment-friendly agronomic solutions to reduce the demand for irrigation water, other than the asserted need for the introduction of new cultivars, are feasible. Further, these agronomic Solutions can offer immediate benefits of reduced water requirements and increased net returns that Would be readily accessible to a wide range of rice producers, particularly the resource poor smallholders. (C) 2009 Elsevier B.V. All rights reserved.