921 resultados para Artificial Information Models
Resumo:
This thesis presents a thorough and principled investigation into the application of artificial neural networks to the biological monitoring of freshwater. It contains original ideas on the classification and interpretation of benthic macroinvertebrates, and aims to demonstrate their superiority over the biotic systems currently used in the UK to report river water quality. The conceptual basis of a new biological classification system is described, and a full review and analysis of a number of river data sets is presented. The biological classification is compared to the common biotic systems using data from the Upper Trent catchment. This data contained 292 expertly classified invertebrate samples identified to mixed taxonomic levels. The neural network experimental work concentrates on the classification of the invertebrate samples into biological class, where only a subset of the sample is used to form the classification. Other experimentation is conducted into the identification of novel input samples, the classification of samples from different biotopes and the use of prior information in the neural network models. The biological classification is shown to provide an intuitive interpretation of a graphical representation, generated without reference to the class labels, of the Upper Trent data. The selection of key indicator taxa is considered using three different approaches; one novel, one from information theory and one from classical statistical methods. Good indicators of quality class based on these analyses are found to be in good agreement with those chosen by a domain expert. The change in information associated with different levels of identification and enumeration of taxa is quantified. The feasibility of using neural network classifiers and predictors to develop numeric criteria for the biological assessment of sediment contamination in the Great Lakes is also investigated.
Resumo:
The research described here concerns the development of metrics and models to support the development of hybrid (conventional/knowledge based) integrated systems. The thesis argues from the point that, although it is well known that estimating the cost, duration and quality of information systems is a difficult task, it is far from clear what sorts of tools and techniques would adequately support a project manager in the estimation of these properties. A literature review shows that metrics (measurements) and estimating tools have been developed for conventional systems since the 1960s while there has been very little research on metrics for knowledge based systems (KBSs). Furthermore, although there are a number of theoretical problems with many of the `classic' metrics developed for conventional systems, it also appears that the tools which such metrics can be used to develop are not widely used by project managers. A survey was carried out of large UK companies which confirmed this continuing state of affairs. Before any useful tools could be developed, therefore, it was important to find out why project managers were not using these tools already. By characterising those companies that use software cost estimating (SCE) tools against those which could but do not, it was possible to recognise the involvement of the client/customer in the process of estimation. Pursuing this point, a model of the early estimating and planning stages (the EEPS model) was developed to test exactly where estimating takes place. The EEPS model suggests that estimating could take place either before a fully-developed plan has been produced, or while this plan is being produced. If it were the former, then SCE tools would be particularly useful since there is very little other data available from which to produce an estimate. A second survey, however, indicated that project managers see estimating as being essentially the latter at which point project management tools are available to support the process. It would seem, therefore, that SCE tools are not being used because project management tools are being used instead. The issue here is not with the method of developing an estimating model or tool, but; in the way in which "an estimate" is intimately tied to an understanding of what tasks are being planned. Current SCE tools are perceived by project managers as targetting the wrong point of estimation, A model (called TABATHA) is then presented which describes how an estimating tool based on an analysis of tasks would thus fit into the planning stage. The issue of whether metrics can be usefully developed for hybrid systems (which also contain KBS components) is tested by extending a number of "classic" program size and structure metrics to a KBS language, Prolog. Measurements of lines of code, Halstead's operators/operands, McCabe's cyclomatic complexity, Henry & Kafura's data flow fan-in/out and post-release reported errors were taken for a set of 80 commercially-developed LPA Prolog programs: By re~defining the metric counts for Prolog it was found that estimates of program size and error-proneness comparable to the best conventional studies are possible. This suggests that metrics can be usefully applied to KBS languages, such as Prolog and thus, the development of metncs and models to support the development of hybrid information systems is both feasible and useful.
Resumo:
Adaptive information filtering is a challenging research problem. It requires the adaptation of a representation of a user’s multiple interests to various changes in them. We investigate the application of an immune-inspired approach to this problem. Nootropia, is a user profiling model that has many properties in common with computational models of the immune system that have been based on Franscisco Varela’s work. In this paper we concentrate on Nootropia’s evaluation. We define an evaluation methodology that uses virtual user’s to simulate various interest changes. The results show that Nootropia exhibits the desirable adaptive behaviour.
Resumo:
Summarizing the accumulated experience for a long time in the polyparametric cognitive modeling of different physiological processes (electrocardiogram, electroencephalogram, electroreovasogram and others) and the development on this basis some diagnostics methods give ground for formulating a new methodology of the system analysis in biology. The gist of the methodology consists of parametrization of fractals of electrophysiological processes, matrix description of functional state of an object with a unified set of parameters, construction of the polyparametric cognitive geometric model with artificial intelligence algorithms. The geometry model enables to display the parameter relationships are adequate to requirements of the system approach. The objective character of the elements of the models and high degree of formalization which facilitate the use of the mathematical methods are advantages of these models. At the same time the geometric images are easily interpreted in physiological and clinical terms. The polyparametric modeling is an object oriented tool possessed advances functional facilities and some principal features.
Resumo:
Annual average daily traffic (AADT) is important information for many transportation planning, design, operation, and maintenance activities, as well as for the allocation of highway funds. Many studies have attempted AADT estimation using factor approach, regression analysis, time series, and artificial neural networks. However, these methods are unable to account for spatially variable influence of independent variables on the dependent variable even though it is well known that to many transportation problems, including AADT estimation, spatial context is important. ^ In this study, applications of geographically weighted regression (GWR) methods to estimating AADT were investigated. The GWR based methods considered the influence of correlations among the variables over space and the spatially non-stationarity of the variables. A GWR model allows different relationships between the dependent and independent variables to exist at different points in space. In other words, model parameters vary from location to location and the locally linear regression parameters at a point are affected more by observations near that point than observations further away. ^ The study area was Broward County, Florida. Broward County lies on the Atlantic coast between Palm Beach and Miami-Dade counties. In this study, a total of 67 variables were considered as potential AADT predictors, and six variables (lanes, speed, regional accessibility, direct access, density of roadway length, and density of seasonal household) were selected to develop the models. ^ To investigate the predictive powers of various AADT predictors over the space, the statistics including local r-square, local parameter estimates, and local errors were examined and mapped. The local variations in relationships among parameters were investigated, measured, and mapped to assess the usefulness of GWR methods. ^ The results indicated that the GWR models were able to better explain the variation in the data and to predict AADT with smaller errors than the ordinary linear regression models for the same dataset. Additionally, GWR was able to model the spatial non-stationarity in the data, i.e., the spatially varying relationship between AADT and predictors, which cannot be modeled in ordinary linear regression. ^
Resumo:
The Bahamas is a small island nation that is dealing with the problem of freshwater shortage. All of the country’s freshwater is contained in shallow lens aquifers that are recharged solely by rainfall. The country has been struggling to meet the water demands by employing a combination of over-pumping of aquifers, transport of water by barge between islands, and desalination of sea water. In recent decades, new development on New Providence, where the capital city of Nassau is located, has created a large area of impervious surfaces and thereby a substantial amount of runoff with the result that several of the aquifers are not being recharged. A geodatabase was assembled to assess and estimate the quantity of runoff from these impervious surfaces and potential recharge locations were identified using a combination of Geographic Information Systems (GIS) and remote sensing. This study showed that runoff from impervious surfaces in New Providence represents a large freshwater resource that could potentially be used to recharge the lens aquifers on New Providence.
Resumo:
Information processing in the human brain has always been considered as a source of inspiration in Artificial Intelligence; in particular, it has led researchers to develop different tools such as artificial neural networks. Recent findings in Neurophysiology provide evidence that not only neurons but also isolated and networks of astrocytes are responsible for processing information in the human brain. Artificial neural net- works (ANNs) model neuron-neuron communications. Artificial neuron-glia networks (ANGN), in addition to neuron-neuron communications, model neuron-astrocyte con- nections. In continuation of the research on ANGNs, first we propose, and evaluate a model of adaptive neuro fuzzy inference systems augmented with artificial astrocytes. Then, we propose a model of ANGNs that captures the communications of astrocytes in the brain; in this model, a network of artificial astrocytes are implemented on top of a typical neural network. The results of the implementation of both networks show that on certain combinations of parameter values specifying astrocytes and their con- nections, the new networks outperform typical neural networks. This research opens a range of possibilities for future work on designing more powerful architectures of artificial neural networks that are based on more realistic models of the human brain.
Resumo:
Traditionally, many small-sized copepod species are considered to be widespread, bipolar or cosmopolitan. However, these large-scale distribution patterns need to be re-examined in view of increasing evidence of cryptic and pseudo-cryptic speciation in pelagic copepods. Here, we present a phylogeographic study of Oithona similis s.l. populations from the Arctic Ocean, the Southern Ocean and its northern boundaries, the North Atlantic and the Mediterrranean Sea. O. similis s.l. is considered as one of the most abundant species in temperate to polar oceans and acts as an important link in the trophic network between the microbial loop and higher trophic levels such as fish larvae. Two gene fragments were analysed: the mitochondrial cytochrome oxidase c subunit I (COI), and the nuclear ribosomal 28S genetic marker. Seven distinct, geographically delimitated, mitochondrial lineages could be identified, with divergences among the lineages ranging from 8 to 24 %, thus representing most likely cryptic or pseudocryptic species within O. similis s.l. Four lineages were identified within or close to the borders of the Southern Ocean, one lineage in the Arctic Ocean and two lineages in the temperate Northern hemisphere. Surprisingly the Arctic lineage was more closely related to lineages from the Southern hemisphere than to the other lineages from the Northern hemisphere, suggesting that geographic proximity is a rather poor predictor of how closely related the clades are on a genetic level. Molecular clock application revealed that the evolutionary history of O. similis s.l. is possibly closely associated with the reorganization of the ocean circulation in the mid Miocene and may be an example of allopatric speciation in the pelagic zone.
Resumo:
This paper presents flow regimes identification methodology in multiphase system in annular, stratified and homogeneous oil-water-gas regimes. The principle is based on recognition of the pulse height distributions (PHD) from gamma-ray with supervised artificial neural network (ANN) systems. The detection geometry simulation comprises of two NaI(Tl) detectors and a dual-energy gamma-ray source. The measurement of scattered radiation enables the dual modality densitometry (DMD) measurement principle to be explored. Its basic principle is to combine the measurement of scattered and transmitted radiation in order to acquire information about the different flow regimes. The PHDs obtained by the detectors were used as input to ANN. The data sets required for training and testing the ANN were generated by the MCNP-X code from static and ideal theoretical models of multiphase systems. The ANN correctly identified the three different flow regimes for all data set evaluated. The results presented show that PHDs examined by ANN may be applied in the successfully flow regime identification.
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.
Resumo:
This study is aimed to model and forecast the tourism demand for Mozambique for the period from January 2004 to December 2013 using artificial neural networks models. The number of overnight stays in Hotels was used as representative of the tourism demand. A set of independent variables were experimented in the input of the model, namely: Consumer Price Index, Gross Domestic Product and Exchange Rates, of the outbound tourism markets, South Africa, United State of America, Mozambique, Portugal and the United Kingdom. The best model achieved has 6.5% for Mean Absolute Percentage Error and 0.696 for Pearson correlation coefficient. A model like this with high accuracy of forecast is important for the economic agents to know the future growth of this activity sector, as it is important for stakeholders to provide products, services and infrastructures and for the hotels establishments to adequate its level of capacity to the tourism demand.
Resumo:
Sea- level variations have a significant impact on coastal areas. Prediction of sea level variations expected from the pre most critical information needs associated with the sea environment. For this, various methods exist. In this study, on the northern coast of the Persian Gulf have been studied relation to the effectiveness of parameters such as pressure, temperature and wind speed on sea leve and associated with global parameters such as the North Atlantic Oscillation index and NAO index and present statistic models for prediction of sea level. In the next step by using artificial neural network predict sea level for first in this region. Then compared results of the models. Prediction using statistical models estimated in terms correlation coefficient R = 0.84 and root mean square error (RMS) 21.9 cm for the Bushehr station, and R = 0.85 and root mean square error (RMS) 48.4 cm for Rajai station, While neural network used to have 4 layers and each middle layer six neurons is best for prediction and produces the results reliably in terms of correlation coefficient with R = 0.90126 and the root mean square error (RMS) 13.7 cm for the Bushehr station, and R = 0.93916 and the root mean square error (RMS) 22.6 cm for Rajai station. Therefore, the proposed methodology could be successfully used in the study area.
Resumo:
This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.