970 resultados para model complexity


Relevância:

60.00% 60.00%

Publicador:

Resumo:

From the model geometry creation to the model analysis, the stages in between such as mesh generation are the most manpower intensive phase in a mesh-based computational mechanics simulation process. On the other hand the model analysis is the most computing intensive phase. Advanced computational hardware and software have significantly reduced the computing time - and more importantly the trend is downward. With the kind of models envisaged coming, which are larger, more complex in geometry and modelling, and multiphysics, there is no clear trend that the manpower intensive phase is to decrease significantly in time - in the present way of operation it is more likely to increase with model complexity. In this paper we address this dilemma in collaborating components for models in electronic packaging application.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the identification of complex dynamic systems using fuzzy neural networks, one of the main issues is the curse of dimensionality, which makes it difficult to retain a large number of system inputs or to consider a large number of fuzzy sets. Moreover, due to the correlations, not all possible network inputs or regression vectors in the network are necessary and adding them simply increases the model complexity and deteriorates the network generalisation performance. In this paper, the problem is solved by first proposing a fast algorithm for selection of network terms, and then introducing a refinement procedure to tackle the correlation issue. Simulation results show the efficacy of the method.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Classification methods with embedded feature selection capability are very appealing for the analysis of complex processes since they allow the analysis of root causes even when the number of input variables is high. In this work, we investigate the performance of three techniques for classification within a Monte Carlo strategy with the aim of root cause analysis. We consider the naive bayes classifier and the logistic regression model with two different implementations for controlling model complexity, namely, a LASSO-like implementation with a L1 norm regularization and a fully Bayesian implementation of the logistic model, the so called relevance vector machine. Several challenges can arise when estimating such models mainly linked to the characteristics of the data: a large number of input variables, high correlation among subsets of variables, the situation where the number of variables is higher than the number of available data points and the case of unbalanced datasets. Using an ecological and a semiconductor manufacturing dataset, we show advantages and drawbacks of each method, highlighting the superior performance in term of classification accuracy for the relevance vector machine with respect to the other classifiers. Moreover, we show how the combination of the proposed techniques and the Monte Carlo approach can be used to get more robust insights into the problem under analysis when faced with challenging modelling conditions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

All systems found in nature exhibit, with different degrees, a nonlinear behavior. To emulate this behavior, classical systems identification techniques use, typically, linear models, for mathematical simplicity. Models inspired by biological principles (artificial neural networks) and linguistically motivated (fuzzy systems), due to their universal approximation property, are becoming alternatives to classical mathematical models. In systems identification, the design of this type of models is an iterative process, requiring, among other steps, the need to identify the model structure, as well as the estimation of the model parameters. This thesis addresses the applicability of gradient-basis algorithms for the parameter estimation phase, and the use of evolutionary algorithms for model structure selection, for the design of neuro-fuzzy systems, i.e., models that offer the transparency property found in fuzzy systems, but use, for their design, algorithms introduced in the context of neural networks. A new methodology, based on the minimization of the integral of the error, and exploiting the parameter separability property typically found in neuro-fuzzy systems, is proposed for parameter estimation. A recent evolutionary technique (bacterial algorithms), based on the natural phenomenon of microbial evolution, is combined with genetic programming, and the resulting algorithm, bacterial programming, advocated for structure determination. Different versions of this evolutionary technique are combined with gradient-based algorithms, solving problems found in fuzzy and neuro-fuzzy design, namely incorporation of a-priori knowledge, gradient algorithms initialization and model complexity reduction.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis explores how multinational corporations of different sizes create barriers to imitation and therefore sustain competitive advantage in rural and informal Base of the Pyramid economies. These markets require close cooperation with local partners in a dynamic environment that lacks imposable property rights and follows a different rationale than developed markets. In order to explore how competitive advantage is sustained by different sized multinational corporations at the Base of the Pyramid, the natural-resource-based view and the dynamic capabilities perspective are integrated. Based on this integration the natural-resource-based view is extended by identifying critical dynamic capabilities that are assumed to be sources of competitive advantage at the Base of the Pyramid. Further, a contrasting case study explores how the identified dynamic capabilities are protected and their competitive advantage is sustained by isolating mechanisms that create barriers to imitation for a small to medium sized and a large multinational corporation. The case study results give grounds to assume that most resource-based isolating mechanisms create barriers to imitation that are fairly high for large and established multinational corporations that operate at the rural Base of the Pyramid and have a high product and business model complexity. On the contrary, barriers to imitation were found to be lower for young and small to medium sized multinational corporations with low product and business model complexity that according to some authors represent the majority of rural Base of the Pyramid companies. Particularly for small to medium sized multinational corporations the case study finds a relationship- and transaction-based unwillingness of local partners to act opportunistically rather than a resource-based inability to imitate. By offering an explanation of sustained competitive advantage for small to medium sized multinational corporations at the rural Base of the Pyramid this thesis closes an important research gap and recommends to include institutional and transaction-based research perspectives.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The development of eutrophication in river systems is poorly understood given the complex relationship between fixed plants, algae, hydrodynamics, water chemistry and solar radiation. However there is a pressing need to understand the relationship between the ecological status of rivers and the controlling environmental factors to help the reasoned implementation of the Water Framework Directive and Catchment Sensitive Farming in the UK. This research aims to create a dynamic, process-based, mathematical in-stream model to simulate the growth and competition of different vegetation types (macrophytes, phytoplankton and benthic algae) in rivers. The model, applied to the River Frome (Dorset, UK), captured well the seasonality of simulated vegetation types (suspended algae, macrophytes, epiphytes, sediment biofilm). Macrophyte results showed that local knowledge is important for explaining unusual changes in biomass. Fixed algae simulations indicated the need for the more detailed representation of various herbivorous grazer groups, however this would increase the model complexity, the number of model parameters and the required observation data to better define the model. The model results also highlighted that simulating only phytoplankton is insufficient in river systems, because the majority of the suspended algae have benthic origin in short retention time rivers. Therefore, there is a need for modelling tools that link the benthic and free-floating habitats.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A large number of urban surface energy balance models now exist with different assumptions about the important features of the surface and exchange processes that need to be incorporated. To date, no com- parison of these models has been conducted; in contrast, models for natural surfaces have been compared extensively as part of the Project for Intercomparison of Land-surface Parameterization Schemes. Here, the methods and first results from an extensive international comparison of 33 models are presented. The aim of the comparison overall is to understand the complexity required to model energy and water exchanges in urban areas. The degree of complexity included in the models is outlined and impacts on model performance are discussed. During the comparison there have been significant developments in the models with resulting improvements in performance (root-mean-square error falling by up to two-thirds). Evaluation is based on a dataset containing net all-wave radiation, sensible heat, and latent heat flux observations for an industrial area in Vancouver, British Columbia, Canada. The aim of the comparison is twofold: to identify those modeling ap- proaches that minimize the errors in the simulated fluxes of the urban energy balance and to determine the degree of model complexity required for accurate simulations. There is evidence that some classes of models perform better for individual fluxes but no model performs best or worst for all fluxes. In general, the simpler models perform as well as the more complex models based on all statistical measures. Generally the schemes have best overall capability to model net all-wave radiation and least capability to model latent heat flux.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Purpose – The purpose of this paper is to investigate the effect of choices of model structure and scale in development viability appraisal. The paper addresses two questions concerning the application of development appraisal techniques to viability modelling within the UK planning system. The first relates to the extent to which, given intrinsic input uncertainty, the choice of model structure significantly affects model outputs. The second concerns the extent to which, given intrinsic input uncertainty, the level of model complexity significantly affects model outputs. Design/methodology/approach – Monte Carlo simulation procedures are applied to a hypothetical development scheme in order to measure the effects of model aggregation and structure on model output variance. Findings – It is concluded that, given the particular scheme modelled and unavoidably subjective assumptions of input variance, that simple and simplistic models may produce similar outputs to more robust and disaggregated models. Evidence is found of equifinality in the outputs of a simple, aggregated model of development viability relative to more complex, disaggregated models. Originality/value – Development viability appraisal has become increasingly important in the planning system. Consequently, the theory, application and outputs from development appraisal are under intense scrutiny from a wide range of users. However, there has been very little published evaluation of viability models. This paper contributes to the limited literature in this area.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper investigates the effect of choices of model structure and scale in development viability appraisal. The paper addresses two questions concerning the application of development appraisal techniques to viability modelling within the UK planning system. The first relates to the extent to which, given intrinsic input uncertainty, the choice of model structure significantly affects model outputs. The second concerns the extent to which, given intrinsic input uncertainty, the level of model complexity significantly affects model outputs. Monte Carlo simulation procedures are applied to a hypothetical development scheme in order to measure the effects of model aggregation and structure on model output variance. It is concluded that, given the particular scheme modelled and unavoidably subjective assumptions of input variance, that simple and simplistic models may produce similar outputs to more robust and disaggregated models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As a part of the Atmospheric Model Intercomparison Project (AMIP), the behaviour of 15 general circulation models has been analysed in order to diagnose and compare the ability of the different models in simulating Northern Hemisphere midlatitude atmospheric blocking. In accordance with the established AMIP procedure, the 10-year model integrations were performed using prescribed, time-evolving monthly mean observed SSTs spanning the period January 1979–December 1988. Atmospheric observational data (ECMWF analyses) over the same period have been also used to verify the models results. The models involved in this comparison represent a wide spectrum of model complexity, with different horizontal and vertical resolution, numerical techniques and physical parametrizations, and exhibit large differences in blocking behaviour. Nevertheless, a few common features can be found, such as the general tendency to underestimate both blocking frequency and the average duration of blocks. The problem of the possible relationship between model blocking and model systematic errors has also been assessed, although without resorting to ad-hoc numerical experimentation it is impossible to relate with certainty particular model deficiencies in representing blocking to precise parts of the model formulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As a part of the Atmospheric Model Intercomparison Project (AMIP), the behaviour of 15 general circulation models has been analysed in order to diagnose and compare the ability of the different models in simulating Northern Hemisphere midlatitude atmospheric blocking. In accordance with the established AMIP procedure, the 10-year model integrations were performed using prescribed, time-evolving monthly mean observed SSTs spanning the period January 1979–December 1988. Atmospheric observational data (ECMWF analyses) over the same period have been also used to verify the models results. The models involved in this comparison represent a wide spectrum of model complexity, with different horizontal and vertical resolution, numerical techniques and physical parametrizations, and exhibit large differences in blocking behaviour. Nevertheless, a few common features can be found, such as the general tendency to underestimate both blocking frequency and the average duration of blocks. The problem of the possible relationship between model blocking and model systematic errors has also been assessed, although without resorting to ad-hoc numerical experimentation it is impossible to relate with certainty particular model deficiencies in representing blocking to precise parts of the model formulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Snow provides large seasonal storage of freshwater, and information about the distribution of snow mass as Snow Water Equivalent (SWE) is important for hydrological planning and detecting climate change impacts. Large regional disagreements remain between estimates from reanalyses, remote sensing and modelling. Assimilating passive microwave information improves SWE estimates in many regions but the assimilation must account for how microwave scattering depends on snow stratigraphy. Physical snow models can estimate snow stratigraphy, but users must consider the computational expense of model complexity versus acceptable errors. Using data from the National Aeronautics and Space Administration Cold Land Processes Experiment (NASA CLPX) and the Helsinki University of Technology (HUT) microwave emission model of layered snowpacks, it is shown that simulations of the brightness temperature difference between 19 GHz and 37 GHz vertically polarised microwaves are consistent with Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) and Special Sensor Microwave Imager (SSM/I) retrievals once known stratigraphic information is used. Simulated brightness temperature differences for an individual snow profile depend on the provided stratigraphic detail. Relative to a profile defined at the 10 cm resolution of density and temperature measurements, the error introduced by simplification to a single layer of average properties increases approximately linearly with snow mass. If this brightness temperature error is converted into SWE using a traditional retrieval method then it is equivalent to ±13 mm SWE (7% of total) at a depth of 100 cm. This error is reduced to ±5.6 mm SWE (3 % of total) for a two-layer model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper evaluates the current status of global modeling of the organic aerosol (OA) in the troposphere and analyzes the differences between models as well as between models and observations. Thirty-one global chemistry transport models (CTMs) and general circulation models (GCMs) have participated in this intercomparison, in the framework of AeroCom phase II. The simulation of OA varies greatly between models in terms of the magnitude of primary emissions, secondary OA (SOA) formation, the number of OA species used (2 to 62), the complexity of OA parameterizations (gas-particle partitioning, chemical aging, multiphase chemistry, aerosol microphysics), and the OA physical, chemical and optical properties. The diversity of the global OA simulation results has increased since earlier AeroCom experiments, mainly due to the increasing complexity of the SOA parameterization in models, and the implementation of new, highly uncertain, OA sources. Diversity of over one order of magnitude exists in the modeled vertical distribution of OA concentrations that deserves a dedicated future study. Furthermore, although the OA / OC ratio depends on OA sources and atmospheric processing, and is important for model evaluation against OA and OC observations, it is resolved only by a few global models. The median global primary OA (POA) source strength is 56 Tg a−1 (range 34–144 Tg a−1) and the median SOA source strength (natural and anthropogenic) is 19 Tg a−1 (range 13–121 Tg a−1). Among the models that take into account the semi-volatile SOA nature, the median source is calculated to be 51 Tg a−1 (range 16–121 Tg a−1), much larger than the median value of the models that calculate SOA in a more simplistic way (19 Tg a−1; range 13–20 Tg a−1, with one model at 37 Tg a−1). The median atmospheric burden of OA is 1.4 Tg (24 models in the range of 0.6–2.0 Tg and 4 between 2.0 and 3.8 Tg), with a median OA lifetime of 5.4 days (range 3.8–9.6 days). In models that reported both OA and sulfate burdens, the median value of the OA/sulfate burden ratio is calculated to be 0.77; 13 models calculate a ratio lower than 1, and 9 models higher than 1. For 26 models that reported OA deposition fluxes, the median wet removal is 70 Tg a−1 (range 28–209 Tg a−1), which is on average 85% of the total OA deposition. Fine aerosol organic carbon (OC) and OA observations from continuous monitoring networks and individual field campaigns have been used for model evaluation. At urban locations, the model–observation comparison indicates missing knowledge on anthropogenic OA sources, both strength and seasonality. The combined model–measurements analysis suggests the existence of increased OA levels during summer due to biogenic SOA formation over large areas of the USA that can be of the same order of magnitude as the POA, even at urban locations, and contribute to the measured urban seasonal pattern. Global models are able to simulate the high secondary character of OA observed in the atmosphere as a result of SOA formation and POA aging, although the amount of OA present in the atmosphere remains largely underestimated, with a mean normalized bias (MNB) equal to −0.62 (−0.51) based on the comparison against OC (OA) urban data of all models at the surface, −0.15 (+0.51) when compared with remote measurements, and −0.30 for marine locations with OC data. The mean temporal correlations across all stations are low when compared with OC (OA) measurements: 0.47 (0.52) for urban stations, 0.39 (0.37) for remote stations, and 0.25 for marine stations with OC data. The combination of high (negative) MNB and higher correlation at urban stations when compared with the low MNB and lower correlation at remote sites suggests that knowledge about the processes that govern aerosol processing, transport and removal, on top of their sources, is important at the remote stations. There is no clear change in model skill with increasing model complexity with regard to OC or OA mass concentration. However, the complexity is needed in models in order to distinguish between anthropogenic and natural OA as needed for climate mitigation, and to calculate the impact of OA on climate accurately.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A critical question in data mining is that can we always trust what discovered by a data mining system unconditionally? The answer is obviously not. If not, when can we trust the discovery then? What are the factors that affect the reliability of the discovery? How do they affect the reliability of the discovery? These are some interesting questions to be investigated.

In this paper we will firstly provide a definition and the measurements of reliability, and analyse the factors that affect the reliability. We then examine the impact of model complexity, weak links, varying sample sizes and the ability of different learners to the reliability of graphical model discovery. The experimental results reveal that (1) the larger sample size for the discovery, the higher reliability we will get; (2) the stronger a graph link is, the easier the discovery will be and thus the higher the reliability it can achieve; (3) the complexity of a graph also plays an important role in the discovery. The higher the complexity of a graph is, the more difficult to induce the graph and the lower reliability it would be.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One of the fundamental machine learning tasks is that of predictive classification. Given that organisations collect an ever increasing amount of data, predictive classification methods must be able to effectively and efficiently handle large amounts of data. However, it is understood that present requirements push existing algorithms to, and sometimes beyond, their limits since many classification prediction algorithms were designed when currently common data set sizes were beyond imagination. This has led to a significant amount of research into ways of making classification learning algorithms more effective and efficient. Although substantial progress has been made, a number of key questions have not been answered. This dissertation investigates two of these key questions. The first is whether different types of algorithms to those currently employed are required when using large data sets. This is answered by analysis of the way in which the bias plus variance decomposition of predictive classification error changes as training set size is increased. Experiments find that larger training sets require different types of algorithms to those currently used. Some insight into the characteristics of suitable algorithms is provided, and this may provide some direction for the development of future classification prediction algorithms which are specifically designed for use with large data sets. The second question investigated is that of the role of sampling in machine learning with large data sets. Sampling has long been used as a means of avoiding the need to scale up algorithms to suit the size of the data set by scaling down the size of the data sets to suit the algorithm. However, the costs of performing sampling have not been widely explored. Two popular sampling methods are compared with learning from all available data in terms of predictive accuracy, model complexity, and execution time. The comparison shows that sub-sampling generally products models with accuracy close to, and sometimes greater than, that obtainable from learning with all available data. This result suggests that it may be possible to develop algorithms that take advantage of the sub-sampling methodology to reduce the time required to infer a model while sacrificing little if any accuracy. Methods of improving effective and efficient learning via sampling are also investigated, and now sampling methodologies proposed. These methodologies include using a varying-proportion of instances to determine the next inference step and using a statistical calculation at each inference step to determine sufficient sample size. Experiments show that using a statistical calculation of sample size can not only substantially reduce execution time but can do so with only a small loss, and occasional gain, in accuracy. One of the common uses of sampling is in the construction of learning curves. Learning curves are often used to attempt to determine the optimal training size which will maximally reduce execution time while nut being detrimental to accuracy. An analysis of the performance of methods for detection of convergence of learning curves is performed, with the focus of the analysis on methods that calculate the gradient, of the tangent to the curve. Given that such methods can be susceptible to local accuracy plateaus, an investigation into the frequency of local plateaus is also performed. It is shown that local accuracy plateaus are a common occurrence, and that ensuring a small loss of accuracy often results in greater computational cost than learning from all available data. These results cast doubt over the applicability of gradient of tangent methods for detecting convergence, and of the viability of learning curves for reducing execution time in general.