906 resultados para General Linear Methods
Resumo:
The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.
Resumo:
The application of Gaussian Quadrature (GQ) procedures to the evaluation of i—E curves in linear sweep voltammetry is advocated. It is shown that a high degree of precision is achieved with these methods and the values obtained through GQ are in good agreement with (and even better than) the values reported in literature by Nicholson-Shain, for example. Another welcome feature with GQ is its ability to be interpreted as an elegant, efficient analytic approximation scheme too. A comparison of the values obtained by this approach and by a recent scheme based on series approximation proposed by Oldham is made and excellent agreement is shown to exist.
Resumo:
Purpose.: To develop three-surface paraxial schematic eyes with different ages and sexes based on data for 7- and 14-year-old Chinese children from the Anyang Childhood Eye Study. Methods.: Six sets of paraxial schematic eyes, including 7-year-old eyes, 7-year-old male eyes, 7-year-old female eyes, 14-year-old eyes, 14-year-old male eyes, and 14-year-old female eyes, were developed. Both refraction-dependent and emmetropic eye models were developed, with the former using linear dependence of ocular parameters on refraction. Results.: A total of 2059 grade 1 children (boys 58%) and 1536 grade 8 children (boys 49%) were included, with mean age of 7.1 ± 0.4 and 13.7 ± 0.5 years, respectively. Changes in these schematic eyes with aging are increased anterior chamber depth, decreased lens thickness, increased vitreous chamber depth, increased axial length, and decreased lens equivalent power. Male schematic eyes have deeper anterior chamber depth, longer vitreous chamber depth, longer axial length, and lower lens equivalent power than female schematic eyes. Changes in the schematic eyes with positive increase in refraction are decreased anterior chamber depth, increased lens thickness, decreased vitreous chamber depth, decreased axial length, increased corneal radius of curvature, and increased lens power. In general, the emmetropic schematic eyes have biometric parameters similar to those arising from regression fits for the refraction-dependent schematic eyes. Conclusions.: The paraxial schematic eyes of Chinese children may be useful for myopia research and for facilitating comparison with other children with the same or different racial backgrounds and living in different places.
Resumo:
The third-kind linear integral equation Image where g(t) vanishes at a finite number of points in (a, b), is considered. In general, the Fredholm Alternative theory [[5.]] does not hold good for this type of integral equation. However, imposing certain conditions on g(t) and K(t, t′), the above integral equation was shown [[1.], 49–57] to obey a Fredholm-type theory, except for a certain class of kernels for which the question was left open. In this note a theory is presented for the equation under consideration with some additional assumptions on such kernels.
Resumo:
This paper considers the on-line identification of a non-linear system in terms of a Hammerstein model, with a zero-memory non-linear gain followed by a linear system. The linear part is represented by a Laguerre expansion of its impulse response and the non-linear part by a polynomial. The identification procedure involves determination of the coefficients of the Laguerre expansion of correlation functions and an iterative adjustment of the parameters of the non-linear gain by gradient methods. The method is applicable to situations involving a wide class of input signals. Even in the presence of additive correlated noise, satisfactory performance is achieved with the variance of the error converging to a value close to the variance of the noise. Digital computer simulation establishes the practicability of the scheme in different situations.
Resumo:
This paper deals with two approximate methods of finding the period of oscillations of non-linear conservative systems excited by step functions. The first method is an extension of the analysis presented by Jonckheere [4] and the second one is based on a weighted bilinear approximation of the non-linear characteristic. An example is presented and the approximate results are compared with the exact results
Resumo:
Cooked prawn colour is known to be a driver of market price and a visual indicator of product quality for the consumer. Although there is a general understanding that colour variation exists in farmed prawns, there has been no attempt to quantify this variation or identify where this variation is most prevalent. The objectives of this study were threefold: firstly to compare three different quantitative methods to measure prawn colour or pigmentation, two different colorimeters and colour quantification from digital images. Secondly, to quantify the amount of pigmentation variation that exists in farmed prawns within ponds, across ponds and across farms. Lastly, to assess the effects of ice storage or freeze-thawing of raw product prior to cooking. Each method was able to detect quantitative differences in prawn colour, although conversion of image based quantification of prawn colour from RGB to Lab was unreliable. Considerable colour variation was observed between prawns from different ponds and different farms, and this variation potentially affects product value. Different post-harvest methods prior to cooking were also shown to have a profound detrimental effect on prawn colour. Both long periods of ice storage and freeze thawing of raw product were detrimental to prawn colour. However, ice storage immediately after cooking was shown to be beneficial to prawn colour. Results demonstrated that darker prawn colour was preserved by holding harvested prawns alive in chilled seawater, limiting the time between harvesting and cooking, and avoiding long periods of ice storage or freeze thawing of uncooked product.
Resumo:
In this paper, the results on primal methods for Bottleneck Linear Programming (BLP) problem are briefly surveyed, the primal method is presented and the degenerate case related to Bottleneck Transportation Problem (BTP) is explicitly considered. The algorithm is based on the idea of using auxiliary coefficients as is done by Garfinkel and Rao [6]. The modification presented for the BTP rectifies the defect in Hammer's method in the case of degenerate basic feasible solution. Illustrative numerical examples are also given.
Resumo:
This thesis consists of an introduction, four research articles and an appendix. The thesis studies relations between two different approaches to continuum limit of models of two dimensional statistical mechanics at criticality. The approach of conformal field theory (CFT) could be thought of as the algebraic classification of some basic objects in these models. It has been succesfully used by physicists since 1980's. The other approach, Schramm-Loewner evolutions (SLEs), is a recently introduced set of mathematical methods to study random curves or interfaces occurring in the continuum limit of the models. The first and second included articles argue on basis of statistical mechanics what would be a plausible relation between SLEs and conformal field theory. The first article studies multiple SLEs, several random curves simultaneously in a domain. The proposed definition is compatible with a natural commutation requirement suggested by Dubédat. The curves of multiple SLE may form different topological configurations, ``pure geometries''. We conjecture a relation between the topological configurations and CFT concepts of conformal blocks and operator product expansions. Example applications of multiple SLEs include crossing probabilities for percolation and Ising model. The second article studies SLE variants that represent models with boundary conditions implemented by primary fields. The most well known of these, SLE(kappa, rho), is shown to be simple in terms of the Coulomb gas formalism of CFT. In the third article the space of local martingales for variants of SLE is shown to carry a representation of Virasoro algebra. Finding this structure is guided by the relation of SLEs and CFTs in general, but the result is established in a straightforward fashion. This article, too, emphasizes multiple SLEs and proposes a possible way of treating pure geometries in terms of Coulomb gas. The fourth article states results of applications of the Virasoro structure to the open questions of SLE reversibility and duality. Proofs of the stated results are provided in the appendix. The objective is an indirect computation of certain polynomial expected values. Provided that these expected values exist, in generic cases they are shown to possess the desired properties, thus giving support for both reversibility and duality.
Resumo:
We explore here the acceleration of convergence of iterative methods for the solution of a class of quasilinear and linear algebraic equations. The specific systems are the finite difference form of the Navier-Stokes equations and the energy equation for recirculating flows. The acceleration procedures considered are: the successive over relaxation scheme; several implicit methods; and a second-order procedure. A new implicit method—the alternating direction line iterative method—is proposed in this paper. The method combines the advantages of the line successive over relaxation and alternating direction implicit methods. The various methods are tested for their computational economy and accuracy on a typical recirculating flow situation. The numerical experiments show that the alternating direction line iterative method is the most economical method of solving the Navier-Stokes equations for all Reynolds numbers in the laminar regime. The usual ADI method is shown to be not so attractive for large Reynolds numbers because of the loss of diagonal dominance. This loss can however be restored by a suitable choice of the relaxation parameter, but at the cost of accuracy. The accuracy of the new procedure is comparable to that of the well-tested successive overrelaxation method and to the available results in the literature. The second-order procedure turns out to be the most efficient method for the solution of the linear energy equation.
Resumo:
The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.
Resumo:
Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.
Resumo:
The metabolism of an organism consists of a network of biochemical reactions that transform small molecules, or metabolites, into others in order to produce energy and building blocks for essential macromolecules. The goal of metabolic flux analysis is to uncover the rates, or the fluxes, of those biochemical reactions. In a steady state, the sum of the fluxes that produce an internal metabolite is equal to the sum of the fluxes that consume the same molecule. Thus the steady state imposes linear balance constraints to the fluxes. In general, the balance constraints imposed by the steady state are not sufficient to uncover all the fluxes of a metabolic network. The fluxes through cycles and alternative pathways between the same source and target metabolites remain unknown. More information about the fluxes can be obtained from isotopic labelling experiments, where a cell population is fed with labelled nutrients, such as glucose that contains 13C atoms. Labels are then transferred by biochemical reactions to other metabolites. The relative abundances of different labelling patterns in internal metabolites depend on the fluxes of pathways producing them. Thus, the relative abundances of different labelling patterns contain information about the fluxes that cannot be uncovered from the balance constraints derived from the steady state. The field of research that estimates the fluxes utilizing the measured constraints to the relative abundances of different labelling patterns induced by 13C labelled nutrients is called 13C metabolic flux analysis. There exist two approaches of 13C metabolic flux analysis. In the optimization approach, a non-linear optimization task, where candidate fluxes are iteratively generated until they fit to the measured abundances of different labelling patterns, is constructed. In the direct approach, linear balance constraints given by the steady state are augmented with linear constraints derived from the abundances of different labelling patterns of metabolites. Thus, mathematically involved non-linear optimization methods that can get stuck to the local optima can be avoided. On the other hand, the direct approach may require more measurement data than the optimization approach to obtain the same flux information. Furthermore, the optimization framework can easily be applied regardless of the labelling measurement technology and with all network topologies. In this thesis we present a formal computational framework for direct 13C metabolic flux analysis. The aim of our study is to construct as many linear constraints to the fluxes from the 13C labelling measurements using only computational methods that avoid non-linear techniques and are independent from the type of measurement data, the labelling of external nutrients and the topology of the metabolic network. The presented framework is the first representative of the direct approach for 13C metabolic flux analysis that is free from restricting assumptions made about these parameters.In our framework, measurement data is first propagated from the measured metabolites to other metabolites. The propagation is facilitated by the flow analysis of metabolite fragments in the network. Then new linear constraints to the fluxes are derived from the propagated data by applying the techniques of linear algebra.Based on the results of the fragment flow analysis, we also present an experiment planning method that selects sets of metabolites whose relative abundances of different labelling patterns are most useful for 13C metabolic flux analysis. Furthermore, we give computational tools to process raw 13C labelling data produced by tandem mass spectrometry to a form suitable for 13C metabolic flux analysis.
Resumo:
Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.
Resumo:
This paper is concerned with the analysis of the absolute stability of a non-linear autonomous system which consists of a single non-linearity belonging to a particular class, in an otherwise linear feedback loop. It is motivated from the earlier Popovlike frequency-domain criteria using the ' multiplier ' eoncept and involves the construction of ' stability multipliers' with prescribed phase characteristics. A few computer-based methods by which this problem can be solved are indicated and it is shown that this constitutes a stop-by-step procedure for testing the stability properties of a given system.