53 resultados para Sums of squares
Resumo:
The aim of this paper is to examine the short term dynamics of foreign exchange rate spreads. Using a vector autoregressive model (VAR) we show that most of the variation in the spread comes from the long run dependencies between past and future spreads rather than being caused by changes in inventory, adverse selection, cost of carry or order processing costs. We apply the Integrated Cumulative Sum of Squares (ICSS) algorithm of Inclan and Tiao (1994) to discover how often spread volatility changes. We find that spread volatility shifts are relatively uncommon and shifts in one currency spread tend not to spillover to other currency spreads. © 2013.
Resumo:
Levels of lignin and hydroxycinnamic acid wall components in three genera of forage grasses (Lolium,Festuca and Dactylis) have been accurately predicted by Fourier-transform infrared spectroscopy using partial least squares models correlated to analytical measurements. Different models were derived that predicted the concentrations of acid detergent lignin, total hydroxycinnamic acids, total ferulate monomers plus dimers, p-coumarate and ferulate dimers in independent spectral test data from methanol extracted samples of perennial forage grass with accuracies of 92.8%, 86.5%, 86.1%, 59.7% and 84.7% respectively, and analysis of model projection scores showed that the models relied generally on spectral features that are known absorptions of these compounds. Acid detergent lignin was predicted in samples of two species of energy grass, (Phalaris arundinacea and Pancium virgatum) with an accuracy of 84.5%.
Resumo:
The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide−MHC binding affinity. The ISC−PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide−MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms - q2, SEP, and NC - ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).
Resumo:
Motivation: The immunogenicity of peptides depends on their ability to bind to MHC molecules. MHC binding affinity prediction methods can save significant amounts of experimental work. The class II MHC binding site is open at both ends, making epitope prediction difficult because of the multiple binding ability of long peptides. Results: An iterative self-consistent partial least squares (PLS)-based additive method was applied to a set of 66 pep- tides no longer than 16 amino acids, binding to DRB1*0401. A regression equation containing the quantitative contributions of the amino acids at each of the nine positions was generated. Its predictability was tested using two external test sets which gave r pred =0.593 and r pred=0.655, respectively. Furthermore, it was benchmarked using 25 known T-cell epitopes restricted by DRB1*0401 and we compared our results with four other online predictive methods. The additive method showed the best result finding 24 of the 25 T-cell epitopes. Availability: Peptides used in the study are available from http://www.jenner.ac.uk/JenPep. The PLS method is available commercially in the SYBYL molecular modelling software package. The final model for affinity prediction of peptides binding to DRB1*0401 molecule is available at http://www.jenner.ac.uk/MHCPred. Models developed for DRB1*0101 and DRB1*0701 also are available in MHC- Pred
Resumo:
Neural networks can be regarded as statistical models, and can be analysed in a Bayesian framework. Generalisation is measured by the performance on independent test data drawn from the same distribution as the training data. Such performance can be quantified by the posterior average of the information divergence between the true and the model distributions. Averaging over the Bayesian posterior guarantees internal coherence; Using information divergence guarantees invariance with respect to representation. The theory generalises the least mean squares theory for linear Gaussian models to general problems of statistical estimation. The main results are: (1)~the ideal optimal estimate is always given by average over the posterior; (2)~the optimal estimate within a computational model is given by the projection of the ideal estimate to the model. This incidentally shows some currently popular methods dealing with hyperpriors are in general unnecessary and misleading. The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd. It therefore offers conceptual simplification to information geometry. The general conclusion on the issue of evaluating neural network learning rules and other statistical inference methods is that such evaluations are only meaningful under three assumptions: The prior P(p), describing the environment of all the problems; the divergence Dd, specifying the requirement of the task; and the model Q, specifying available computing resources.
Resumo:
Neural networks have often been motivated by superficial analogy with biological nervous systems. Recently, however, it has become widely recognised that the effective application of neural networks requires instead a deeper understanding of the theoretical foundations of these models. Insight into neural networks comes from a number of fields including statistical pattern recognition, computational learning theory, statistics, information geometry and statistical mechanics. As an illustration of the importance of understanding the theoretical basis for neural network models, we consider their application to the solution of multi-valued inverse problems. We show how a naive application of the standard least-squares approach can lead to very poor results, and how an appreciation of the underlying statistical goals of the modelling process allows the development of a more general and more powerful formalism which can tackle the problem of multi-modality.
Resumo:
We consider the problem of illusory or artefactual structure from the visualisation of high-dimensional structureless data. In particular we examine the role of the distance metric in the use of topographic mappings based on the statistical field of multidimensional scaling. We show that the use of a squared Euclidean metric (i.e. the SSTRESs measure) gives rise to an annular structure when the input data is drawn from a high-dimensional isotropic distribution, and we provide a theoretical justification for this observation.
Resumo:
Correlation and regression are two of the statistical procedures most widely used by optometrists. However, these tests are often misused or interpreted incorrectly, leading to erroneous conclusions from clinical experiments. This review examines the major statistical tests concerned with correlation and regression that are most likely to arise in clinical investigations in optometry. First, the use, interpretation and limitations of Pearson's product moment correlation coefficient are described. Second, the least squares method of fitting a linear regression to data and for testing how well a regression line fits the data are described. Third, the problems of using linear regression methods in observational studies, if there are errors associated in measuring the independent variable and for predicting a new value of Y for a given X, are discussed. Finally, methods for testing whether a non-linear relationship provides a better fit to the data and for comparing two or more regression lines are considered.
Resumo:
When applying multivariate analysis techniques in information systems and social science disciplines, such as management information systems (MIS) and marketing, the assumption that the empirical data originate from a single homogeneous population is often unrealistic. When applying a causal modeling approach, such as partial least squares (PLS) path modeling, segmentation is a key issue in coping with the problem of heterogeneity in estimated cause-and-effect relationships. This chapter presents a new PLS path modeling approach which classifies units on the basis of the heterogeneity of the estimates in the inner model. If unobserved heterogeneity significantly affects the estimated path model relationships on the aggregate data level, the methodology will allow homogenous groups of observations to be created that exhibit distinctive path model estimates. The approach will, thus, provide differentiated analytical outcomes that permit more precise interpretations of each segment formed. An application on a large data set in an example of the American customer satisfaction index (ACSI) substantiates the methodology’s effectiveness in evaluating PLS path modeling results.
Resumo:
Dispersal of a Hypogymnia physodes (L.) Nyl. population was studied on an isolated Prunus blireiana L. tree at a site in North Seattle, U.S.A. Lichen propagules were trapped on adhesive strips pinned to four sites on the tree for 7 successive days. Soredia of H. physodes were frequently deposited on the strips but thallus fragments were rare. More soredia were deposited on the upper and lower branches than on the trunk, few soredia were deposited on the underside of the branches. The total daily deposition of soredia on the tree was positively correlated with average daily wind speed. Dispersal downwind from the tree was studied with squares of adhesive contact paper pinned to boards and located at intervals up to 25 m from the tree. Soredia and a few thallus fragments were recorded 25 m and 10 m, respectively, downwind on a day when average wind speed was 10.3 m/sec. The dispersal of soredia by wind from four individual thalli was studied over 10 successive days. Soredia were deposited from each thallus on each day mostly within 2 cm of the source. Higher wind speeds were necessary to dispersae soredia on days when the relative humidity was high. Soredia and thallus fragments were also dispersed by splash dispersal. More soredia were splashed furthest at a splash height of 90 cm. These results suggest that initial colonization of the tree by H. physodes may have occurred by wind-dispersed soredia. Subsequent spread probably occurred from established thalli mainly by the dispersal of soredia by wind and rain splash.
Resumo:
Studies of spatial summation often use sinusoidal gratings with blurred edges. When the envelope is elongated (i) along the grating stripes and (ii) at right angles to the grating stripes, we refer to the stimuli as skunk-tails and tiger-tails respectively. Previous work [Polat & Tyler, 1999; Vision Research, 39, 887-895.] has found that sensitivity to skunk-tails is greater than for tiger-tails, but there have been several failures to replicate this result within a subset of the conditions. To address this we measured detection thresholds for skunk-tails, tiger-tails and squares of grating with sides matched to the lengths of the tails. For foveal viewing, we found a contrast sensitivity advantage in the order of 2 dB for skunk-tails over tiger-tails, but only for horizontal gratings. For vertical gratings, sensitivity was very similar for both tail-types. When the stimuli were presented parafoveally (upper right visual field), a small advantage was found for skunk-tails over tiger-tails at both orientations, and spatial summation slopes were close to that of the ideal observer. We did not replicate the findings of Polat & Tyler, but our results are consistent with (i) those of Foley et al. [Foley, J. M., Varadharajan, S., Koh, C. C., & Farias, C. Q. (2007) Vision Research, 47, 85-107.] who used only vertical gratings and (ii) those from modelfest, where only horizontal gratings were used. The small effect of tail-type here suggests an anisotropy in the underlying physiology. © 2007 Elsevier Ltd. All rights reserved.
Resumo:
The Gestalt theorists of the early twentieth century proposed a psychological primacy for circles, squares and triangles over other shapes. They described them as 'good' shapes and the Gestalt premise has been widely accepted. Rosch (1973), for example, suggested that shape categories formed around these 'natural' prototypes irrespective of the paucity of shape terms in a language. Rosch found that speakers of a language lacking terms for any geometric shape nevertheless learnt paired-associates to these 'good' shapes more easily than to asymmetric variants. We question these empirical data in the light of the accumulation of recent evidence in other perceptual domains that language affects categorization. A cross-cultural investigation sought to replicate Rosch's findings with the Himba of Northern Namibia who also have no terms in their language for the supposedly basic shapes of circle, square and triangle. A replication of Rosch (1973) found no advantage for these 'good' shapes in the organization of categories. It was concluded that there is no necessary salience for circles, squares and triangles. Indeed, we argue for the opposite because these shapes are rare in nature. The general absence of straight lines and symmetry in the perceptual environment should rather make circles, squares and triangles unusual and, therefore, less likely to be used as prototypes in categorization tasks. We place shape as one of the types of perceptual input (in philosophical terms, 'vague') that is readily susceptible to effects of language variation.
Resumo:
Methods of dynamic modelling and analysis of structures, for example the finite element method, are well developed. However, it is generally agreed that accurate modelling of complex structures is difficult and for critical applications it is necessary to validate or update the theoretical models using data measured from actual structures. The techniques of identifying the parameters of linear dynamic models using Vibration test data have attracted considerable interest recently. However, no method has received a general acceptance due to a number of difficulties. These difficulties are mainly due to (i) Incomplete number of Vibration modes that can be excited and measured, (ii) Incomplete number of coordinates that can be measured, (iii) Inaccuracy in the experimental data (iv) Inaccuracy in the model structure. This thesis reports on a new approach to update the parameters of a finite element model as well as a lumped parameter model with a diagonal mass matrix. The structure and its theoretical model are equally perturbed by adding mass or stiffness and the incomplete number of eigen-data is measured. The parameters are then identified by an iterative updating of the initial estimates, by sensitivity analysis, using eigenvalues or both eigenvalues and eigenvectors of the structure before and after perturbation. It is shown that with a suitable choice of the perturbing coordinates exact parameters can be identified if the data and the model structure are exact. The theoretical basis of the technique is presented. To cope with measurement errors and possible inaccuracies in the model structure, a well known Bayesian approach is used to minimize the least squares difference between the updated and the initial parameters. The eigen-data of the structure with added mass or stiffness is also determined using the frequency response data of the unmodified structure by a structural modification technique. Thus, mass or stiffness do not have to be added physically. The mass-stiffness addition technique is demonstrated by simulation examples and Laboratory experiments on beams and an H-frame.
Resumo:
Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.
Resumo:
Substantial altimetry datasets collected by different satellites have only become available during the past five years, but the future will bring a variety of new altimetry missions, both parallel and consecutive in time. The characteristics of each produced dataset vary with the different orbital heights and inclinations of the spacecraft, as well as with the technical properties of the radar instrument. An integral analysis of datasets with different properties offers advantages both in terms of data quantity and data quality. This thesis is concerned with the development of the means for such integral analysis, in particular for dynamic solutions in which precise orbits for the satellites are computed simultaneously. The first half of the thesis discusses the theory and numerical implementation of dynamic multi-satellite altimetry analysis. The most important aspect of this analysis is the application of dual satellite altimetry crossover points as a bi-directional tracking data type in simultaneous orbit solutions. The central problem is that the spatial and temporal distributions of the crossovers are in conflict with the time-organised nature of traditional solution methods. Their application to the adjustment of the orbits of both satellites involved in a dual crossover therefore requires several fundamental changes of the classical least-squares prediction/correction methods. The second part of the thesis applies the developed numerical techniques to the problems of precise orbit computation and gravity field adjustment, using the altimetry datasets of ERS-1 and TOPEX/Poseidon. Although the two datasets can be considered less compatible that those of planned future satellite missions, the obtained results adequately illustrate the merits of a simultaneous solution technique. In particular, the geographically correlated orbit error is partially observable from a dataset consisting of crossover differences between two sufficiently different altimetry datasets, while being unobservable from the analysis of altimetry data of both satellites individually. This error signal, which has a substantial gravity-induced component, can be employed advantageously in simultaneous solutions for the two satellites in which also the harmonic coefficients of the gravity field model are estimated.