874 resultados para Time-varying variable selection
Resumo:
Study on variable stars is an important topic of modern astrophysics. After the invention of powerful telescopes and high resolving powered CCD’s, the variable star data is accumulating in the order of peta-bytes. The huge amount of data need lot of automated methods as well as human experts. This thesis is devoted to the data analysis on variable star’s astronomical time series data and hence belong to the inter-disciplinary topic, Astrostatistics. For an observer on earth, stars that have a change in apparent brightness over time are called variable stars. The variation in brightness may be regular (periodic), quasi periodic (semi-periodic) or irregular manner (aperiodic) and are caused by various reasons. In some cases, the variation is due to some internal thermo-nuclear processes, which are generally known as intrinsic vari- ables and in some other cases, it is due to some external processes, like eclipse or rotation, which are known as extrinsic variables. Intrinsic variables can be further grouped into pulsating variables, eruptive variables and flare stars. Extrinsic variables are grouped into eclipsing binary stars and chromospheri- cal stars. Pulsating variables can again classified into Cepheid, RR Lyrae, RV Tauri, Delta Scuti, Mira etc. The eruptive or cataclysmic variables are novae, supernovae, etc., which rarely occurs and are not periodic phenomena. Most of the other variations are periodic in nature. Variable stars can be observed through many ways such as photometry, spectrophotometry and spectroscopy. The sequence of photometric observa- xiv tions on variable stars produces time series data, which contains time, magni- tude and error. The plot between variable star’s apparent magnitude and time are known as light curve. If the time series data is folded on a period, the plot between apparent magnitude and phase is known as phased light curve. The unique shape of phased light curve is a characteristic of each type of variable star. One way to identify the type of variable star and to classify them is by visually looking at the phased light curve by an expert. For last several years, automated algorithms are used to classify a group of variable stars, with the help of computers. Research on variable stars can be divided into different stages like observa- tion, data reduction, data analysis, modeling and classification. The modeling on variable stars helps to determine the short-term and long-term behaviour and to construct theoretical models (for eg:- Wilson-Devinney model for eclips- ing binaries) and to derive stellar properties like mass, radius, luminosity, tem- perature, internal and external structure, chemical composition and evolution. The classification requires the determination of the basic parameters like pe- riod, amplitude and phase and also some other derived parameters. Out of these, period is the most important parameter since the wrong periods can lead to sparse light curves and misleading information. Time series analysis is a method of applying mathematical and statistical tests to data, to quantify the variation, understand the nature of time-varying phenomena, to gain physical understanding of the system and to predict future behavior of the system. Astronomical time series usually suffer from unevenly spaced time instants, varying error conditions and possibility of big gaps. This is due to daily varying daylight and the weather conditions for ground based observations and observations from space may suffer from the impact of cosmic ray particles. Many large scale astronomical surveys such as MACHO, OGLE, EROS, xv ROTSE, PLANET, Hipparcos, MISAO, NSVS, ASAS, Pan-STARRS, Ke- pler,ESA, Gaia, LSST, CRTS provide variable star’s time series data, even though their primary intention is not variable star observation. Center for Astrostatistics, Pennsylvania State University is established to help the astro- nomical community with the aid of statistical tools for harvesting and analysing archival data. Most of these surveys releases the data to the public for further analysis. There exist many period search algorithms through astronomical time se- ries analysis, which can be classified into parametric (assume some underlying distribution for data) and non-parametric (do not assume any statistical model like Gaussian etc.,) methods. Many of the parametric methods are based on variations of discrete Fourier transforms like Generalised Lomb-Scargle peri- odogram (GLSP) by Zechmeister(2009), Significant Spectrum (SigSpec) by Reegen(2007) etc. Non-parametric methods include Phase Dispersion Minimi- sation (PDM) by Stellingwerf(1978) and Cubic spline method by Akerlof(1994) etc. Even though most of the methods can be brought under automation, any of the method stated above could not fully recover the true periods. The wrong detection of period can be due to several reasons such as power leakage to other frequencies which is due to finite total interval, finite sampling interval and finite amount of data. Another problem is aliasing, which is due to the influence of regular sampling. Also spurious periods appear due to long gaps and power flow to harmonic frequencies is an inherent problem of Fourier methods. Hence obtaining the exact period of variable star from it’s time series data is still a difficult problem, in case of huge databases, when subjected to automation. As Matthew Templeton, AAVSO, states “Variable star data analysis is not always straightforward; large-scale, automated analysis design is non-trivial”. Derekas et al. 2007, Deb et.al. 2010 states “The processing of xvi huge amount of data in these databases is quite challenging, even when looking at seemingly small issues such as period determination and classification”. It will be beneficial for the variable star astronomical community, if basic parameters, such as period, amplitude and phase are obtained more accurately, when huge time series databases are subjected to automation. In the present thesis work, the theories of four popular period search methods are studied, the strength and weakness of these methods are evaluated by applying it on two survey databases and finally a modified form of cubic spline method is intro- duced to confirm the exact period of variable star. For the classification of new variable stars discovered and entering them in the “General Catalogue of Vari- able Stars” or other databases like “Variable Star Index“, the characteristics of the variability has to be quantified in term of variable star parameters.
Resumo:
An adaptive tuned vibration absorber (ATVA) with a smart variable stiffness element is capable of retuning itself in response to a time-varying excitation frequency., enabling effective vibration control over a range of frequencies. This paper discusses novel methods of achieving variable stiffness in an ATVA by changing shape, as inspired by biological paradigms. It is shown that considerable variation in the tuned frequency can be achieved by actuating a shape change, provided that this is within the limits of the actuator. A feasible design for such an ATVA is one in which the device offers low resistance to the required shape change actuation while not being restricted to low values of the effective stiffness of the vibration absorber. Three such original designs are identified: (i) A pinned-pinned arch beam with fixed profile of slight curvature and variable preload through an adjustable natural curvature; (ii) a vibration absorber with a stiffness element formed from parallel curved beams of adjustable curvature vibrating longitudinally; (iii) a vibration absorber with a variable geometry linkage as stiffness element. The experimental results from demonstrators based on two of these designs show good correlation with the theory.
Resumo:
This paper presents two Variable Structure Controllers (VSC) for continuous-time switched plants. It is assumed that the state vector is available for feedback. The proposed control system provides a switching rule and also the variable structure control input. The design is based on Lyapunov-Metzler (LM) inequalities and also on Strictly Positive Real (SPR) systems stability results. The definition of Lyapunov-Metzler-SPR (LMS) systems and its direct application in the design of VSC for switched systems are introduced in this paper. Two examples illustrate the design of the proposed VSC, considering a plant given by a switched system with a switched-state control law and two linear time-invariant systems, that are not controllable and also can not be stabilized with state feedback. ©2008 IEEE.
Resumo:
If change over time is compared in several groups, it is important to take into account baseline values so that the comparison is carried out under the same preconditions. As the observed baseline measurements are distorted by measurement error, it may not be sufficient to include them as covariate. By fitting a longitudinal mixed-effects model to all data including the baseline observations and subsequently calculating the expected change conditional on the underlying baseline value, a solution to this problem has been provided recently so that groups with the same baseline characteristics can be compared. In this article, we present an extended approach where a broader set of models can be used. Specifically, it is possible to include any desired set of interactions between the time variable and the other covariates, and also, time-dependent covariates can be included. Additionally, we extend the method to adjust for baseline measurement error of other time-varying covariates. We apply the methodology to data from the Swiss HIV Cohort Study to address the question if a joint infection with HIV-1 and hepatitis C virus leads to a slower increase of CD4 lymphocyte counts over time after the start of antiretroviral therapy.
Resumo:
Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.
While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.
For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.
Resumo:
When a task must be executed in a remote or dangerous environment, teleoperation systems may be employed to extend the influence of the human operator. In the case of manipulation tasks, haptic feedback of the forces experienced by the remote (slave) system is often highly useful in improving an operator's ability to perform effectively. In many of these cases (especially teleoperation over the internet and ground-to-space teleoperation), substantial communication latency exists in the control loop and has the strong tendency to cause instability of the system. The first viable solution to this problem in the literature was based on a scattering/wave transformation from transmission line theory. This wave transformation requires the designer to select a wave impedance parameter appropriate to the teleoperation system. It is widely recognized that a small value of wave impedance is well suited to free motion and a large value is preferable for contact tasks. Beyond this basic observation, however, very little guidance exists in the literature regarding the selection of an appropriate value. Moreover, prior research on impedance selection generally fails to account for the fact that in any realistic contact task there will simultaneously exist contact considerations (perpendicular to the surface of contact) and quasi-free-motion considerations (parallel to the surface of contact). The primary contribution of the present work is to introduce an approximate linearized optimum for the choice of wave impedance and to apply this quasi-optimal choice to the Cartesian reality of such a contact task, in which it cannot be expected that a given joint will be either perfectly normal to or perfectly parallel to the motion constraint. The proposed scheme selects a wave impedance matrix that is appropriate to the conditions encountered by the manipulator. This choice may be implemented as a static wave impedance value or as a time-varying choice updated according to the instantaneous conditions encountered. A Lyapunov-like analysis is presented demonstrating that time variation in wave impedance will not violate the passivity of the system. Experimental trials, both in simulation and on a haptic feedback device, are presented validating the technique. Consideration is also given to the case of an uncertain environment, in which an a priori impedance choice may not be possible.
Resumo:
The structural engineering community in Brazil faces new challenges with the recent occurrence of high intensity tornados. Satellite surveillance data shows that the area covering the south-east of Brazil, Uruguay and some of Argentina is one of the world most tornado-prone areas, second only to the infamous tornado alley in central United States. The design of structures subject to tornado winds is a typical example of decision making in the presence of uncertainty. Structural design involves finding a good balance between the competing goals of safety and economy. This paper presents a methodology to find the optimum balance between these goals in the presence of uncertainty. In this paper, reliability-based risk optimization is used to find the optimal safety coefficient that minimizes the total expected cost of a steel frame communications tower, subject to extreme storm and tornado wind loads. The technique is not new, but it is applied to a practical problem of increasing interest to Brazilian structural engineers. The problem is formulated in the partial safety factor format used in current design codes, with all additional partial factor introduced to serve as optimization variable. The expected cost of failure (or risk) is defined as the product of a. limit state exceedance probability by a limit state exceedance cost. These costs include costs of repairing, rebuilding, and paying compensation for injury and loss of life. The total expected failure cost is the sum of individual expected costs over all failure modes. The steel frame communications, tower subject of this study has become very common in Brazil due to increasing mobile phone coverage. The study shows that optimum reliability is strongly dependent on the cost (or consequences) of failure. Since failure consequences depend oil actual tower location, it turn,,; out that different optimum designs should be used in different locations. Failure consequences are also different for the different parties involved in the design, construction and operation of the tower. Hence, it is important that risk is well understood by the parties involved, so that proper contracts call be made. The investigation shows that when non-structural terms dominate design costs (e.g, in residential or office buildings) it is not too costly to over-design; this observation is in agreement with the observed practice for non-optimized structural systems. In this situation, is much easier to loose money by under-design. When by under-design. When structural material cost is a significant part of design cost (e.g. concrete dam or bridge), one is likely to lose significantmoney by over-design. In this situation, a cost-risk-benefit optimization analysis is highly recommended. Finally, the study also shows that under time-varying loads like tornados, the optimum reliability is strongly dependent on the selected design life.
Resumo:
An order of magnitude sensitivity gain is described for using quasar spectra to investigate possible time or space variation in the fine structure constant alpha. Applied to a sample of 30 absorption systems, spanning redshifts 0.5 < z < 1.6, we derive limits on variations in alpha over a wide range of epochs. For the whole sample, Delta alpha/alpha = (-1.1 +/- 0.4) x 10(-5). This deviation is dominated by measurements at z > 1, where Delta alpha/alpha = (-1.9 +/- 0.5) x 10(-5). For z < 1, Delta alpha/alpha = (-0.2 +/- 0.4) x 10(-5). While this is consistent with a time-varying alpha, further work is required to explore possible systematic errors in the data, although careful searches have so far revealed none.
Resumo:
Tuberculosis (TB) is a worldwide infectious disease that has shown over time extremely high mortality levels. The urgent need to develop new antitubercular drugs is due to the increasing rate of appearance of multi-drug resistant strains to the commonly used drugs, and the longer durations of therapy and recovery, particularly in immuno-compromised patients. The major goal of the present study is the exploration of data from different families of compounds through the use of a variety of machine learning techniques so that robust QSAR-based models can be developed to further guide in the quest for new potent anti-TB compounds. Eight QSAR models were built using various types of descriptors (from ADRIANA.Code and Dragon software) with two publicly available structurally diverse data sets, including recent data deposited in PubChem. QSAR methodologies used Random Forests and Associative Neural Networks. Predictions for the external evaluation sets obtained accuracies in the range of 0.76-0.88 (for active/inactive classifications) and Q(2)=0.66-0.89 for regressions. Models developed in this study can be used to estimate the anti-TB activity of drug candidates at early stages of drug development (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Copyright © 2013 Springer Netherlands.
Resumo:
The momentum anomaly has been widely documented in the literature. However, there are still many issues where there is no consensus and puzzles left unexplained. One is that strategies based on momentum present a level of risk that is inconsistent with the diversification that it offers. Moreover, recent studies indicate that this risk is variable over time and mostly strategy-specific. This work project hypothesises and proves that this evidence is explained by the portfolio constitution of the momentum strategy over time, namely the covariance and correlation between companies in the top and down deciles and across them.
Resumo:
Understanding the genetic underpinnings of adaptive change is a fundamental but largely unresolved problem in evolutionary biology. Drosophila melanogaster, an ancestrally tropical insect that has spread to temperate regions and become cosmopolitan, offers a powerful opportunity for identifying the molecular polymorphisms underlying clinal adaptation. Here, we use genome-wide next-generation sequencing of DNA pools ('pool-seq') from three populations collected along the North American east coast to examine patterns of latitudinal differentiation. Comparing the genomes of these populations is particularly interesting since they exhibit clinal variation in a number of important life history traits. We find extensive latitudinal differentiation, with many of the most strongly differentiated genes involved in major functional pathways such as the insulin/TOR, ecdysone, torso, EGFR, TGFβ/BMP, JAK/STAT, immunity and circadian rhythm pathways. We observe particularly strong differentiation on chromosome 3R, especially within the cosmopolitan inversion In(3R)Payne, which contains a large number of clinally varying genes. While much of the differentiation might be driven by clinal differences in the frequency of In(3R)P, we also identify genes that are likely independent of this inversion. Our results provide genome-wide evidence consistent with pervasive spatially variable selection acting on numerous loci and pathways along the well-known North American cline, with many candidates implicated in life history regulation and exhibiting parallel differentiation along the previously investigated Australian cline.
Resumo:
VAR methods have been used to model the inter-relationships between inflows and outfl ows into unemployment and vacancies using tools such as impulse response analysis. In order to investigate whether such impulse responses change over the course of the business cycle or or over time, this paper uses TVP-VARs for US and Canadian data. For the US, we find interesting differences between the most recent recession and earlier recessions and expansions. In particular, we find the immediate effect of a negative shock on both in ow and out flow hazards to be larger in 2008 than in earlier times. Furthermore, the effect of this shock takes longer to decay. For Canada, we fi nd less evidence of time-variation in impulse responses.
Resumo:
This paper discusses the challenges faced by the empirical macroeconomist and methods for surmounting them. These challenges arise due to the fact that macroeconometric models potentially include a large number of variables and allow for time variation in parameters. These considerations lead to models which have a large number of parameters to estimate relative to the number of observations. A wide range of approaches are surveyed which aim to overcome the resulting problems. We stress the related themes of prior shrinkage, model averaging and model selection. Subsequently, we consider a particular modelling approach in detail. This involves the use of dynamic model selection methods with large TVP-VARs. A forecasting exercise involving a large US macroeconomic data set illustrates the practicality and empirical success of our approach.
Resumo:
Vector Autoregressive Moving Average (VARMA) models have many theoretical properties which should make them popular among empirical macroeconomists. However, they are rarely used in practice due to over-parameterization concerns, difficulties in ensuring identification and computational challenges. With the growing interest in multivariate time series models of high dimension, these problems with VARMAs become even more acute, accounting for the dominance of VARs in this field. In this paper, we develop a Bayesian approach for inference in VARMAs which surmounts these problems. It jointly ensures identification and parsimony in the context of an efficient Markov chain Monte Carlo (MCMC) algorithm. We use this approach in a macroeconomic application involving up to twelve dependent variables. We find our algorithm to work successfully and provide insights beyond those provided by VARs.