952 resultados para Linear models (Statistics)
Resumo:
As modern molecular biology moves towards the analysis of biological systems as opposed to their individual components, the need for appropriate mathematical and computational techniques for understanding the dynamics and structure of such systems is becoming more pressing. For example, the modeling of biochemical systems using ordinary differential equations (ODEs) based on high-throughput, time-dense profiles is becoming more common-place, which is necessitating the development of improved techniques to estimate model parameters from such data. Due to the high dimensionality of this estimation problem, straight-forward optimization strategies rarely produce correct parameter values, and hence current methods tend to utilize genetic/evolutionary algorithms to perform non-linear parameter fitting. Here, we describe a completely deterministic approach, which is based on interval analysis. This allows us to examine entire sets of parameters, and thus to exhaust the global search within a finite number of steps. In particular, we show how our method may be applied to a generic class of ODEs used for modeling biochemical systems called Generalized Mass Action Models (GMAs). In addition, we show that for GMAs our method is amenable to the technique in interval arithmetic called constraint propagation, which allows great improvement of its efficiency. To illustrate the applicability of our method we apply it to some networks of biochemical reactions appearing in the literature, showing in particular that, in addition to estimating system parameters in the absence of noise, our method may also be used to recover the topology of these networks.
Resumo:
The Polochic-Motagua fault systems (PMFS) are part of the sinistral transform boundary between the North American and Caribbean plates. To the west, these systems interact with the subduction zone of the Cocos plate, forming a subduction-subduction-transform triple junction. The North American plate moves westward relative to the Caribbean plate. This movement does not affect the geometry of the subducted Cocos plate, which implies that deformation is accommodated entirely in the two overriding plates. Structural data, fault kinematic analysis, and geomorphic observations provide new elements that help to understand the late Cenozoic evolution of this triple junction. In the Miocene, extension and shortening occurred south and north of the Motagua fault, respectively. This strain regime migrated northward to the Polochic fault after the late Miocene. This shift is interpreted as a ``pull-up'' of North American blocks into the Caribbean realm. To the west, the PMFS interact with a trench-parallel fault zone that links the Tonala fault to the Jalpatagua fault. These faults bound a fore-arc sliver that is shared by the two overriding plates. We propose that the dextral Jalpatagua fault merges with the sinistral PMFS, leaving behind a suturing structure, the Tonala fault. This tectonic ``zipper'' allows the migration of the triple junction. As a result, the fore-arc sliver comes into contact with the North American plate and helps to maintain a linear subduction zone along the trailing edge of the Caribbean plate. All these processes currently make the triple junction increasingly diffuse as it propagates eastward and inland within both overriding plates.
Resumo:
This paper suggests a method for obtaining efficiency bounds in models containing either only infinite-dimensional parameters or both finite- and infinite-dimensional parameters (semiparametric models). The method is based on a theory of random linear functionals applied to the gradient of the log-likelihood functional and is illustrated by computing the lower bound for Cox's regression model
Resumo:
In this correspondence, we propose applying the hiddenMarkov models (HMM) theory to the problem of blind channel estimationand data detection. The Baum–Welch (BW) algorithm, which is able toestimate all the parameters of the model, is enriched by introducingsome linear constraints emerging from a linear FIR hypothesis on thechannel. Additionally, a version of the algorithm that is suitable for timevaryingchannels is also presented. Performance is analyzed in a GSMenvironment using standard test channels and is found to be close to thatobtained with a nonblind receiver.
Resumo:
In this paper we develop a new linear approach to identify the parameters of a moving average (MA) model from the statistics of the output. First, we show that, under some constraints, the impulse response of the system can be expressed as a linear combination of cumulant slices. Then, thisresult is used to obtain a new well-conditioned linear methodto estimate the MA parameters of a non-Gaussian process. Theproposed method presents several important differences withexisting linear approaches. The linear combination of slices usedto compute the MA parameters can be constructed from dif-ferent sets of cumulants of different orders, providing a generalframework where all the statistics can be combined. Further-more, it is not necessary to use second-order statistics (the autocorrelation slice), and therefore the proposed algorithm stillprovides consistent estimates in the presence of colored Gaussian noise. Another advantage of the method is that while mostlinear methods developed so far give totally erroneous estimates if the order is overestimated, the proposed approach doesnot require a previous estimation of the filter order. The simulation results confirm the good numerical conditioning of thealgorithm and the improvement in performance with respect to existing methods.
Resumo:
The objective of this work was to compare random regression models for the estimation of genetic parameters for Guzerat milk production, using orthogonal Legendre polynomials. Records (20,524) of test-day milk yield (TDMY) from 2,816 first-lactation Guzerat cows were used. TDMY grouped into 10-monthly classes were analyzed for additive genetic effect and for environmental and residual permanent effects (random effects), whereas the contemporary group, calving age (linear and quadratic effects) and mean lactation curve were analized as fixed effects. Trajectories for the additive genetic and permanent environmental effects were modeled by means of a covariance function employing orthogonal Legendre polynomials ranging from the second to the fifth order. Residual variances were considered in one, four, six, or ten variance classes. The best model had six residual variance classes. The heritability estimates for the TDMY records varied from 0.19 to 0.32. The random regression model that used a second-order Legendre polynomial for the additive genetic effect, and a fifth-order polynomial for the permanent environmental effect is adequate for comparison by the main employed criteria. The model with a second-order Legendre polynomial for the additive genetic effect, and that with a fourth-order for the permanent environmental effect could also be employed in these analyses.
Resumo:
A maximum entropy statistical treatment of an inverse problem concerning frame theory is presented. The problem arises from the fact that a frame is an overcomplete set of vectors that defines a mapping with no unique inverse. Although any vector in the concomitant space can be expressed as a linear combination of frame elements, the coefficients of the expansion are not unique. Frame theory guarantees the existence of a set of coefficients which is “optimal” in a minimum norm sense. We show here that these coefficients are also “optimal” from a maximum entropy viewpoint.
Resumo:
Web-portaalien aiheenmukaista luokittelua voidaan hyödyntää tunnistamaan käyttäjän kiinnostuksen kohteet keräämällä tilastotietoa hänen selaustottumuksistaan eri kategorioissa. Tämä diplomityö käsittelee web-sovelluksien osa-alueita, joissa kerättyä tilastotietoa voidaan hyödyntää personalisoinnissa. Yleisperiaatteet sisällön personalisoinnista, Internet-mainostamisesta ja tiedonhausta selitetään matemaattisia malleja käyttäen. Lisäksi työssä kuvaillaan yleisluontoiset ominaisuudet web-portaaleista sekä tilastotiedon keräämiseen liittyvät seikat.
Resumo:
It is generally accepted that between 70 and 80% of manufacturing costs can be attributed to design. Nevertheless, it is difficult for the designer to estimate manufacturing costs accurately, especially when alternative constructions are compared at the conceptual design phase, because of the lack of cost information and appropriate tools. In general, previous reports concerning optimisation of a welded structure have used the mass of the product as the basis for the cost comparison. However, it can easily be shown using a simple example that the use of product mass as the sole manufacturing cost estimator is unsatisfactory. This study describes a method of formulating welding time models for cost calculation, and presents the results of the models for particular sections, based on typical costs in Finland. This was achieved by collecting information concerning welded products from different companies. The data included 71 different welded assemblies taken from the mechanical engineering and construction industries. The welded assemblies contained in total 1 589 welded parts, 4 257 separate welds, and a total welded length of 3 188 metres. The data were modelled for statistical calculations, and models of welding time were derived by using linear regression analysis. Themodels were tested by using appropriate statistical methods, and were found to be accurate. General welding time models have been developed, valid for welding in Finland, as well as specific, more accurate models for particular companies. The models are presented in such a form that they can be used easily by a designer, enabling the cost calculation to be automated.
Resumo:
Decision situations are often characterized by uncertainty: we do not know the values of the different options on all attributes and have to rely on information stored in our memory to decide. Several strategies have been proposed to describe how people make inferences based on knowledge used as cues. The present research shows how declarative memory of ACT-R models could be populated based on internet statistics. This will allow to simulate the performance of decision strategies operating on declarative knowledge based on occurrences and co-occurrences of objects and cues in the environment.
Resumo:
Geophysical tomography captures the spatial distribution of the underlying geophysical property at a relatively high resolution, but the tomographic images tend to be blurred representations of reality and generally fail to reproduce sharp interfaces. Such models may cause significant bias when taken as a basis for predictive flow and transport modeling and are unsuitable for uncertainty assessment. We present a methodology in which tomograms are used to condition multiple-point statistics (MPS) simulations. A large set of geologically reasonable facies realizations and their corresponding synthetically calculated cross-hole radar tomograms are used as a training image. The training image is scanned with a direct sampling algorithm for patterns in the conditioning tomogram, while accounting for the spatially varying resolution of the tomograms. In a post-processing step, only those conditional simulations that predicted the radar traveltimes within the expected data error levels are accepted. The methodology is demonstrated on a two-facies example featuring channels and an aquifer analog of alluvial sedimentary structures with five facies. For both cases, MPS simulations exhibit the sharp interfaces and the geological patterns found in the training image. Compared to unconditioned MPS simulations, the uncertainty in transport predictions is markedly decreased for simulations conditioned to tomograms. As an improvement to other approaches relying on classical smoothness-constrained geophysical tomography, the proposed method allows for: (1) reproduction of sharp interfaces, (2) incorporation of realistic geological constraints and (3) generation of multiple realizations that enables uncertainty assessment.
Resumo:
In this paper we study the existence of a unique solution for linear stochastic differential equations driven by a Lévy process, where the initial condition and the coefficients are random and not necessarily adapted to the underlying filtration. Towards this end, we extend the method based on Girsanov transformations on Wiener space and developped by Buckdahn [7] to the canonical Lévy space, which is introduced in [25].
Resumo:
This work presents new, efficient Markov chain Monte Carlo (MCMC) simulation methods for statistical analysis in various modelling applications. When using MCMC methods, the model is simulated repeatedly to explore the probability distribution describing the uncertainties in model parameters and predictions. In adaptive MCMC methods based on the Metropolis-Hastings algorithm, the proposal distribution needed by the algorithm learns from the target distribution as the simulation proceeds. Adaptive MCMC methods have been subject of intensive research lately, as they open a way for essentially easier use of the methodology. The lack of user-friendly computer programs has been a main obstacle for wider acceptance of the methods. This work provides two new adaptive MCMC methods: DRAM and AARJ. The DRAM method has been built especially to work in high dimensional and non-linear problems. The AARJ method is an extension to DRAM for model selection problems, where the mathematical formulation of the model is uncertain and we want simultaneously to fit several different models to the same observations. The methods were developed while keeping in mind the needs of modelling applications typical in environmental sciences. The development work has been pursued while working with several application projects. The applications presented in this work are: a winter time oxygen concentration model for Lake Tuusulanjärvi and adaptive control of the aerator; a nutrition model for Lake Pyhäjärvi and lake management planning; validation of the algorithms of the GOMOS ozone remote sensing instrument on board the Envisat satellite of European Space Agency and the study of the effects of aerosol model selection on the GOMOS algorithm.
Resumo:
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes.
Resumo:
QUESTIONS UNDER STUDY: Since tumour burden consumes substantial healthcare resources, precise cancer incidence estimations are pivotal to define future needs of national healthcare. This study aimed to estimate incidence and mortality rates of oesophageal, gastric, pancreatic, hepatic and colorectal cancers up to 2030 in Switzerland. METHODS: Swiss Statistics provides national incidences and mortality rates of various cancers, and models of future developments of the Swiss population. Cancer incidences and mortality rates from 1985 to 2009 were analysed to estimate trends and to predict incidence and mortality rates up to 2029. Linear regressions and Joinpoint analyses were performed to estimate the future trends of incidences and mortality rates. RESULTS: Crude incidences of oesophageal, pancreas, liver and colorectal cancers have steadily increased since 1985, and will continue to increase. Gastric cancer incidence and mortality rates reveal an ongoing decrease. Pancreatic and liver cancer crude mortality rates will keep increasing, whereas colorectal cancer mortality on the contrary will fall. Mortality from oesophageal cancer will plateau or minimally increase. If we consider European population-standardised incidence rates, oesophageal, pancreatic and colorectal cancer incidences are steady. Gastric cancers are diminishing and liver cancers will follow an increasing trend. Standardised mortality rates show a diminution for all but liver cancer. CONCLUSIONS: The oncological burden of gastrointestinal cancer will significantly increase in Switzerland during the next two decades. The crude mortality rates globally show an ongoing increase except for gastric and colorectal cancers. Enlarged healthcare resources to take care of these complex patient groups properly will be needed.