844 resultados para Failure time data analysis
Resumo:
Linear mixed effects models are frequently used to analyse longitudinal data, due to their flexibility in modelling the covariance structure between and within observations. Further, it is easy to deal with unbalanced data, either with respect to the number of observations per subject or per time period, and with varying time intervals between observations. In most applications of mixed models to biological sciences, a normal distribution is assumed both for the random effects and for the residuals. This, however, makes inferences vulnerable to the presence of outliers. Here, linear mixed models employing thick-tailed distributions for robust inferences in longitudinal data analysis are described. Specific distributions discussed include the Student-t, the slash and the contaminated normal. A Bayesian framework is adopted, and the Gibbs sampler and the Metropolis-Hastings algorithms are used to carry out the posterior analyses. An example with data on orthodontic distance growth in children is discussed to illustrate the methodology. Analyses based on either the Student-t distribution or on the usual Gaussian assumption are contrasted. The thick-tailed distributions provide an appealing robust alternative to the Gaussian process for modelling distributions of the random effects and of residuals in linear mixed models, and the MCMC implementation allows the computations to be performed in a flexible manner.
Resumo:
In order to contribute to the genetic breeding programs of buffaloes, this study aimed to determine the influence of environmental effects on the stayability (ST) of dairy female Murrah buffalo in the herd. Data from 1016 buffaloes were used. ST was defined as the ability of the female to remain in the herd for 1, 2, 3, 4, 5 or 6 years after the first calving. Environmental effects were studied by survival analysis, adjusted to the fixed effects of farm, year and season of birth, class of first-lactation milk yield and age at first calving. The data were analyzed using the LIFEREG procedure of the SAS program that fits parametric models to failure time data (culling or ST = 0), and estimates parameters by maximum likelihood estimation. Breeding farm, year of birth and first-lactation milk yield significantly influenced (P < 0.0001) the ST to the specific ages (1 to 6 years after the first calving). Buffaloes that were older at first calving presented higher probabilities of being culled 1 year after the first calving, without any effect on culling at older ages. Buffaloes with a higher milk yield at first calving presented a lower culling probability and remained for a longer period of time in the herd. The effects of breeding farm, year of birth and first-lactation milk yield should be included in models used for the analysis of ST in buffaloes. Copyright © The Animal Consortium 2010.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Masticatory muscle contraction causes both jaw movement and tissue deformation during function. Natural chewing data from 25 adult miniature pigs were studied by means of time series analysis. The data set included simultaneous recordings of electromyography (EMG) from bilateral masseter (MA), zygomaticomandibularis (ZM) and lateral pterygoid muscles, bone surface strains from the left squamosal bone (SQ), condylar neck (CD) and mandibular corpus (MD), and linear deformation of the capsule of the jaw joint measured bilaterally using differential variable reluctance transducers. Pairwise comparisons were examined by calculating the cross-correlation functions. Jaw-adductor muscle activity of MA and ZM was found to be highly cross-correlated with CD and SQ strains and weakly with MD strain. No muscle’s activity was strongly linked to capsular deformation of the jaw joint, nor were bone strains and capsular deformation tightly linked. Homologous muscle pairs showed the greatest synchronization of signals, but the signals themselves were not significantly more correlated than those of non-homologous muscle pairs. These results suggested that bone strains and capsular deformation are driven by different mechanical regimes. Muscle contraction and ensuing reaction forces are probably responsible for bone strains, whereas capsular deformation is more likely a product of movement.
Resumo:
Each plasma physics laboratory has a proprietary scheme to control and data acquisition system. Usually, it is different from one laboratory to another. It means that each laboratory has its own way to control the experiment and retrieving data from the database. Fusion research relies to a great extent on international collaboration and this private system makes it difficult to follow the work remotely. The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The choice of MDSplus (Model Driven System plus) is proved by the fact that it is widely utilized, and the scientists from different institutions may use the same system in different experiments in different tokamaks without the need to know how each system treats its acquisition system and data analysis. Another important point is the fact that the MDSplus has a library system that allows communication between different types of language (JAVA, Fortran, C, C++, Python) and programs such as MATLAB, IDL, OCTAVE. In the case of tokamak TCABR interfaces (object of this paper) between the system already in use and MDSplus were developed, instead of using the MDSplus at all stages, from the control, and data acquisition to the data analysis. This was done in the way to preserve a complex system already in operation and otherwise it would take a long time to migrate. This implementation also allows add new components using the MDSplus fully at all stages. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Data visualization techniques are powerful in the handling and analysis of multivariate systems. One such technique known as parallel coordinates was used to support the diagnosis of an event, detected by a neural network-based monitoring system, in a boiler at a Brazilian Kraft pulp mill. Its attractiveness is the possibility of the visualization of several variables simultaneously. The diagnostic procedure was carried out step-by-step going through exploratory, explanatory, confirmatory, and communicative goals. This tool allowed the visualization of the boiler dynamics in an easier way, compared to commonly used univariate trend plots. In addition it facilitated analysis of other aspects, namely relationships among process variables, distinct modes of operation and discrepant data. The whole analysis revealed firstly that the period involving the detected event was associated with a transition between two distinct normal modes of operation, and secondly the presence of unusual changes in process variables at this time.
Resumo:
The Primary Care Information System (SIAB) concentrates basic healthcare information from all different regions of Brazil. The information is collected by primary care teams on a paper-based procedure that degrades the quality of information provided to the healthcare authorities and slows down the process of decision making. To overcome these problems we propose a new data gathering application that uses a mobile device connected to a 3G network and a GPS to be used by the primary care teams for collecting the families' data. A prototype was developed in which a digital version of one SIAB form is made available at the mobile device. The prototype was tested in a basic healthcare unit located in a suburb of Sao Paulo. The results obtained so far have shown that the proposed process is a better alternative for data collecting at primary care, both in terms of data quality and lower deployment time to health care authorities.
Resumo:
The autoregressive (AR) estimator, a non-parametric method, is used to analyze functional magnetic resonance imaging (fMRI) data. The same method has been used, with success, in several other time series data analysis. It uses exclusively the available experimental data points to estimate the most plausible power spectra compatible with the experimental data and there is no need to make any assumption about non-measured points. The time series, obtained from fMRI block paradigm data, is analyzed by the AR method to determine the brain active regions involved in the processing of a given stimulus. This method is considerably more reliable than the fast Fourier transform or the parametric methods. The time series corresponding to each image pixel is analyzed using the AR estimator and the corresponding poles are obtained. The pole distribution gives the shape of power spectra, and the pixels with poles at the stimulation frequency are considered as the active regions. The method was applied in simulated and real data, its superiority is shown by the receiver operating characteristic curves which were obtained using the simulated data.
Resumo:
This work provides a forward step in the study and comprehension of the relationships between stochastic processes and a certain class of integral-partial differential equation, which can be used in order to model anomalous diffusion and transport in statistical physics. In the first part, we brought the reader through the fundamental notions of probability and stochastic processes, stochastic integration and stochastic differential equations as well. In particular, within the study of H-sssi processes, we focused on fractional Brownian motion (fBm) and its discrete-time increment process, the fractional Gaussian noise (fGn), which provide examples of non-Markovian Gaussian processes. The fGn, together with stationary FARIMA processes, is widely used in the modeling and estimation of long-memory, or long-range dependence (LRD). Time series manifesting long-range dependence, are often observed in nature especially in physics, meteorology, climatology, but also in hydrology, geophysics, economy and many others. We deepely studied LRD, giving many real data examples, providing statistical analysis and introducing parametric methods of estimation. Then, we introduced the theory of fractional integrals and derivatives, which indeed turns out to be very appropriate for studying and modeling systems with long-memory properties. After having introduced the basics concepts, we provided many examples and applications. For instance, we investigated the relaxation equation with distributed order time-fractional derivatives, which describes models characterized by a strong memory component and can be used to model relaxation in complex systems, which deviates from the classical exponential Debye pattern. Then, we focused in the study of generalizations of the standard diffusion equation, by passing through the preliminary study of the fractional forward drift equation. Such generalizations have been obtained by using fractional integrals and derivatives of distributed orders. In order to find a connection between the anomalous diffusion described by these equations and the long-range dependence, we introduced and studied the generalized grey Brownian motion (ggBm), which is actually a parametric class of H-sssi processes, which have indeed marginal probability density function evolving in time according to a partial integro-differential equation of fractional type. The ggBm is of course Non-Markovian. All around the work, we have remarked many times that, starting from a master equation of a probability density function f(x,t), it is always possible to define an equivalence class of stochastic processes with the same marginal density function f(x,t). All these processes provide suitable stochastic models for the starting equation. Studying the ggBm, we just focused on a subclass made up of processes with stationary increments. The ggBm has been defined canonically in the so called grey noise space. However, we have been able to provide a characterization notwithstanding the underline probability space. We also pointed out that that the generalized grey Brownian motion is a direct generalization of a Gaussian process and in particular it generalizes Brownain motion and fractional Brownain motion as well. Finally, we introduced and analyzed a more general class of diffusion type equations related to certain non-Markovian stochastic processes. We started from the forward drift equation, which have been made non-local in time by the introduction of a suitable chosen memory kernel K(t). The resulting non-Markovian equation has been interpreted in a natural way as the evolution equation of the marginal density function of a random time process l(t). We then consider the subordinated process Y(t)=X(l(t)) where X(t) is a Markovian diffusion. The corresponding time-evolution of the marginal density function of Y(t) is governed by a non-Markovian Fokker-Planck equation which involves the same memory kernel K(t). We developed several applications and derived the exact solutions. Moreover, we considered different stochastic models for the given equations, providing path simulations.
Resumo:
The Gaia space mission is a major project for the European astronomical community. As challenging as it is, the processing and analysis of the huge data-flow incoming from Gaia is the subject of thorough study and preparatory work by the DPAC (Data Processing and Analysis Consortium), in charge of all aspects of the Gaia data reduction. This PhD Thesis was carried out in the framework of the DPAC, within the team based in Bologna. The task of the Bologna team is to define the calibration model and to build a grid of spectro-photometric standard stars (SPSS) suitable for the absolute flux calibration of the Gaia G-band photometry and the BP/RP spectrophotometry. Such a flux calibration can be performed by repeatedly observing each SPSS during the life-time of the Gaia mission and by comparing the observed Gaia spectra to the spectra obtained by our ground-based observations. Due to both the different observing sites involved and the huge amount of frames expected (≃100000), it is essential to maintain the maximum homogeneity in data quality, acquisition and treatment, and a particular care has to be used to test the capabilities of each telescope/instrument combination (through the “instrument familiarization plan”), to devise methods to keep under control, and eventually to correct for, the typical instrumental effects that can affect the high precision required for the Gaia SPSS grid (a few % with respect to Vega). I contributed to the ground-based survey of Gaia SPSS in many respects: with the observations, the instrument familiarization plan, the data reduction and analysis activities (both photometry and spectroscopy), and to the maintenance of the data archives. However, the field I was personally responsible for was photometry and in particular relative photometry for the production of short-term light curves. In this context I defined and tested a semi-automated pipeline which allows for the pre-reduction of imaging SPSS data and the production of aperture photometry catalogues ready to be used for further analysis. A series of semi-automated quality control criteria are included in the pipeline at various levels, from pre-reduction, to aperture photometry, to light curves production and analysis.
Resumo:
The objective of this dissertation is to study the structure and behavior of the Atmospheric Boundary Layer (ABL) in stable conditions. This type of boundary layer is not completely well understood yet, although it is very important for many practical uses, from forecast modeling to atmospheric dispersion of pollutants. We analyzed data from the SABLES98 experiment (Stable Atmospheric Boundary Layer Experiment in Spain, 1998), and compared the behaviour of this data using Monin-Obukhov's similarity functions for wind speed and potential temperature. Analyzing the vertical profiles of various variables, in particular the thermal and momentum fluxes, we identified two main contrasting structures describing two different states of the SBL, a traditional and an upside-down boundary layer. We were able to determine the main features of these two states of the boundary layer in terms of vertical profiles of potential temperature and wind speed, turbulent kinetic energy and fluxes, studying the time series and vertical structure of the atmosphere for two separate nights in the dataset, taken as case studies. We also developed an original classification of the SBL, in order to separate the influence of mesoscale phenomena from turbulent behavior, using as parameters the wind speed and the gradient Richardson number. We then compared these two formulations, using the SABLES98 dataset, verifying their validity for different variables (wind speed and potential temperature, and their difference, at different heights) and with different stability parameters (zita or Rg). Despite these two classifications having completely different physical origins, we were able to find some common behavior, in particular under weak stability conditions.
Resumo:
The aging process is characterized by the progressive fitness decline experienced at all the levels of physiological organization, from single molecules up to the whole organism. Studies confirmed inflammaging, a chronic low-level inflammation, as a deeply intertwined partner of the aging process, which may provide the “common soil” upon which age-related diseases develop and flourish. Thus, albeit inflammation per se represents a physiological process, it can rapidly become detrimental if it goes out of control causing an excess of local and systemic inflammatory response, a striking risk factor for the elderly population. Developing interventions to counteract the establishment of this state is thus a top priority. Diet, among other factors, represents a good candidate to regulate inflammation. Building on top of this consideration, the EU project NU-AGE is now trying to assess if a Mediterranean diet, fortified for the elderly population needs, may help in modulating inflammaging. To do so, NU-AGE enrolled a total of 1250 subjects, half of which followed a 1-year long diet, and characterized them by mean of the most advanced –omics and non –omics analyses. The aim of this thesis was the development of a solid data management pipeline able to efficiently cope with the results of these assays, which are now flowing inside a centralized database, ready to be used to test the most disparate scientific hypotheses. At the same time, the work hereby described encompasses the data analysis of the GEHA project, which was focused on identifying the genetic determinants of longevity, with a particular focus on developing and applying a method for detecting epistatic interactions in human mtDNA. Eventually, in an effort to propel the adoption of NGS technologies in everyday pipeline, we developed a NGS variant calling pipeline devoted to solve all the sequencing-related issues of the mtDNA.
Resumo:
Over the time, Twitter has become a fundamental source of information for news. As a one step forward, researchers have tried to analyse if the tweets contain predictive power. In the past, in financial field, a lot of research has been done to propose a function which takes as input all the tweets for a particular stock or index s, analyse them and predict the stock or index price of s. In this work, we take an alternative approach: using the stock price and tweet information, we investigate following questions. 1. Is there any relation between the amount of tweets being generated and the stocks being exchanged? 2. Is there any relation between the sentiment of the tweets and stock prices? 3. What is the structure of the graph that describes the relationships between users?
Does published orthodontic research account for clustering effects during statistical data analysis?
Resumo:
In orthodontics, multiple site observations within patients or multiple observations collected at consecutive time points are often encountered. Clustered designs require larger sample sizes compared to individual randomized trials and special statistical analyses that account for the fact that observations within clusters are correlated. It is the purpose of this study to assess to what degree clustering effects are considered during design and data analysis in the three major orthodontic journals. The contents of the most recent 24 issues of the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO), Angle Orthodontist (AO), and European Journal of Orthodontics (EJO) from December 2010 backwards were hand searched. Articles with clustering effects and whether the authors accounted for clustering effects were identified. Additionally, information was collected on: involvement of a statistician, single or multicenter study, number of authors in the publication, geographical area, and statistical significance. From the 1584 articles, after exclusions, 1062 were assessed for clustering effects from which 250 (23.5 per cent) were considered to have clustering effects in the design (kappa = 0.92, 95 per cent CI: 0.67-0.99 for inter rater agreement). From the studies with clustering effects only, 63 (25.20 per cent) had indicated accounting for clustering effects. There was evidence that the studies published in the AO have higher odds of accounting for clustering effects [AO versus AJODO: odds ratio (OR) = 2.17, 95 per cent confidence interval (CI): 1.06-4.43, P = 0.03; EJO versus AJODO: OR = 1.90, 95 per cent CI: 0.84-4.24, non-significant; and EJO versus AO: OR = 1.15, 95 per cent CI: 0.57-2.33, non-significant). The results of this study indicate that only about a quarter of the studies with clustering effects account for this in statistical data analysis.
Resumo:
The positive and negative predictive value are standard measures used to quantify the predictive accuracy of binary biomarkers when the outcome being predicted is also binary. When the biomarkers are instead being used to predict a failure time outcome, there is no standard way of quantifying predictive accuracy. We propose a natural extension of the traditional predictive values to accommodate censored survival data. We discuss not only quantifying predictive accuracy using these extended predictive values, but also rigorously comparing the accuracy of two biomarkers in terms of their predictive values. Using a marginal regression framework, we describe how to estimate differences in predictive accuracy and how to test whether the observed difference is statistically significant.