980 resultados para multilevel statistical modeling
Resumo:
The tongue is the most important and dynamic articulator for speech formation, because of its anatomic aspects (particularly, the large volume of this muscular organ comparatively to the surrounding organs of the vocal tract) and also due to the wide range of movements and flexibility that are involved. In speech communication research, a variety of techniques have been used for measuring the three-dimensional vocal tract shapes. More recently, magnetic resonance imaging (MRI) becomes common; mainly, because this technique allows the collection of a set of static and dynamic images that can represent the entire vocal tract along any orientation. Over the years, different anatomical organs of the vocal tract have been modelled; namely, 2D and 3D tongue models, using parametric or statistical modelling procedures. Our aims are to present and describe some 3D reconstructed models from MRI data, for one subject uttering sustained articulations of some typical Portuguese sounds. Thus, we present a 3D database of the tongue obtained by stack combinations with the subject articulating Portuguese vowels. This 3D knowledge of the speech organs could be very important; especially, for clinical purposes (for example, for the assessment of articulatory impairments followed by tongue surgery in speech rehabilitation), and also for a better understanding of acoustic theory in speech formation.
Resumo:
OBJECTIVE To analyze the association between concentrations of air pollutants and admissions for respiratory causes in children. METHODS Ecological time series study. Daily figures for hospital admissions of children aged < 6, and daily concentrations of air pollutants (PM10, SO2, NO2, O3 and CO) were analyzed in the Região da Grande Vitória, ES, Southeastern Brazil, from January 2005 to December 2010. For statistical analysis, two techniques were combined: Poisson regression with generalized additive models and principal model component analysis. Those analysis techniques complemented each other and provided more significant estimates in the estimation of relative risk. The models were adjusted for temporal trend, seasonality, day of the week, meteorological factors and autocorrelation. In the final adjustment of the model, it was necessary to include models of the Autoregressive Moving Average Models (p, q) type in the residuals in order to eliminate the autocorrelation structures present in the components. RESULTS For every 10:49 μg/m3 increase (interquartile range) in levels of the pollutant PM10 there was a 3.0% increase in the relative risk estimated using the generalized additive model analysis of main components-seasonal autoregressive – while in the usual generalized additive model, the estimate was 2.0%. CONCLUSIONS Compared to the usual generalized additive model, in general, the proposed aspect of generalized additive model − principal component analysis, showed better results in estimating relative risk and quality of fit.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies
Resumo:
The rapid growth of big cities has been noticed since 1950s when the majority of world population turned to live in urban areas rather than villages, seeking better job opportunities and higher quality of services and lifestyle circumstances. This demographic transition from rural to urban is expected to have a continuous increase. Governments, especially in less developed countries, are going to face more challenges in different sectors, raising the essence of understanding the spatial pattern of the growth for an effective urban planning. The study aimed to detect, analyse and model the urban growth in Greater Cairo Region (GCR) as one of the fast growing mega cities in the world using remote sensing data. Knowing the current and estimated urbanization situation in GCR will help decision makers in Egypt to adjust their plans and develop new ones. These plans should focus on resources reallocation to overcome the problems arising in the future and to achieve a sustainable development of urban areas, especially after the high percentage of illegal settlements which took place in the last decades. The study focused on a period of 30 years; from 1984 to 2014, and the major transitions to urban were modelled to predict the future scenarios in 2025. Three satellite images of different time stamps (1984, 2003 and 2014) were classified using Support Vector Machines (SVM) classifier, then the land cover changes were detected by applying a high level mapping technique. Later the results were analyzed for higher accurate estimations of the urban growth in the future in 2025 using Land Change Modeler (LCM) embedded in IDRISI software. Moreover, the spatial and temporal urban growth patterns were analyzed using statistical metrics developed in FRAGSTATS software. The study resulted in an overall classification accuracy of 96%, 97.3% and 96.3% for 1984, 2003 and 2014’s map, respectively. Between 1984 and 2003, 19 179 hectares of vegetation and 21 417 hectares of desert changed to urban, while from 2003 to 2014, the transitions to urban from both land cover classes were found to be 16 486 and 31 045 hectares, respectively. The model results indicated that 14% of the vegetation and 4% of the desert in 2014 will turn into urban in 2025, representing 16 512 and 24 687 hectares, respectively.
Resumo:
A potentially renewable and sustainable source of energy is the chemical energy associated with solvation of salts. Mixing of two aqueous streams with different saline concentrations is spontaneous and releases energy. The global theoretically obtainable power from salinity gradient energy due to World’s rivers discharge into the oceans has been estimated to be within the range of 1.4-2.6 TW. Reverse electrodialysis (RED) is one of the emerging, membrane-based, technologies for harvesting the salinity gradient energy. A common RED stack is composed by alternately-arranged cation- and anion-exchange membranes, stacked between two electrodes. The compartments between the membranes are alternately fed with concentrated (e.g., sea water) and dilute (e.g., river water) saline solutions. Migration of the respective counter-ions through the membranes leads to ionic current between the electrodes, where an appropriate redox pair converts the chemical salinity gradient energy into electrical energy. Given the importance of the need for new sources of energy for power generation, the present study aims at better understanding and solving current challenges, associated with the RED stack design, fluid dynamics, ionic mass transfer and long-term RED stack performance with natural saline solutions as feedwaters. Chronopotentiometry was used to determinate diffusion boundary layer (DBL) thickness from diffusion relaxation data and the flow entrance effects on mass transfer were found to avail a power generation increase in RED stacks. Increasing the linear flow velocity also leads to a decrease of DBL thickness but on the cost of a higher pressure drop. Pressure drop inside RED stacks was successfully simulated by the developed mathematical model, in which contribution of several pressure drops, that until now have not been considered, was included. The effect of each pressure drop on the RED stack performance was identified and rationalized and guidelines for planning and/or optimization of RED stacks were derived. The design of new profiled membranes, with a chevron corrugation structure, was proposed using computational fluid dynamics (CFD) modeling. The performance of the suggested corrugation geometry was compared with the already existing ones, as well as with the use of conductive and non-conductive spacers. According to the estimations, use of chevron structures grants the highest net power density values, at the best compromise between the mass transfer coefficient and the pressure drop values. Finally, long-term experiments with natural waters were performed, during which fouling was experienced. For the first time, 2D fluorescence spectroscopy was used to monitor RED stack performance, with a dedicated focus on following fouling on ion-exchange membrane surfaces. To extract relevant information from fluorescence spectra, parallel factor analysis (PARAFAC) was performed. Moreover, the information obtained was then used to predict net power density, stack electric resistance and pressure drop by multivariate statistical models based on projection to latent structures (PLS) modeling. The use in such models of 2D fluorescence data, containing hidden, but extractable by PARAFAC, information about fouling on membrane surfaces, considerably improved the models fitting to the experimental data.
Resumo:
We explore the determinants of usage of six different types of health care services, using the Medical Expenditure Panel Survey data, years 1996-2000. We apply a number of models for univariate count data, including semiparametric, semi-nonparametric and finite mixture models. We find that the complexity of the model that is required to fit the data well depends upon the way in which the data is pooled across sexes and over time, and upon the characteristics of the usage measure. Pooling across time and sexes is almost always favored, but when more heterogeneous data is pooled it is often the case that a more complex statistical model is required.
Resumo:
Among the largest resources for biological sequence data is the large amount of expressed sequence tags (ESTs) available in public and proprietary databases. ESTs provide information on transcripts but for technical reasons they often contain sequencing errors. Therefore, when analyzing EST sequences computationally, such errors must be taken into account. Earlier attempts to model error prone coding regions have shown good performance in detecting and predicting these while correcting sequencing errors using codon usage frequencies. In the research presented here, we improve the detection of translation start and stop sites by integrating a more complex mRNA model with codon usage bias based error correction into one hidden Markov model (HMM), thus generalizing this error correction approach to more complex HMMs. We show that our method maintains the performance in detecting coding sequences.
Resumo:
In multilevel modelling, interest in modeling the nested structure of hierarchical data has been accompanied by increasing attention to different forms of spatial interactions across different levels of the hierarchy. Neglecting such interactions is likely to create problems of inference, which typically assumes independence. In this paper we review approaches to multilevel modelling with spatial effects, and attempt to connect the two literatures, discussing the advantages and limitations of various approaches.
Resumo:
We investigate the dynamic and asymmetric dependence structure between equity portfolios from the US and UK. We demonstrate the statistical significance of dynamic asymmetric copula models in modelling and forecasting market risk. First, we construct “high-minus-low" equity portfolios sorted on beta, coskewness, and cokurtosis. We find substantial evidence of dynamic and asymmetric dependence between characteristic-sorted portfolios. Second, we consider a dynamic asymmetric copula model by combining the generalized hyperbolic skewed t copula with the generalized autoregressive score (GAS) model to capture both the multivariate non-normality and the dynamic and asymmetric dependence between equity portfolios. We demonstrate its usefulness by evaluating the forecasting performance of Value-at-Risk and Expected Shortfall for the high-minus-low portfolios. From back-testing, e find consistent and robust evidence that our dynamic asymmetric copula model provides the most accurate forecasts, indicating the importance of incorporating the dynamic and asymmetric dependence structure in risk management.
Resumo:
1. Species distribution modelling is used increasingly in both applied and theoretical research to predict how species are distributed and to understand attributes of species' environmental requirements. In species distribution modelling, various statistical methods are used that combine species occurrence data with environmental spatial data layers to predict the suitability of any site for that species. While the number of data sharing initiatives involving species' occurrences in the scientific community has increased dramatically over the past few years, various data quality and methodological concerns related to using these data for species distribution modelling have not been addressed adequately. 2. We evaluated how uncertainty in georeferences and associated locational error in occurrences influence species distribution modelling using two treatments: (1) a control treatment where models were calibrated with original, accurate data and (2) an error treatment where data were first degraded spatially to simulate locational error. To incorporate error into the coordinates, we moved each coordinate with a random number drawn from the normal distribution with a mean of zero and a standard deviation of 5 km. We evaluated the influence of error on the performance of 10 commonly used distributional modelling techniques applied to 40 species in four distinct geographical regions. 3. Locational error in occurrences reduced model performance in three of these regions; relatively accurate predictions of species distributions were possible for most species, even with degraded occurrences. Two species distribution modelling techniques, boosted regression trees and maximum entropy, were the best performing models in the face of locational errors. The results obtained with boosted regression trees were only slightly degraded by errors in location, and the results obtained with the maximum entropy approach were not affected by such errors. 4. Synthesis and applications. To use the vast array of occurrence data that exists currently for research and management relating to the geographical ranges of species, modellers need to know the influence of locational error on model quality and whether some modelling techniques are particularly robust to error. We show that certain modelling techniques are particularly robust to a moderate level of locational error and that useful predictions of species distributions can be made even when occurrence data include some error.
Resumo:
The dynamical analysis of large biological regulatory networks requires the development of scalable methods for mathematical modeling. Following the approach initially introduced by Thomas, we formalize the interactions between the components of a network in terms of discrete variables, functions, and parameters. Model simulations result in directed graphs, called state transition graphs. We are particularly interested in reachability properties and asymptotic behaviors, which correspond to terminal strongly connected components (or "attractors") in the state transition graph. A well-known problem is the exponential increase of the size of state transition graphs with the number of network components, in particular when using the biologically realistic asynchronous updating assumption. To address this problem, we have developed several complementary methods enabling the analysis of the behavior of large and complex logical models: (i) the definition of transition priority classes to simplify the dynamics; (ii) a model reduction method preserving essential dynamical properties, (iii) a novel algorithm to compact state transition graphs and directly generate compressed representations, emphasizing relevant transient and asymptotic dynamical properties. The power of an approach combining these different methods is demonstrated by applying them to a recent multilevel logical model for the network controlling CD4+ T helper cell response to antigen presentation and to a dozen cytokines. This model accounts for the differentiation of canonical Th1 and Th2 lymphocytes, as well as of inflammatory Th17 and regulatory T cells, along with many hybrid subtypes. All these methods have been implemented into the software GINsim, which enables the definition, the analysis, and the simulation of logical regulatory graphs.
Multimodel inference and multimodel averaging in empirical modeling of occupational exposure levels.
Resumo:
Empirical modeling of exposure levels has been popular for identifying exposure determinants in occupational hygiene. Traditional data-driven methods used to choose a model on which to base inferences have typically not accounted for the uncertainty linked to the process of selecting the final model. Several new approaches propose making statistical inferences from a set of plausible models rather than from a single model regarded as 'best'. This paper introduces the multimodel averaging approach described in the monograph by Burnham and Anderson. In their approach, a set of plausible models are defined a priori by taking into account the sample size and previous knowledge of variables influent on exposure levels. The Akaike information criterion is then calculated to evaluate the relative support of the data for each model, expressed as Akaike weight, to be interpreted as the probability of the model being the best approximating model given the model set. The model weights can then be used to rank models, quantify the evidence favoring one over another, perform multimodel prediction, estimate the relative influence of the potential predictors and estimate multimodel-averaged effects of determinants. The whole approach is illustrated with the analysis of a data set of 1500 volatile organic compound exposure levels collected by the Institute for work and health (Lausanne, Switzerland) over 20 years, each concentration having been divided by the relevant Swiss occupational exposure limit and log-transformed before analysis. Multimodel inference represents a promising procedure for modeling exposure levels that incorporates the notion that several models can be supported by the data and permits to evaluate to a certain extent model selection uncertainty, which is seldom mentioned in current practice.
Resumo:
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot
Resumo:
The pathogenesis of Schistosoma mansoni infection is largely determined by host T-cell mediated immune responses such as the granulomatous response to tissue deposited eggs and subsequent fibrosis. The major egg antigens have a valuable role in desensitizing the CD4+ Th cells that mediate granuloma formation, which may prevent or ameliorate clinical signs of schistosomiasis.S. mansoni major egg antigen Smp40 was expressed and completely purified. It was found that the expressed Smp40 reacts specifically with anti-Smp40 monoclonal antibody in Western blotting. Three-dimensional structure was elucidated based on the similarity of Smp40 with the small heat shock protein coded in the protein database as 1SHS as a template in the molecular modeling. It was figured out that the C-terminal of the Smp40 protein (residues 130 onward) contains two alpha crystallin domains. The fold consists of eight beta strands sandwiched in two sheets forming Greek key. The purified Smp40 was used for in vitro stimulation of peripheral blood mononuclear cells from patients infected with S. mansoni using phytohemagglutinin mitogen as a positive control. The obtained results showed that there is no statistical difference in interferon-g, interleukin (IL)-4 and IL-13 levels obtained with Smp40 stimulation compared with the control group (P > 0.05 for each). On the other hand, there were significant differences after Smp40 stimulation in IL-5 (P = 0.006) and IL-10 levels (P < 0.001) compared with the control group. Gaining the knowledge by reviewing the literature, it was found that the overall pattern of cytokine profile obtained with Smp40 stimulation is reported to be associated with reduced collagen deposition, decreased fibrosis, and granuloma formation inhibition. This may reflect its future prospect as a leading anti-pathology schistosomal vaccine candidate.
Resumo:
Metabolic problems lead to numerous failures during clinical trials, and much effort is now devoted to developing in silico models predicting metabolic stability and metabolites. Such models are well known for cytochromes P450 and some transferases, whereas less has been done to predict the activity of human hydrolases. The present study was undertaken to develop a computational approach able to predict the hydrolysis of novel esters by human carboxylesterase hCES2. The study involved first a homology modeling of the hCES2 protein based on the model of hCES1 since the two proteins share a high degree of homology (congruent with 73%). A set of 40 known substrates of hCES2 was taken from the literature; the ligands were docked in both their neutral and ionized forms using GriDock, a parallel tool based on the AutoDock4.0 engine which can perform efficient and easy virtual screening analyses of large molecular databases exploiting multi-core architectures. Useful statistical models (e.g., r (2) = 0.91 for substrates in their unprotonated state) were calculated by correlating experimental pK(m) values with distance between the carbon atom of the substrate's ester group and the hydroxy function of Ser228. Additional parameters in the equations accounted for hydrophobic and electrostatic interactions between substrates and contributing residues. The negatively charged residues in the hCES2 cavity explained the preference of the enzyme for neutral substrates and, more generally, suggested that ligands which interact too strongly by ionic bonds (e.g., ACE inhibitors) cannot be good CES2 substrates because they are trapped in the cavity in unproductive modes and behave as inhibitors. The effects of protonation on substrate recognition and the contrasting behavior of substrates and products were finally investigated by MD simulations of some CES2 complexes.