148 resultados para Vector Auto Regression
em Queensland University of Technology - ePrints Archive
Resumo:
BACKGROUND Pandemic influenza A (H1N1) has a significant public health impact. This study aimed to examine the effect of socio-ecological factors on the transmission of H1N1 in Brisbane, Australia. METHODOLOGY We obtained data from Queensland Health on numbers of laboratory-confirmed daily H1N1 in Brisbane by statistical local areas (SLA) in 2009. Data on weather and socio-economic index were obtained from the Australian Bureau of Meteorology and the Australian Bureau of Statistics, respectively. A Bayesian spatial conditional autoregressive (CAR) model was used to quantify the relationship between variation of H1N1 and independent factors and to determine its spatiotemporal patterns. RESULTS Our results show that average increase in weekly H1N1 cases were 45.04% (95% credible interval (CrI): 42.63-47.43%) and 23.20% (95% CrI: 16.10-32.67%), for a 1 °C decrease in average weekly maximum temperature at a lag of one week and a 10mm decrease in average weekly rainfall at a lag of one week, respectively. An interactive effect between temperature and rainfall on H1N1 incidence was found (changes: 0.71%; 95% CrI: 0.48-0.98%). The auto-regression term was significantly associated with H1N1 transmission (changes: 2.5%; 95% CrI: 1.39-3.72). No significant association between socio-economic indexes for areas (SEIFA) and H1N1 was observed at SLA level. CONCLUSIONS Our results demonstrate that average weekly temperature at lag of one week and rainfall at lag of one week were substantially associated with H1N1 incidence at a SLA level. The ecological factors seemed to have played an important role in H1N1 transmission cycles in Brisbane, Australia.
Resumo:
Background The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
This paper presents an approach to predict the operating conditions of machine based on classification and regression trees (CART) and adaptive neuro-fuzzy inference system (ANFIS) in association with direct prediction strategy for multi-step ahead prediction of time series techniques. In this study, the number of available observations and the number of predicted steps are initially determined by using false nearest neighbor method and auto mutual information technique, respectively. These values are subsequently utilized as inputs for prediction models to forecast the future values of the machines’ operating conditions. The performance of the proposed approach is then evaluated by using real trending data of low methane compressor. A comparative study of the predicted results obtained from CART and ANFIS models is also carried out to appraise the prediction capability of these models. The results show that the ANFIS prediction model can track the change in machine conditions and has the potential for using as a tool to machine fault prognosis.
Resumo:
A satellite based observation system can continuously or repeatedly generate a user state vector time series that may contain useful information. One typical example is the collection of International GNSS Services (IGS) station daily and weekly combined solutions. Another example is the epoch-by-epoch kinematic position time series of a receiver derived by a GPS real time kinematic (RTK) technique. Although some multivariate analysis techniques have been adopted to assess the noise characteristics of multivariate state time series, statistic testings are limited to univariate time series. After review of frequently used hypotheses test statistics in univariate analysis of GNSS state time series, the paper presents a number of T-squared multivariate analysis statistics for use in the analysis of multivariate GNSS state time series. These T-squared test statistics have taken the correlation between coordinate components into account, which is neglected in univariate analysis. Numerical analysis was conducted with the multi-year time series of an IGS station to schematically demonstrate the results from the multivariate hypothesis testing in comparison with the univariate hypothesis testing results. The results have demonstrated that, in general, the testing for multivariate mean shifts and outliers tends to reject less data samples than the testing for univariate mean shifts and outliers under the same confidence level. It is noted that neither univariate nor multivariate data analysis methods are intended to replace physical analysis. Instead, these should be treated as complementary statistical methods for a prior or posteriori investigations. Physical analysis is necessary subsequently to refine and interpret the results.
Resumo:
Expert elicitation is the process of retrieving and quantifying expert knowledge in a particular domain. Such information is of particular value when the empirical data is expensive, limited, or unreliable. This paper describes a new software tool, called Elicitator, which assists in quantifying expert knowledge in a form suitable for use as a prior model in Bayesian regression. Potential environmental domains for applying this elicitation tool include habitat modeling, assessing detectability or eradication, ecological condition assessments, risk analysis, and quantifying inputs to complex models of ecological processes. The tool has been developed to be user-friendly, extensible, and facilitate consistent and repeatable elicitation of expert knowledge across these various domains. We demonstrate its application to elicitation for logistic regression in a geographically based ecological context. The underlying statistical methodology is also novel, utilizing an indirect elicitation approach to target expert knowledge on a case-by-case basis. For several elicitation sites (or cases), experts are asked simply to quantify their estimated ecological response (e.g. probability of presence), and its range of plausible values, after inspecting (habitat) covariates via GIS.
Resumo:
The roles of weather variability and sunspots in the occurrence of cyanobacteria blooms, were investigated using cyanobacteria cell data collected from the Fred Haigh Dam, Queensland, Australia. Time series generalized linear model and classification and regression (CART) model were used in the analysis. Data on notified cell numbers of cyanobacteria and weather variables over the periods 2001 and 2005 were provided by the Australian Department of Natural Resources and Water, and Australian Bureau of Meteorology, respectively. The results indicate that monthly minimum temperature (relative risk [RR]: 1.13, 95% confidence interval [CI]: 1.02-1.25) and rainfall (RR: 1.11; 95% CI: 1.03-1.20) had a positive association, but relative humidity (RR: 0.94; 95% CI: 0.91-0.98) and wind speed (RR:0.90; 95% CI: 0.82-0.98) were negatively associated with the cyanobacterial numbers, after adjustment for seasonality and auto-correlation. The CART model showed that the cyanobacteria numbers were best described by an interaction between minimum temperature, relative humidity, and sunspot numbers. When minimum temperature exceeded 18%C and relative humidity was under 66%, the number of cyanobacterial cells rose by 2.15-fold. We conclude that the weather variability and sunspot activity may affect cyanobacterial blooms in dams.
Resumo:
Over the past decade, plants have been used as expression hosts for the production of pharmaceutically important and commercially valuable proteins. Plants offer many advantages over other expression systems such as lower production costs, rapid scale up of production, similar post-translational modification as animals and the low likelihood of contamination with animal pathogens, microbial toxins or oncogenic sequences. However, improving recombinant protein yield remains one of the greatest challenges to molecular farming. In-Plant Activation (InPAct) is a newly developed technology that offers activatable and high-level expression of heterologous proteins in plants. InPAct vectors contain the geminivirus cis elements essential for rolling circle replication (RCR) and are arranged such that the gene of interest is only expressed in the presence of the cognate viral replication-associated protein (Rep). The expression of Rep in planta may be controlled by a tissue-specific, developmentally regulated or chemically inducible promoter such that heterologous protein accumulation can be spatially and temporally controlled. One of the challenges for the successful exploitation of InPAct technology is the control of Rep expression as even very low levels of this protein can reduce transformation efficiency, cause abnormal phenotypes and premature activation of the InPAct vector in regenerated plants. Tight regulation over transgene expression is also essential if expressing cytotoxic products. Unfortunately, many tissue-specific and inducible promoters are unsuitable for controlling expression of Rep due to low basal activity in the absence of inducer or in tissues other than the target tissue. This PhD aimed to control Rep activity through the production of single chain variable fragments (scFvs) specific to the motif III of Tobacco yellow dwarf virus (TbYDV) Rep. Due to the important role played by the conserved motif III in the RCR, it was postulated that such scFvs can be used to neutralise the activity of the low amount of Rep expressed from a “leaky” inducible promoter, thus preventing activation of the TbYDV-based InPAct vector until intentional induction. Such scFvs could also offer the potential to confer partial or complete resistance to TbYDV, and possibly heterologous viruses as motif III is conserved between geminiviruses. Studies were first undertaken to determine the levels of TbYDV Rep and TbYDV replication-associated protein A (RepA) required for optimal transgene expression from a TbYDV-based InPAct vector. Transient assays in a non-regenerable Nicotiana tabacum (NT-1) cell line were undertaken using a TbYDV-based InPAct vector containing the uidA reporter gene (encoding GUS) in combination with TbYDV Rep and RepA under the control of promoters with high (CaMV 35S) or low (Banana bunchy top virus DNA-R, BT1) activity. The replication enhancer protein of Tomato leaf curl begomovirus (ToLCV), REn, was also used in some co-bombardment experiments to examine whether RepA could be substituted by a replication enhancer from another geminivirus genus. GUS expression was observed both quantitatively and qualitatively by fluorometric and histochemical assays, respectively. GUS expression from the TbYDV-based InPAct vector was found to be greater when Rep was expected to be expressed at low levels (BT1 promoter) rather than high levels (35S promoter). GUS expression was further enhanced when Rep and RepA were co-bombarded with a low ratio of Rep to RepA. Substituting TbYDV RepA with ToLCV REn also enhanced GUS expression but more importantly highest GUS expression was observed when cells were co-transformed with expression vectors directing low levels of Rep and high levels of RepA irrespective of the level of REn. In this case, GUS expression was approximately 74-fold higher than that from a non-replicating vector. The use of different terminators, namely CaMV 35S and Nos terminators, in InPAct vectors was found to influence GUS expression. In the presence of Rep, GUS expression was greater using pInPActGUS-Nos rather than pInPActGUS-35S. The only instance of GUS expression being greater from vectors containing the 35S terminator was when comparing expression from cells transformed with Rep, RepA and REnexpressing vectors and either non-replicating vectors, p35SGS-Nos or p35SGS-35S. This difference was most likely caused by an interaction of viral replication proteins with each other and the terminators. These results indicated that (i) the level of replication associated proteins is critical to high transgene expression, (ii) the choice of terminator within the InPAct vector may affect expression levels and (iii) very low levels of Rep can activate InPAct vectors hence controlling its activity is critical. Prior to generating recombinant scFvs, a recombinant TbYDV Rep was produced in E. coli to act as a control to enable the screening for Rep-specific antibodies. A bacterial expression vector was constructed to express recombinant TbYDV Rep with an Nterminal His-tag (N-His-Rep). Despite investigating several purification techniques including Ni-NTA, anion exchange, hydrophobic interaction and size exclusion chromatography, N-His-Rep could only be partially purified using a Ni-NTA column under native conditions. Although it was not certain that this recombinant N-His-Rep had the same conformation as the native TbYDV Rep and was functional, results from an electromobility shift assay (EMSA) showed that N-His-Rep was able to interact with the TbYDV LIR and was, therefore, possibly functional. Two hybridoma cell lines from mice, immunised with a synthetic peptide containing the TbYDV Rep motif III amino acid sequence, were generated by GenScript (USA). Monoclonal antibodies secreted by the two hybridoma cell lines were first screened against denatured N-His-Rep in Western analysis. After demonstrating their ability to bind N-His-Rep, two scFvs (scFv1 and scFv2) were generated using a PCR-based approach. Whereas the variable heavy chain (VH) from both cell lines could be amplified, only the variable light chain (VL) from cell line 2 was amplified. As a result, scFv1 contained VH and VL from cell line 1, whereas scFv2 contained VH from cell line 2 and VL from cell line 1. Both scFvs were first expressed in E. coli in order to evaluate their affinity to the recombinant TbYDV N-His-Rep. The preliminary results demonstrated that both scFvs were able to bind to the denatured N-His-Rep. However, EMSAs revealed that only scFv2 was able to bind to native N-His-Rep and prevent it from interacting with the TbYDV LIR. Each scFv was cloned into plant expression vectors and co-bombarded into NT-1 cells with the TbYDV-based InPAct GUS expression vector and pBT1-Rep to examine whether the scFvs could prevent Rep from mediating RCR. Although it was expected that the addition of the scFvs would result in decreased GUS expression, GUS expression was found to slightly increase. This increase was even more pronounced when the scFvs were targeted to the cell nucleus by the inclusion of the Simian virus 40 large T antigen (SV40) nuclear localisation signal (NLS). It was postulated that the scFvs were binding to a proportion of Rep, leaving a small amount available to mediate RCR. The outcomes of this project provide evidence that very high levels of recombinant protein can theoretically be expressed using InPAct vectors with judicious selection and control of viral replication proteins. However, the question of whether the scFvs generated in this project have sufficient affinity for TbYDV Rep to prevent its activity in a stably transformed plant remains unknown. It may be that other scFvs with different combinations of VH and VL may have greater affinity for TbYDV Rep. Such scFvs, when expressed at high levels in planta, might also confer resistance to TbYDV and possibly heterologous geminiviruses.