937 resultados para likelihood-based inference
Resumo:
Research on the cognitive and decision-making processes of individuals who choose to engage in ideologically based violence is vital. Our research examines how abstract and concrete construal mindsets affect likelihood to engage in ideologically based violence. Construal Level Theory (CLT) states that an abstract mindset (high-level construal), as opposed to a concrete mindset (low-level construal), is associated with a greater likelihood of engaging in goal-oriented, value-motivated behaviors. Assuming that ideologically based violence is goal-oriented, we hypothesized that high-level construal should result in an increased likelihood of engaging in ideologically based violence. In the pilot study we developed and tested 24 vignettes covering controversial topics and assessed them on features such as relatability, emotional impact, and capacity to elicit a violent reaction. The ten most impactful vignettes were selected for use in the primary investigations. The two primary investigations examined the effect of high- and low-level construal manipulations on self-reported likelihood of engaging in ideologically based violence. Self-reported willingness was measured through an ideological violence assessment. Data trends implied that participants were engaged in the study, as they reported a higher willingness to engage in ideologically based violence when they had a higher passion for the vignette's social issue topic. Our results did not indicate a significant relationship between construal manipulations and level of passion for a topic.
Resumo:
Effective school discipline practices are essential to keeping schools safe and creating an optimal learning environment. However, the overreliance of exclusionary discipline often removes students from the school setting and deprives them of the opportunity to learn. Previous research has suggested that students are being introduced to the juvenile justice system through the use of school-based juvenile court referrals. In 2011, approximately 1.2 million delinquency cases were referred to the juvenile courts in the United States. Preliminary evidence suggests that an increasing number of these referrals have originated in the schools. This study investigated school-based referrals to the juvenile courts as an element of the School-to-Prison Pipeline (StPP). The likelihood of school-based juvenile court referrals and rate of dismissal of these referrals was examined in several states using data from the National Juvenile Court Data Archives. In addition, the study examined race and special education status as predictors of school-based juvenile court referrals. Descriptive statistics, logistic regression and odds ratio, were used to analyze the data, make conclusions based on the findings and recommend appropriate school discipline practices.
Resumo:
By proposing a numerical based method on PCA-ANFIS(Adaptive Neuro-Fuzzy Inference System), this paper is focusing on solving the problem of uncertain cycle of water injection in the oilfield. As the dimension of original data is reduced by PCA, ANFIS can be applied for training and testing the new data proposed by this paper. The correctness of PCA-ANFIS models are verified by the injection statistics data collected from 116 wells inside an oilfield, the average absolute error of testing is 1.80 months. With comparison by non-PCA based models which average error is 4.33 months largely ahead of PCA-ANFIS based models, it shows that the testing accuracy has been greatly enhanced by our approach. With the conclusion of the above testing, the PCA-ANFIS method is robust in predicting the effectiveness cycle of water injection which helps oilfield developers to design the water injection scheme.
Resumo:
In physics, one attempts to infer the rules governing a system given only the results of imperfect measurements. Hence, microscopic theories may be effectively indistinguishable experimentally. We develop an operationally motivated procedure to identify the corresponding equivalence classes of states, and argue that the renormalization group (RG) arises from the inherent ambiguities associated with the classes: one encounters flow parameters as, e.g., a regulator, a scale, or a measure of precision, which specify representatives in a given equivalence class. This provides a unifying framework and reveals the role played by information in renormalization. We validate this idea by showing that it justifies the use of low-momenta n-point functions as statistically relevant observables around a Gaussian hypothesis. These results enable the calculation of distinguishability in quantum field theory. Our methods also provide a way to extend renormalization techniques to effective models which are not based on the usual quantum-field formalism, and elucidates the relationships between various type of RG.
Resumo:
The protein lysate array is an emerging technology for quantifying the protein concentration ratios in multiple biological samples. It is gaining popularity, and has the potential to answer questions about post-translational modifications and protein pathway relationships. Statistical inference for a parametric quantification procedure has been inadequately addressed in the literature, mainly due to two challenges: the increasing dimension of the parameter space and the need to account for dependence in the data. Each chapter of this thesis addresses one of these issues. In Chapter 1, an introduction to the protein lysate array quantification is presented, followed by the motivations and goals for this thesis work. In Chapter 2, we develop a multi-step procedure for the Sigmoidal models, ensuring consistent estimation of the concentration level with full asymptotic efficiency. The results obtained in this chapter justify inferential procedures based on large-sample approximations. Simulation studies and real data analysis are used to illustrate the performance of the proposed method in finite-samples. The multi-step procedure is simpler in both theory and computation than the single-step least squares method that has been used in current practice. In Chapter 3, we introduce a new model to account for the dependence structure of the errors by a nonlinear mixed effects model. We consider a method to approximate the maximum likelihood estimator of all the parameters. Using the simulation studies on various error structures, we show that for data with non-i.i.d. errors the proposed method leads to more accurate estimates and better confidence intervals than the existing single-step least squares method.
Resumo:
Background: Calluna vulgaris is one of the most important landscaping plants produced in Germany. Its enormous economic success is due to the prolonged flower attractiveness of mutants in flower morphology, the so-called bud-bloomers. In this study, we present the first genetic linkage map of C. vulgaris in which we mapped a locus of the economically highly desired trait " flower type" .Results: The map was constructed in JoinMap 4.1. using 535 AFLP markers from a single mapping population. A large fraction (40%) of markers showed distorted segregation. To test the effect of segregation distortion on linkage estimation, these markers were sorted regarding their segregation ratio and added in groups to the data set. The plausibility of group formation was evaluated by comparison of the " two-way pseudo-testcross" and the " integrated" mapping approach. Furthermore, regression mapping was compared to the multipoint-likelihood algorithm. The majority of maps constructed by different combinations of these methods consisted of eight linkage groups corresponding to the chromosome number of C. vulgaris.Conclusions: All maps confirmed the independent inheritance of the most important horticultural traits " flower type" , " flower colour" , and " leaf colour". An AFLP marker for the most important breeding target " flower type" was identified. The presented genetic map of C. vulgaris can now serve as a basis for further molecular marker selection and map-based cloning of the candidate gene encoding the unique flower architecture of C. vulgaris bud-bloomers. © 2013 Behrend et al.; licensee BioMed Central Ltd.
Resumo:
Various environmental management systems, standards and tools are being created to assist companies to become more environmental friendly. However, not all the enterprises have adopted environmental policies in the same scale and range. Additionally, there is no existing guide to help them determine their level of environmental responsibility and subsequently, provide support to enable them to move forward towards environmental responsibility excellence. This research proposes the use of a Belief Rule-Based approach to assess an enterprise’s level commitment to environmental issues. The Environmental Responsibility BRB assessment system has been developed for this research. Participating companies will have to complete a structured questionnaire. An automated analysis of their responses (using the Belief Rule-Based approach) will determine their environmental responsibility level. This is followed by a recommendation on how to progress to the next level. The recommended best practices will help promote understanding, increase awareness, and make the organization greener. BRB systems consist of two parts: Knowledge Base and Inference Engine. The knowledge base in this research is constructed after an in-depth literature review, critical analyses of existing environmental performance assessment models and primarily guided by the EU Draft Background Report on "Best Environmental Management Practice in the Telecommunications and ICT Services Sector". The reasoning algorithm of a selected Drools JBoss BRB inference engine is forward chaining, where an inference starts iteratively searching for a pattern-match of the input and if-then clause. However, the forward chaining mechanism is not equipped with uncertainty handling. Therefore, a decision is made to deploy an evidential reasoning and forward chaining with a hybrid knowledge representation inference scheme to accommodate imprecision, ambiguity and fuzzy types of uncertainties. It is believed that such a system generates well balanced, sensible and Green ICT readiness adapted results, to help enterprises focus on making improvements on more sustainable business operations.
Resumo:
This is the author’s version of a work that was accepted for publication in AIDS Research and Human Retroviruses .
Resumo:
L'entraînement sans surveillance efficace et inférence dans les modèles génératifs profonds reste un problème difficile. Une approche assez simple, la machine de Helmholtz, consiste à entraîner du haut vers le bas un modèle génératif dirigé qui sera utilisé plus tard pour l'inférence approximative. Des résultats récents suggèrent que de meilleurs modèles génératifs peuvent être obtenus par de meilleures procédures d'inférence approximatives. Au lieu d'améliorer la procédure d'inférence, nous proposons ici un nouveau modèle, la machine de Helmholtz bidirectionnelle, qui garantit qu'on peut calculer efficacement les distributions de haut-vers-bas et de bas-vers-haut. Nous y parvenons en interprétant à les modèles haut-vers-bas et bas-vers-haut en tant que distributions d'inférence approximative, puis ensuite en définissant la distribution du modèle comme étant la moyenne géométrique de ces deux distributions. Nous dérivons une borne inférieure pour la vraisemblance de ce modèle, et nous démontrons que l'optimisation de cette borne se comporte en régulisateur. Ce régularisateur sera tel que la distance de Bhattacharyya sera minisée entre les distributions approximatives haut-vers-bas et bas-vers-haut. Cette approche produit des résultats de pointe en terme de modèles génératifs qui favorisent les réseaux significativement plus profonds. Elle permet aussi une inférence approximative amérliorée par plusieurs ordres de grandeur. De plus, nous introduisons un modèle génératif profond basé sur les modèles BiHM pour l'entraînement semi-supervisé.
Resumo:
This report discusses the calculation of analytic second-order bias techniques for the maximum likelihood estimates (for short, MLEs) of the unknown parameters of the distribution in quality and reliability analysis. It is well-known that the MLEs are widely used to estimate the unknown parameters of the probability distributions due to their various desirable properties; for example, the MLEs are asymptotically unbiased, consistent, and asymptotically normal. However, many of these properties depend on an extremely large sample sizes. Those properties, such as unbiasedness, may not be valid for small or even moderate sample sizes, which are more practical in real data applications. Therefore, some bias-corrected techniques for the MLEs are desired in practice, especially when the sample size is small. Two commonly used popular techniques to reduce the bias of the MLEs, are ‘preventive’ and ‘corrective’ approaches. They both can reduce the bias of the MLEs to order O(n−2), whereas the ‘preventive’ approach does not have an explicit closed form expression. Consequently, we mainly focus on the ‘corrective’ approach in this report. To illustrate the importance of the bias-correction in practice, we apply the bias-corrected method to two popular lifetime distributions: the inverse Lindley distribution and the weighted Lindley distribution. Numerical studies based on the two distributions show that the considered bias-corrected technique is highly recommended over other commonly used estimators without bias-correction. Therefore, special attention should be paid when we estimate the unknown parameters of the probability distributions under the scenario in which the sample size is small or moderate.
Resumo:
Detection canines represent the fastest and most versatile means of illicit material detection. This research endeavor in its most simplistic form is the improvement of detection canines through training, training aids, and calibration. This study focuses on developing a universal calibration compound for which all detection canines, regardless of detection substance, can be tested daily to ensure that they are working with acceptable parameters. Surrogate continuation aids (SCAs) were developed for peroxide based explosives along with the validation of the SCAs already developed within the International Forensic Research Institute (IFRI) prototype surrogate explosives kit. Storage parameters of the SCAs were evaluated to give recommendations to the detection canine community on the best possible training aid storage solution that minimizes the likelihood of contamination. Two commonly used and accepted detection canine imprinting methods were also evaluated for the speed in which the canine is trained and their reliability. As a result of the completion of this study, SCAs have been developed for explosive detection canine use covering: peroxide based explosives, TNT based explosives, nitroglycerin based explosives, tagged explosives, plasticized explosives, and smokeless powders. Through the use of these surrogate continuation aids a more uniform and reliable system of training can be implemented in the field than is currently used today. By examining the storage parameters of the SCAs, an ideal storage system has been developed using three levels of containment for the reduction of possible contamination. The developed calibration compound will ease the growing concerns over the legality and reliability of detection canine use by detailing the daily working parameters of the canine, allowing for Daubert rules of evidence admissibility to be applied. Through canine field testing, it has been shown that the IFRI SCAs outperform other commercially available training aids on the market. Additionally, of the imprinting methods tested, no difference was found in the speed in which the canines are trained or their reliability to detect illicit materials. Therefore, if the recommendations discovered in this study are followed, the detection canine community will greatly benefit through the use of scientifically validated training techniques and training aids.
Resumo:
L'entraînement sans surveillance efficace et inférence dans les modèles génératifs profonds reste un problème difficile. Une approche assez simple, la machine de Helmholtz, consiste à entraîner du haut vers le bas un modèle génératif dirigé qui sera utilisé plus tard pour l'inférence approximative. Des résultats récents suggèrent que de meilleurs modèles génératifs peuvent être obtenus par de meilleures procédures d'inférence approximatives. Au lieu d'améliorer la procédure d'inférence, nous proposons ici un nouveau modèle, la machine de Helmholtz bidirectionnelle, qui garantit qu'on peut calculer efficacement les distributions de haut-vers-bas et de bas-vers-haut. Nous y parvenons en interprétant à les modèles haut-vers-bas et bas-vers-haut en tant que distributions d'inférence approximative, puis ensuite en définissant la distribution du modèle comme étant la moyenne géométrique de ces deux distributions. Nous dérivons une borne inférieure pour la vraisemblance de ce modèle, et nous démontrons que l'optimisation de cette borne se comporte en régulisateur. Ce régularisateur sera tel que la distance de Bhattacharyya sera minisée entre les distributions approximatives haut-vers-bas et bas-vers-haut. Cette approche produit des résultats de pointe en terme de modèles génératifs qui favorisent les réseaux significativement plus profonds. Elle permet aussi une inférence approximative amérliorée par plusieurs ordres de grandeur. De plus, nous introduisons un modèle génératif profond basé sur les modèles BiHM pour l'entraînement semi-supervisé.
Resumo:
The current approach to data analysis for the Laser Interferometry Space Antenna (LISA) depends on the time delay interferometry observables (TDI) which have to be generated before any weak signal detection can be performed. These are linear combinations of the raw data with appropriate time shifts that lead to the cancellation of the laser frequency noises. This is possible because of the multiple occurrences of the same noises in the different raw data. Originally, these observables were manually generated starting with LISA as a simple stationary array and then adjusted to incorporate the antenna's motions. However, none of the observables survived the flexing of the arms in that they did not lead to cancellation with the same structure. The principal component approach is another way of handling these noises that was presented by Romano and Woan which simplified the data analysis by removing the need to create them before the analysis. This method also depends on the multiple occurrences of the same noises but, instead of using them for cancellation, it takes advantage of the correlations that they produce between the different readings. These correlations can be expressed in a noise (data) covariance matrix which occurs in the Bayesian likelihood function when the noises are assumed be Gaussian. Romano and Woan showed that performing an eigendecomposition of this matrix produced two distinct sets of eigenvalues that can be distinguished by the absence of laser frequency noise from one set. The transformation of the raw data using the corresponding eigenvectors also produced data that was free from the laser frequency noises. This result led to the idea that the principal components may actually be time delay interferometry observables since they produced the same outcome, that is, data that are free from laser frequency noise. The aims here were (i) to investigate the connection between the principal components and these observables, (ii) to prove that the data analysis using them is equivalent to that using the traditional observables and (ii) to determine how this method adapts to real LISA especially the flexing of the antenna. For testing the connection between the principal components and the TDI observables a 10x 10 covariance matrix containing integer values was used in order to obtain an algebraic solution for the eigendecomposition. The matrix was generated using fixed unequal arm lengths and stationary noises with equal variances for each noise type. Results confirm that all four Sagnac observables can be generated from the eigenvectors of the principal components. The observables obtained from this method however, are tied to the length of the data and are not general expressions like the traditional observables, for example, the Sagnac observables for two different time stamps were generated from different sets of eigenvectors. It was also possible to generate the frequency domain optimal AET observables from the principal components obtained from the power spectral density matrix. These results indicate that this method is another way of producing the observables therefore analysis using principal components should give the same results as that using the traditional observables. This was proven by fact that the same relative likelihoods (within 0.3%) were obtained from the Bayesian estimates of the signal amplitude of a simple sinusoidal gravitational wave using the principal components and the optimal AET observables. This method fails if the eigenvalues that are free from laser frequency noises are not generated. These are obtained from the covariance matrix and the properties of LISA that are required for its computation are the phase-locking, arm lengths and noise variances. Preliminary results of the effects of these properties on the principal components indicate that only the absence of phase-locking prevented their production. The flexing of the antenna results in time varying arm lengths which will appear in the covariance matrix and, from our toy model investigations, this did not prevent the occurrence of the principal components. The difficulty with flexing, and also non-stationary noises, is that the Toeplitz structure of the matrix will be destroyed which will affect any computation methods that take advantage of this structure. In terms of separating the two sets of data for the analysis, this was not necessary because the laser frequency noises are very large compared to the photodetector noises which resulted in a significant reduction in the data containing them after the matrix inversion. In the frequency domain the power spectral density matrices were block diagonals which simplified the computation of the eigenvalues by allowing them to be done separately for each block. The results in general showed a lack of principal components in the absence of phase-locking except for the zero bin. The major difference with the power spectral density matrix is that the time varying arm lengths and non-stationarity do not show up because of the summation in the Fourier transform.
Resumo:
Understanding how virus strains offer protection against closely related emerging strains is vital for creating effective vaccines. For many viruses, including Foot-and-Mouth Disease Virus (FMDV) and the Influenza virus where multiple serotypes often co-circulate, in vitro testing of large numbers of vaccines can be infeasible. Therefore the development of an in silico predictor of cross-protection between strains is important to help optimise vaccine choice. Vaccines will offer cross-protection against closely related strains, but not against those that are antigenically distinct. To be able to predict cross-protection we must understand the antigenic variability within a virus serotype, distinct lineages of a virus, and identify the antigenic residues and evolutionary changes that cause the variability. In this thesis we present a family of sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution (SABRE), as well as an extended version of the method, the extended SABRE (eSABRE) method, which better takes into account the data collection process. The SABRE methods are a family of sparse Bayesian hierarchical models that use spike and slab priors to identify sites in the viral protein which are important for the neutralisation of the virus. In this thesis we demonstrate how the SABRE methods can be used to identify antigenic residues within different serotypes and show how the SABRE method outperforms established methods, mixed-effects models based on forward variable selection or l1 regularisation, on both synthetic and viral datasets. In addition we also test a number of different versions of the SABRE method, compare conjugate and semi-conjugate prior specifications and an alternative to the spike and slab prior; the binary mask model. We also propose novel proposal mechanisms for the Markov chain Monte Carlo (MCMC) simulations, which improve mixing and convergence over that of the established component-wise Gibbs sampler. The SABRE method is then applied to datasets from FMDV and the Influenza virus in order to identify a number of known antigenic residue and to provide hypotheses of other potentially antigenic residues. We also demonstrate how the SABRE methods can be used to create accurate predictions of the important evolutionary changes of the FMDV serotypes. In this thesis we provide an extended version of the SABRE method, the eSABRE method, based on a latent variable model. The eSABRE method takes further into account the structure of the datasets for FMDV and the Influenza virus through the latent variable model and gives an improvement in the modelling of the error. We show how the eSABRE method outperforms the SABRE methods in simulation studies and propose a new information criterion for selecting the random effects factors that should be included in the eSABRE method; block integrated Widely Applicable Information Criterion (biWAIC). We demonstrate how biWAIC performs equally to two other methods for selecting the random effects factors and combine it with the eSABRE method to apply it to two large Influenza datasets. Inference in these large datasets is computationally infeasible with the SABRE methods, but as a result of the improved structure of the likelihood, we are able to show how the eSABRE method offers a computational improvement, leading it to be used on these datasets. The results of the eSABRE method show that we can use the method in a fully automatic manner to identify a large number of antigenic residues on a variety of the antigenic sites of two Influenza serotypes, as well as making predictions of a number of nearby sites that may also be antigenic and are worthy of further experiment investigation.
Resumo:
Our aim was to determine the normative reference values of cardiorespiratory fitness (CRF) and to establish the proportion of subjects with low CRF suggestive of future cardio-metabolic risk.