963 resultados para STATISTICAL MODELS
Resumo:
A evasão estudantil afeta as universidades, privadas e públicas, no Brasil, trazendo-lhes prejuízos financeiros proporcionais à incidência, respectivamente, de 12% e de 26% no âmbito nacional e de 23% na Universidade de São Paulo (USP), razão pela qual se deve compreender as variáveis que governam o comportamento. Neste contexto, a pesquisa apresenta os prejuízos causados pela evasão e a importância de pesquisá-la na Escola Politécnica da USP (EPUSP): seção 1, desenvolve revisão bibliográfica sobre as causas da evasão (seção 2) e propõe métodos para obter as taxas de evasão a partir dos bancos de dados do Governo Federal e da USP (seção 3). Os resultados estão na seção 4. Para inferir sobre as causas da evasão na EPUSP, analisaram-se bancos de dados que, descritos e tratados na seção 5.1, contêm informações (P. Ex.: tipo de ingresso e egresso, tempo de permanência e histórico escolar) de 16.664 alunos ingressantes entre 1.970 e 2.000, bem como se propuseram modelos estatísticos e se detalharam os conceitos dos testes de hipóteses 2 e t-student (seção 5.2) utilizados na pesquisa. As estatísticas descritivas mostram que a EPUSP sofre 15% de evasão (com maior incidência no 2º ano: 24,65%), que os evadidos permanecem matriculados por 3,8 anos, que a probabilidade de evadir cresce após 6º ano e que as álgebras e os cálculos são disciplinas reprovadoras no 1º ano (seção 5.3). As estatísticas inferenciais demonstraram relação entre a evasão - modo de ingresso na EPUSP e evasão - reprovação nas disciplinas do 1º ano da EPUSP, resultados que, combinados com as estatísticas descritivas, permitiram apontar o déficit vocacional, a falta de persistência, a falta de ambientação à EPUSP e as deficiências na formação predecessora como variáveis responsáveis pela evasão (seção 5.4).
Resumo:
Los modelos de nicho ecológico permiten estudiar el efecto del ambiente sobre la distribución de las especies, relacionando datos de su distribución con información ambiental. El objetivo del presente estudio fue estimar el nicho ecológico y describir la variabilidad en la distribución espacial de la anchoveta (Engraulis ringens) mediante el uso de modelos estadísticos de nicho ecológico. Se trabajó con dos enfoques de análisis: por stocks (norte, centro y sur) en el Pacífico Sudoriental (PSO) y por estadios de desarrollo (pre-reclutas, reclutas y adultos) en la costa peruana. El modelo de nicho ecológico utilizó modelos aditivos generalizados, estimaciones georeferenciadas de presencia y ausencia de anchoveta e información de cuatro variables ambientales (temperatura superficial del mar, salinidad superficial del mar, concentración de clorofila a superficial y la profundidad de la oxiclina) entre los a˜nos 1985 y 2008. Se encontró que no existen diferencias en los nichos ecológicos de los tres stocks de anchoveta siendo los modelos que utilizaron la información de la anchoveta en todo el PSO los que lograron modelar el nicho de manera correcta. Respecto al análisis por estadios, se evidenció que cada estadio de desarrollo tiene distintas tolerancias a las variables ambientales consideradas en este trabajo, siendo los nichos de estadios menos desarrollados los que estuvieron incluidos dentro de los estadios más desarrollados. Se recomienda realizar estudios separados para cada estadio de desarrollo, lo cual permita comprender mejor las relaciones ecológicas encontradas en los resultados del nicho ecológico. Además se recomienda realizar simulaciones con modelos de nicho que incluyan más variables ambientales, las cuales puedan mejorar los mapas de distribución espacial de la anchoveta para los dos enfoques de análisis.
Resumo:
Includes bibliographical references and index.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Recently, methods for computing D-optimal designs for population pharmacokinetic studies have become available. However there are few publications that have prospectively evaluated the benefits of D-optimality in population or single-subject settings. This study compared a population optimal design with an empirical design for estimating the base pharmacokinetic model for enoxaparin in a stratified randomized setting. The population pharmacokinetic D-optimal design for enoxaparin was estimated using the PFIM function (MATLAB version 6.0.0.88). The optimal design was based on a one-compartment model with lognormal between subject variability and proportional residual variability and consisted of a single design with three sampling windows (0-30 min, 1.5-5 hr and 11 - 12 hr post-dose) for all patients. The empirical design consisted of three sample time windows per patient from a total of nine windows that collectively represented the entire dose interval. Each patient was assigned to have one blood sample taken from three different windows. Windows for blood sampling times were also provided for the optimal design. Ninety six patients were recruited into the study who were currently receiving enoxaparin therapy. Patients were randomly assigned to either the optimal or empirical sampling design, stratified for body mass index. The exact times of blood samples and doses were recorded. Analysis was undertaken using NONMEM (version 5). The empirical design supported a one compartment linear model with additive residual error, while the optimal design supported a two compartment linear model with additive residual error as did the model derived from the full data set. A posterior predictive check was performed where the models arising from the empirical and optimal designs were used to predict into the full data set. This revealed the optimal'' design derived model was superior to the empirical design model in terms of precision and was similar to the model developed from the full dataset. This study suggests optimal design techniques may be useful, even when the optimized design was based on a model that was misspecified in terms of the structural and statistical models and when the implementation of the optimal designed study deviated from the nominal design.
Resumo:
Background: Hospital performance reports based on administrative data should distinguish differences in quality of care between hospitals from case mix related variation and random error effects. A study was undertaken to determine which of 12 diagnosis-outcome indicators measured across all hospitals in one state had significant risk adjusted systematic ( or special cause) variation (SV) suggesting differences in quality of care. For those that did, we determined whether SV persists within hospital peer groups, whether indicator results correlate at the individual hospital level, and how many adverse outcomes would be avoided if all hospitals achieved indicator values equal to the best performing 20% of hospitals. Methods: All patients admitted during a 12 month period to 180 acute care hospitals in Queensland, Australia with heart failure (n = 5745), acute myocardial infarction ( AMI) ( n = 3427), or stroke ( n = 2955) were entered into the study. Outcomes comprised in-hospital deaths, long hospital stays, and 30 day readmissions. Regression models produced standardised, risk adjusted diagnosis specific outcome event ratios for each hospital. Systematic and random variation in ratio distributions for each indicator were then apportioned using hierarchical statistical models. Results: Only five of 12 (42%) diagnosis-outcome indicators showed significant SV across all hospitals ( long stays and same diagnosis readmissions for heart failure; in-hospital deaths and same diagnosis readmissions for AMI; and in-hospital deaths for stroke). Significant SV was only seen for two indicators within hospital peer groups ( same diagnosis readmissions for heart failure in tertiary hospitals and inhospital mortality for AMI in community hospitals). Only two pairs of indicators showed significant correlation. If all hospitals emulated the best performers, at least 20% of AMI and stroke deaths, heart failure long stays, and heart failure and AMI readmissions could be avoided. Conclusions: Diagnosis-outcome indicators based on administrative data require validation as markers of significant risk adjusted SV. Validated indicators allow quantification of realisable outcome benefits if all hospitals achieved best performer levels. The overall level of quality of care within single institutions cannot be inferred from the results of one or a few indicators.
Resumo:
Many studies on birds focus on the collection of data through an experimental design, suitable for investigation in a classical analysis of variance (ANOVA) framework. Although many findings are confirmed by one or more experts, expert information is rarely used in conjunction with the survey data to enhance the explanatory and predictive power of the model. We explore this neglected aspect of ecological modelling through a study on Australian woodland birds, focusing on the potential impact of different intensities of commercial cattle grazing on bird density in woodland habitat. We examine a number of Bayesian hierarchical random effects models, which cater for overdispersion and a high frequency of zeros in the data using WinBUGS and explore the variation between and within different grazing regimes and species. The impact and value of expert information is investigated through the inclusion of priors that reflect the experience of 20 experts in the field of bird responses to disturbance. Results indicate that expert information moderates the survey data, especially in situations where there are little or no data. When experts agreed, credible intervals for predictions were tightened considerably. When experts failed to agree, results were similar to those evaluated in the absence of expert information. Overall, we found that without expert opinion our knowledge was quite weak. The fact that the survey data is quite consistent, in general, with expert opinion shows that we do know something about birds and grazing and we could learn a lot faster if we used this approach more in ecology, where data are scarce. Copyright (c) 2005 John Wiley & Sons, Ltd.
Resumo:
Objective: To examine the short-term health effects of air pollution on daily mortality in four Australian cities (Brisbane, Melbourne, Perth and Sydney), where more than 50% of Australians reside. Methods: The study used a similar protocol to APHEA2 (Air Pollution and Health: A European Approach) study and derived single-city and pooled estimates. Results: The results derived from the different approaches for the 1996-99 period showed consistent results for different statistical models used. There were significant effects on total mortality, (RR=1.0284 per 1 unit increase in nelphelometry [10(-4).m(-1)], RR=1.0011 per 1ppb increase in NO2), and on respiratory mortality (RR=1.0022 per 1ppb increase in O-2). No significant differences between cities were found, but the NO2 and particle effects may refer to the same impacts. Meta-analyses carried out for three cities yielded estimates for the increase in the daily total number of deaths of 0.2% (-0.8% to 1.2%) for a 10 mu g/m(3) increase in PM, concentration, and 0.9% (-0.7% to 2.5%) for a 10 mu g/m(3) increase in PM2.5 concentration. Conclusions: Air pollutants in Australian cities have significant effects on mortality.
Resumo:
Introduction. Potentially modifiable physiological variables may influence stroke prognosis but their independence from modifiable factors remains unclear. Methods. Admission physiological measures (blood pressure, heart rate, temperature and blood glucose) and other unmodifiable factors were recorded from patients presenting within 48 hours of stroke. These variables were compared with the outcomes of death and death or dependency at 30 days in multivariate statistical models. Results. In the 186 patients included in the study, age, atrial fibrillation and the National Institutes of Health Stroke Score were identified as unmodifiable factors independently associated with death and death or dependency. After adjusting for these factors, none of the physiological variables were independently associated with death, while only diastolic blood pressure (DBP) >= 90 mmHg was associated with death or dependency at 30 days (p = 0.02). Conclusions. Except for elevated DBP, we found no independent associations between admission physiology and outcome at 30 days in an unselected stroke cohort. Future studies should look for associations in subgroups, or by analysing serial changes in physiology during the early post-stroke period.
Resumo:
Neural networks can be regarded as statistical models, and can be analysed in a Bayesian framework. Generalisation is measured by the performance on independent test data drawn from the same distribution as the training data. Such performance can be quantified by the posterior average of the information divergence between the true and the model distributions. Averaging over the Bayesian posterior guarantees internal coherence; Using information divergence guarantees invariance with respect to representation. The theory generalises the least mean squares theory for linear Gaussian models to general problems of statistical estimation. The main results are: (1)~the ideal optimal estimate is always given by average over the posterior; (2)~the optimal estimate within a computational model is given by the projection of the ideal estimate to the model. This incidentally shows some currently popular methods dealing with hyperpriors are in general unnecessary and misleading. The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd. It therefore offers conceptual simplification to information geometry. The general conclusion on the issue of evaluating neural network learning rules and other statistical inference methods is that such evaluations are only meaningful under three assumptions: The prior P(p), describing the environment of all the problems; the divergence Dd, specifying the requirement of the task; and the model Q, specifying available computing resources.
Resumo:
Neural networks are statistical models and learning rules are estimators. In this paper a theory for measuring generalisation is developed by combining Bayesian decision theory with information geometry. The performance of an estimator is measured by the information divergence between the true distribution and the estimate, averaged over the Bayesian posterior. This unifies the majority of error measures currently in use. The optimal estimators also reveal some intricate interrelationships among information geometry, Banach spaces and sufficient statistics.
Resumo:
The problem of evaluating different learning rules and other statistical estimators is analysed. A new general theory of statistical inference is developed by combining Bayesian decision theory with information geometry. It is coherent and invariant. For each sample a unique ideal estimate exists and is given by an average over the posterior. An optimal estimate within a model is given by a projection of the ideal estimate. The ideal estimate is a sufficient statistic of the posterior, so practical learning rules are functions of the ideal estimator. If the sole purpose of learning is to extract information from the data, the learning rule must also approximate the ideal estimator. This framework is applicable to both Bayesian and non-Bayesian methods, with arbitrary statistical models, and to supervised, unsupervised and reinforcement learning schemes.
Resumo:
Neural networks are usually curved statistical models. They do not have finite dimensional sufficient statistics, so on-line learning on the model itself inevitably loses information. In this paper we propose a new scheme for training curved models, inspired by the ideas of ancillary statistics and adaptive critics. At each point estimate an auxiliary flat model (exponential family) is built to locally accommodate both the usual statistic (tangent to the model) and an ancillary statistic (normal to the model). The auxiliary model plays a role in determining credit assignment analogous to that played by an adaptive critic in solving temporal problems. The method is illustrated with the Cauchy model and the algorithm is proved to be asymptotically efficient.
Resumo:
A visualization plot of a data set of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM, and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries) and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection) and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the data sets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the data sets used to evaluate clustering by activity, LTM again gives the best performance but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map and a Bernoulli noise model for modeling binary data.
Resumo:
Enterprise Risk Management (ERM) and Knowledge Management (KM) both encompass top-down and bottom-up approaches developing and embedding risk knowledge concepts and processes in strategy, policies, risk appetite definition, the decision-making process and business processes. The capacity to transfer risk knowledge affects all stakeholders and understanding of the risk knowledge about the enterprise's value is a key requirement in order to identify protection strategies for business sustainability. There are various factors that affect this capacity for transferring and understanding. Previous work has established that there is a difference between the influence of KM variables on Risk Control and on the perceived value of ERM. Communication among groups appears as a significant variable in improving Risk Control but only as a weak factor in improving the perceived value of ERM. However, the ERM mandate requires for its implementation a clear understanding, of risk management (RM) policies, actions and results, and the use of the integral view of RM as a governance and compliance program to support the value driven management of the organization. Furthermore, ERM implementation demands better capabilities for unification of the criteria of risk analysis, alignment of policies and protection guidelines across the organization. These capabilities can be affected by risk knowledge sharing between the RM group and the Board of Directors and other executives in the organization. This research presents an exploratory analysis of risk knowledge transfer variables used in risk management practice. A survey to risk management executives from 65 firms in various industries was undertaken and 108 answers were analyzed. Potential relationships among the variables are investigated using descriptive statistics and multivariate statistical models. The level of understanding of risk management policies and reports by the board is related to the quality of the flow of communication in the firm and perceived level of integration of the risk policy in the business processes.