977 resultados para Instrumental-variable Methods
Resumo:
Recent discussion regarding whether the noise that limits 2AFC discrimination performance is fixed or variable has focused either on describing experimental methods that presumably dissociate the effects of response mean and variance or on reanalyzing a published data set with the aim of determining how to solve the question through goodness-of-fit statistics. This paper illustrates that the question cannot be solved by fitting models to data and assessing goodness-of-fit because data on detection and discrimination performance can be indistinguishably fitted by models that assume either type of noise when each is coupled with a convenient form for the transducer function. Thus, success or failure at fitting a transducer model merely illustrates the capability (or lack thereof) of some particular combination of transducer function and variance function to account for the data, but it cannot disclose the nature of the noise. We also comment on some of the issues that have been raised in recent exchange on the topic, namely, the existence of additional constraints for the models, the presence of asymmetric asymptotes, the likelihood of history-dependent noise, and the potential of certain experimental methods to dissociate the effects of response mean and variance.
Resumo:
In this work, we introduce the periodic nonlinear Fourier transform (PNFT) method as an alternative and efficacious tool for compensation of the nonlinear transmission effects in optical fiber links. In the Part I, we introduce the algorithmic platform of the technique, describing in details the direct and inverse PNFT operations, also known as the inverse scattering transform for periodic (in time variable) nonlinear Schrödinger equation (NLSE). We pay a special attention to explaining the potential advantages of the PNFT-based processing over the previously studied nonlinear Fourier transform (NFT) based methods. Further, we elucidate the issue of the numerical PNFT computation: we compare the performance of four known numerical methods applicable for the calculation of nonlinear spectral data (the direct PNFT), in particular, taking the main spectrum (utilized further in Part II for the modulation and transmission) associated with some simple example waveforms as the quality indicator for each method. We show that the Ablowitz-Ladik discretization approach for the direct PNFT provides the best performance in terms of the accuracy and computational time consumption.
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.
Resumo:
Abstract
Continuous variable is one of the major data types collected by the survey organizations. It can be incomplete such that the data collectors need to fill in the missingness. Or, it can contain sensitive information which needs protection from re-identification. One of the approaches to protect continuous microdata is to sum them up according to different cells of features. In this thesis, I represents novel methods of multiple imputation (MI) that can be applied to impute missing values and synthesize confidential values for continuous and magnitude data.
The first method is for limiting the disclosure risk of the continuous microdata whose marginal sums are fixed. The motivation for developing such a method comes from the magnitude tables of non-negative integer values in economic surveys. I present approaches based on a mixture of Poisson distributions to describe the multivariate distribution so that the marginals of the synthetic data are guaranteed to sum to the original totals. At the same time, I present methods for assessing disclosure risks in releasing such synthetic magnitude microdata. The illustration on a survey of manufacturing establishments shows that the disclosure risks are low while the information loss is acceptable.
The second method is for releasing synthetic continuous micro data by a nonstandard MI method. Traditionally, MI fits a model on the confidential values and then generates multiple synthetic datasets from this model. Its disclosure risk tends to be high, especially when the original data contain extreme values. I present a nonstandard MI approach conditioned on the protective intervals. Its basic idea is to estimate the model parameters from these intervals rather than the confidential values. The encouraging results of simple simulation studies suggest the potential of this new approach in limiting the posterior disclosure risk.
The third method is for imputing missing values in continuous and categorical variables. It is extended from a hierarchically coupled mixture model with local dependence. However, the new method separates the variables into non-focused (e.g., almost-fully-observed) and focused (e.g., missing-a-lot) ones. The sub-model structure of focused variables is more complex than that of non-focused ones. At the same time, their cluster indicators are linked together by tensor factorization and the focused continuous variables depend locally on non-focused values. The model properties suggest that moving the strongly associated non-focused variables to the side of focused ones can help to improve estimation accuracy, which is examined by several simulation studies. And this method is applied to data from the American Community Survey.
Resumo:
The aim of this thesis is to identify the relationship between subjective well-being and economic insecurity for public and private sector workers in Ireland using the European Social Survey 2010-2012. Life satisfaction and job satisfaction are the indicators used to measure subjective well-being. Economic insecurity is approximated by regional unemployment rates and self-perceived job insecurity. Potential sample selection bias and endogeneity bias are accounted for. It is traditionally believed that public sector workers are relatively more protected against insecurity due to very institution of public sector employment. The institution of public sector employment is made up of stricter dismissal practices (Luechinger et al., 2010a) and less volatile employment (Freeman, 1987) where workers are subsequently less likely to be affected by business cycle downturns (Clark and Postal-Vinay, 2009). It is found in the literature that economic insecurity depresses the well-being of public sector workers to a lesser degree than private sector workers (Luechinger et al., 2010a; Artz and Kaya, 2014). These studies provide the rationale for this thesis in testing for similar relationships in an Irish context. Sample selection bias arises when a selection into a particular category is not random (Heckman, 1979). An example of this is non-random selection into public sector employment based on personal characteristics (Heckman, 1979; Luechinger et al., 2010b). If selection into public sector employment is not corrected for this can lead to biased and inconsistent estimators (Gujarati, 2009). Selection bias of public sector employment is corrected for by using a standard Two-Step Heckman Probit OLS estimation method. Following Luechinger et al. (2010b), the propensity for individuals to select into public sector employment is estimated by a binomial probit model with the inclusion of the additional regressor Irish citizenship. Job satisfaction is then estimated by Ordinary Least Squares (OLS) with the inclusion of a sample correction term similar as is done in Clark (1997). Endogeneity is where an independent variable included in the model is determined within in the context of the model (Chenhall and Moers, 2007). The econometric definition states that an endogenous independent variable is one that is correlated with the error term (Wooldridge, 2010). Endogeneity is expected to be present due to a simultaneous relationship between job insecurity and job satisfaction whereby both variables are jointly determined (Theodossiou and Vasileiou, 2007). Simultaneity, as an instigator of endogeneity, is corrected for using Instrumental Variables (IV) techniques. Limited Information Methods and Full Information Methods of estimation of simultaneous equations models are assed and compared. The general results show that job insecurity depresses the subjective well-being of all workers in both the public and private sectors in Ireland. The magnitude of this effect differs among sectoral workers. The subjective well-being of private sector workers is more adversely affected by job insecurity than the subjective well-being of public sector workers. This is observed in basic ordered probit estimations of both a life satisfaction equation and a job satisfaction equation. The marginal effects from the ordered probit estimation of a basic job satisfaction equation show that as job insecurity increases the probability of reporting a 9 on a 10-point job satisfaction scale significantly decreases by 3.4% for the whole sample of workers, 2.8% for public sector workers and 4.0% for private sector workers. Artz and Kaya (2014) explain that as a result of many austerity policies implemented to reduce government expenditure during the economic recession, workers in the public sector may for the first time face worsening perceptions of job security which can have significant implications for their well-being (Artz and Kaya, 2014). This can be observed in the marginal effects where job insecurity negatively impacts the well-being of public sector workers in Ireland. However, in accordance with Luechinger et al. (2010a) the results show that private sector workers are more adversely impacted by economic insecurity than public sector workers. This suggests that in a time of high economic volatility, the institution of public sector employment held and was able to protect workers against some of the well-being consequences of rising insecurity. In estimating the relationship between subjective well-being and economic insecurity advanced econometric issues arise. The results show that when selection bias is corrected for, any statistically significant relationship between job insecurity and job satisfaction disappears for public sector workers. Additionally, in order to correct for endogeneity bias the simultaneous equations model for job satisfaction and job insecurity is estimated by Limited Information and Full Information Methods. The results from two different estimators classified as Limited Information Methods support the general findings of this research. Moreover, the magnitude of the endogeneity-corrected estimates are twice as large as those not corrected for endogeneity bias which is similarly found in Geishecker (2010, 2012). As part of the analysis into the effect of economic insecurity on subjective well-being, the effects of other socioeconomic variables and work-related variables are examined for public and private sector workers in Ireland.
Resumo:
In certain European countries and the United States of America, canines have been successfully used in human scent identification. There is however, limited scientific knowledge on the composition of human scent and the detection mechanism that produces an alert from canines. This lack of information has resulted in successful legal challenges to human scent evidence in the courts of law. The main objective of this research was to utilize science to validate the current practices of using human scent evidence in criminal cases. The goals of this study were to utilize Headspace Solid Phase Micro Extraction Gas Chromatography Mass Spectrometry (HS-SPME-GC/MS) to determine the optimum collection and storage conditions for human scent samples, to investigate whether the amount of DNA deposited upon contact with an object affects the alerts produced by human scent identification canines, and to create a prototype pseudo human scent which could be used for training purposes. Hand odor samples which were collected on different sorbent materials and exposed to various environmental conditions showed that human scent samples should be stored without prolonged exposure to UVA/UVB light to allow minimal changes to the overall scent profile. Various methods of collecting human scent from objects were also investigated and it was determined that passive collection methods yields ten times more VOCs by mass than active collection methods. Through the use of polymerase chain reaction (PCR) no correlation was found between the amount of DNA that was deposited upon contact with an object and the alerts that were produced by human scent identification canines. Preliminary studies conducted to create a prototype pseudo human scent showed that it is possible to produce fractions of a human scent sample which can be presented to the canines to determine whether specific fractions or the entire sample is needed to produce alerts by the human scent identification canines.
Resumo:
Thesis (Master's)--University of Washington, 2016-08
Resumo:
Ma thèse s’intéresse aux politiques de santé conçues pour encourager l’offre de services de santé. L’accessibilité aux services de santé est un problème majeur qui mine le système de santé de la plupart des pays industrialisés. Au Québec, le temps médian d’attente entre une recommandation du médecin généraliste et un rendez-vous avec un médecin spécialiste était de 7,3 semaines en 2012, contre 2,9 semaines en 1993, et ceci malgré l’augmentation du nombre de médecins sur cette même période. Pour les décideurs politiques observant l’augmentation du temps d’attente pour des soins de santé, il est important de comprendre la structure de l’offre de travail des médecins et comment celle-ci affecte l’offre des services de santé. Dans ce contexte, je considère deux principales politiques. En premier lieu, j’estime comment les médecins réagissent aux incitatifs monétaires et j’utilise les paramètres estimés pour examiner comment les politiques de compensation peuvent être utilisées pour déterminer l’offre de services de santé de court terme. En second lieu, j’examine comment la productivité des médecins est affectée par leur expérience, à travers le mécanisme du "learning-by-doing", et j’utilise les paramètres estimés pour trouver le nombre de médecins inexpérimentés que l’on doit recruter pour remplacer un médecin expérimenté qui va à la retraite afin de garder l’offre des services de santé constant. Ma thèse développe et applique des méthodes économique et statistique afin de mesurer la réaction des médecins face aux incitatifs monétaires et estimer leur profil de productivité (en mesurant la variation de la productivité des médecins tout le long de leur carrière) en utilisant à la fois des données de panel sur les médecins québécois, provenant d’enquêtes et de l’administration. Les données contiennent des informations sur l’offre de travail de chaque médecin, les différents types de services offerts ainsi que leurs prix. Ces données couvrent une période pendant laquelle le gouvernement du Québec a changé les prix relatifs des services de santé. J’ai utilisé une approche basée sur la modélisation pour développer et estimer un modèle structurel d’offre de travail en permettant au médecin d’être multitâche. Dans mon modèle les médecins choisissent le nombre d’heures travaillées ainsi que l’allocation de ces heures à travers les différents services offerts, de plus les prix des services leurs sont imposés par le gouvernement. Le modèle génère une équation de revenu qui dépend des heures travaillées et d’un indice de prix représentant le rendement marginal des heures travaillées lorsque celles-ci sont allouées de façon optimale à travers les différents services. L’indice de prix dépend des prix des services offerts et des paramètres de la technologie de production des services qui déterminent comment les médecins réagissent aux changements des prix relatifs. J’ai appliqué le modèle aux données de panel sur la rémunération des médecins au Québec fusionnées à celles sur l’utilisation du temps de ces mêmes médecins. J’utilise le modèle pour examiner deux dimensions de l’offre des services de santé. En premierlieu, j’analyse l’utilisation des incitatifs monétaires pour amener les médecins à modifier leur production des différents services. Bien que les études antérieures ont souvent cherché à comparer le comportement des médecins à travers les différents systèmes de compensation,il y a relativement peu d’informations sur comment les médecins réagissent aux changementsdes prix des services de santé. Des débats actuels dans les milieux de politiques de santé au Canada se sont intéressés à l’importance des effets de revenu dans la détermination de la réponse des médecins face à l’augmentation des prix des services de santé. Mon travail contribue à alimenter ce débat en identifiant et en estimant les effets de substitution et de revenu résultant des changements des prix relatifs des services de santé. En second lieu, j’analyse comment l’expérience affecte la productivité des médecins. Cela a une importante implication sur le recrutement des médecins afin de satisfaire la demande croissante due à une population vieillissante, en particulier lorsque les médecins les plus expérimentés (les plus productifs) vont à la retraite. Dans le premier essai, j’ai estimé la fonction de revenu conditionnellement aux heures travaillées, en utilisant la méthode des variables instrumentales afin de contrôler pour une éventuelle endogeneité des heures travaillées. Comme instruments j’ai utilisé les variables indicatrices des âges des médecins, le taux marginal de taxation, le rendement sur le marché boursier, le carré et le cube de ce rendement. Je montre que cela donne la borne inférieure de l’élasticité-prix direct, permettant ainsi de tester si les médecins réagissent aux incitatifs monétaires. Les résultats montrent que les bornes inférieures des élasticités-prix de l’offre de services sont significativement positives, suggérant que les médecins répondent aux incitatifs. Un changement des prix relatifs conduit les médecins à allouer plus d’heures de travail au service dont le prix a augmenté. Dans le deuxième essai, j’estime le modèle en entier, de façon inconditionnelle aux heures travaillées, en analysant les variations des heures travaillées par les médecins, le volume des services offerts et le revenu des médecins. Pour ce faire, j’ai utilisé l’estimateur de la méthode des moments simulés. Les résultats montrent que les élasticités-prix direct de substitution sont élevées et significativement positives, représentant une tendance des médecins à accroitre le volume du service dont le prix a connu la plus forte augmentation. Les élasticitésprix croisées de substitution sont également élevées mais négatives. Par ailleurs, il existe un effet de revenu associé à l’augmentation des tarifs. J’ai utilisé les paramètres estimés du modèle structurel pour simuler une hausse générale de prix des services de 32%. Les résultats montrent que les médecins devraient réduire le nombre total d’heures travaillées (élasticité moyenne de -0,02) ainsi que les heures cliniques travaillées (élasticité moyenne de -0.07). Ils devraient aussi réduire le volume de services offerts (élasticité moyenne de -0.05). Troisièmement, j’ai exploité le lien naturel existant entre le revenu d’un médecin payé à l’acte et sa productivité afin d’établir le profil de productivité des médecins. Pour ce faire, j’ai modifié la spécification du modèle pour prendre en compte la relation entre la productivité d’un médecin et son expérience. J’estime l’équation de revenu en utilisant des données de panel asymétrique et en corrigeant le caractère non-aléatoire des observations manquantes à l’aide d’un modèle de sélection. Les résultats suggèrent que le profil de productivité est une fonction croissante et concave de l’expérience. Par ailleurs, ce profil est robuste à l’utilisation de l’expérience effective (la quantité de service produit) comme variable de contrôle et aussi à la suppression d’hypothèse paramétrique. De plus, si l’expérience du médecin augmente d’une année, il augmente la production de services de 1003 dollar CAN. J’ai utilisé les paramètres estimés du modèle pour calculer le ratio de remplacement : le nombre de médecins inexpérimentés qu’il faut pour remplacer un médecin expérimenté. Ce ratio de remplacement est de 1,2.
Resumo:
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
Resumo:
Abstract not available
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Habitat fragmentation and the consequently the loss of connectivity between populations can reduce the individuals interchange and gene flow, increasing the chances of inbreeding, and the increase the risk of local extinction. Landscape genetics is providing more and better tools to identify genetic barriers.. To our knowledge, no comparison of methods in terms of consistency has been made with observed data and species with low dispersal ability. The aim of this study is to examine the consistency of the results of five methods to detect barriers to gene flow in a Mediterranean pine vole population Microtus duodecimcostatus: F-statistics estimations, Non-Bayesian clustering, Bayesian clustering, Boundary detection and Simple/Partial Mantel tests. All methods were consistent in detecting the stream as a non-genetic barrier. However, no consistency in results among the methods were found regarding the role of the highway as a genetic barrier. Fst, Bayesian clustering assignment test and Partial Mantel test identifyed the highway as a filter to individual interchange. The Mantel tests were the most sensitive method. Boundary detection method (Monmonier’s Algorithm) and Non-Bayesian approaches did not detect any genetic differentiation of the pine vole due to the highway. Based on our findings we recommend that the genetic barrier detection in low dispersal ability populations should be analyzed with multiple methods such as Mantel tests, Bayesian clustering approaches because they show more sensibility in those scenarios and with boundary detection methods by having the aim of detect drastic changes in a variable of interest between the closest individuals. Although simulation studies highlight the weaknesses and the strengths of each method and the factors that promote some results, tests with real data are needed to increase the effectiveness of genetic barrier detection.
Resumo:
Although the value of primary forests for biodiversity conservation is well known, the potential biodiversity and conservation value of regenerating forests remains controversial. Many factors likely contribute to this, including: 1. the variable ages of regenerating forests being studied (often dominated by relatively young regenerating forests); 2. the potential for confounding on-going human disturbance (such as logging and hunting); 3. the relatively low number of multi-taxa studies; 4. the lack of studies that directly compare different historic disturbances within the same location; 5. contrasting patterns from different survey methodologies and the paucity of knowledge on the impacts across different vertical levels of rainforest biodiversity (often due to a lack of suitable methodologies available to assess them). We also know relatively little as to how biodiversity is affected by major current impacts, such as unmarked rainforest roads, which contribute to this degradation of habitat and fragmentation. This thesis explores the potential biodiversity value of regenerating rainforests under the best of scenarios and seeks to understand more about the impact of current human disturbance to biodiversity; data comes from case studies from the Manu and Sumaco Biosphere Reserves in the Western Amazon. Specifically, I compare overall biodiversity and conservation value of a best case regenerating rainforest site with a selection of well-studied primary forest sites and with predicted species lists for the region; including a focus on species of key conservation concern. I then investigate the biodiversity of the same study site in reference to different types of historic anthropogenic disturbance. Following this I investigate the impacts to biodiversity from an unmarked rainforest road. In order to understand more about the differential effects of habitat disturbance on arboreal diversity I directly assess how patterns of butterfly biodiversity vary between three vertical strata. Although assessments within the canopy have been made for birds, invertebrates and bats, very few studies have successfully targeted arboreal mammals. I therefore investigate the potential of camera traps for inventorying arboreal mammal species in comparison with traditional methodologies. Finally, in order to investigate the possibility that different survey methodologies might identify different biodiversity patterns in habitat disturbance assessments, I investigate whether two different but commonly used survey methodologies used to assess amphibians, indicate the same or different responses of amphibian biodiversity to historic habitat change by people. The regenerating rainforest study site contained high levels of species richness; both in terms of alpha diversity found in nearby primary forest areas (87% ±3.5) and in terms of predicted primary forest diversity from the region (83% ±6.7). This included 89% (39 out of 44) of the species of high conservation concern predicted for the Manu region. Faunal species richness in once completely cleared regenerating forest was on average 13% (±9.8) lower than historically selectively logged forest. The presence of the small unmarked road significantly altered levels of faunal biodiversity for three taxa, up to and potentially beyond 350m into the forest interior. Most notably, the impact on biodiversity extended to at least 32% of the whole reserve area. The assessment of butterflies across strata showed that different vertical zones within the same rainforest responded differently in areas with different historic human disturbance. A comparison between forest regenerating after selective logging and forest regenerating after complete clearance, showed that there was a 17% greater reduction in canopy species richness in the historically cleared forest compared with the terrestrial community. Comparing arboreal camera traps with traditional ground-based techniques suggests that camera traps are an effective tool for inventorying secretive arboreal rainforest mammal communities and detect a higher number of cryptic species. Finally, the two survey methodologies used to assess amphibian communities identified contrasting biodiversity patterns in a human modified rainforest; one indicated biodiversity differences between forests with different human disturbance histories, whereas the other suggested no differences between forest disturbance types. Overall, in this thesis I find that the conservation and biodiversity value of regenerating and human disturbed tropical forest can potentially contribute to rainforest biodiversity conservation, particularly in the best of circumstances. I also highlight the importance of utilising appropriate study methodologies that to investigate these three-dimensional habitats, and contribute to the development of methodologies to do so. However, care should be taken when using different survey methodologies, which can provide contrasting biodiversity patterns in response to human disturbance.
Resumo:
Abstract: Quantitative Methods (QM) is a compulsory course in the Social Science program in CEGEP. Many QM instructors assign a number of homework exercises to give students the opportunity to practice the statistical methods, which enhances their learning. However, traditional written exercises have two significant disadvantages. The first is that the feedback process is often very slow. The second disadvantage is that written exercises can generate a large amount of correcting for the instructor. WeBWorK is an open-source system that allows instructors to write exercises which students answer online. Although originally designed to write exercises for math and science students, WeBWorK programming allows for the creation of a variety of questions which can be used in the Quantitative Methods course. Because many statistical exercises generate objective and quantitative answers, the system is able to instantly assess students’ responses and tell them whether they are right or wrong. This immediate feedback has been shown to be theoretically conducive to positive learning outcomes. In addition, the system can be set up to allow students to re-try the problem if they got it wrong. This has benefits both in terms of student motivation and reinforcing learning. Through the use of a quasi-experiment, this research project measured and analysed the effects of using WeBWorK exercises in the Quantitative Methods course at Vanier College. Three specific research questions were addressed. First, we looked at whether students who did the WeBWorK exercises got better grades than students who did written exercises. Second, we looked at whether students who completed more of the WeBWorK exercises got better grades than students who completed fewer of the WeBWorK exercises. Finally, we used a self-report survey to find out what students’ perceptions and opinions were of the WeBWorK and the written exercises. For the first research question, a crossover design was used in order to compare whether the group that did WeBWorK problems during one unit would score significantly higher on that unit test than the other group that did the written problems. We found no significant difference in grades between students who did the WeBWorK exercises and students who did the written exercises. The second research question looked at whether students who completed more of the WeBWorK exercises would get significantly higher grades than students who completed fewer of the WeBWorK exercises. The straight-line relationship between number of WeBWorK exercises completed and grades was positive in both groups. However, the correlation coefficients for these two variables showed no real pattern. Our third research question was investigated by using a survey to elicit students’ perceptions and opinions regarding the WeBWorK and written exercises. Students reported no difference in the amount of effort put into completing each type of exercise. Students were also asked to rate each type of exercise along six dimensions and a composite score was calculated. Overall, students gave a significantly higher score to the written exercises, and reported that they found the written exercises were better for understanding the basic statistical concepts and for learning the basic statistical methods. However, when presented with the choice of having only written or only WeBWorK exercises, slightly more students preferred or strongly preferred having only WeBWorK exercises. The results of this research suggest that the advantages of using WeBWorK to teach Quantitative Methods are variable. The WeBWorK system offers immediate feedback, which often seems to motivate students to try again if they do not have the correct answer. However, this does not necessarily translate into better performance on the written tests and on the final exam. What has been learned is that the WeBWorK system can be used by interested instructors to enhance student learning in the Quantitative Methods course. Further research may examine more specifically how this system can be used more effectively.
Quantificação de açúcares com uma língua eletrónica: calibração multivariada com seleção de sensores
Resumo:
Este trabalho incide na análise dos açúcares majoritários nos alimentos (glucose, frutose e sacarose) com uma língua eletrónica potenciométrica através de calibração multivariada com seleção de sensores. A análise destes compostos permite contribuir para a avaliação do impacto dos açúcares na saúde e seu efeito fisiológico, além de permitir relacionar atributos sensoriais e atuar no controlo de qualidade e autenticidade dos alimentos. Embora existam diversas metodologias analíticas usadas rotineiramente na identificação e quantificação dos açúcares nos alimentos, em geral, estes métodos apresentam diversas desvantagens, tais como lentidão das análises, consumo elevado de reagentes químicos e necessidade de pré-tratamentos destrutivos das amostras. Por isso se decidiu aplicar uma língua eletrónica potenciométrica, construída com sensores poliméricos selecionados considerando as sensibilidades aos açucares obtidas em trabalhos anteriores, na análise dos açúcares nos alimentos, visando estabelecer uma metodologia analítica e procedimentos matemáticos para quantificação destes compostos. Para este propósito foram realizadas análises em soluções padrão de misturas ternárias dos açúcares em diferentes níveis de concentração e em soluções de dissoluções de amostras de mel, que foram previamente analisadas em HPLC para se determinar as concentrações de referência dos açúcares. Foi então feita uma análise exploratória dos dados visando-se remover sensores ou observações discordantes através da realização de uma análise de componentes principais. Em seguida, foram construídos modelos de regressão linear múltipla com seleção de variáveis usando o algoritmo stepwise e foi verificado que embora fosse possível estabelecer uma boa relação entre as respostas dos sensores e as concentrações dos açúcares, os modelos não apresentavam desempenho de previsão satisfatório em dados de grupo de teste. Dessa forma, visando contornar este problema, novas abordagens foram testadas através da construção e otimização dos parâmetros de um algoritmo genético para seleção de variáveis que pudesse ser aplicado às diversas ferramentas de regressão, entre elas a regressão pelo método dos mínimos quadrados parciais. Foram obtidos bons resultados de previsão para os modelos obtidos com o método dos mínimos quadrados parciais aliado ao algoritmo genético, tanto para as soluções padrão quanto para as soluções de mel, com R²ajustado acima de 0,99 e RMSE inferior a 0,5 obtidos da relação linear entre os valores previstos e experimentais usando dados dos grupos de teste. O sistema de multi-sensores construído se mostrou uma ferramenta adequada para a análise dos iii açúcares, quando presentes em concentrações maioritárias, e alternativa a métodos instrumentais de referência, como o HPLC, por reduzir o tempo da análise e o valor monetário da análise, bem como, ter um preparo mínimo das amostras e eliminar produtos finais poluentes.