930 resultados para Data-driven
Resumo:
Video games have become one of the largest entertainment industries, and their power to capture the attention of players worldwide soon prompted the idea of using games to improve education. However, these educational games, commonly referred to as serious games, face different challenges when brought into the classroom, ranging from pragmatic issues (e.g. a high development cost) to deeper educational issues, including a lack of understanding of how the students interact with the games and how the learning process actually occurs. This chapter explores the potential of data-driven approaches to improve the practical applicability of serious games. Existing work done by the entertainment and learning industries helps to build a conceptual model of the tasks required to analyze player interactions in serious games (gaming learning analytics or GLA). The chapter also describes the main ongoing initiatives to create reference GLA infrastructures and their connection to new emerging specifications from the educational technology field. Finally, it explores how this data-driven GLA will help in the development of a new generation of more effective educational games and new business models that will support their expansion. This results in additional ethical implications, which are discussed at the end of the chapter.
Resumo:
Microsecond long Molecular Dynamics (MD) trajectories of biomolecular processes are now possible due to advances in computer technology. Soon, trajectories long enough to probe dynamics over many milliseconds will become available. Since these timescales match the physiological timescales over which many small proteins fold, all atom MD simulations of protein folding are now becoming popular. To distill features of such large folding trajectories, we must develop methods that can both compress trajectory data to enable visualization, and that can yield themselves to further analysis, such as the finding of collective coordinates and reduction of the dynamics. Conventionally, clustering has been the most popular MD trajectory analysis technique, followed by principal component analysis (PCA). Simple clustering used in MD trajectory analysis suffers from various serious drawbacks, namely, (i) it is not data driven, (ii) it is unstable to noise and change in cutoff parameters, and (iii) since it does not take into account interrelationships amongst data points, the separation of data into clusters can often be artificial. Usually, partitions generated by clustering techniques are validated visually, but such validation is not possible for MD trajectories of protein folding, as the underlying structural transitions are not well understood. Rigorous cluster validation techniques may be adapted, but it is more crucial to reduce the dimensions in which MD trajectories reside, while still preserving their salient features. PCA has often been used for dimension reduction and while it is computationally inexpensive, being a linear method, it does not achieve good data compression. In this thesis, I propose a different method, a nonmetric multidimensional scaling (nMDS) technique, which achieves superior data compression by virtue of being nonlinear, and also provides a clear insight into the structural processes underlying MD trajectories. I illustrate the capabilities of nMDS by analyzing three complete villin headpiece folding and six norleucine mutant (NLE) folding trajectories simulated by Freddolino and Schulten [1]. Using these trajectories, I make comparisons between nMDS, PCA and clustering to demonstrate the superiority of nMDS. The three villin headpiece trajectories showed great structural heterogeneity. Apart from a few trivial features like early formation of secondary structure, no commonalities between trajectories were found. There were no units of residues or atoms found moving in concert across the trajectories. A flipping transition, corresponding to the flipping of helix 1 relative to the plane formed by helices 2 and 3 was observed towards the end of the folding process in all trajectories, when nearly all native contacts had been formed. However, the transition occurred through a different series of steps in all trajectories, indicating that it may not be a common transition in villin folding. The trajectories showed competition between local structure formation/hydrophobic collapse and global structure formation in all trajectories. Our analysis on the NLE trajectories confirms the notion that a tight hydrophobic core inhibits correct 3-D rearrangement. Only one of the six NLE trajectories folded, and it showed no flipping transition. All the other trajectories get trapped in hydrophobically collapsed states. The NLE residues were found to be buried deeply into the core, compared to the corresponding lysines in the villin headpiece, thereby making the core tighter and harder to undo for 3-D rearrangement. Our results suggest that the NLE may not be a fast folder as experiments suggest. The tightness of the hydrophobic core may be a very important factor in the folding of larger proteins. It is likely that chaperones like GroEL act to undo the tight hydrophobic core of proteins, after most secondary structure elements have been formed, so that global rearrangement is easier. I conclude by presenting facts about chaperone-protein complexes and propose further directions for the study of protein folding.
Resumo:
This study positioned the federal No Child Left Behind (NCLB) Act of 2002 as a reified colonizing entity, inscribing its hegemonic authority upon the professional identity and work of school principals within their school communities of practice. Pressure on educators and students intensifies each year as the benchmark for Adequate Yearly Progress under the NCLB policy is raised, resulting in standards-based reform, scripted curriculum and pedagogy, absence of elective subjects, and a general lack of autonomy critical to the work of teachers as they approach each unique class and student (Crocco & Costigan, 2007; Mabry & Margolis, 2006). Emphasis on high stakes standardized testing as the indicator for student achievement (Popham, 2005) affects educators’ professional identity through dramatic pedagological and structural changes in schools (Day, Flores, & Viana, 2007). These dramatic changes to the ways our nation conducts schooling must be understood and thought about critically from school leaders’ perspectives as their professional identity is influenced by large scale NCLB school reform. The author explored the impact No Child Left Behind reform had on the professional identity of fourteen, veteran Illinois principals leading in urban, small urban, suburban, and rural middle and elementary schools. Qualitative data were collected during semi-structured interviews and focus groups and analyzed using a dual theoretical framework of postcolonial and identity theories. Postcolonial theory provided a lens from which the author applied a metaphor of colonization to principals’ experiences as colonized-colonizers in a time of school reform. Principal interview data illustrated many examples of NCLB as a colonizing authority having a significant impact on the professional identity of school leaders. This framework was used to interpret data in a unique and alternative way and contributed to the need to better understand the ways school leaders respond to district-level, state-level, and national-level accountability policies (Sloan, 2000). Identity theory situated principals as professionals shaped by the communities of practice in which they lead. Principals’ professional identity has become more data-driven as a result of NCLB and their role as instructional leaders has intensified. The data showed that NCLB has changed the work and professional identity of principals in terms of use of data, classroom instruction, Response to Intervention, and staffing changes. Although NCLB defines success in terms of meeting or exceeding the benchmark for Adequate Yearly Progress, principals’ view AYP as only one measurement of their success. The need to meet the benchmark for AYP is a present reality that necessitates school-wide attention to reading and math achievement. At this time, principals leading in affluent, somewhat homogeneous schools typically experience less pressure and more power under NCLB and are more often labeled “successful” school communities. In contrast, principals leading in schools with more heterogeneity experience more pressure and lack of power under NCLB and are more often labeled “failing” school communities. Implications from this study for practitioners and policymakers include a need to reexamine the intents and outcomes of the policy for all school communities, especially in terms of power and voice. Recommendations for policy reform include moving to a growth model with multi-year assessments that make sense for individual students rather than one standardized test score as the measure for achievement. Overall, the study reveals enhancements and constraints NCLB policy has caused in a variety of school contexts, which have affected the professional identity of school leaders.
Resumo:
With the development of variable-data-driven digital presses - where each document printed is potentially unique - there is a need for pre-press optimization to identify material that is invariant from document to document. In this way rasterisation can be confined solely to those areas which change between successive documents thereby alleviating a potential performance bottleneck. Given a template document specified in terms of layout functions, where actual data is bound at the last possible moment before printing, we look at deriving and exploiting the invariant properties of layout functions from their formal specifications. We propose future work on generic extraction of invariance from such properties for certain classes of layout functions.
Resumo:
This work is aimed at understanding and unifying information on epidemiological modelling methods and how those methods relate to public policy addressing human health, specifically in the context of infectious disease prevention, pandemic planning, and health behaviour change. This thesis employs multiple qualitative and quantitative methods, and presents as a manuscript of several individual, data-driven projects that are combined in a narrative arc. The first chapter introduces the scope and complexity of this interdisciplinary undertaking, describing several topical intersections of importance. The second chapter begins the presentation of original data, and describes in detail two exercises in computational epidemiological modelling pertinent to pandemic influenza planning and policy, and progresses in the next chapter to present additional original data on how the confidence of the public in modelling methodology may have an effect on their planned health behaviour change as recommended in public health policy. The thesis narrative continues in the final data-driven chapter to describe how health policymakers use modelling methods and scientific evidence to inform and construct health policies for the prevention of infectious diseases, and concludes with a narrative chapter that evaluates the breadth of this data and recommends strategies for the optimal use of modelling methodologies when informing public health policy in applied public health scenarios.
Resumo:
Human relationships have long been studied by scientists from domains like sociology, psychology, literature, etc. for understanding people's desires, goals, actions and expected behaviors. In this dissertation we study inter-personal relationships as expressed in natural language text. Modeling inter-personal relationships from text finds application in general natural language understanding, as well as real-world domains such as social networks, discussion forums, intelligent virtual agents, etc. We propose that the study of relationships should incorporate not only linguistic cues in text, but also the contexts in which these cues appear. Our investigations, backed by empirical evaluation, support this thesis, and demonstrate that the task benefits from using structured models that incorporate both types of information. We present such structured models to address the task of modeling the nature of relationships between any two given characters from a narrative. To begin with, we assume that relationships are of two types: cooperative and non-cooperative. We first describe an approach to jointly infer relationships between all characters in the narrative, and demonstrate how the task of characterizing the relationship between two characters can benefit from including information about their relationships with other characters in the narrative. We next formulate the relationship-modeling problem as a sequence prediction task to acknowledge the evolving nature of human relationships, and demonstrate the need to model the history of a relationship in predicting its evolution. Thereafter, we present a data-driven method to automatically discover various types of relationships such as familial, romantic, hostile, etc. Like before, we address the task of modeling evolving relationships but don't restrict ourselves to two types of relationships. We also demonstrate the need to incorporate not only local historical but also global context while solving this problem. Lastly, we demonstrate a practical application of modeling inter-personal relationships in the domain of online educational discussion forums. Such forums offer opportunities for its users to interact and form deeper relationships. With this view, we address the task of identifying initiation of such deeper relationships between a student and the instructor. Specifically, we analyze contents of the forums to automatically suggest threads to the instructors that require their intervention. By highlighting scenarios that need direct instructor-student interactions, we alleviate the need for the instructor to manually peruse all threads of the forum and also assist students who have limited avenues for communicating with instructors. We do this by incorporating the discourse structure of the thread through latent variables that abstractly represent contents of individual posts and model the flow of information in the thread. Such latent structured models that incorporate the linguistic cues without losing their context can be helpful in other related natural language understanding tasks as well. We demonstrate this by using the model for a very different task: identifying if a stated desire has been fulfilled by the end of a story.
Resumo:
International audience
Resumo:
Experiments were conducted at the GALCIT supersonic shear-layer facility to investigate aspects of reacting transverse jets in supersonic crossflow using chemiluminescence and schlieren image-correlation velocimetry. In particular, experiments were designed to examine mixing-delay length dependencies on jet-fluid molar mass, jet diameter, and jet inclination.
The experimental results show that mixing-delay length depends on jet Reynolds number, when appropriately normalized, up to a jet Reynolds number of 500,000. Jet inclination increases the mixing-delay length, but causes less disturbance to the crossflow when compared to normal jet injection. This can be explained, in part, in terms of a control-volume analysis that relates jet inclination to flow conditions downstream of injection.
In the second part of this thesis, a combustion-modeling framework is proposed and developed that is tailored to large-eddy simulations of turbulent combustion in high-speed flows. Scaling arguments place supersonic hydrocarbon combustion in a regime of autoignition-dominated distributed reaction zones (DRZ). The proposed evolution-variable manifold (EVM) framework incorporates an ignition-delay data-driven induction model with a post-ignition manifold that uses a Lagrangian convected 'balloon' reactor model for chemistry tabulation. A large-eddy simulation incorporating the EVM framework captures several important reacting-flow features of a transverse hydrogen jet in heated-air crossflow experiment.
Resumo:
Cette recherche explore comment l’infrastructure et les utilisations d’eBird, l’un des plus grands projets de science citoyenne dans le monde, se développent et évoluent dans le temps et l’espace. Nous nous concentrerons sur le travail d’eBird avec deux de ses partenaires latino-américains, le Mexique et le Pérou, chacun avec un portail Web géré par des organisations locales. eBird, qui est maintenant un grand réseau mondial de partenariats, donne occasion aux citoyens du monde entier la possibilité de contribuer à la science et à la conservation d’oiseaux à partir de ses observations téléchargées en ligne. Ces observations sont gérées et gardées dans une base de données qui est unifiée, globale et accessible pour tous ceux qui s’intéressent au sujet des oiseaux et sa conservation. De même, les utilisateurs profitent des fonctionnalités de la plateforme pour organiser et visualiser leurs données et celles d’autres. L’étude est basée sur une méthodologie qualitative à partir de l’observation des plateformes Web et des entrevues semi-structurées avec les membres du Laboratoire d’ornithologie de Cornell, l’équipe eBird et les membres des organisations partenaires locales responsables d’eBird Pérou et eBird Mexique. Nous analysons eBird comme une infrastructure qui prend en considération les aspects sociaux et techniques dans son ensemble, comme un tout. Nous explorons aussi à la variété de différents types d’utilisation de la plateforme et de ses données par ses divers utilisateurs. Trois grandes thématiques ressortent : l’importance de la collaboration comme une philosophie qui sous-tend le développement d’eBird, l’élargissement des relations et connexions d’eBird à travers ses partenariats, ainsi que l’augmentation de la participation et le volume des données. Finalement, au fil du temps on a vu une évolution des données et de ses différentes utilisations, et ce qu’eBird représente comme infrastructure.
Resumo:
During the past decade, there has been a dramatic increase by postsecondary institutions in providing academic programs and course offerings in a multitude of formats and venues (Biemiller, 2009; Kucsera & Zimmaro, 2010; Lang, 2009; Mangan, 2008). Strategies pertaining to reapportionment of course-delivery seat time have been a major facet of these institutional initiatives; most notably, within many open-door 2-year colleges. Often, these enrollment-management decisions are driven by the desire to increase market-share, optimize the usage of finite facility capacity, and contain costs, especially during these economically turbulent times. So, while enrollments have surged to the point where nearly one in three 18-to-24 year-old U.S. undergraduates are community college students (Pew Research Center, 2009), graduation rates, on average, still remain distressingly low (Complete College America, 2011). Among the learning-theory constructs related to seat-time reapportionment efforts is the cognitive phenomenon commonly referred to as the spacing effect, the degree to which learning is enhanced by a series of shorter, separated sessions as opposed to fewer, more massed episodes. This ex post facto study explored whether seat time in a postsecondary developmental-level algebra course is significantly related to: course success; course-enrollment persistence; and, longitudinally, the time to successfully complete a general-education-level mathematics course. Hierarchical logistic regression and discrete-time survival analysis were used to perform a multi-level, multivariable analysis of a student cohort (N = 3,284) enrolled at a large, multi-campus, urban community college. The subjects were retrospectively tracked over a 2-year longitudinal period. The study found that students in long seat-time classes tended to withdraw earlier and more often than did their peers in short seat-time classes (p < .05). Additionally, a model comprised of nine statistically significant covariates (all with p-values less than .01) was constructed. However, no longitudinal seat-time group differences were detected nor was there sufficient statistical evidence to conclude that seat time was predictive of developmental-level course success. A principal aim of this study was to demonstrate—to educational leaders, researchers, and institutional-research/business-intelligence professionals—the advantages and computational practicability of survival analysis, an underused but more powerful way to investigate changes in students over time.
Resumo:
This dissertation introduces a new approach for assessing the effects of pediatric epilepsy on the language connectome. Two novel data-driven network construction approaches are presented. These methods rely on connecting different brain regions using either extent or intensity of language related activations as identified by independent component analysis of fMRI data. An auditory description decision task (ADDT) paradigm was used to activate the language network for 29 patients and 30 controls recruited from three major pediatric hospitals. Empirical evaluations illustrated that pediatric epilepsy can cause, or is associated with, a network efficiency reduction. Patients showed a propensity to inefficiently employ the whole brain network to perform the ADDT language task; on the contrary, controls seemed to efficiently use smaller segregated network components to achieve the same task. To explain the causes of the decreased efficiency, graph theoretical analysis was carried out. The analysis revealed no substantial global network feature differences between the patient and control groups. It also showed that for both subject groups the language network exhibited small-world characteristics; however, the patient’s extent of activation network showed a tendency towards more random networks. It was also shown that the intensity of activation network displayed ipsilateral hub reorganization on the local level. The left hemispheric hubs displayed greater centrality values for patients, whereas the right hemispheric hubs displayed greater centrality values for controls. This hub hemispheric disparity was not correlated with a right atypical language laterality found in six patients. Finally it was shown that a multi-level unsupervised clustering scheme based on self-organizing maps, a type of artificial neural network, and k-means was able to fairly and blindly separate the subjects into their respective patient or control groups. The clustering was initiated using the local nodal centrality measurements only. Compared to the extent of activation network, the intensity of activation network clustering demonstrated better precision. This outcome supports the assertion that the local centrality differences presented by the intensity of activation network can be associated with focal epilepsy.
Resumo:
Cette recherche explore comment l’infrastructure et les utilisations d’eBird, l’un des plus grands projets de science citoyenne dans le monde, se développent et évoluent dans le temps et l’espace. Nous nous concentrerons sur le travail d’eBird avec deux de ses partenaires latino-américains, le Mexique et le Pérou, chacun avec un portail Web géré par des organisations locales. eBird, qui est maintenant un grand réseau mondial de partenariats, donne occasion aux citoyens du monde entier la possibilité de contribuer à la science et à la conservation d’oiseaux à partir de ses observations téléchargées en ligne. Ces observations sont gérées et gardées dans une base de données qui est unifiée, globale et accessible pour tous ceux qui s’intéressent au sujet des oiseaux et sa conservation. De même, les utilisateurs profitent des fonctionnalités de la plateforme pour organiser et visualiser leurs données et celles d’autres. L’étude est basée sur une méthodologie qualitative à partir de l’observation des plateformes Web et des entrevues semi-structurées avec les membres du Laboratoire d’ornithologie de Cornell, l’équipe eBird et les membres des organisations partenaires locales responsables d’eBird Pérou et eBird Mexique. Nous analysons eBird comme une infrastructure qui prend en considération les aspects sociaux et techniques dans son ensemble, comme un tout. Nous explorons aussi à la variété de différents types d’utilisation de la plateforme et de ses données par ses divers utilisateurs. Trois grandes thématiques ressortent : l’importance de la collaboration comme une philosophie qui sous-tend le développement d’eBird, l’élargissement des relations et connexions d’eBird à travers ses partenariats, ainsi que l’augmentation de la participation et le volume des données. Finalement, au fil du temps on a vu une évolution des données et de ses différentes utilisations, et ce qu’eBird représente comme infrastructure.
Resumo:
Model predictive control (MPC) has often been referred to in literature as a potential method for more efficient control of building heating systems. Though a significant performance improvement can be achieved with an MPC strategy, the complexity introduced to the commissioning of the system is often prohibitive. Models are required which can capture the thermodynamic properties of the building with sufficient accuracy for meaningful predictions to be made. Furthermore, a large number of tuning weights may need to be determined to achieve a desired performance. For MPC to become a practicable alternative, these issues must be addressed. Acknowledging the impact of the external environment as well as the interaction of occupants on the thermal behaviour of the building, in this work, techniques have been developed for deriving building models from data in which large, unmeasured disturbances are present. A spatio-temporal filtering process was introduced to determine estimates of the disturbances from measured data, which were then incorporated with metaheuristic search techniques to derive high-order simulation models, capable of replicating the thermal dynamics of a building. While a high-order simulation model allowed for control strategies to be analysed and compared, low-order models were required for use within the MPC strategy itself. The disturbance estimation techniques were adapted for use with system-identification methods to derive such models. MPC formulations were then derived to enable a more straightforward commissioning process and implemented in a validated simulation platform. A prioritised-objective strategy was developed which allowed for the tuning parameters typically associated with an MPC cost function to be omitted from the formulation by separation of the conflicting requirements of comfort satisfaction and energy reduction within a lexicographic framework. The improved ability of the formulation to be set-up and reconfigured in faulted conditions was shown.
Resumo:
Objetivo: Identificar las barreras para la unificación de una Historia Clínica Electrónica –HCE- en Colombia. Materiales y Métodos: Se realizó un estudio cualitativo. Se realizaron entrevistas semiestructuradas a profesionales y expertos de 22 instituciones del sector salud, de Bogotá y de los departamentos de Cundinamarca, Santander, Antioquia, Caldas, Huila, Valle del Cauca. Resultados: Colombia se encuentra en una estructuración para la implementación de la Historia Clínica Electrónica Unificada -HCEU-. Actualmente, se encuentra en unificación en 42 IPSs públicas en el departamento de Cundinamarca, el desarrollo de la HCEU en el país es privado y de desarrollo propio debido a las necesidades particulares de cada IPS. Conclusiones: Se identificaron barreras humanas, financieras, legales, organizacionales, técnicas y profesionales en los departamentos entrevistados. Se identificó que la unificación de la HCE depende del acuerdo de voluntades entre las IPSs del sector público, privado, EPSs, y el Gobierno Nacional.
Resumo:
We propose a method denoted as synthetic portfolio for event studies in market microstructure that is particularly interesting to use with high frequency data and thinly traded markets. The method is based on Synthetic Control Method and provides a robust data driven method to build a counterfactual for evaluating the effects of the volatility call auctions. We find that SMC could be used if the loss function is defined as the difference between the returns of the asset and the returns of a synthetic portfolio. We apply SCM to test the performance of the volatility call auction as a circuit breaker in the context of an event study. We find that for Colombian Stock Market securities, the asynchronicity of intraday data reduces the analysis to a selected group of stocks, however it is possible to build a tracking portfolio. The realized volatility increases after the auction, indicating that the mechanism is not enhancing the price discovery process.