903 resultados para Data-driven knowledge acquisition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Raising academic standards is a driving force behind public school initiatives. True educational reform requires a data-driven approach to choosing valid options for student improvement. We discuss current and continuing research that provides evidence that class size reduction, with related variables, is a significant option for improving student learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation introduces a new approach for assessing the effects of pediatric epilepsy on the language connectome. Two novel data-driven network construction approaches are presented. These methods rely on connecting different brain regions using either extent or intensity of language related activations as identified by independent component analysis of fMRI data. An auditory description decision task (ADDT) paradigm was used to activate the language network for 29 patients and 30 controls recruited from three major pediatric hospitals. Empirical evaluations illustrated that pediatric epilepsy can cause, or is associated with, a network efficiency reduction. Patients showed a propensity to inefficiently employ the whole brain network to perform the ADDT language task; on the contrary, controls seemed to efficiently use smaller segregated network components to achieve the same task. To explain the causes of the decreased efficiency, graph theoretical analysis was carried out. The analysis revealed no substantial global network feature differences between the patient and control groups. It also showed that for both subject groups the language network exhibited small-world characteristics; however, the patient's extent of activation network showed a tendency towards more random networks. It was also shown that the intensity of activation network displayed ipsilateral hub reorganization on the local level. The left hemispheric hubs displayed greater centrality values for patients, whereas the right hemispheric hubs displayed greater centrality values for controls. This hub hemispheric disparity was not correlated with a right atypical language laterality found in six patients. Finally it was shown that a multi-level unsupervised clustering scheme based on self-organizing maps, a type of artificial neural network, and k-means was able to fairly and blindly separate the subjects into their respective patient or control groups. The clustering was initiated using the local nodal centrality measurements only. Compared to the extent of activation network, the intensity of activation network clustering demonstrated better precision. This outcome supports the assertion that the local centrality differences presented by the intensity of activation network can be associated with focal epilepsy.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation establishes a novel data-driven method to identify language network activation patterns in pediatric epilepsy through the use of the Principal Component Analysis (PCA) on functional magnetic resonance imaging (fMRI). A total of 122 subjects’ data sets from five different hospitals were included in the study through a web-based repository site designed here at FIU. Research was conducted to evaluate different classification and clustering techniques in identifying hidden activation patterns and their associations with meaningful clinical variables. The results were assessed through agreement analysis with the conventional methods of lateralization index (LI) and visual rating. What is unique in this approach is the new mechanism designed for projecting language network patterns in the PCA-based decisional space. Synthetic activation maps were randomly generated from real data sets to uniquely establish nonlinear decision functions (NDF) which are then used to classify any new fMRI activation map into typical or atypical. The best nonlinear classifier was obtained on a 4D space with a complexity (nonlinearity) degree of 7. Based on the significant association of language dominance and intensities with the top eigenvectors of the PCA decisional space, a new algorithm was deployed to delineate primary cluster members without intensity normalization. In this case, three distinct activations patterns (groups) were identified (averaged kappa with rating 0.65, with LI 0.76) and were characterized by the regions of: 1) the left inferior frontal Gyrus (IFG) and left superior temporal gyrus (STG), considered typical for the language task; 2) the IFG, left mesial frontal lobe, right cerebellum regions, representing a variant left dominant pattern by higher activation; and 3) the right homologues of the first pattern in Broca's and Wernicke's language areas. Interestingly, group 2 was found to reflect a different language compensation mechanism than reorganization. Its high intensity activation suggests a possible remote effect on the right hemisphere focus on traditionally left-lateralized functions. In retrospect, this data-driven method provides new insights into mechanisms for brain compensation/reorganization and neural plasticity in pediatric epilepsy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. This thesis describes a heterogeneous database system being developed at Highperformance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i.) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii.) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii.) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv.) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v.) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi.) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii.) a framework for intelligent computing and communication on the Internet applying the concepts of our work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Highway Safety Manual (HSM) estimates roadway safety performance based on predictive models that were calibrated using national data. Calibration factors are then used to adjust these predictive models to local conditions for local applications. The HSM recommends that local calibration factors be estimated using 30 to 50 randomly selected sites that experienced at least a total of 100 crashes per year. It also recommends that the factors be updated every two to three years, preferably on an annual basis. However, these recommendations are primarily based on expert opinions rather than data-driven research findings. Furthermore, most agencies do not have data for many of the input variables recommended in the HSM. This dissertation is aimed at determining the best way to meet three major data needs affecting the estimation of calibration factors: (1) the required minimum sample sizes for different roadway facilities, (2) the required frequency for calibration factor updates, and (3) the influential variables affecting calibration factors. In this dissertation, statewide segment and intersection data were first collected for most of the HSM recommended calibration variables using a Google Maps application. In addition, eight years (2005-2012) of traffic and crash data were retrieved from existing databases from the Florida Department of Transportation. With these data, the effect of sample size criterion on calibration factor estimates was first studied using a sensitivity analysis. The results showed that the minimum sample sizes not only vary across different roadway facilities, but they are also significantly higher than those recommended in the HSM. In addition, results from paired sample t-tests showed that calibration factors in Florida need to be updated annually. To identify influential variables affecting the calibration factors for roadway segments, the variables were prioritized by combining the results from three different methods: negative binomial regression, random forests, and boosted regression trees. Only a few variables were found to explain most of the variation in the crash data. Traffic volume was consistently found to be the most influential. In addition, roadside object density, major and minor commercial driveway densities, and minor residential driveway density were also identified as influential variables.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper will explore a data-driven approach called Sales Resource Management (SRM) that can provide real insight into sales management. The DSMT (Diagnosis, Strategy, Metrics and Tools) framework can be used to solve field sales management challenges. This paper focus on the 6P's strategy of SRM and illustrates how to use them to solve the CAPS (Concentration, Attrition, Performance and Spend) challenges. © 2010 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the global phenomena with threats to environmental health and safety is artisanal mining. There are ambiguities in the manner in which an ore-processing facility operates which hinders the mining capacity of these miners in Ghana. These problems are reviewed on the basis of current socio-economic, health and safety, environmental, and use of rudimentary technologies which limits fair-trade deals to miners. This research sought to use an established data-driven, geographic information (GIS)-based system employing the spatial analysis approach for locating a centralized processing facility within the Wassa Amenfi-Prestea Mining Area (WAPMA) in the Western region of Ghana. A spatial analysis technique that utilizes ModelBuilder within the ArcGIS geoprocessing environment through suitability modeling will systematically and simultaneously analyze a geographical dataset of selected criteria. The spatial overlay analysis methodology and the multi-criteria decision analysis approach were selected to identify the most preferred locations to site a processing facility. For an optimal site selection, seven major criteria including proximity to settlements, water resources, artisanal mining sites, roads, railways, tectonic zones, and slopes were considered to establish a suitable location for a processing facility. Site characterizations and environmental considerations, incorporating identified constraints such as proximity to large scale mines, forest reserves and state lands to site an appropriate position were selected. The analysis was limited to criteria that were selected and relevant to the area under investigation. Saaty’s analytical hierarchy process was utilized to derive relative importance weights of the criteria and then a weighted linear combination technique was applied to combine the factors for determination of the degree of potential site suitability. The final map output indicates estimated potential sites identified for the establishment of a facility centre. The results obtained provide intuitive areas suitable for consideration

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An investigation into karst hazard in southern Ontario has been undertaken with the intention of leading to the development of predictive karst models for this region. The reason these are not currently feasible is a lack of sufficient karst data, though this is not entirely due to the lack of karst features. Geophysical data was collected at Lake on the Mountain, Ontario as part of this karst investigation. This data was collected in order to validate the long-standing hypothesis that Lake on the Mountain was formed from a sinkhole collapse. Sub-bottom acoustic profiling data was collected in order to image the lake bottom sediments and bedrock. Vertical bedrock features interpreted as solutionally enlarged fractures were taken as evidence for karst processes on the lake bottom. Additionally, the bedrock topography shows a narrower and more elongated basin than was previously identified, and this also lies parallel to a mapped fault system in the area. This suggests that Lake on the Mountain was formed over a fault zone which also supports the sinkhole hypothesis as it would provide groundwater pathways for karst dissolution to occur. Previous sediment cores suggest that Lake on the Mountain would have formed at some point during the Wisconsinan glaciation with glacial meltwater and glacial loading as potential contributing factors to sinkhole development. A probabilistic karst model for the state of Kentucky, USA, has been generated using the Weights of Evidence method. This model is presented as an example of the predictive capabilities of these kind of data-driven modelling techniques and to show how such models could be applied to karst in Ontario. The model was able to classify 70% of the validation dataset correctly while minimizing false positive identifications. This is moderately successful and could stand to be improved. Finally, suggestions to improving the current karst model of southern Ontario are suggested with the goal of increasing investigation into karst in Ontario and streamlining the reporting system for sinkholes, caves, and other karst features so as to improve the current Ontario karst database.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantitative methods can help us understand how underlying attributes contribute to movement patterns. Applying principal components analysis (PCA) to whole-body motion data may provide an objective data-driven method to identify unique and statistically important movement patterns. Therefore, the primary purpose of this study was to determine if athletes’ movement patterns can be differentiated based on skill level or sport played using PCA. Motion capture data from 542 athletes performing three sport-screening movements (i.e. bird-dog, drop jump, T-balance) were analyzed. A PCA-based pattern recognition technique was used to analyze the data. Prior to analyzing the effects of skill level or sport on movement patterns, methodological considerations related to motion analysis reference coordinate system were assessed. All analyses were addressed as case-studies. For the first case study, referencing motion data to a global (lab-based) coordinate system compared to a local (segment-based) coordinate system affected the ability to interpret important movement features. Furthermore, for the second case study, where the interpretability of PCs was assessed when data were referenced to a stationary versus a moving segment-based coordinate system, PCs were more interpretable when data were referenced to a stationary coordinate system for both the bird-dog and T-balance task. As a result of the findings from case study 1 and 2, only stationary segment-based coordinate systems were used in cases 3 and 4. During the bird-dog task, elite athletes had significantly lower scores compared to recreational athletes for principal component (PC) 1. For the T-balance movement, elite athletes had significantly lower scores compared to recreational athletes for PC 2. In both analyses the lower scores in elite athletes represented a greater range of motion. Finally, case study 4 reported differences in athletes’ movement patterns who competed in different sports, and significant differences in technique were detected during the bird-dog task. Through these case studies, this thesis highlights the feasibility of applying PCA as a movement pattern recognition technique in athletes. Future research can build on this proof-of-principle work to develop robust quantitative methods to help us better understand how underlying attributes (e.g. height, sex, ability, injury history, training type) contribute to performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Video games have become one of the largest entertainment industries, and their power to capture the attention of players worldwide soon prompted the idea of using games to improve education. However, these educational games, commonly referred to as serious games, face different challenges when brought into the classroom, ranging from pragmatic issues (e.g. a high development cost) to deeper educational issues, including a lack of understanding of how the students interact with the games and how the learning process actually occurs. This chapter explores the potential of data-driven approaches to improve the practical applicability of serious games. Existing work done by the entertainment and learning industries helps to build a conceptual model of the tasks required to analyze player interactions in serious games (gaming learning analytics or GLA). The chapter also describes the main ongoing initiatives to create reference GLA infrastructures and their connection to new emerging specifications from the educational technology field. Finally, it explores how this data-driven GLA will help in the development of a new generation of more effective educational games and new business models that will support their expansion. This results in additional ethical implications, which are discussed at the end of the chapter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Microsecond long Molecular Dynamics (MD) trajectories of biomolecular processes are now possible due to advances in computer technology. Soon, trajectories long enough to probe dynamics over many milliseconds will become available. Since these timescales match the physiological timescales over which many small proteins fold, all atom MD simulations of protein folding are now becoming popular. To distill features of such large folding trajectories, we must develop methods that can both compress trajectory data to enable visualization, and that can yield themselves to further analysis, such as the finding of collective coordinates and reduction of the dynamics. Conventionally, clustering has been the most popular MD trajectory analysis technique, followed by principal component analysis (PCA). Simple clustering used in MD trajectory analysis suffers from various serious drawbacks, namely, (i) it is not data driven, (ii) it is unstable to noise and change in cutoff parameters, and (iii) since it does not take into account interrelationships amongst data points, the separation of data into clusters can often be artificial. Usually, partitions generated by clustering techniques are validated visually, but such validation is not possible for MD trajectories of protein folding, as the underlying structural transitions are not well understood. Rigorous cluster validation techniques may be adapted, but it is more crucial to reduce the dimensions in which MD trajectories reside, while still preserving their salient features. PCA has often been used for dimension reduction and while it is computationally inexpensive, being a linear method, it does not achieve good data compression. In this thesis, I propose a different method, a nonmetric multidimensional scaling (nMDS) technique, which achieves superior data compression by virtue of being nonlinear, and also provides a clear insight into the structural processes underlying MD trajectories. I illustrate the capabilities of nMDS by analyzing three complete villin headpiece folding and six norleucine mutant (NLE) folding trajectories simulated by Freddolino and Schulten [1]. Using these trajectories, I make comparisons between nMDS, PCA and clustering to demonstrate the superiority of nMDS. The three villin headpiece trajectories showed great structural heterogeneity. Apart from a few trivial features like early formation of secondary structure, no commonalities between trajectories were found. There were no units of residues or atoms found moving in concert across the trajectories. A flipping transition, corresponding to the flipping of helix 1 relative to the plane formed by helices 2 and 3 was observed towards the end of the folding process in all trajectories, when nearly all native contacts had been formed. However, the transition occurred through a different series of steps in all trajectories, indicating that it may not be a common transition in villin folding. The trajectories showed competition between local structure formation/hydrophobic collapse and global structure formation in all trajectories. Our analysis on the NLE trajectories confirms the notion that a tight hydrophobic core inhibits correct 3-D rearrangement. Only one of the six NLE trajectories folded, and it showed no flipping transition. All the other trajectories get trapped in hydrophobically collapsed states. The NLE residues were found to be buried deeply into the core, compared to the corresponding lysines in the villin headpiece, thereby making the core tighter and harder to undo for 3-D rearrangement. Our results suggest that the NLE may not be a fast folder as experiments suggest. The tightness of the hydrophobic core may be a very important factor in the folding of larger proteins. It is likely that chaperones like GroEL act to undo the tight hydrophobic core of proteins, after most secondary structure elements have been formed, so that global rearrangement is easier. I conclude by presenting facts about chaperone-protein complexes and propose further directions for the study of protein folding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study positioned the federal No Child Left Behind (NCLB) Act of 2002 as a reified colonizing entity, inscribing its hegemonic authority upon the professional identity and work of school principals within their school communities of practice. Pressure on educators and students intensifies each year as the benchmark for Adequate Yearly Progress under the NCLB policy is raised, resulting in standards-based reform, scripted curriculum and pedagogy, absence of elective subjects, and a general lack of autonomy critical to the work of teachers as they approach each unique class and student (Crocco & Costigan, 2007; Mabry & Margolis, 2006). Emphasis on high stakes standardized testing as the indicator for student achievement (Popham, 2005) affects educators’ professional identity through dramatic pedagological and structural changes in schools (Day, Flores, & Viana, 2007). These dramatic changes to the ways our nation conducts schooling must be understood and thought about critically from school leaders’ perspectives as their professional identity is influenced by large scale NCLB school reform. The author explored the impact No Child Left Behind reform had on the professional identity of fourteen, veteran Illinois principals leading in urban, small urban, suburban, and rural middle and elementary schools. Qualitative data were collected during semi-structured interviews and focus groups and analyzed using a dual theoretical framework of postcolonial and identity theories. Postcolonial theory provided a lens from which the author applied a metaphor of colonization to principals’ experiences as colonized-colonizers in a time of school reform. Principal interview data illustrated many examples of NCLB as a colonizing authority having a significant impact on the professional identity of school leaders. This framework was used to interpret data in a unique and alternative way and contributed to the need to better understand the ways school leaders respond to district-level, state-level, and national-level accountability policies (Sloan, 2000). Identity theory situated principals as professionals shaped by the communities of practice in which they lead. Principals’ professional identity has become more data-driven as a result of NCLB and their role as instructional leaders has intensified. The data showed that NCLB has changed the work and professional identity of principals in terms of use of data, classroom instruction, Response to Intervention, and staffing changes. Although NCLB defines success in terms of meeting or exceeding the benchmark for Adequate Yearly Progress, principals’ view AYP as only one measurement of their success. The need to meet the benchmark for AYP is a present reality that necessitates school-wide attention to reading and math achievement. At this time, principals leading in affluent, somewhat homogeneous schools typically experience less pressure and more power under NCLB and are more often labeled “successful” school communities. In contrast, principals leading in schools with more heterogeneity experience more pressure and lack of power under NCLB and are more often labeled “failing” school communities. Implications from this study for practitioners and policymakers include a need to reexamine the intents and outcomes of the policy for all school communities, especially in terms of power and voice. Recommendations for policy reform include moving to a growth model with multi-year assessments that make sense for individual students rather than one standardized test score as the measure for achievement. Overall, the study reveals enhancements and constraints NCLB policy has caused in a variety of school contexts, which have affected the professional identity of school leaders.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: A high level of adherence is required to achieve the desired outcomes of antiretroviral therapy. There is paucity of information about adherence to combined antiretroviral therapy in Bayelsa State of southern Nigeria. Objectives: The objectives of the study were to determine the level of adherence to combined antiretroviral therapy among the patients, evaluate the improvement in their immune status and identify reasons for sub-optimal adherence to therapy. Methods: The cross-sectional study involved administration of an adapted and pretested questionnaire to 601 consented patients attending the two tertiary health institutions in Bayesla State, Nigeria: The Federal Medical Centre, Yenagoa and the Niger-Delta University Teaching Hospital Okolobiri. The tool was divided into various sections such as socio-demographic data, HIV knowledge and adherence to combined antiretroviral therapy. Information on the patient's CD4+ T cells count was retrieved from their medical records. Adherence was assessed by asking patients to recall their intake of prescribed doses in the last fourteen days and subjects who had 95-100% of the prescribed antiretroviral drugs were considered adherent. Results: Three hundred and forty eight (57.9%) of the subjects were females and 253 (42.1%) were males. The majority of them, 557 (92.7%) have good knowledge of HIV and combined anti-retroviral therapy with a score of 70.0% and above. A larger proportion of the respondents, 441 (73.4%), had > 95% adherence. Some of the most important reasons giving for missing doses include, “simply forgot” 147 (24.5%), and “wanted to avoid the side-effects of drugs” 33(5.5%). There were remarkable improvements in the immune status of the subjects with an increment in the proportion of the subjects with CD4+ T cells count of greater than 350 cells/mm3 from 33 (5.5%) at therapy initiation to 338 (56.3%) at study period (p<0.0001). Conclusion: The adherence level of 73.4% was low which calls for intervention and improvement. The combined antiretroviral therapy has significantly improved the immune status of the majority of patients which must be sustained. “Simply forgot” was the most important reason for missing doses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the development of variable-data-driven digital presses - where each document printed is potentially unique - there is a need for pre-press optimization to identify material that is invariant from document to document. In this way rasterisation can be confined solely to those areas which change between successive documents thereby alleviating a potential performance bottleneck. Given a template document specified in terms of layout functions, where actual data is bound at the last possible moment before printing, we look at deriving and exploiting the invariant properties of layout functions from their formal specifications. We propose future work on generic extraction of invariance from such properties for certain classes of layout functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is aimed at understanding and unifying information on epidemiological modelling methods and how those methods relate to public policy addressing human health, specifically in the context of infectious disease prevention, pandemic planning, and health behaviour change. This thesis employs multiple qualitative and quantitative methods, and presents as a manuscript of several individual, data-driven projects that are combined in a narrative arc. The first chapter introduces the scope and complexity of this interdisciplinary undertaking, describing several topical intersections of importance. The second chapter begins the presentation of original data, and describes in detail two exercises in computational epidemiological modelling pertinent to pandemic influenza planning and policy, and progresses in the next chapter to present additional original data on how the confidence of the public in modelling methodology may have an effect on their planned health behaviour change as recommended in public health policy. The thesis narrative continues in the final data-driven chapter to describe how health policymakers use modelling methods and scientific evidence to inform and construct health policies for the prevention of infectious diseases, and concludes with a narrative chapter that evaluates the breadth of this data and recommends strategies for the optimal use of modelling methodologies when informing public health policy in applied public health scenarios.