901 resultados para Data dissemination and sharing
Resumo:
The integration of mathematics and science in secondary schools in the 21st century continues to be an important topic of practice and research. The purpose of my research study, which builds on studies by Frykholm and Glasson (2005) and Berlin and White (2010), is to explore the potential constraints and benefits of integrating mathematics and science in Ontario secondary schools based on the perspectives of in-service and pre-service teachers with various math and/or science backgrounds. A qualitative and quantitative research design with an exploratory approach was used. The qualitative data was collected from a sample of 12 in-service teachers with various math and/or science backgrounds recruited from two school boards in Eastern Ontario. The quantitative and some qualitative data was collected from a sample of 81 pre-service teachers from the Queen’s University Bachelor of Education (B.Ed) program. Semi-structured interviews were conducted with the in-service teachers while a survey and a focus group was conducted with the pre-service teachers. Once the data was collected, the qualitative data were abductively analyzed. For the quantitative data, descriptive and inferential statistics (one-way ANOVAs and Pearson Chi Square analyses) were calculated to examine perspectives of teachers regardless of teaching background and to compare groups of teachers based on teaching background. The findings of this study suggest that in-service and pre-service teachers have a positive attitude towards the integration of math and science and view it as valuable to student learning and success. The pre-service teachers viewed the integration as easy and did not express concerns to this integration. On the other hand, the in-service teachers highlighted concerns and challenges such as resources, scheduling, and time constraints. My results illustrate when teachers perceive it is valuable to integrate math and science and which aspects of the classroom benefit best from the integration. Furthermore, the results highlight barriers and possible solutions to better the integration of math and science. In addition to the benefits and constraints of integration, my results illustrate why some teachers may opt out of integrating math and science and the different strategies teachers have incorporated to integrate math and science in their classroom.
Resumo:
In the last several years there has been an increase in the amount of qualitative research using in-depth interviews and comprehensive content analyses in sport psychology. However, no explicit method has been provided to deal with the large amount of unstructured data. This article provides common guidelines for organizing and interpreting unstructured data. Two main operations are suggested and discussed: first, coding meaningful text segments, or creating tags, and second, regrouping similar text segments,or creating categories. Furthermore, software programs for the microcomputer are presented as away to facilitate the organization and interpretation of qualitative data
Resumo:
The astonishing development of diverse and different hardware platforms is twofold: on one side, the challenge for the exascale performance for big data processing and management; on the other side, the mobile and embedded devices for data collection and human machine interaction. This drove to a highly hierarchical evolution of programming models. GVirtuS is the general virtualization system developed in 2009 and firstly introduced in 2010 enabling a completely transparent layer among GPUs and VMs. This paper shows the latest achievements and developments of GVirtuS, now supporting CUDA 6.5, memory management and scheduling. Thanks to the new and improved remoting capabilities, GVirtus now enables GPU sharing among physical and virtual machines based on x86 and ARM CPUs on local workstations,computing clusters and distributed cloud appliances.
Resumo:
The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
The last decades have been characterized by a continuous adoption of IT solutions in the healthcare sector, which resulted in the proliferation of tremendous amounts of data over heterogeneous systems. Distinct data types are currently generated, manipulated, and stored, in the several institutions where patients are treated. The data sharing and an integrated access to this information will allow extracting relevant knowledge that can lead to better diagnostics and treatments. This thesis proposes new integration models for gathering information and extracting knowledge from multiple and heterogeneous biomedical sources. The scenario complexity led us to split the integration problem according to the data type and to the usage specificity. The first contribution is a cloud-based architecture for exchanging medical imaging services. It offers a simplified registration mechanism for providers and services, promotes remote data access, and facilitates the integration of distributed data sources. Moreover, it is compliant with international standards, ensuring the platform interoperability with current medical imaging devices. The second proposal is a sensor-based architecture for integration of electronic health records. It follows a federated integration model and aims to provide a scalable solution to search and retrieve data from multiple information systems. The last contribution is an open architecture for gathering patient-level data from disperse and heterogeneous databases. All the proposed solutions were deployed and validated in real world use cases.
Resumo:
This dissertation contains four essays that all share a common purpose: developing new methodologies to exploit the potential of high-frequency data for the measurement, modeling and forecasting of financial assets volatility and correlations. The first two chapters provide useful tools for univariate applications while the last two chapters develop multivariate methodologies. In chapter 1, we introduce a new class of univariate volatility models named FloGARCH models. FloGARCH models provide a parsimonious joint model for low frequency returns and realized measures, and are sufficiently flexible to capture long memory as well as asymmetries related to leverage effects. We analyze the performances of the models in a realistic numerical study and on the basis of a data set composed of 65 equities. Using more than 10 years of high-frequency transactions, we document significant statistical gains related to the FloGARCH models in terms of in-sample fit, out-of-sample fit and forecasting accuracy compared to classical and Realized GARCH models. In chapter 2, using 12 years of high-frequency transactions for 55 U.S. stocks, we argue that combining low-frequency exogenous economic indicators with high-frequency financial data improves the ability of conditionally heteroskedastic models to forecast the volatility of returns, their full multi-step ahead conditional distribution and the multi-period Value-at-Risk. Using a refined version of the Realized LGARCH model allowing for time-varying intercept and implemented with realized kernels, we document that nominal corporate profits and term spreads have strong long-run predictive ability and generate accurate risk measures forecasts over long-horizon. The results are based on several loss functions and tests, including the Model Confidence Set. Chapter 3 is a joint work with David Veredas. We study the class of disentangled realized estimators for the integrated covariance matrix of Brownian semimartingales with finite activity jumps. These estimators separate correlations and volatilities. We analyze different combinations of quantile- and median-based realized volatilities, and four estimators of realized correlations with three synchronization schemes. Their finite sample properties are studied under four data generating processes, in presence, or not, of microstructure noise, and under synchronous and asynchronous trading. The main finding is that the pre-averaged version of disentangled estimators based on Gaussian ranks (for the correlations) and median deviations (for the volatilities) provide a precise, computationally efficient, and easy alternative to measure integrated covariances on the basis of noisy and asynchronous prices. Along these lines, a minimum variance portfolio application shows the superiority of this disentangled realized estimator in terms of numerous performance metrics. Chapter 4 is co-authored with Niels S. Hansen, Asger Lunde and Kasper V. Olesen, all affiliated with CREATES at Aarhus University. We propose to use the Realized Beta GARCH model to exploit the potential of high-frequency data in commodity markets. The model produces high quality forecasts of pairwise correlations between commodities which can be used to construct a composite covariance matrix. We evaluate the quality of this matrix in a portfolio context and compare it to models used in the industry. We demonstrate significant economic gains in a realistic setting including short selling constraints and transaction costs.
Resumo:
In 2005, the University of Maryland acquired over 70 digital videos spanning 35 years of Jim Henson’s groundbreaking work in television and film. To support in-house discovery and use, the collection was cataloged in detail using AACR2 and MARC21, and a web-based finding aid was also created. In the past year, I created an "r-ball" (a linked data set described using RDA) of these same resources. The presentation will compare and contrast these three ways of accessing the Jim Henson Works collection, with insights gleaned from providing resource discovery using RIMMF (RDA in Many Metadata Formats).
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.
Resumo:
The business environment context points at the necessity of new forms of management for the sustainable competitiveness of organizations through time. Coopetition is characterized as an alternative in the interaction of different actors, which compete and cooperate simultaneously, in the pursuit of common goals. This dual relation, within a gain-increasing perspective, converts competitors into partners and fosters competitiveness, especially that of organizations within a specific sector. The field of competitive intelligence has, in its turn, assisted organizations, individually, in the systematization of information valuable to decision-making processes, which benefits competitiveness. It follows that it is possible to combine coopetition and competitive intelligence in a systematized process of sectorial intelligence for coopetitive relations. The general aim of this study is, therefore, to put forth a model of sectorial coopetitive intelligence. The methodological outlining of the study is characterized as a mixed approach (quantitative and qualitative methods), of an applied nature, of exploratory and descriptive aims. The Coordination of the Strategic Roadmapping Project for the Future of Paraná's Industry is the selected object of investigation. Protocols have been designed to collect primary and secondary data. In the collection of the primary ata, online questionary were sent to the sectors selected for examination. A total of 149 answers to the online questionary were obtained, and interviews were performed with all embers of the technical team of the Coordination, in a total of five interviewees. After the collection, all the data were tabulated, analyzed and validated by means of focal groups with the same five members of the Coordination technical team, and interviews were performed with a representative of each of the four sectors selected, in a total of nine participants in the validation. The results allowed the systematization of a sectorial coopetitive intelligence model called ICoops. This model is characterized by five stages, namely, planning, collection, nalysis, project development, dissemination and evaluation. Each stage is detailed in inputs, activities and outputs. The results suggest that sectorial coopetition is motivated mainly by knowledge sharing, technological development, investment in R&D, innovation, chain integration and resource complementation. The importance of a neutral institution has been recognized as a facilitator and incentive to the approximation of organizations. Among the main difficulties are the financing of the projects, the adhesion of new members, the lack of tools for the analysis of information and the dissemination of the actions.
Resumo:
With the prevalence of smartphones, new ways of engaging citizens and stakeholders in urban planning and govern-ance are emerging. The technologies in smartphones allow citizens to act as sensors of their environment, producing and sharing rich spatial data useful for new types of collaborative governance set-ups. Data derived from Volunteered Geographic Information (VGI) can support accessible, transparent, democratic, inclusive, and locally-based governance situations of interest to planners, citizens, politicians, and scientists. However, there are still uncertainties about how to actually conduct this in practice. This study explores how social media VGI can be used to document spatial tendencies regarding citizens’ uses and perceptions of urban nature with relevance for urban green space governance. Via the hashtag #sharingcph, created by the City of Copenhagen in 2014, VGI data consisting of geo-referenced images were collected from Instagram, categorised according to their content and analysed according to their spatial distribution patterns. The results show specific spatial distributions of the images and main hotspots. Many possibilities and much potential of using VGI for generating, sharing, visualising and communicating knowledge about citizens’ spatial uses and preferences exist, but as a tool to support scientific and democratic interaction, VGI data is challenged by practical, technical and ethical concerns. More research is needed in order to better understand the usefulness and application of this rich data source to governance.
Resumo:
We present new methodologies to generate rational function approximations of broadband electromagnetic responses of linear and passive networks of high-speed interconnects, and to construct SPICE-compatible, equivalent circuit representations of the generated rational functions. These new methodologies are driven by the desire to improve the computational efficiency of the rational function fitting process, and to ensure enhanced accuracy of the generated rational function interpolation and its equivalent circuit representation. Toward this goal, we propose two new methodologies for rational function approximation of high-speed interconnect network responses. The first one relies on the use of both time-domain and frequency-domain data, obtained either through measurement or numerical simulation, to generate a rational function representation that extrapolates the input, early-time transient response data to late-time response while at the same time providing a means to both interpolate and extrapolate the used frequency-domain data. The aforementioned hybrid methodology can be considered as a generalization of the frequency-domain rational function fitting utilizing frequency-domain response data only, and the time-domain rational function fitting utilizing transient response data only. In this context, a guideline is proposed for estimating the order of the rational function approximation from transient data. The availability of such an estimate expedites the time-domain rational function fitting process. The second approach relies on the extraction of the delay associated with causal electromagnetic responses of interconnect systems to provide for a more stable rational function process utilizing a lower-order rational function interpolation. A distinctive feature of the proposed methodology is its utilization of scattering parameters. For both methodologies, the approach of fitting the electromagnetic network matrix one element at a time is applied. It is shown that, with regard to the computational cost of the rational function fitting process, such an element-by-element rational function fitting is more advantageous than full matrix fitting for systems with a large number of ports. Despite the disadvantage that different sets of poles are used in the rational function of different elements in the network matrix, such an approach provides for improved accuracy in the fitting of network matrices of systems characterized by both strongly coupled and weakly coupled ports. Finally, in order to provide a means for enforcing passivity in the adopted element-by-element rational function fitting approach, the methodology for passivity enforcement via quadratic programming is modified appropriately for this purpose and demonstrated in the context of element-by-element rational function fitting of the admittance matrix of an electromagnetic multiport.
Resumo:
In order to optimize frontal detection in sea surface temperature fields at 4 km resolution, a combined statistical and expert-based approach is applied to test different spatial smoothing of the data prior to the detection process. Fronts are usually detected at 1 km resolution using the histogram-based, single image edge detection (SIED) algorithm developed by Cayula and Cornillon in 1992, with a standard preliminary smoothing using a median filter and a 3 × 3 pixel kernel. Here, detections are performed in three study regions (off Morocco, the Mozambique Channel, and north-western Australia) and across the Indian Ocean basin using the combination of multiple windows (CMW) method developed by Nieto, Demarcq and McClatchie in 2012 which improves on the original Cayula and Cornillon algorithm. Detections at 4 km and 1 km of resolution are compared. Fronts are divided in two intensity classes (“weak” and “strong”) according to their thermal gradient. A preliminary smoothing is applied prior to the detection using different convolutions: three type of filters (median, average and Gaussian) combined with four kernel sizes (3 × 3, 5 × 5, 7 × 7, and 9 × 9 pixels) and three detection window sizes (16 × 16, 24 × 24 and 32 × 32 pixels) to test the effect of these smoothing combinations on reducing the background noise of the data and therefore on improving the frontal detection. The performance of the combinations on 4 km data are evaluated using two criteria: detection efficiency and front length. We find that the optimal combination of preliminary smoothing parameters in enhancing detection efficiency and preserving front length includes a median filter, a 16 × 16 pixel window size, and a 5 × 5 pixel kernel for strong fronts and a 7 × 7 pixel kernel for weak fronts. Results show an improvement in detection performance (from largest to smallest window size) of 71% for strong fronts and 120% for weak fronts. Despite the small window used (16 × 16 pixels), the length of the fronts has been preserved relative to that found with 1 km data. This optimal preliminary smoothing and the CMW detection algorithm on 4 km sea surface temperature data are then used to describe the spatial distribution of the monthly frequencies of occurrence for both strong and weak fronts across the Indian Ocean basin. In general strong fronts are observed in coastal areas whereas weak fronts, with some seasonal exceptions, are mainly located in the open ocean. This study shows that adequate noise reduction done by a preliminary smoothing of the data considerably improves the frontal detection efficiency as well as the global quality of the results. Consequently, the use of 4 km data enables frontal detections similar to 1 km data (using a standard median 3 × 3 convolution) in terms of detectability, length and location. This method, using 4 km data is easily applicable to large regions or at the global scale with far less constraints of data manipulation and processing time relative to 1 km data.