921 resultados para Data Storage Solutions
Resumo:
High density spatial and temporal sampling of EEG data enhances the quality of results of electrophysiological experiments. Because EEG sources typically produce widespread electric fields (see Chapter 3) and operate at frequencies well below the sampling rate, increasing the number of electrodes and time samples will not necessarily increase the number of observed processes, but mainly increase the accuracy of the representation of these processes. This is namely the case when inverse solutions are computed. As a consequence, increasing the sampling in space and time increases the redundancy of the data (in space, because electrodes are correlated due to volume conduction, and time, because neighboring time points are correlated), while the degrees of freedom of the data change only little. This has to be taken into account when statistical inferences are to be made from the data. However, in many ERP studies, the intrinsic correlation structure of the data has been disregarded. Often, some electrodes or groups of electrodes are a priori selected as the analysis entity and considered as repeated (within subject) measures that are analyzed using standard univariate statistics. The increased spatial resolution obtained with more electrodes is thus poorly represented by the resulting statistics. In addition, the assumptions made (e.g. in terms of what constitutes a repeated measure) are not supported by what we know about the properties of EEG data. From the point of view of physics (see Chapter 3), the natural “atomic” analysis entity of EEG and ERP data is the scalp electric field
Resumo:
A combinatorial protocol (CP) is introduced here to interface it with the multiple linear regression (MLR) for variable selection. The efficiency of CP-MLR is primarily based on the restriction of entry of correlated variables to the model development stage. It has been used for the analysis of Selwood et al data set [16], and the obtained models are compared with those reported from GFA [8] and MUSEUM [9] approaches. For this data set CP-MLR could identify three highly independent models (27, 28 and 31) with Q2 value in the range of 0.632-0.518. Also, these models are divergent and unique. Even though, the present study does not share any models with GFA [8], and MUSEUM [9] results, there are several descriptors common to all these studies, including the present one. Also a simulation is carried out on the same data set to explain the model formation in CP-MLR. The results demonstrate that the proposed method should be able to offer solutions to data sets with 50 to 60 descriptors in reasonable time frame. By carefully selecting the inter-parameter correlation cutoff values in CP-MLR one can identify divergent models and handle data sets larger than the present one without involving excessive computer time.
Resumo:
This article provides a holistic legal analysis of the use of cookies in Online Behavioural Advertising. The current EU legislative framework is outlined in detail, and the legal obligations are examined. Consent and the debates surrounding its implementation form a large portion of the analysis. The article outlines the current difficulties associated with the reliance on this requirement as a condition for the placing and accessing of cookies. Alternatives to this approach are explored, and the implementation of solutions based on the application of the Privacy by Design and Privacy by Default concepts are presented. This discussion involves an analysis of the use of code and, therefore, product architecture to ensure adequate protections.
Resumo:
Earth observations (EO) represent a growing and valuable resource for many scientific, research and practical applications carried out by users around the world. Access to EO data for some applications or activities, like climate change research or emergency response activities, becomes indispensable for their success. However, often EO data or products made of them are (or are claimed to be) subject to intellectual property law protection and are licensed under specific conditions regarding access and use. Restrictive conditions on data use can be prohibitive for further work with the data. Global Earth Observation System of Systems (GEOSS) is an initiative led by the Group on Earth Observations (GEO) with the aim to provide coordinated, comprehensive, and sustained EO and information for making informed decisions in various areas beneficial to societies, their functioning and development. It seeks to share data with users world-wide with the fewest possible restrictions on their use by implementing GEOSS Data Sharing Principles adopted by GEO. The Principles proclaim full and open exchange of data shared within GEOSS, while recognising relevant international instruments and national policies and legislation through which restrictions on the use of data may be imposed.The paper focuses on the issue of the legal interoperability of data that are shared with varying restrictions on use with the aim to explore the options of making data interoperable. The main question it addresses is whether the public domain or its equivalents represent the best mechanism to ensure legal interoperability of data. To this end, the paper analyses legal protection regimes and their norms applicable to EO data. Based on the findings, it highlights the existing public law statutory, regulatory, and policy approaches, as well as private law instruments, such as waivers, licenses and contracts, that may be used to place the datasets in the public domain, or otherwise make them publicly available for use and re-use without restrictions. It uses GEOSS and the particular characteristics of it as a system to identify the ways to reconcile the vast possibilities it provides through sharing of data from various sources and jurisdictions on the one hand, and the restrictions on the use of the shared resources on the other. On a more general level the paper seeks to draw attention to the obstacles and potential regulatory solutions for sharing factual or research data for the purposes that go beyond research and education.
Resumo:
In this paper, we investigate content-centric data transmission in the context of short opportunistic contacts and base our work on an existing content-centric networking architecture. In case of short interconnection times, file transfers may not be completed and the received information is discarded. Caches in content-centric networks are used for short-term storage and do not guarantee persistence. We implemented a mechanism to extend caching on persistent storage enabling the completion of disrupted content transfers. The mechanisms have been implemented in the CCNx framework and have been evaluated on wireless mesh nodes. Our evaluations using multicast and unicast communication show that the implementation can support content transfers in opportunistic environments without significant processing and storing overhead.
Resumo:
The Gravity field and steady-state Ocean Circulation Explorer (GOCE) is now in orbit for more than four years. This is longer than the originally planned lifetime of the satellite and after three years on the same altitude the satellite has been lowered to 235 km in several steps. In the frame of the GOCE High-level Processing Facility the Astronomical Institute of the University of Bern (AIUB) is responsible for the determination of the official Precise Science Orbit (PSO) product. Kinematic GOCE orbits are part of this product and are used by several institutions in- and outside the HPF for determining the low degrees of the Earth’s gravity field. AIUB GOCE GPS-only gravity field solutions using the Celestial Mechanics Approach and covering the Release 4 period as well as a more recent time interval at the lower orbit altitude are shown and discussed. Special attention is paid to the impact of systematic deficiencies in the kinematic orbits on the resulting gravity fields, e.g., related to the geomagnetic equator, and on possibilities to get rid of them.
Resumo:
Time series of geocenter coordinates were determined with data of two global navigation satellite systems (GNSSs), namely the U.S. GPS (Global Positioning System) and the Russian GLONASS (Global’naya Nawigatsionnaya Sputnikowaya Sistema). The data was recorded in the years 2008–2011 by a global network of 92 permanently observing GPS/GLONASS receivers. Two types of daily solutions were generated independently for each GNSS, one including the estimation of geocenter coordinates and one without these parameters. A fair agreement for GPS and GLONASS was found in the geocenter x- and y-coordinate series. Our tests, however, clearly reveal artifacts in the z-component determined with the GLONASS data. Large periodic excursions in the GLONASS geocenter z-coordinates of about 40 cm peak-to-peak are related to the maximum elevation angles of the Sun above/below the orbital planes of the satellite system and thus have a period of about 4 months (third of a year). A detailed analysis revealed that the artifacts are almost uniquely governed by the differences of the estimates of direct solar radiation pressure (SRP) in the two solution series (with and without geocenter estimation). A simple formula is derived, describing the relation between the geocenter z-coordinate and the corresponding parameter of the SRP. The effect can be explained by first-order perturbation theory of celestial mechanics. The theory also predicts a heavy impact on the GNSS-derived geocenter if once-per-revolution SRP parameters are estimated in the direction of the satellite’s solar panel axis. Specific experiments using GPS observations revealed that this is indeed the case. Although the main focus of this article is on GNSS, the theory developed is applicable to all satellite observing techniques. We applied the theory to satellite laser ranging (SLR) solutions using LAGEOS. It turns out that the correlation between geocenter and SRP parameters is not a critical issue for the SLR solutions. The reasons are threefold: The direct SRP is about a factor of 30–40 smaller for typical geodetic SLR satellites than for GNSS satellites, allowing it in most cases to not solve for SRP parameters (ruling out the correlation between these parameters and the geocenter coordinates); the orbital arc length of 7 days (which is typically used in SLR analysis) contains more than 50 revolutions of the LAGEOS satellites as compared to about two revolutions of GNSS satellites for the daily arcs used in GNSS analysis; the orbit geometry is not as critical for LAGEOS as for GNSS satellites, because the elevation angle of the Sun w.r.t. the orbital plane is usually significantly changing over 7 days.
Resumo:
LARES is a new spherical geodetic satellite designed for SLR observations. It is made of solid tungsten alloy covered with 92 corner cubes. Due to a very small area-to-mass ratio, the sensitivity of LARES orbits to non-gravitational forces is greatly minimized. We processed 82 weeks (Feb12-Aug13) of LARES observations from a global SLR network and we analyzed the contribution of LARES data to the current SLR products (e.g., global scale and geocenter coordinates). The quality of the combined LARES+LAGEOS-1/2 solutions is also addressed in the paper. Introduction LARES
Resumo:
The current state of health and biomedicine includes an enormity of heterogeneous data ‘silos’, collected for different purposes and represented differently, that are presently impossible to share or analyze in toto. The greatest challenge for large-scale and meaningful analyses of health-related data is to achieve a uniform data representation for data extracted from heterogeneous source representations. Based upon an analysis and categorization of heterogeneities, a process for achieving comparable data content by using a uniform terminological representation is developed. This process addresses the types of representational heterogeneities that commonly arise in healthcare data integration problems. Specifically, this process uses a reference terminology, and associated "maps" to transform heterogeneous data to a standard representation for comparability and secondary use. The capture of quality and precision of the “maps” between local terms and reference terminology concepts enhances the meaning of the aggregated data, empowering end users with better-informed queries for subsequent analyses. A data integration case study in the domain of pediatric asthma illustrates the development and use of a reference terminology for creating comparable data from heterogeneous source representations. The contribution of this research is a generalized process for the integration of data from heterogeneous source representations, and this process can be applied and extended to other problems where heterogeneous data needs to be merged.
Resumo:
OBJECTIVE: To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant. DESIGN AND MEASUREMENTS A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, journal impact factor, PageRank, and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology. RESULTS Citation-based algorithms were more effective than noncitation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best noncitation-based algorithm (p < 0.001). The authors saw similar differences between citation-based and noncitation-based algorithms at 10, 20, 50, 200, 500, and 1,000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than noncitation-based algorithms. CONCLUSION Algorithms that have proved successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.
Resumo:
People often use tools to search for information. In order to improve the quality of an information search, it is important to understand how internal information, which is stored in user’s mind, and external information, represented by the interface of tools interact with each other. How information is distributed between internal and external representations significantly affects information search performance. However, few studies have examined the relationship between types of interface and types of search task in the context of information search. For a distributed information search task, how data are distributed, represented, and formatted significantly affects the user search performance in terms of response time and accuracy. Guided by UFuRT (User, Function, Representation, Task), a human-centered process, I propose a search model, task taxonomy. The model defines its relationship with other existing information models. The taxonomy clarifies the legitimate operations for each type of search task of relation data. Based on the model and taxonomy, I have also developed prototypes of interface for the search tasks of relational data. These prototypes were used for experiments. The experiments described in this study are of a within-subject design with a sample of 24 participants recruited from the graduate schools located in the Texas Medical Center. Participants performed one-dimensional nominal search tasks over nominal, ordinal, and ratio displays, and searched one-dimensional nominal, ordinal, interval, and ratio tasks over table and graph displays. Participants also performed the same task and display combination for twodimensional searches. Distributed cognition theory has been adopted as a theoretical framework for analyzing and predicting the search performance of relational data. It has been shown that the representation dimensions and data scales, as well as the search task types, are main factors in determining search efficiency and effectiveness. In particular, the more external representations used, the better search task performance, and the results suggest the ideal search performance occurs when the question type and corresponding data scale representation match. The implications of the study lie in contributing to the effective design of search interface for relational data, especially laboratory results, which are often used in healthcare activities.
Resumo:
The implications of the new research presented in Volume 2, Issue 1 (Human Trafficking) of the Journal of Applied Research on Children are explored, calling attention to the need for increased awareness, greater availability of data, and proactive policy solutions to combat child trafficking.
Resumo:
The global ocean is a significant sink for anthropogenic carbon (Cant), absorbing roughly a third of human CO2 emitted over the industrial period. Robust estimates of the magnitude and variability of the storage and distribution of Cant in the ocean are therefore important for understanding the human impact on climate. In this synthesis we review observational and model-based estimates of the storage and transport of Cant in the ocean. We pay particular attention to the uncertainties and potential biases inherent in different inference schemes. On a global scale, three data-based estimates of the distribution and inventory of Cant are now available. While the inventories are found to agree within their uncertainty, there are considerable differences in the spatial distribution. We also present a review of the progress made in the application of inverse and data assimilation techniques which combine ocean interior estimates of Cant with numerical ocean circulation models. Such methods are especially useful for estimating the air–sea flux and interior transport of Cant, quantities that are otherwise difficult to observe directly. However, the results are found to be highly dependent on modeled circulation, with the spread due to different ocean models at least as large as that from the different observational methods used to estimate Cant. Our review also highlights the importance of repeat measurements of hydrographic and biogeochemical parameters to estimate the storage of Cant on decadal timescales in the presence of the variability in circulation that is neglected by other approaches. Data-based Cant estimates provide important constraints on forward ocean models, which exhibit both broad similarities and regional errors relative to the observational fields. A compilation of inventories of Cant gives us a "best" estimate of the global ocean inventory of anthropogenic carbon in 2010 of 155 ± 31 PgC (±20% uncertainty). This estimate includes a broad range of values, suggesting that a combination of approaches is necessary in order to achieve a robust quantification of the ocean sink of anthropogenic CO2.