Biblioteca Digital

9 resultados para Data access

Automatically exposing OpenLifeData via SADI semantic Web Services

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: Two distinct trends are emerging with respect to how data is shared, collected, and analyzed within the bioinformatics community. First, Linked Data, exposed as SPARQL endpoints, promises to make data easier to collect and integrate by moving towards the harmonization of data syntax, descriptive vocabularies, and identifiers, as well as providing a standardized mechanism for data access. Second, Web Services, often linked together into workflows, normalize data access and create transparent, reproducible scientific methodologies that can, in principle, be re-used and customized to suit new scientific questions. Constructing queries that traverse semantically-rich Linked Data requires substantial expertise, yet traditional RESTful or SOAP Web Services cannot adequately describe the content of a SPARQL endpoint. We propose that content-driven Semantic Web Services can enable facile discovery of Linked Data, independent of their location. Results: We use a well-curated Linked Dataset - OpenLifeData - and utilize its descriptive metadata to automatically configure a series of more than 22,000 Semantic Web Services that expose all of its content via the SADI set of design principles. The OpenLifeData SADI services are discoverable via queries to the SHARE registry and easy to integrate into new or existing bioinformatics workflows and analytical pipelines. We demonstrate the utility of this system through comparison of Web Service-mediated data access with traditional SPARQL, and note that this approach not only simplifies data retrieval, but simultaneously provides protection against resource-intensive queries. Conclusions: We show, through a variety of different clients and examples of varying complexity, that data from the myriad OpenLifeData can be recovered without any need for prior-knowledge of the content or structure of the SPARQL endpoints. We also demonstrate that, via clients such as SHARE, the complexity of federated SPARQL queries is dramatically reduced.

Veja mais

Data Preprocessing and Model Design for Medicine Problems

Relevância:

30.00% 30.00%

Publicador:

Veja mais

Biologically-inspired data decorrelation for hyper-spectral imaging

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hyper-spectral data allows the construction of more robust statistical models to sample the material properties than the standard tri-chromatic color representation. However, because of the large dimensionality and complexity of the hyper-spectral data, the extraction of robust features (image descriptors) is not a trivial issue. Thus, to facilitate efficient feature extraction, decorrelation techniques are commonly applied to reduce the dimensionality of the hyper-spectral data with the aim of generating compact and highly discriminative image descriptors. Current methodologies for data decorrelation such as principal component analysis (PCA), linear discriminant analysis (LDA), wavelet decomposition (WD), or band selection methods require complex and subjective training procedures and in addition the compressed spectral information is not directly related to the physical (spectral) characteristics associated with the analyzed materials. The major objective of this article is to introduce and evaluate a new data decorrelation methodology using an approach that closely emulates the human vision. The proposed data decorrelation scheme has been employed to optimally minimize the amount of redundant information contained in the highly correlated hyper-spectral bands and has been comprehensively evaluated in the context of non-ferrous material classification

Veja mais

Is there a subgroup of long-term evolution among patients with advanced lung cancer?: Hints from the analysis of survival curves from cancer registry data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Recently, with the access of low toxicity biological and targeted therapies, evidence of the existence of a long-term survival subpopulation of cancer patients is appearing. We have studied an unselected population with advanced lung cancer to look for evidence of multimodality in survival distribution, and estimate the proportion of long-term survivors. Methods: We used survival data of 4944 patients with non-small-cell lung cancer (NSCLC) stages IIIb-IV at diagnostic, registered in the National Cancer Registry of Cuba (NCRC) between January 1998 and December 2006. We fitted one-component survival model and two-component mixture models to identify short-and long-term survivors. Bayesian information criterion was used for model selection. Results: For all of the selected parametric distributions the two components model presented the best fit. The population with short-term survival (almost 4 months median survival) represented 64% of patients. The population of long-term survival included 35% of patients, and showed a median survival around 12 months. None of the patients of short-term survival was still alive at month 24, while 10% of the patients of long-term survival died afterwards. Conclusions: There is a subgroup showing long-term evolution among patients with advanced lung cancer. As survival rates continue to improve with the new generation of therapies, prognostic models considering short-and long-term survival subpopulations should be considered in clinical research.

Veja mais

BAMS: A Tool for Supervised Burned Area Mapping Using Landsat Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new supervised burned area mapping software named BAMS (Burned Area Mapping Software) is presented in this paper. The tool was built from standard ArcGIS (TM) libraries. It computes several of the spectral indexes most commonly used in burned area detection and implements a two-phase supervised strategy to map areas burned between two Landsat multitemporal images. The only input required from the user is the visual delimitation of a few burned areas, from which burned perimeters are extracted. After the discrimination of burned patches, the user can visually assess the results, and iteratively select additional sampling burned areas to improve the extent of the burned patches. The final result of the BAMS program is a polygon vector layer containing three categories: (a) burned perimeters, (b) unburned areas, and (c) non-observed areas. The latter refer to clouds or sensor observation errors. Outputs of the BAMS code meet the requirements of file formats and structure of standard validation protocols. This paper presents the tool's structure and technical basis. The program has been tested in six areas located in the United States, for various ecosystems and land covers, and then compared against the National Monitoring Trends in Burn Severity (MTBS) Burned Area Boundaries Dataset.

Veja mais

A Data Dropout Compensation Algorithm Based on the Iterative Learning Control Methodology for Discrete-Time Systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper deals with the convergence of a remote iterative learning control system subject to data dropouts. The system is composed by a set of discrete-time multiple input-multiple output linear models, each one with its corresponding actuator device and its sensor. Each actuator applies the input signals vector to its corresponding model at the sampling instants and the sensor measures the output signals vector. The iterative learning law is processed in a controller located far away of the models so the control signals vector has to be transmitted from the controller to the actuators through transmission channels. Such a law uses the measurements of each model to generate the input vector to be applied to its subsequent model so the measurements of the models have to be transmitted from the sensors to the controller. All transmissions are subject to failures which are described as a binary sequence taking value 1 or 0. A compensation dropout technique is used to replace the lost data in the transmission processes. The convergence to zero of the errors between the output signals vector and a reference one is achieved as the number of models tends to infinity.

Veja mais

Data Gathering Bias: Trait Vulnerability to Psychotic Symptoms?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Jumping to conclusions (JTC) is associated with psychotic disorder and psychotic symptoms. If JTC represents a trait, the rate should be (i) increased in people with elevated levels of psychosis proneness such as individuals diagnosed with borderline personality disorder (BPD), and (ii) show a degree of stability over time. Methods The JTC rate was examined in 3 groups: patients with first episode psychosis (FEP), BPD patients and controls, using the Beads Task. PANSS, SIS-R and CAPE scales were used to assess positive psychotic symptoms. Four WAIS III subtests were used to assess IQ. Results A total of 61 FEP, 26 BPD and 150 controls were evaluated. 29 FEP were revaluated after one year. 44% of FEP (OR = 8.4, 95% CI: 3.9-17.9) displayed a JTC reasoning bias versus 19% of BPD (OR = 2.5, 95% CI: 0.8-7.8) and 9% of controls. JTC was not associated with level of psychotic symptoms or specifically delusionality across the different groups. Differences between FEP and controls were independent of sex, educational level, cannabis use and IQ. After one year, 47.8% of FEP with JTC at baseline again displayed JTC. Conclusions JTC in part reflects trait vulnerability to develop disorders with expression of psychotic symptoms.

Veja mais

Towards Application of One-Class Classification Methods to Medical Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the problem of one-class classification (OCC) one of the classes, the target class, has to be distinguished from all other possible objects, considered as nontargets. In many biomedical problems this situation arises, for example, in diagnosis, image based tumor recognition or analysis of electrocardiogram data. In this paper an approach to OCC based on a typicality test is experimentally compared with reference state-of-the-art OCC techniques-Gaussian, mixture of Gaussians, naive Parzen, Parzen, and support vector data description-using biomedical data sets. We evaluate the ability of the procedures using twelve experimental data sets with not necessarily continuous data. As there are few benchmark data sets for one-class classification, all data sets considered in the evaluation have multiple classes. Each class in turn is considered as the target class and the units in the other classes are considered as new units to be classified. The results of the comparison show the good performance of the typicality approach, which is available for high dimensional data; it is worth mentioning that it can be used for any kind of data (continuous, discrete, or nominal), whereas state-of-the-art approaches application is not straightforward when nominal variables are present.

Veja mais

Exploring Game Performance in the National Basketball Association Using Player Tracking Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent player tracking technology provides new information about basketball game performance. The aim of this study was to (i) compare the game performances of all-star and non all-star basketball players from the National Basketball Association (NBA), and (ii) describe the different basketball game performance profiles based on the different game roles. Archival data were obtained from all 2013-2014 regular season games (n = 1230). The variables analyzed included the points per game, minutes played and the game actions recorded by the player tracking system. To accomplish the first aim, the performance per minute of play was analyzed using a descriptive discriminant analysis to identify which variables best predict the all-star and non all-star playing categories. The all-star players showed slower velocities in defense and performed better in elbow touches, defensive rebounds, close touches, close points and pull-up points, possibly due to optimized attention processes that are key for perceiving the required appropriate environmental information. The second aim was addressed using a k-means cluster analysis, with the aim of creating maximal different performance profile groupings. Afterwards, a descriptive discriminant analysis identified which variables best predict the different playing clusters. The results identified different playing profile of performers, particularly related to the game roles of scoring, passing, defensive and all-round game behavior. Coaching staffs may apply this information to different players, while accounting for individual differences and functional variability, to optimize practice planning and, consequently, the game performances of individuals and teams.

Veja mais

9 resultados para Data access

Filtro por publicador