965 resultados para data analytics
                                
                                
Resumo:
Imaging mass spectrometry (IMS) represents an innovative tool in the cancer research pipeline, which is increasingly being used in clinical and pharmaceutical applications. The unique properties of the technique, especially the amount of data generated, make the handling of data from multiple IMS acquisitions challenging. This work presents a histology-driven IMS approach aiming to identify discriminant lipid signatures from the simultaneous mining of IMS data sets from multiple samples. The feasibility of the developed workflow is evaluated on a set of three human colorectal cancer liver metastasis (CRCLM) tissue sections. Lipid IMS on tissue sections was performed using MALDI-TOF/TOF MS in both negative and positive ionization modes after 1,5-diaminonaphthalene matrix deposition by sublimation. The combination of both positive and negative acquisition results was performed during data mining to simplify the process and interrogate a larger lipidome into a single analysis. To reduce the complexity of the IMS data sets, a sub data set was generated by randomly selecting a fixed number of spectra from a histologically defined region of interest, resulting in a 10-fold data reduction. Principal component analysis confirmed that the molecular selectivity of the regions of interest is maintained after data reduction. Partial least-squares and heat map analyses demonstrated a selective signature of the CRCLM, revealing lipids that are significantly up- and down-regulated in the tumor region. This comprehensive approach is thus of interest for defining disease signatures directly from IMS data sets by the use of combinatory data mining, opening novel routes of investigation for addressing the demands of the clinical setting.
                                
Resumo:
La infraestructura europea ICOS (Integrated Carbon Observation System), tiene como misión proveer de mediciones de gases de efecto invernadero a largo plazo, lo que ha de permitir estudiar el estado actual y comportamiento futuro del ciclo global del carbono. En este contexto, geomati.co ha desarrollado un portal de búsqueda y descarga de datos que integra las mediciones realizadas en los ámbitos terrestre, marítimo y atmosférico, disciplinas que hasta ahora habían gestionado los datos de forma separada. El portal permite hacer búsquedas por múltiples ámbitos geográficos, por rango temporal, por texto libre o por un subconjunto de magnitudes, realizar vistas previas de los datos, y añadir los conjuntos de datos que se crean interesantes a un “carrito” de descargas. En el momento de realizar la descarga de una colección de datos, se le asignará un identificador universal que permitirá referenciarla en eventuales publicaciones, y repetir su descarga en el futuro (de modo que los experimentos publicados sean reproducibles). El portal se apoya en formatos abiertos de uso común en la comunidad científica, como el formato NetCDF para los datos, y en el perfil ISO de CSW, estándar de catalogación y búsqueda propio del ámbito geoespacial. El portal se ha desarrollado partiendo de componentes de software libre existentes, como Thredds Data Server, GeoNetwork Open Source y GeoExt, y su código y documentación quedarán publicados bajo una licencia libre para hacer posible su reutilización en otros proyecto
                                
Resumo:
This is one of the few studies that have explored the value of baseline symptoms and health-related quality of life (HRQOL) in predicting survival in brain cancer patients. Baseline HRQOL scores (from the EORTC QLQ-C30 and the Brain Cancer Module (BN 20)) were examined in 490 newly diagnosed glioblastoma cancer patients for the relationship with overall survival by using Cox proportional hazards regression models. Refined techniques as the bootstrap re-sampling procedure and the computation of C-indexes and R(2)-coefficients were used to try and validate the model. Classical analysis controlled for major clinical prognostic factors selected cognitive functioning (P=0.0001), global health status (P=0.0055) and social functioning (P<0.0001) as statistically significant prognostic factors of survival. However, several issues question the validity of these findings. C-indexes and R(2)-coefficients, which are measures of the predictive ability of the models, did not exhibit major improvements when adding selected or all HRQOL scores to clinical factors. While classical techniques lead to positive results, more refined analyses suggest that baseline HRQOL scores add relatively little to clinical factors to predict survival. These results may have implications for future use of HRQOL as a prognostic factor in cancer patients.
                                
Resumo:
BACKGROUND Patients with chronic obstructive pulmonary disease (COPD) have a modified clinical presentation of venous thromboembolism (VTE) but also a worse prognosis than non-COPD patients with VTE. As it may induce therapeutic modifications, we evaluated the influence of the initial VTE presentation on the 3-month outcomes in COPD patients. METHODS COPD patients included in the on-going world-wide RIETE Registry were studied. The rate of pulmonary embolism (PE), major bleeding and death during the first 3 months in COPD patients were compared according to their initial clinical presentation (acute PE or deep vein thrombosis (DVT)). RESULTS Of the 4036 COPD patients included, 2452 (61%; 95% CI: 59.2-62.3) initially presented with PE. PE as the first VTE recurrence occurred in 116 patients, major bleeding in 101 patients and mortality in 443 patients (Fatal PE: first cause of death). Multivariate analysis confirmed that presenting with PE was associated with higher risk of VTE recurrence as PE (OR, 2.04; 95% CI: 1.11-3.72) and higher risk of fatal PE (OR, 7.77; 95% CI: 2.92-15.7). CONCLUSIONS COPD patients presenting with PE have an increased risk for PE recurrences and fatal PE compared with those presenting with DVT alone. More efficient therapy is needed in this subtype of patients.
                                
Resumo:
Objectives. To study the utility of the Mini-Cog test for detection of patients with cognitive impairment (CI) in primary care (PC). Methods. We pooled data from two phase III studies conducted in Spain. Patients with complaints or suspicion of CI were consecutively recruited by PC physicians. The cognitive diagnosis was performed by an expert neurologist, after formal neuropsychological evaluation. The Mini-Cog score was calculated post hoc, and its diagnostic utility was evaluated and compared with the utility of the Mini-Mental State (MMS), the Clock Drawing Test (CDT), and the sum of the MMS and the CDT (MMS + CDT) using the area under the receiver operating characteristic curve (AUC). The best cut points were obtained on the basis of diagnostic accuracy (DA) and kappa index. Results. A total sample of 307 subjects (176 CI) was analyzed. The Mini-Cog displayed an AUC (±SE) of 0.78 ± 0.02, which was significantly inferior to the AUC of the CDT (0.84 ± 0.02), the MMS (0.84 ± 0.02), and the MMS + CDT (0.86 ± 0.02). The best cut point of the Mini-Cog was 1/2 (sensitivity 0.60, specificity 0.90, DA 0.73, and kappa index 0.48 ± 0.05). Conclusions. The utility of the Mini-Cog for detection of CI in PC was very modest, clearly inferior to the MMS or the CDT. These results do not permit recommendation of the Mini-Cog in PC.
                                
Resumo:
Given the very large amount of data obtained everyday through population surveys, much of the new research again could use this information instead of collecting new samples. Unfortunately, relevant data are often disseminated into different files obtained through different sampling designs. Data fusion is a set of methods used to combine information from different sources into a single dataset. In this article, we are interested in a specific problem: the fusion of two data files, one of which being quite small. We propose a model-based procedure combining a logistic regression with an Expectation-Maximization algorithm. Results show that despite the lack of data, this procedure can perform better than standard matching procedures.
                                
Resumo:
Early immunological data, obtained by immunodiffusion and immunoelectrophoresis, on the whole-cell antigenicity of kinetoplastid protozoa were retrieved and used to construct a dendrogram of antigenic distances. Remarkably, they supported the same taxonomic conclusions as analyses based on DNA and protein sequence data.
                                
Resumo:
Nowadays, there are several services and applications that allow users to locate and move to different tourist areas using a mobile device. These systems can be used either by internet or downloading an application in concrete places like a visitors centre. Although such applications are able to facilitate the location and the search for points of interest, in most cases, these services and applications do not meet the needs of each user. This paper aims to provide a solution by studying the main projects, services and applications, their routing algorithms and their treatment of the real geographical data in Android mobile devices, focusing on the data acquisition and treatment to improve the routing searches in off-line environments.
                                
Resumo:
In this project a research both in finding predictors via clustering techniques and in reviewing the Data Mining free software is achieved. The research is based in a case of study, from where additionally to the KDD free software used by the scientific community; a new free tool for pre-processing the data is presented. The predictors are intended for the e-learning domain as the data from where these predictors have to be inferred are student qualifications from different e-learning environments. Through our case of study not only clustering algorithms are tested but also additional goals are proposed.
                                
Resumo:
Purpose: HIV-infected patients present an increased cardiovascular risk (CVR) of multifactorial origin, usually lower in women than in men. Information by gender about prevalence of modifiable risk factors is scarce. Methods: Coronator is a cross-sectional survey of a representative sample of HIV-infected patients on ART within 10 hospitals across Spain in 2011. Variables include sociodemographics, CVR factors and 10-year CV disease risk estimation (Regicor: Framingham score adapted to the Spanish population). Results: We included 860 patients (76.3% male) with no history of CVD. Median age 45.6 years; 84.1% were Spaniards; 29.9% women were IDUs. Median time since HIV diagnosis for men and women was 10 and 13 years (p=0.001), 28% had an AIDS diagnosis. Median CD4 cell count was 596 cells/mm3, 88% had undetectable viral load. Median time on ART was 91 and 108 months (p=0.017). There was a family history of early CVD in 113 men (17.9%) and 41 women (20.6%). Classical CVR factors are described in the table. Median (IQR) Regicor Score was 3% (2-5) for men and 2% (1-3) for women (p=0.000), and the proportion of subjects with mid-high risk (>5%) was 26.1% for men and 9.4% for women (p=0.000). Conclusions: In this population of HIV-infected patients, women have lower cardiovascular risk than men, partly due to higher levels of HDL cholesterol. Of note is the high frequency of smoking, abdominal obesity and sedentary lifestyle in our population. (Table Presented).
                                
Resumo:
Despite their limited proliferation capacity, regulatory T cells (T(regs)) constitute a population maintained over the entire lifetime of a human organism. The means by which T(regs) sustain a stable pool in vivo are controversial. Using a mathematical model, we address this issue by evaluating several biological scenarios of the origins and the proliferation capacity of two subsets of T(regs): precursor CD4(+)CD25(+)CD45RO(-) and mature CD4(+)CD25(+)CD45RO(+) cells. The lifelong dynamics of T(regs) are described by a set of ordinary differential equations, driven by a stochastic process representing the major immune reactions involving these cells. The model dynamics are validated using data from human donors of different ages. Analysis of the data led to the identification of two properties of the dynamics: (1) the equilibrium in the CD4(+)CD25(+)FoxP3(+)T(regs) population is maintained over both precursor and mature T(regs) pools together, and (2) the ratio between precursor and mature T(regs) is inverted in the early years of adulthood. Then, using the model, we identified three biologically relevant scenarios that have the above properties: (1) the unique source of mature T(regs) is the antigen-driven differentiation of precursors that acquire the mature profile in the periphery and the proliferation of T(regs) is essential for the development and the maintenance of the pool; there exist other sources of mature T(regs), such as (2) a homeostatic density-dependent regulation or (3) thymus- or effector-derived T(regs), and in both cases, antigen-induced proliferation is not necessary for the development of a stable pool of T(regs). This is the first time that a mathematical model built to describe the in vivo dynamics of regulatory T cells is validated using human data. The application of this model provides an invaluable tool in estimating the amount of regulatory T cells as a function of time in the blood of patients that received a solid organ transplant or are suffering from an autoimmune disease.
                                
                                
Resumo:
Background Multiple logistic regression is precluded from many practical applications in ecology that aim to predict the geographic distributions of species because it requires absence data, which are rarely available or are unreliable. In order to use multiple logistic regression, many studies have simulated "pseudo-absences" through a number of strategies, but it is unknown how the choice of strategy influences models and their geographic predictions of species. In this paper we evaluate the effect of several prevailing pseudo-absence strategies on the predictions of the geographic distribution of a virtual species whose "true" distribution and relationship to three environmental predictors was predefined. We evaluated the effect of using a) real absences b) pseudo-absences selected randomly from the background and c) two-step approaches: pseudo-absences selected from low suitability areas predicted by either Ecological Niche Factor Analysis: (ENFA) or BIOCLIM. We compared how the choice of pseudo-absence strategy affected model fit, predictive power, and information-theoretic model selection results. Results Models built with true absences had the best predictive power, best discriminatory power, and the "true" model (the one that contained the correct predictors) was supported by the data according to AIC, as expected. Models based on random pseudo-absences had among the lowest fit, but yielded the second highest AUC value (0.97), and the "true" model was also supported by the data. Models based on two-step approaches had intermediate fit, the lowest predictive power, and the "true" model was not supported by the data. Conclusion If ecologists wish to build parsimonious GLM models that will allow them to make robust predictions, a reasonable approach is to use a large number of randomly selected pseudo-absences, and perform model selection based on an information theoretic approach. However, the resulting models can be expected to have limited fit.
                                
Resumo:
It has been demonstrated in earlier studies that patients with a cochlear implant have increased abilities for audio-visual integration because the crude information transmitted by the cochlear implant requires the persistent use of the complementary speech information from the visual channel. The brain network for these abilities needs to be clarified. We used an independent components analysis (ICA) of the activation (H2 (15) O) positron emission tomography data to explore occipito-temporal brain activity in post-lingually deaf patients with unilaterally implanted cochlear implants at several months post-implantation (T1), shortly after implantation (T0) and in normal hearing controls. In between-group analysis, patients at T1 had greater blood flow in the left middle temporal cortex as compared with T0 and normal hearing controls. In within-group analysis, patients at T0 had a task-related ICA component in the visual cortex, and patients at T1 had one task-related ICA component in the left middle temporal cortex and the other in the visual cortex. The time courses of temporal and visual activities during the positron emission tomography examination at T1 were highly correlated, meaning that synchronized integrative activity occurred. The greater involvement of the visual cortex and its close coupling with the temporal cortex at T1 confirm the importance of audio-visual integration in more experienced cochlear implant subjects at the cortical level.
 
                    