956 resultados para Data Sets
Resumo:
Adolescent health surveys, like those for other segments of the population, tend to remain in the hands of researchers, where they can have no real impact on the way critical health issues are dealt with by policy makers or other professionals directly connected to young people in their everyday work. This paper reviews important issues concerning the dissemination of survey results among professionals from various fields. The content, length and wording of the messages should be tailored to the audience one wants to reach as well as the type of channels used for their diffusion. Survey data sets can be used to select priorities for interventions: ad hoc presentations, attractive summaries and brochures, or even films expressing young peoples' opinions have been used by European public health professionals to make data sets usable in various local, regional and national contexts. CONCLUSION: The impact of these diffusion strategies is, however, difficult to assess and needs to be refined. The adequate delivery of survey findings as well as advocacy and lobbying activities require specific skills which can be endorsed by specialized professionals. Ultimately, it is the researchers' responsibility to ensure that such tasks are effectively performed.
Resumo:
Aim, Location Although the alpine mouse Apodemus alpicola has been given species status since 1989, no distribution map has ever been constructed for this endemic alpine rodent in Switzerland. Based on redetermined museum material and using the Ecological-Niche Factor Analysis (ENFA), habitat-suitability maps were computed for A. alpicola, and also for the co-occurring A. flavicollis and A. sylvaticus. Methods In the particular case of habitat suitability models, classical approaches (GLMs, GAMs, discriminant analysis, etc.) generally require presence and absence data. The presence records provided by museums can clearly give useful information about species distribution and ecology and have already been used for knowledge-based mapping. In this paper, we apply the ENFA which requires only presence data, to build a habitat-suitability map of three species of Apodemus on the basis of museum skull collections. Results Interspecific niche comparisons showed that A. alpicola is very specialized concerning habitat selection, meaning that its habitat differs unequivocally from the average conditions in Switzerland, while both A. flavicollis and A. sylvaticus could be considered as 'generalists' in the study area. Main conclusions Although an adequate sampling design is the best way to collect ecological data for predictive modelling, this is a time and money consuming process and there are cases where time is simply not available, as for instance with endangered species conservation. On the other hand, museums, herbariums and other similar institutions are treasuring huge presence data sets. By applying the ENFA to such data it is possible to rapidly construct a habitat suitability model. The ENFA method not only provides two key measurements regarding the niche of a species (i.e. marginality and specialization), but also has ecological meaning, and allows the scientist to compare directly the niches of different species.
Resumo:
The biplot has proved to be a powerful descriptive and analytical tool in many areasof applications of statistics. For compositional data the necessary theoreticaladaptation has been provided, with illustrative applications, by Aitchison (1990) andAitchison and Greenacre (2002). These papers were restricted to the interpretation ofsimple compositional data sets. In many situations the problem has to be described insome form of conditional modelling. For example, in a clinical trial where interest isin how patients’ steroid metabolite compositions may change as a result of differenttreatment regimes, interest is in relating the compositions after treatment to thecompositions before treatment and the nature of the treatments applied. To study thisthrough a biplot technique requires the development of some form of conditionalcompositional biplot. This is the purpose of this paper. We choose as a motivatingapplication an analysis of the 1992 US President ial Election, where interest may be inhow the three-part composition, the percentage division among the three candidates -Bush, Clinton and Perot - of the presidential vote in each state, depends on the ethniccomposition and on the urban-rural composition of the state. The methodology ofconditional compositional biplots is first developed and a detailed interpretation of the1992 US Presidential Election provided. We use a second application involving theconditional variability of tektite mineral compositions with respect to major oxidecompositions to demonstrate some hazards of simplistic interpretation of biplots.Finally we conjecture on further possible applications of conditional compositionalbiplots
Resumo:
PURPOSE: There is growing evidence that interaction between stromal and tumor cells is pivotal in breast cancer progression and response to therapy. Based on earlier research suggesting that during breast cancer progression, striking changes occur in CD10(+) stromal cells, we aimed to better characterize this cell population and its clinical relevance. EXPERIMENTAL DESIGN: We developed a CD10(+) stroma gene expression signature (using HG U133 Plus 2.0) on the basis of the comparison of CD10 cells isolated from tumoral (n = 28) and normal (n = 3) breast tissue. We further characterized the CD10(+) cells by coculture experiments of representative breast cancer cell lines with the different CD10(+) stromal cell types (fibroblasts, myoepithelial, and mesenchymal stem cells). We then evaluated its clinical relevance in terms of in situ to invasive progression, invasive breast cancer prognosis, and prediction of efficacy of chemotherapy using publicly available data sets. RESULTS: This 12-gene CD10(+) stroma signature includes, among others, genes involved in matrix remodeling (MMP11, MMP13, and COL10A1) and genes related to osteoblast differentiation (periostin). The coculture experiments showed that all 3 CD10(+) cell types contribute to the CD10(+) stroma signature, although mesenchymal stem cells have the highest CD10(+) stroma signature score. Of interest, this signature showed an important role in differentiating in situ from invasive breast cancer, in prognosis of the HER2(+) subpopulation of breast cancer only, and potentially in nonresponse to chemotherapy for those patients. CONCLUSIONS: Our results highlight the importance of CD10(+) cells in breast cancer prognosis and efficacy of chemotherapy, particularly within the HER2(+) breast cancer disease.
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
The aim of this study was to determine the impact of the learning curve on the diagnostic performances of CT colonography. Two blinded teams, each having a radiologist and gastroenterologist, prospectively examined 50 patients using helical CT scan followed by colonoscopy. Intermediate data evaluation was performed after 24 data sets (group 1) and compared with data from 26 subsequent patients (group 2). Parameters evaluated included sensitivity, specificity, false-positive and false-negative findings, time of data acquisition and interpretation. Using colonoscopy as the gold standard, sensitivity for CT colonography was for lesions >5 mm 63% for both teams for group 1 patients; for group 2 patients sensitivity was 45% for team 1 and 64% for team 2. Specificity per patients was for patient group 1 42% for team 1 and 58% for team 2; for patient group 2 it was 79% for both teams ( p=0.04 for team 1; p=0.2 for team 2). Comparing group 1 with group 2, the number of false-positive findings decreased significantly ( p=0.02). Furthermore, the mean time of data evaluation decreased from 45 to 17 min ( p=0.002) and the mean time of data acquisition from 19 to 17 min. With increasing experience, specificity and the time required for data interpretation improved and false positives decreased. There was no significant change of sensitivity, false-negative findings and time of data acquisition. A minimum experience of the readers is required for data interpretation of CT colonography.
Resumo:
Background: To report a single-center experience in 19 patients (pts) with anal canal cancer treated with helical tomotherapy (HT) and concurrent chemotherapy, and compare the dosimetric results with fixed-field intensitymodulated radiotherapy (IMRT) and 3D conformal radiotherapy (3D RT). Materials and Methods: Between 2007 and 2008, 19 consecutive pts were treated with HT and concurrent CT for anal canal cancer. Median age was 59 years (range, 38−83), and female/male ratio was 14/5. The majority of the pts had T2 or T3 tumours (68.4%), and 52.6% had positive lymph nodes. In all 19 pts, pelvic and inguinal nodes, and tumour irradiation was given using HT upto a median dose of 36 Gy (1.8 Gy/fr) followed by a 1-week gap. A boost dose of 23.4 Gy (1.8 Gy/fr) was delivered to the tumour and involved nodes using 3DRT (n = 12), HT (n = 6), or IMRT (n = 1). Simultaneous integrated boost was used in none of the pts. All but one patient with a T1N0 tumour received concomitant mitomycin/5- fluorouracil (n = 12) or mitomycin/capecitabin (n = 7) CT. Toxicity was scored according to the Common Terminology Criteria for Adverse Events (NCICTCAE v3.0). HT plans and treatments were generated using Tomotherapy, Inc., software and hardware; and 3D or IMRT boost plans with the CMS treatment planning system (TPS), using 6−18 MV photons from a Siemens Primus accelerator. For dosimetric comparison, computed tomography data sets of 10 pts were imported into the TPS, and 3D and 5-field step-andshoot IMRT plans were generated for each case. Plans were optimized with the aim of assessing organs at risk (OAR) and healthy-tissue sparing while enforcing highly conformal target coverage, and evaluated by dose-volume histograms (DVH) of planning target volumes (PTV) and OAR. Results: With a median follow-up of 13 months (range, 3−18), all pts are alive and well; except one patient developing local recurrence at 12 months. No patient developed grade 3 or more acute toxicity. No unplanned treatment interruption was necessary because of toxicity. With 360-degree-of-freedom beam projection, HT showed an advantage over 3D or IMRT plans in terms of dose conformity around the PTV, and dose gradients were steeper outside the PTV, resulting in reduced doses to OARs. Using HT, acute toxicity was acceptable, and seemed to be better than historical standards. Conclusion: We conclude that HT combined with concurrent chemotherapy for anal canal cancer is effective and tolerable. Compared to 3DRT or 5-field IMRT, there is better conformity around the PTV, and OAR sparing.
Resumo:
PURPOSE: To assess the diagnostic performance of respiratory self-navigation for whole-heart coronary magnetic resonance (MR) angiography in a patient cohort referred for diagnostic cardiac MR imaging. MATERIALS AND METHODS: Written informed consent was obtained from all participants for this institutional review board-approved study. Self-navigated coronary MR angiography was performed after administration of a contrast agent in 78 patients (mean age, 48.5 years ± 20.7 [standard deviation]; 53 male patients) referred for cardiac MR imaging because of coronary artery disease (n = 40), cardiomyopathy (n = 14), congenital anomaly (n = 17), or "other" (n = 7). Examination duration was recorded, and the image quality for each coronary segment was assessed with consensus reading. Vessel sharpness, length, and diameter were measured. Quantitative values in proximal, middle, and distal segments were compared by using analysis of variance and t tests. A double-blinded comparison with the results of x-ray angiography was performed when such results were available. RESULTS: When patients with different indications for cardiac MR imaging were examined with self-navigated postcontrast coronary MR angiography, whole-heart data sets with 1.15-mm isotropic spatial resolution were acquired in an average of 7.38 minutes ± 1.85. The main and proximal coronary segments could be visualized in 92.3% of cases, while the middle and distal segments could be visualized in 84.0% and 55.8% of cases, respectively. Subjective scores and vessel sharpness were significantly higher in the proximal segments than in the middle and distal segments (P < .05). Anomalies of the coronary arteries could be confirmed or excluded in all cases. Per-vessel sensitivity and specificity for stenosis detection were 64.7% and 85.0%, respectively, in the 31 patients for whom reference standard x-ray coronary angiography results were available. CONCLUSION: The self-navigated coronary MR angiography sequence shows promise for coronary imaging. However, technical improvements are needed to improve image quality, especially in the more distal coronary segments.
Resumo:
PURPOSE: To evaluate and validate mRNA expression markers capable of identifying patients with ErbB2-positive breast cancer associated with distant metastasis and reduced survival. PATIENTS AND METHODS: Expression of 60 genes involved in breast cancer biology was assessed by quantitative real-time PCR (qrt-PCR) in 317 primary breast cancer patients and correlated with clinical outcome data. Results were validated subsequently using two previously published and publicly available microarray data sets with different patient populations comprising 295 and 286 breast cancer samples, respectively. RESULTS: Of the 60 genes measured by qrt-PCR, urokinase-type plasminogen activator (uPA or PLAU) mRNA expression was the most significant marker associated with distant metastasis-free survival (MFS) by univariate Cox analysis in patients with ErbB2-positive tumors and an independent factor in multivariate analysis. Subsequent validation in two microarray data sets confirmed the prognostic value of uPA in ErbB2-positive tumors by both univariate and multivariate analysis. uPA mRNA expression was not significantly associated with MFS in ErbB2-negative tumors. Kaplan-Meier analysis showed in all three study populations that patients with ErbB2-positive/uPA-positive tumors exhibited significantly reduced MFS (hazard ratios [HR], 4.3; 95% CI, 1.6 to 11.8; HR, 2.7; 95% CI, 1.2 to 6.2; and, HR, 2.8; 95% CI, 1.1 to 7.1; all P < .02) as compared with the group with ErbB2-positive/uPA-negative tumors who exhibited similar outcome to those with ErbB2-negative tumors, irrespective of uPA status. CONCLUSION: After evaluation of 898 breast cancer patients, uPA mRNA expression emerged as a powerful prognostic indicator in ErbB2-positive tumors. These results were consistent among three independent study populations assayed by different techniques, including qrt-PCR and two microarray platforms.
Resumo:
The R-package “compositions”is a tool for advanced compositional analysis. Its basicfunctionality has seen some conceptual improvement, containing now some facilitiesto work with and represent ilr bases built from balances, and an elaborated subsys-tem for dealing with several kinds of irregular data: (rounded or structural) zeroes,incomplete observations and outliers. The general approach to these irregularities isbased on subcompositions: for an irregular datum, one can distinguish a “regular” sub-composition (where all parts are actually observed and the datum behaves typically)and a “problematic” subcomposition (with those unobserved, zero or rounded parts, orelse where the datum shows an erratic or atypical behaviour). Systematic classificationschemes are proposed for both outliers and missing values (including zeros) focusing onthe nature of irregularities in the datum subcomposition(s).To compute statistics with values missing at random and structural zeros, a projectionapproach is implemented: a given datum contributes to the estimation of the desiredparameters only on the subcompositon where it was observed. For data sets withvalues below the detection limit, two different approaches are provided: the well-knownimputation technique, and also the projection approach.To compute statistics in the presence of outliers, robust statistics are adapted to thecharacteristics of compositional data, based on the minimum covariance determinantapproach. The outlier classification is based on four different models of outlier occur-rence and Monte-Carlo-based tests for their characterization. Furthermore the packageprovides special plots helping to understand the nature of outliers in the dataset.Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator,robustness, rounded zeros
Resumo:
Understanding the distribution and composition of species assemblages and being able to predict them in space and time are highly important tasks io investigate the fate of biodiversity in the current global changes context. Species distribution models are tools that have proven useful to predict the potential distribution of species by relating their occurrences to environmental variables. Species assemblages can then be predicted by combining the prediction of individual species models. In the first part of my thesis, I tested the importance of new environmental predictors to improve species distribution prediction. I showed that edaphic variables, above all soil pH and nitrogen content could be important in species distribution models. In a second chapter, I tested the influence of different resolution of predictors on the predictive ability of species distribution models. I showed that fine resolution predictors could ameliorate the models for some species by giving a better estimation of the micro-topographic condition that species tolerate, but that fine resolution predictors for climatic factors still need to be ameliorated. The second goal of my thesis was to test the ability of empirical models to predict species assemblages' characteristics such as species richness or functional attributes. I showed that species richness could be modelled efficiently and that the resulting prediction gave a more realistic estimate of the number of species than when obtaining it by stacking outputs of single species distribution models. Regarding the prediction of functional characteristics (plant height, leaf surface, seed mass) of plant assemblages, mean and extreme values of functional traits were better predictable than indices reflecting the diversity of traits in the community. This approach proved interesting to understand which environmental conditions influence particular aspects of the vegetation functioning. It could also be useful to predict climate change impacts on the vegetation. In the last part of my thesis, I studied the capacity of stacked species distribution models to predict the plant assemblages. I showed that this method tended to over-predict the number of species and that the composition of the community was not predicted exactly either. Finally, I combined the results of macro- ecological models obtained in the preceding chapters with stacked species distribution models and showed that this approach reduced significantly the number of species predicted and that the prediction of the composition is also ameliorated in some cases. These results showed that this method is promising. It needs now to be tested on further data sets. - Comprendre la manière dont les plantes se répartissent dans l'environnement et s'organisent en communauté est une question primordiale dans le contexte actuel de changements globaux. Cette connaissance peut nous aider à sauvegarder la diversité des espèces et les écosystèmes. Des méthodes statistiques nous permettent de prédire la distribution des espèces de plantes dans l'espace géographique et dans le temps. Ces modèles de distribution d'espèces, relient les occurrences d'une espèce avec des variables environnementales pour décrire sa distribution potentielle. Cette méthode a fait ses preuves pour ce qui est de la prédiction d'espèces individuelles. Plus récemment plusieurs tentatives de cumul de modèles d'espèces individuelles ont été réalisées afin de prédire la composition des communautés végétales. Le premier objectif de mon travail est d'améliorer les modèles de distribution en testant l'importance de nouvelles variables prédictives. Parmi différentes variables édaphiques, le pH et la teneur en azote du sol se sont avérés des facteurs non négligeables pour prédire la distribution des plantes. Je démontre aussi dans un second chapitre que les prédicteurs environnementaux à fine résolution permettent de refléter les conditions micro-topographiques subies par les plantes mais qu'ils doivent encore être améliorés avant de pouvoir être employés de manière efficace dans les modèles. Le deuxième objectif de ce travail consistait à étudier le développement de modèles prédictifs pour des attributs des communautés végétales tels que, par exemple, la richesse en espèces rencontrée à chaque point. Je démontre qu'il est possible de prédire par ce biais des valeurs de richesse spécifiques plus réalistes qu'en sommant les prédictions obtenues précédemment pour des espèces individuelles. J'ai également prédit dans l'espace et dans le temps des caractéristiques de la végétation telles que sa hauteur moyenne, minimale et maximale. Cette approche peut être utile pour comprendre quels facteurs environnementaux promeuvent différents types de végétation ainsi que pour évaluer les changements à attendre au niveau de la végétation dans le futur sous différents régimes de changements climatiques. Dans une troisième partie de ma thèse, j'ai exploré la possibilité de prédire les assemblages de plantes premièrement en cumulant les prédictions obtenues à partir de modèles individuels pour chaque espèce. Cette méthode a le défaut de prédire trop d'espèces par rapport à ce qui est observé en réalité. J'ai finalement employé le modèle de richesse en espèce développé précédemment pour contraindre les résultats du modèle d'assemblage de plantes. Cela a permis l'amélioration des modèles en réduisant la sur-prédiction et en améliorant la prédiction de la composition en espèces. Cette méthode semble prometteuse mais de nouveaux tests sont nécessaires pour bien évaluer ses capacités.
Resumo:
Geographical Information Systems (GIS) facilitate access to epidemiological data through visualization and may be consulted for the development of mathematical models and analysis by spatial statistics. Variables such as land-cover, land-use, elevations, surface temperatures, rainfall etc. emanating from earth-observing satellites, complement GIS as this information allows the analysis of disease distribution based on environmental characteristics. The strength of this approach issues from the specific environmental requirements of those causative infectious agents, which depend on intermediate hosts for their transmission. The distribution of these diseases is restricted, both by the environmental requirements of their intermediate hosts/vectors and by the ambient temperature inside these hosts, which effectively govern the speed of maturation of the parasite. This paper discusses the current capabilities with regard to satellite data collection in terms of resolution (spatial, temporal and spectral) of the sensor instruments on board drawing attention to the utility of computer-based models of the Earth for epidemiological research. Virtual globes, available from Google and other commercial firms, are superior to conventional maps as they do not only show geographical and man-made features, but also allow instant import of data-sets of specific interest, e.g. environmental parameters, demographic information etc., from the Internet.
Resumo:
In 1991, the World Health Organization (WHO) committed to reducing the prevalence of leprosy to below 1 in 10,000 inhabitants by 2000. Significant improvements in leprosy control have occurred, but leprosy remains a public health problem in many countries due to its high incidence and rate of transmission. This paper reviews data published by the WHO in the years 2000, 2005 and 2010. These data sets included 148 countries or territories that reported to the WHO at least once. Only four countries reported higher prevalence rates in 2010 than in 2000 and eight reported higher case detection rate (CDR) in 2009 than in 1999. Prevalence rate reductions were greater for the first five-year period examined, while CDR reductions were greater in the second five-year period. Thirty-six countries and territories reported at least one prevalence value higher than 1 per 10,000 inhabitants and 32 reported at least one CDR value higher than 9 per 100,000 inhabitants. A total of 39 countries fit at least one of these criteria and all were located in tropical regions.
Resumo:
Synchrotron radiation X-ray tomographic microscopy is a nondestructive method providing ultra-high-resolution 3D digital images of rock microstructures. We describe this method and, to demonstrate its wide applicability, we present 3D images of very different rock types: Berea sandstone, Fontainebleau sandstone, dolomite, calcitic dolomite, and three-phase magmatic glasses. For some samples, full and partial saturation scenarios are considered using oil, water, and air. The rock images precisely reveal the 3D rock microstructure, the pore space morphology, and the interfaces between fluids saturating the same pore. We provide the raw image data sets as online supplementary material, along with laboratory data describing the rock properties. By making these data sets available to other research groups, we aim to stimulate work based on digital rock images of high quality and high resolution. We also discuss and suggest possible applications and research directions that can be pursued on the basis of our data.
Resumo:
We want to shed some light on the development of person mobility by analysing the repeated cross-sectional data of the four National Travel Surveys (NTS) that were conducted in Germany since the mid seventies. The above mentioned driving forces operate on different levels of the system that generates the spatial behaviour we observe: Travel demand is derived from the needs and desires of individuals to participate in spatially separated activities. Individuals organise their lives in an interactive process within the context they live in, using given infrastructure. Essential determinants of their demand are the individual's socio-demographic characteristics, but also the opportunities and constraints defined by the household and the environment are relevant for the behaviour which ultimately can be realised. In order to fully capture the context which determines individual behaviour, the (nested) hierarchy of persons within households within spatial settings has to be considered. The data we will use for our analysis contains information on these three levels. With the analysis of this micro-data we attempt to improve our understanding of the afore subsumed macro developments. In addition we will investigate the prediction power of a few classic sociodemographic variables for the daily travel distance of individuals in the four NTS data sets, with a focus on the evolution of this predictive power. The additional task to correctly measure distances travelled by means of the NTS is threatened by the fact that although these surveys measure the same variables, different sampling designs and data collection procedures were used. So the aim of the analysis is also to detect variables whose control corrects for the known measurement error, as a prerequisite to apply appropriate models in order to better understand the development of individual travel behaviour in a multilevel context. This task is complicated by the fact that variables that inform on survey procedures and outcomes are only provided with the data set for 2002 (see Infas and DIW Berlin, 2003).