962 resultados para Clustering a large document collection
Resumo:
Many classification systems rely on clustering techniques in which a collection of training examples is provided as an input, and a number of clusters c1,...cm modelling some concept C results as an output, such that every cluster ci is labelled as positive or negative. Given a new, unlabelled instance enew, the above classification is used to determine to which particular cluster ci this new instance belongs. In such a setting clusters can overlap, and a new unlabelled instance can be assigned to more than one cluster with conflicting labels. In the literature, such a case is usually solved non-deterministically by making a random choice. This paper presents a novel, hybrid approach to solve this situation by combining a neural network for classification along with a defeasible argumentation framework which models preference criteria for performing clustering.
Resumo:
Zonal management in vineyards requires the prior delineation of stable yield zones within the parcel. Among the different methodologies used for zone delineation, cluster analysis of yield data from several years is one of the possibilities cited in scientific literature. However, there exist reasonable doubts concerning the cluster algorithm to be used and the number of zones that have to be delineated within a field. In this paper two different cluster algorithms have been compared (k-means and fuzzy c-means) using the grape yield data corresponding to three successive years (2002, 2003 and 2004), for a ‘Pinot Noir’ vineyard parcel. Final choice of the most recommendable algorithm has been linked to obtaining a stable pattern of spatial yield distribution and to allowing for the delineation of compact and average sized areas. The general recommendation is to use reclassified maps of two clusters or yield classes (low yield zone and high yield zone) and, consequently, the site-specific vineyard management should be based on the prior delineation of just two different zones or sub-parcels. The two tested algorithms are good options for this purpose. However, the fuzzy c-means algorithm allows for a better zoning of the parcel, forming more compact areas and with more equilibrated zonal differences over time.
Resumo:
This paper presents an approach based on the saddle-point approximation to study the equilibrium interactions between small molecules and macromolecules with a large number of sites. For this case, the application of the Darwin–Fowler method results in very simple expressions for the stoichiometric equilibrium constants and their corresponding free energies in terms of integrals of the binding curve plus a correction term which depends on the first derivatives of the binding curve in the points corresponding to an integer value of the mean occupation number. These expressions are simplified when the number of sites tends to infinity, providing an interpretation of the binding curve in terms of the stoichiometric stability constants. The formalism presented is applied to some simple complexation models, obtaining good values for the free energies involved. When heterogeneous complexation is assumed, simple expressions are obtained to relate the macroscopic description of the binding, given by the stoichiomeric constants, with the microscopic description in terms of the intrinsic stability constants or the affinity spectrum. © 1999 American Institute of Physics.
Resumo:
BackgroundFacioscapulohumeral muscular dystrophy type 1(FSHD1) is an autosomal dominant disorder associated with the contraction of D4Z4 less than 11 repeat units (RUs) on chromosome 4q35. Penetrance in the range of the largest alleles is poorly known. Our objective was to study the penetrance of FSHD1 in patients carrying alleles ranging between 6 to10 RUs and to evaluate the influence of sex, age, and several environmental factors on clinical expression of the disease. Methods A cross-sectional multicenter study was conducted in six French and one Swiss neuromuscular centers. 65 FSHD1 affected patients carrying a 4qA allele of 6¿10 RUs were identified as index cases (IC) and their 119 at-risk relatives were included. The age of onset was recorded for IC only. Medical history, neurological examination and manual muscle testing were performed for each subject. Genetic testing determined the allele size (number of RUs) and the 4qA/4qB allelic variant. The clinical status of relatives was established blindly to their genetic testing results. The main outcome was the penetrance defined as the ratio between the number of clinically affected carriers and the total number of carriers. Results Among the relatives, 59 carried the D4Z4 contraction. At the clinical level, 34 relatives carriers were clinically affected and 25 unaffected. Therefore, the calculated penetrance was 57% in the range of 6¿10 RUs. Penetrance was estimated at 62% in the range of 6¿8 RUs, and at 47% in the range of 9¿10 RUs. Moreover, penetrance was lower in women than men. There was no effect of drugs, anesthesia, surgery or traumatisms on the penetrance. Conclusions Penetrance of FSHD1 is low for largest alleles in the range of 9¿10 RUs, and lower in women than men. This is of crucial importance for genetic counseling and clinical management of patients and families.
Resumo:
Our newly generated murine tumor dendritic cell (MuTuDC) lines, generated from tumors developing in transgenic mice expressing the simian virus 40 large T antigen (SV40LgT) and GFP under the DC specific promoter CD11c, reproduce the phenotypic and functional properties of splenic wild type CD8α(+) conventional DCs. They have an immature phenotype with low co-stimulation molecule expression (CD40, CD70, CD80, and CD86) that is upregulated after activation with toll-like receptor ligands. We observed that after transfer into syngeneic C57BL/6 mice, MuTuDC lines were quickly rejected. Tumors grew efficiently in large T transgene-tolerant mice. To investigate the immune response toward the large T antigen that leads to rejection of the MuTuDC lines, they were genetically engineered by lentiviral transduction to express luciferase and tested for the induction of DC tumors after adoptive transfer in various gene deficient recipient mice. Here, we document that the MuTuDC line was rejected in C57BL/6 mice by a CD4 T cell help-independent, perforin-mediated CD8 T cell response to the SV40LgT without pre-activation or co-injection of adjuvants. Using depleting anti-CD8β antibodies, we were able to induce efficient tumor growth in C57BL/6 mice. These results are important for researchers who want to use the MuTuDC lines for in vivo studies.
Resumo:
La station valaisanne de Crans-Montana est richement représentée par la photographie, la peinture, les affiches et l'architecture. Cette thèse de doctorat s'emploie à réunir un large corpus de photographies et de représentations : peintures, affiches, cartes postales et reproductions de bâtiments emblématiques (voir le corpus illustré et documentaire annexé). Les questions liées à l'identité du territoire et son image sont les fils conducteurs de ce travail qui a débuté en 2008. Un premier ensemble visuel a été réuni par le Dr Théodore Stephani (1868-1951), un acteur fondamental pour l'histoire de la naissance de la station. Médecin, mais également photographe, il réalise une collection de plus de 1300 clichés, réunie en six albums, sur une période de trente-sept ans (1899-1936). Les photographies du médecin, originaire de Genève, fondateur de ce lieu désormais touristique sont le point de départ de cette recherche et son fil rouge. Celle-ci tentera d'articuler des représentations sur l'évolution du paysage et l'urbanisation de la station autour d'acteurs illustres, tels que les peintres Ferdinand Hodler (1853-1918) et Albert Muret (1874-1955), l'écrivain Charles-Ferdinand Ramuz (1878-1947) et les nombreux hôteliers ou médecins qui ont marqué l'histoire de la naissance du Haut-Plateau. Les représentations débutent en 1896 car c'est à ce moment-là que le Dr Stephani s'établit à Montana. Les architectes les plus connus de la première période sont François-Casimir Besson (1869-1944), Markus Burgener (1878-1953), suivi de la deuxième génération autour de Jean-Marie Ellenberger (1913-1988), André Perraudin (1915-2014) et André Gaillard (1921-2010). Parallèlement ou avant eux, les peintres déjà cités, Ferdinand Hodler et Albert Muret, - suivis de René Auberjonois (1872-1957), Henri-Edouard Bercher (1877-1970), Charles-Clos Olsommer (1883-1966), Oskar Kokoschka (1886-1980), Albert Chavaz (1907¬1990), Paul Monnier (1907-1982) et Hans Emi (1909-2015) - qui appartiennent tous à l'histoire culturelle de la région. Quant aux écrivains qui ont résidé dans la région, nous citons Elizabeth von Arnim (1866-1941), sa cousine Katherine Mansfield (1888-1923) alors que l'oeuvre de Charles-Ferdinand Ramuz est largement développée par une interprétation de son oeuvre Le Règne de l'esprit malin (1917) et un clin d'oeil pour Igor Stravinsky (1882¬1971). Nous présenterons aussi les films de trois cinéastes qui se sont inspirés des oeuvres écrites par Ramuz lors de son passage à Lens, à savoir Dimitri Kirsanoff (1899-1957), Claude Goretta (1929) et Francis Reusser (1942). Le concept du « village » est abordé depuis l'exposition nationale suisse (1896) jusqu'au projet des investisseurs russes, à Aminona. Ce « village » est le deuxième mégaprojet de Suisse, après celui d'Andermatt. Si le projet se réalise, l'image de la station s'en trouvera profondément transformée. En 1998, la publication de Au bord de la falaise. L'histoire entre certitudes et inquiétudes amène une grande visibilité aux propositions de Roger Chartier, qui lie l'étude des textes aux objets matériels et les usages qu'ils engendrent dans la société. Il définit l'histoire culturelle comme "une histoire culturelle du social" alors que pour Pascal Ory, une histoire culturelle est "comme une forme d'histoire sociale", ce qui revient presque au même, mais nous choisirons celle d'Ory pour une histoire sociale du paysage et de l'architecture. Ce travail adopte ainsi plusieurs points de vue : l'histoire sociale, basée sur les interviews de nombreux protagonistes de l'histoire locale, et l'histoire de l'art qui permet une sélection d'objets emblématiques ; l'histoire culturelle offre ainsi une méthode transversale pour lire et relier ces différents regards ou points de vue entre les paysages, les arts visuels, l'architecture, la littérature et le cinéma.
Resumo:
During October 23rd and 24th and November 2012 we collected a sample of drosophilids at Font Groga (Barcelona). This site is located on the foothills of the Tibidabo mountain, which is located on the northwest edge of Barcelona and at approximately 400m above sea level. The vegetation is typical for the area, and it is mainly composed of a sparse pine forest (Pinus pinea) with some oaks (Quercus ilex) and Mediterranean brushwood. Flies were netted over 12 baits containing fermenting bananas. A large proportion of D. simulans males was found. The invasive species D. suzukii (Calabria et al. 2010; Cini et al. 2012) was detected in a non-negligible quantity. Taking into account the number of males and females, the estimated Ne for D. suzukii in the Font Groga sample was 33.70. A similar value was obtained for D. subobscura (34.97). Finally, in the study of species diversity the values obtained for H" (Shannon diversity index) and J (Shannon uniformity index) were 0.678 and 0.421, respectively. These estimates are very similar to those obtained in September 2009 in Montpellier by Calabria (2012), who reported H" = 0.679 and J = 0.422, but differ from those reported by the same author in a Font Groga sample of October 2007 (H" = 0.904 and J = 0.505).
Resumo:
Genome-wide association studies (GWAS) have identified more than 100 genetic variants contributing to BMI, a measure of body size, or waist-to-hip ratio (adjusted for BMI, WHRadjBMI), a measure of body shape. Body size and shape change as people grow older and these changes differ substantially between men and women. To systematically screen for age- and/or sex-specific effects of genetic variants on BMI and WHRadjBMI, we performed meta-analyses of 114 studies (up to 320,485 individuals of European descent) with genome-wide chip and/or Metabochip data by the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. Each study tested the association of up to ~2.8M SNPs with BMI and WHRadjBMI in four strata (men ≤50y, men >50y, women ≤50y, women >50y) and summary statistics were combined in stratum-specific meta-analyses. We then screened for variants that showed age-specific effects (G x AGE), sex-specific effects (G x SEX) or age-specific effects that differed between men and women (G x AGE x SEX). For BMI, we identified 15 loci (11 previously established for main effects, four novel) that showed significant (FDR<5%) age-specific effects, of which 11 had larger effects in younger (<50y) than in older adults (≥50y). No sex-dependent effects were identified for BMI. For WHRadjBMI, we identified 44 loci (27 previously established for main effects, 17 novel) with sex-specific effects, of which 28 showed larger effects in women than in men, five showed larger effects in men than in women, and 11 showed opposite effects between sexes. No age-dependent effects were identified for WHRadjBMI. This is the first genome-wide interaction meta-analysis to report convincing evidence of age-dependent genetic effects on BMI. In addition, we confirm the sex-specificity of genetic effects on WHRadjBMI. These results may provide further insights into the biology that underlies weight change with age or the sexually dimorphism of body shape.
Resumo:
BACKGROUND: A major threat to the validity of longitudinal cohort studies is non-response to follow-up, which can lead to erroneous conclusions. The objective of this study was to evaluate the profile of non-responders to self-reported questionnaires in the Swiss inflammatory bowel disease (IBD) Cohort. METHODS: We used data from adult patients enrolled between November 2006 and June 2011. Responders versus non-responders were compared according to socio-demographic, clinical and psychosocial characteristics. Odds ratio for non-response to initial patient questionnaire (IPQ) compared to 1-year follow-up questionnaire (FPQ) were calculated. RESULTS: A total of 1943 patients received IPQ, in which 331 (17%) did not respond. Factors inversely associated with non-response to IPQ were age >50 and female gender (OR = 0.37; p < 0.001 respectively OR = 0.63; p = 0.003) among Crohn's disease (CD) patients, and disease duration >16 years (OR = 0.48; p = 0.025) among patients with ulcerative colitis (UC). FPQ was sent to 1586 patients who had completed the IPQ; 263 (17%) did not respond. Risk factors of non-response to FPQ were mild depression (OR = 2.17; p = 0.003) for CD, and mild anxiety (OR = 1.83; p = 0.024) for UC. Factors inversely associated with non-response to FPQ were: age >30 years, colonic only disease location, higher education and higher IBD-related quality of life for CD, and age >50 years or having a positive social support for UC. CONCLUSIONS: Characteristics of non-responders differed between UC and CD. The risk of non-response to repetitive solicitations (longitudinal versus transversal study) seemed to decrease with age. Assessing non-respondents' characteristics is important to document potential bias in longitudinal studies.
Resumo:
This thesis focuses on the social-psychological factors that help coping with structural disadvantage, and specifically on the role of cohesive ingroups and the sense of connectedness and efficacy they entail in this process. It aims to complement existing group-based models of coping that are grounded in a categorization perspective to groups and consequently focus exclusively on the large-scale categories made salient in intergroup contexts of comparisons. The dissertation accomplishes this aim through a reconsideration of between-persons relational interdependence as a sufficient and independent antecedent of a sense of groupness, and the benefits that a sense of group connectedness in one's direct environment, regardless of the categorical or relational basis of groupness, might have in the everyday struggles of disadvantaged group members. The three empirical papers aim to validate this approach, outlined in the first theoretical introduction, by testing derived hypotheses. They are based on data collected with youth populations (15-30) from three institutions in French-speaking Switzerland within the context of a larger project on youth transitions. Methods of data collection are paper-pencil questionnaires and in-depth interviews with a selected sub-sample of participants. The key argument of the first paper is that members of socially disadvantaged categories face higher barriers to their life project and that a general sense of connectedness, either based on categorical identities or other proximal groups and relations, mitigates the feeling of powerlessness associated with this experience. The second paper develops and tests a model that defines individual needs satisfaction as antecedent of self-group bonds and the efficacy beliefs derived from these intragroup bonds as the mechanism underlining the role of ingroups in coping. The third paper highlights the complexities that might be associated with the construction of a sense of groupness directly from intergroup comparisons and categorization-based disadvantage, and points out a more subtle understanding of the processes underling the emergence of groupness out of the situation of structural disadvantage. Overall, the findings confirm the central role of ingroups in coping with structural disadvantage and the importance of an understanding of groupness and its role that goes beyond the dominant focus on intergroup contexts and categorization processes.
Resumo:
The main objective of this study is to assess the potential of the information technology industry in the Saint Petersburg area to become one of the new key industries in the Russian economy. To achieve this objective, the study analyzes especially the international competitiveness of the industry and the conditions for clustering. Russia is currently heavily dependent on its natural resources, which are the main source of its recent economic growth. In order to achieve good long-term economic performance, Russia needs diversification in its well-performing industries in addition to the ones operating in the field of natural resources. The Russian government has acknowledged this and started special initiatives to promote such other industries as information technology and nanotechnology. An interesting industry that is basically less than 20 years old and fast growing in Russia, is information technology. Information technology activities and markets are mainly concentrated in Russia’s two biggest cities, Moscow and Saint Petersburg, and areas around them. The information technology industry in the Saint Petersburg area, although smaller than Moscow, is especially dynamic and is gaining increasing foreign company presence. However, the industry is not yet internationally competitive as it lacks substantial and sustainable competitive advantages. The industry is also merely a potential global information technology cluster, as it lacks the competitive edge and a wide supplier and manufacturing base and other related parts of the whole information technology value system. Alone, the industry will not become a key industry in Russia, but it will, on the other hand, have an important supporting role for the development of other industries. The information technology market in the Saint Petersburg area is already large and if more tightly integrated to Moscow, they will together form a huge and still growing market sufficient for most companies operating in Russia currently and in the future. Therefore, the potential of information technology inside Russia is immense.