877 resultados para document clustering
Resumo:
El terme paisatge i les seves aplicacions són cada dia més utilitzats per les administracions i altres entitats com a eina de gestió del territori. Aprofitant la gran quantitat de dades en bases compatibles amb SIG (Sistemes d’Informació Geogràfica) existents a Catalunya s’ha desenvolupat una síntesi cartogràfica on s’identifiquen els Paisatges Funcionals (PF) de Catalunya, concepte que fa referència al comportament fisico-ecològic del terreny a partir de variables topogràfiques i climàtiques convenientment transformades i agregades. S’ha utilitzat un mètode semiautomàtic i iteratiu de classificació no supervisada (clustering) que permet la creació d’una llegenda jeràrquica o nivells de generalització. S’ha obtingut com a resultat el Mapa de Paisatges Funcionals de Catalunya (MPFC) amb una llegenda de 26 categories de paisatges i 5 nivells de generalització amb una resolució espacial de 180 m. Paral·lelament, s’han realitzat validacions indirectes sobre el mapa obtingut a partir dels coneixements naturalistes i la cartografia existent, així com també d’un mapa d’incertesa (aplicant lògica difusa) que aporten informació de la fiabilitat de la classificació realitzada. Els Paisatges Funcionals obtinguts permeten relacionar zones de condicions topo-climàtiques homogènies i dividir el territori en zones caracteritzades ambientalment i no políticament amb la intenció que sigui d’utilitat a l’hora de millorar la gestió dels recursos naturals i la planificació d’actuacions humanes.
Resumo:
Document de síntesi d'aquest estudi que analitza -seguint una metodologia quantitativa basada en una mostra representativa de 2.093 professors i 23.864 estudiants i reforçada amb elements qualitatius- la transició que es produeix en el sistema universitari públic català cap a un model més adaptat a les noves necessitats de la societat xarxa. Per a això, es posa especial èmfasi en l'anàlisi dels usos que es fa d'Internet (l'eina clau de la societat xarxa) en el món universitari i en les transformacions que es donen o es donaran com a conseqüència d'aquests usos.
Resumo:
OBJECTIVE: This study assessed clustering of multiple risk behaviors (i.e., low leisure-time physical activity, low fruits/vegetables intake, and high alcohol consumption) with level of cigarette consumption. METHODS: Data from the 2002 Swiss Health Survey, a population-based cross-sectional telephone survey assessing health and self-reported risk behaviors, were used. 18,005 subjects (8052 men and 9953 women) aged 25 years old or more participated. RESULTS: Smokers more frequently had low leisure time physical activity, low fruits/vegetables intake, and high alcohol consumption than non- and ex-smokers. Frequency of each risk behavior increased steadily with cigarette consumption. Clustering of risk behaviors increased with cigarette consumption in both men and women. For men, the odds ratios of multiple (> or =2) risk behaviors other than smoking, adjusted for age, nationality, and educational level, were 1.14 (95% confidence interval: 0.97, 1.33) for ex-smokers, 1.24 (0.93, 1.64) for light smokers (1-9 cigarettes/day), 1.72 (1.36, 2.17) for moderate smokers (10-19 cigarettes/day), and 3.07 (2.59, 3.64) for heavy smokers (> or =20 cigarettes/day) versus non-smokers. Similar odds ratios were found for women for corresponding groups, i.e., 1.01 (0.86, 1.19), 1.26 (1.00, 1.58), 1.62 (1.33, 1.98), and 2.75 (2.30, 3.29). CONCLUSIONS: Counseling and intervention with smokers should take into account the strong clustering of risk behaviors with level of cigarette consumption.
Resumo:
Summary : 1. Measuring health literacy in Switzerland: a review of six surveys: 1.1 Comparison of questionnaires - 1.2 Measures of health literacy in Switzerland - 1.3 Discussion of Swiss data on HL - 1.4 Description of the six surveys: 1.4.1 Current health trends and health literacy in the Swiss population (gfs-UNIVOX), 1.4.2 Nutrition, physical exercise and body weight : opinions and perceptions of the Swiss population (USI), 1.4.3 Health Literacy in Switzerland (ISPMZ), 1.4.4 Swiss Health Survey (SHS), 1.4.5 Survey of Health, Ageing and Retirement in Europe (SHARE), 1.4.6 Adult literacy and life skills survey (ALL). - 2 . Economic costs of low health literacy in Switzerland: a rough calculation. Appendix: Screenshots cost model
Resumo:
Our essay aims at studying suitable statistical methods for the clustering ofcompositional data in situations where observations are constituted by trajectories ofcompositional data, that is, by sequences of composition measurements along a domain.Observed trajectories are known as “functional data” and several methods have beenproposed for their analysis.In particular, methods for clustering functional data, known as Functional ClusterAnalysis (FCA), have been applied by practitioners and scientists in many fields. To ourknowledge, FCA techniques have not been extended to cope with the problem ofclustering compositional data trajectories. In order to extend FCA techniques to theanalysis of compositional data, FCA clustering techniques have to be adapted by using asuitable compositional algebra.The present work centres on the following question: given a sample of compositionaldata trajectories, how can we formulate a segmentation procedure giving homogeneousclasses? To address this problem we follow the steps described below.First of all we adapt the well-known spline smoothing techniques in order to cope withthe smoothing of compositional data trajectories. In fact, an observed curve can bethought of as the sum of a smooth part plus some noise due to measurement errors.Spline smoothing techniques are used to isolate the smooth part of the trajectory:clustering algorithms are then applied to these smooth curves.The second step consists in building suitable metrics for measuring the dissimilaritybetween trajectories: we propose a metric that accounts for difference in both shape andlevel, and a metric accounting for differences in shape only.A simulation study is performed in order to evaluate the proposed methodologies, usingboth hierarchical and partitional clustering algorithm. The quality of the obtained resultsis assessed by means of several indices
Resumo:
Allergic conjunctivitis (AC) is an inflammatory disease of the conjunctiva caused mainly by an IgE-mediated mechanism. It is the most common type of ocular allergy. Despite being the most benign form of conjunctivitis, AC has a considerable effect on patient quality of life, reduces work productivity, and increases health care costs. No consensus has been reached on its classification, diagnosis, or treatment. Consequently, the literature provides little information on its natural history, epidemiological data are scarce, and it is often difficult to ascertain its true morbidity. The main objective of the Consensus Document on Allergic Conjunctivitis (Documento dE Consenso sobre Conjuntivitis Alérgica [DECA]), which was drafted by an expert panel from the Spanish Society of Allergology and Spanish Society of Ophthalmology, was to reach agreement on basic criteria that could prove useful for both specialists and primary care physicians and facilitate the diagnosis, classification, and treatment of AC. This document is the first of its kind to describe and analyze aspects of AC that could make it possible to control symptoms.
Resumo:
Globalization involves several facility location problems that need to be handled at large scale. Location Allocation (LA) is a combinatorial problem in which the distance among points in the data space matter. Precisely, taking advantage of the distance property of the domain we exploit the capability of clustering techniques to partition the data space in order to convert an initial large LA problem into several simpler LA problems. Particularly, our motivation problem involves a huge geographical area that can be partitioned under overall conditions. We present different types of clustering techniques and then we perform a cluster analysis over our dataset in order to partition it. After that, we solve the LA problem applying simulated annealing algorithm to the clustered and non-clustered data in order to work out how profitable is the clustering and which of the presented methods is the most suitable
Resumo:
This paper presents and discusses the use of Bayesian procedures - introduced through the use of Bayesian networks in Part I of this series of papers - for 'learning' probabilities from data. The discussion will relate to a set of real data on characteristics of black toners commonly used in printing and copying devices. Particular attention is drawn to the incorporation of the proposed procedures as an integral part in probabilistic inference schemes (notably in the form of Bayesian networks) that are intended to address uncertainties related to particular propositions of interest (e.g., whether or not a sample originates from a particular source). The conceptual tenets of the proposed methodologies are presented along with aspects of their practical implementation using currently available Bayesian network software.
Resumo:
El document és una recopilació de tota una informació prèvia per tal derealitzar un Document Tècnic de l’Edificació que s’ha servit de guia de les actuals NTE(Normes Tecnològiques de l’Edificació), però sense posar en dubte les NTE ni substituir-les.Aquest DTE consisteix en refer la NTE-QLC-1973 (NormaTecnològica de l’Edificació – Cubiertas Lucernarios Claraboyas), que passaràanomenar-se DTE-CLC-CP / CLC-CT. (Document Tècnic de l’Edificació – CobertesLluernaris Claraboies – Claraboies de Catàleg / Claraboies Tubulars).El meu interès per la llum natural va fer que em decidís per triar el tema de lesClaraboies, i el meu estudi es base en les claraboies de catàleg, de les quals ja parlade l’actual NTE i, a més a més, el DTE incorpora un nou tipus de claraboia com són lestubulars
Resumo:
Iowa has 8 commercial service airports and 105 general aviation airports, of which three serve as reliever airports. ***NOTE*** This document is for historical viewing, the internal information is no longer current or accurate! ***NOTE*** Current information can be found at http://www.iowadot.gov/aviation/aircraftregistration/registration.aspx ***NOTE***
Resumo:
BACKGROUND: It is unknown why patients with extensive ulcerative colitis (UC) have a higher risk of colorectal cancer compared with patients with left-sided UC. This study characterizes the inflammatory processes in left-sided UC, pancolitis, and UC-associated dysplasia at the transcriptional level to identify potential biomarkers and transcripts of importance for the carcinogenic behavior of chronic inflammation. METHODS: The Affymetrix GeneChip Human Genome U133 Plus 2.0 was applied on colonic biopsies from UC patients with left-sided UC, pancolitis, dysplasia, and controls. Reverse transcription polymerase chain reaction and immunohistochemistry were performed for validating selected transcripts in the initial cohort and in 2 independent cohorts of patients with UC. Microarray data were analyzed by principal component analysis, and reverse transcription polymerase chain reaction and immunohistochemistry data by the Wilcoxon's rank-sum test. RESULTS: The principal component analysis results revealed separate clusters for left-sided UC, pancolitis, dysplasia, and controls. Close clustering of dysplastic and pancolitic samples indicated similarities in gene expression. Indeed, 101 and 656 parallel upregulated and downregulated transcripts, respectively, were identified in specimens from dysplasia and pancolitis. Validation of selected transcripts hereof identified insulin receptor alpha (INSRA) and MAP kinase interacting serine/threonine kinase 2 (MKNK2) with an enhanced expression in dysplasia compared with left-sided UC and controls, whereas laminin γ2 (LAMC2) was found with a lower expression in dysplasia compared with the remaining 3 groups. CONCLUSIONS: This study demonstrates pancolitis and left-sided UC as distinct inflammatory processes at the transcriptional level, and identifies INSRA, MKNK2, and LAMC2 as potential critical transcripts in the inflammation-driven preneoplastic process of UC.
Resumo:
BACKGROUND: Jeune asphyxiating thoracic dystrophy (JATD) is a rare, often lethal, recessively inherited chondrodysplasia characterised by shortened ribs and long bones, sometimes accompanied by polydactyly, and renal, liver and retinal disease. Mutations in intraflagellar transport (IFT) genes cause JATD, including the IFT dynein-2 motor subunit gene DYNC2H1. Genetic heterogeneity and the large DYNC2H1 gene size have hindered JATD genetic diagnosis. AIMS AND METHODS: To determine the contribution to JATD we screened DYNC2H1 in 71 JATD patients JATD patients combining SNP mapping, Sanger sequencing and exome sequencing. RESULTS AND CONCLUSIONS: We detected 34 DYNC2H1 mutations in 29/71 (41%) patients from 19/57 families (33%), showing it as a major cause of JATD especially in Northern European patients. This included 13 early protein termination mutations (nonsense/frameshift, deletion, splice site) but no patients carried these in combination, suggesting the human phenotype is at least partly hypomorphic. In addition, 21 missense mutations were distributed across DYNC2H1 and these showed some clustering to functional domains, especially the ATP motor domain. DYNC2H1 patients largely lacked significant extra-skeletal involvement, demonstrating an important genotype-phenotype correlation in JATD. Significant variability exists in the course and severity of the thoracic phenotype, both between affected siblings with identical DYNC2H1 alleles and among individuals with different alleles, which suggests the DYNC2H1 phenotype might be subject to modifier alleles, non-genetic or epigenetic factors. Assessment of fibroblasts from patients showed accumulation of anterograde IFT proteins in the ciliary tips, confirming defects similar to patients with other retrograde IFT machinery mutations, which may be of undervalued potential for diagnostic purposes.
Resumo:
BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.
Resumo:
El projecte es basa en una proposta de modificació d’una Norma Tecnològica d’Edificació sobre les instal•lacions elèctriques de baixa tensió del 1973, per tal de donar-li un caire més actual