875 resultados para document clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lo scopo del clustering è quindi quello di individuare strutture nei dati significative, ed è proprio dalla seguente definizione che è iniziata questa attività di tesi , fornendo un approccio innovativo ed inesplorato al cluster, ovvero non ricercando la relazione ma ragionando su cosa non lo sia. Osservando un insieme di dati ,cosa rappresenta la non relazione? Una domanda difficile da porsi , che ha intrinsecamente la sua risposta, ovvero l’indipendenza di ogni singolo dato da tutti gli altri. La ricerca quindi dell’indipendenza tra i dati ha portato il nostro pensiero all’approccio statistico ai dati , in quanto essa è ben descritta e dimostrata in statistica. Ogni punto in un dataset, per essere considerato “privo di collegamenti/relazioni” , significa che la stessa probabilità di essere presente in ogni elemento spaziale dell’intero dataset. Matematicamente parlando , ogni punto P in uno spazio S ha la stessa probabilità di cadere in una regione R ; il che vuol dire che tale punto può CASUALMENTE essere all’interno di una qualsiasi regione del dataset. Da questa assunzione inizia il lavoro di tesi, diviso in più parti. Il secondo capitolo analizza lo stato dell’arte del clustering, raffrontato alla crescente problematica della mole di dati, che con l’avvento della diffusione della rete ha visto incrementare esponenzialmente la grandezza delle basi di conoscenza sia in termini di attributi (dimensioni) che in termini di quantità di dati (Big Data). Il terzo capitolo richiama i concetti teorico-statistici utilizzati dagli algoritimi statistici implementati. Nel quarto capitolo vi sono i dettagli relativi all’implementazione degli algoritmi , ove sono descritte le varie fasi di investigazione ,le motivazioni sulle scelte architetturali e le considerazioni che hanno portato all’esclusione di una delle 3 versioni implementate. Nel quinto capitolo gli algoritmi 2 e 3 sono confrontati con alcuni algoritmi presenti in letteratura, per dimostrare le potenzialità e le problematiche dell’algoritmo sviluppato , tali test sono a livello qualitativo , in quanto l’obbiettivo del lavoro di tesi è dimostrare come un approccio statistico può rivelarsi un’arma vincente e non quello di fornire un nuovo algoritmo utilizzabile nelle varie problematiche di clustering. Nel sesto capitolo saranno tratte le conclusioni sul lavoro svolto e saranno elencati i possibili interventi futuri dai quali la ricerca appena iniziata del clustering statistico potrebbe crescere.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis aims at investigating a new approach to document analysis based on the idea of structural patterns in XML vocabularies. My work is founded on the belief that authors do naturally converge to a reasonable use of markup languages and that extreme, yet valid instances are rare and limited. Actual documents, therefore, may be used to derive classes of elements (patterns) persisting across documents and distilling the conceptualization of the documents and their components, and may give ground for automatic tools and services that rely on no background information (such as schemas) at all. The central part of my work consists in introducing from the ground up a formal theory of eight structural patterns (with three sub-patterns) that are able to express the logical organization of any XML document, and verifying their identifiability in a number of different vocabularies. This model is characterized by and validated against three main dimensions: terseness (i.e. the ability to represent the structure of a document with a small number of objects and composition rules), coverage (i.e. the ability to capture any possible situation in any document) and expressiveness (i.e. the ability to make explicit the semantics of structures, relations and dependencies). An algorithm for the automatic recognition of structural patterns is then presented, together with an evaluation of the results of a test performed on a set of more than 1100 documents from eight very different vocabularies. This language-independent analysis confirms the ability of patterns to capture and summarize the guidelines used by the authors in their everyday practice. Finally, I present some systems that work directly on the pattern-based representation of documents. The ability of these tools to cover very different situations and contexts confirms the effectiveness of the model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have investigated the use of hierarchical clustering of flow cytometry data to classify samples of conventional central chondrosarcoma, a malignant cartilage forming tumor of uncertain cellular origin, according to similarities with surface marker profiles of several known cell types. Human primary chondrosarcoma cells, articular chondrocytes, mesenchymal stem cells, fibroblasts, and a panel of tumor cell lines from chondrocytic or epithelial origin were clustered based on the expression profile of eleven surface markers. For clustering, eight hierarchical clustering algorithms, three distance metrics, as well as several approaches for data preprocessing, including multivariate outlier detection, logarithmic transformation, and z-score normalization, were systematically evaluated. By selecting clustering approaches shown to give reproducible results for cluster recovery of known cell types, primary conventional central chondrosacoma cells could be grouped in two main clusters with distinctive marker expression signatures: one group clustering together with mesenchymal stem cells (CD49b-high/CD10-low/CD221-high) and a second group clustering close to fibroblasts (CD49b-low/CD10-high/CD221-low). Hierarchical clustering also revealed substantial differences between primary conventional central chondrosarcoma cells and established chondrosarcoma cell lines, with the latter not only segregating apart from primary tumor cells and normal tissue cells, but clustering together with cell lines from epithelial lineage. Our study provides a foundation for the use of hierarchical clustering applied to flow cytometry data as a powerful tool to classify samples according to marker expression patterns, which could lead to uncover new cancer subtypes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, enamel matrix derivative (EMD) has garnered much interest in the dental field for its apparent bioactivity that stimulates regeneration of periodontal tissues including periodontal ligament, cementum and alveolar bone. Despite its widespread use, the underlying cellular mechanisms remain unclear and an understanding of its biological interactions could identify new strategies for tissue engineering. Previous in vitro research has demonstrated that EMD promotes premature osteoblast clustering at early time points. The aim of the present study was to evaluate the influence of cell clustering on vital osteoblast cell-cell communication and adhesion molecules, connexin 43 (cx43) and N-cadherin (N-cad) as assessed by immunofluorescence imaging, real-time PCR and Western blot analysis. In addition, differentiation markers of osteoblasts were quantified using alkaline phosphatase, osteocalcin and von Kossa staining. EMD significantly increased the expression of connexin 43 and N-cadherin at early time points ranging from 2 to 5 days. Protein expression was localized to cell membranes when compared to control groups. Alkaline phosphatase activity was also significantly increased on EMD-coated samples at 3, 5 and 7 days post seeding. Interestingly, higher activity was localized to cell cluster regions. There was a 3 fold increase in osteocalcin and bone sialoprotein mRNA levels for osteoblasts cultured on EMD-coated culture dishes. Moreover, EMD significantly increased extracellular mineral deposition in cell clusters as assessed through von Kossa staining at 5, 7, 10 and 14 days post seeding. We conclude that EMD up-regulates the expression of vital osteoblast cell-cell communication and adhesion molecules, which enhances the differentiation and mineralization activity of osteoblasts. These findings provide further support for the clinical evidence that EMD increases the speed and quality of new bone formation in vivo.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Three-dimensional (3D) models of teeth and soft and hard tissues are tessellated surfaces used for diagnosis, treatment planning, appliance fabrication, outcome evaluation, and research. In scientific publications or communications with colleagues, these 3D data are often reduced to 2-dimensional pictures or need special software for visualization. The portable document format (PDF) offers a simple way to interactively display 3D surface data without additional software other than a recent version of Adobe Reader (Adobe, San Jose, Calif). The purposes of this article were to give an example of how 3D data and their analyses can be interactively displayed in 3 dimensions in electronic publications, and to show how they can be exported from any software for diagnostic reports and communications among colleagues.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The European Society of Cardiology heart failure guidelines firmly recommend regular physical activity and structured exercise training (ET), but this recommendation is still poorly implemented in daily clinical practice outside specialized centres and in the real world of heart failure clinics. In reality, exercise intolerance can be successfully tackled by applying ET. We need to encourage the mindset that breathlessness may be evidence of signalling between the periphery and central haemodynamic performance and regular physical activity may ultimately bring about favourable changes in myocardial function, symptoms, functional capacity, and increased hospitalization-free life span and probably survival. In this position paper, we provide practical advice for the application of exercise in heart failure and how to overcome traditional barriers, based on the current scientific and clinical knowledge supporting the beneficial effect of this intervention.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In orthodontics, multiple site observations within patients or multiple observations collected at consecutive time points are often encountered. Clustered designs require larger sample sizes compared to individual randomized trials and special statistical analyses that account for the fact that observations within clusters are correlated. It is the purpose of this study to assess to what degree clustering effects are considered during design and data analysis in the three major orthodontic journals. The contents of the most recent 24 issues of the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO), Angle Orthodontist (AO), and European Journal of Orthodontics (EJO) from December 2010 backwards were hand searched. Articles with clustering effects and whether the authors accounted for clustering effects were identified. Additionally, information was collected on: involvement of a statistician, single or multicenter study, number of authors in the publication, geographical area, and statistical significance. From the 1584 articles, after exclusions, 1062 were assessed for clustering effects from which 250 (23.5 per cent) were considered to have clustering effects in the design (kappa = 0.92, 95 per cent CI: 0.67-0.99 for inter rater agreement). From the studies with clustering effects only, 63 (25.20 per cent) had indicated accounting for clustering effects. There was evidence that the studies published in the AO have higher odds of accounting for clustering effects [AO versus AJODO: odds ratio (OR) = 2.17, 95 per cent confidence interval (CI): 1.06-4.43, P = 0.03; EJO versus AJODO: OR = 1.90, 95 per cent CI: 0.84-4.24, non-significant; and EJO versus AO: OR = 1.15, 95 per cent CI: 0.57-2.33, non-significant). The results of this study indicate that only about a quarter of the studies with clustering effects account for this in statistical data analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Little is known about engagement in multiple health behaviours in childhood cancer survivors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Arterio-venous malformations (AVMs) are congenital vascular malformations (CVMs) that result from birth defects involving the vessels of both arterial and venous origins, resulting in direct communications between the different size vessels or a meshwork of primitive reticular networks of dysplastic minute vessels which have failed to mature to become 'capillary' vessels termed "nidus". These lesions are defined by shunting of high velocity, low resistance flow from the arterial vasculature into the venous system in a variety of fistulous conditions. A systematic classification system developed by various groups of experts (Hamburg classification, ISSVA classification, Schobinger classification, angiographic classification of AVMs,) has resulted in a better understanding of the biology and natural history of these lesions and improved management of CVMs and AVMs. The Hamburg classification, based on the embryological differentiation between extratruncular and truncular type of lesions, allows the determination of the potential of progression and recurrence of these lesions. The majority of all AVMs are extra-truncular lesions with persistent proliferative potential, whereas truncular AVM lesions are exceedingly rare. Regardless of the type, AV shunting may ultimately result in significant anatomical, pathophysiological and hemodynamic consequences. Therefore, despite their relative rarity (10-20% of all CVMs), AVMs remain the most challenging and potentially limb or life-threatening form of vascular anomalies. The initial diagnosis and assessment may be facilitated by non- to minimally invasive investigations such as duplex ultrasound, magnetic resonance imaging (MRI), MR angiography (MRA), computerized tomography (CT) and CT angiography (CTA). Arteriography remains the diagnostic gold standard, and is required for planning subsequent treatment. A multidisciplinary team approach should be utilized to integrate surgical and non-surgical interventions for optimum care. Currently available treatments are associated with significant risk of complications and morbidity. However, an early aggressive approach to elimiate the nidus (if present) may be undertaken if the benefits exceed the risks. Trans-arterial coil embolization or ligation of feeding arteries where the nidus is left intact, are incorrect approaches and may result in proliferation of the lesion. Furthermore, such procedures would prevent future endovascular access to the lesions via the arterial route. Surgically inaccessible, infiltrating, extra-truncular AVMs can be treated with endovascular therapy as an independent modality. Among various embolo-sclerotherapy agents, ethanol sclerotherapy produces the best long term outcomes with minimum recurrence. However, this procedure requires extensive training and sufficient experience to minimize complications and associated morbidity. For the surgically accessible lesions, surgical resection may be the treatment of choice with a chance of optimal control. Preoperative sclerotherapy or embolization may supplement the subsequent surgical excision by reducing the morbidity (e.g. operative bleeding) and defining the lesion borders. Such a combined approach may provide an excellent potential for a curative result. Conclusion. AVMs are high flow congenital vascular malformations that may occur in any part of the body. The clinical presentation depends on the extent and size of the lesion and can range from an asymptomatic birthmark to congestive heart failure. Detailed investigations including duplex ultrasound, MRI/MRA and CT/CTA are required to develop an appropriate treatment plan. Appropriate management is best achieved via a multi-disciplinary approach and interventions should be undertaken by appropriately trained physicians.