4 resultados para Document Segmentation
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
This thesis proposes a new document model, according to which any document can be segmented in some independent components and transformed in a pattern-based projection, that only uses a very small set of objects and composition rules. The point is that such a normalized document expresses the same fundamental information of the original one, in a simple, clear and unambiguous way. The central part of my work consists of discussing that model, investigating how a digital document can be segmented, and how a segmented version can be used to implement advanced tools of conversion. I present seven patterns which are versatile enough to capture the most relevant documents’ structures, and whose minimality and rigour make that implementation possible. The abstract model is then instantiated into an actual markup language, called IML. IML is a general and extensible language, which basically adopts an XHTML syntax, able to capture a posteriori the only content of a digital document. It is compared with other languages and proposals, in order to clarify its role and objectives. Finally, I present some systems built upon these ideas. These applications are evaluated in terms of users’ advantages, workflow improvements and impact over the overall quality of the output. In particular, they cover heterogeneous content management processes: from web editing to collaboration (IsaWiki and WikiFactory), from e-learning (IsaLearning) to professional printing (IsaPress).
Resumo:
In this thesis two major topics inherent with medical ultrasound images are addressed: deconvolution and segmentation. In the first case a deconvolution algorithm is described allowing statistically consistent maximum a posteriori estimates of the tissue reflectivity to be restored. These estimates are proven to provide a reliable source of information for achieving an accurate characterization of biological tissues through the ultrasound echo. The second topic involves the definition of a semi automatic algorithm for myocardium segmentation in 2D echocardiographic images. The results show that the proposed method can reduce inter- and intra observer variability in myocardial contours delineation and is feasible and accurate even on clinical data.
Resumo:
Myocardial perfusion quantification by means of Contrast-Enhanced Cardiac Magnetic Resonance images relies on time consuming frame-by-frame manual tracing of regions of interest. In this Thesis, a novel automated technique for myocardial segmentation and non-rigid registration as a basis for perfusion quantification is presented. The proposed technique is based on three steps: reference frame selection, myocardial segmentation and non-rigid registration. In the first step, the reference frame in which both endo- and epicardial segmentation will be performed is chosen. Endocardial segmentation is achieved by means of a statistical region-based level-set technique followed by a curvature-based regularization motion. Epicardial segmentation is achieved by means of an edge-based level-set technique followed again by a regularization motion. To take into account the changes in position, size and shape of myocardium throughout the sequence due to out of plane respiratory motion, a non-rigid registration algorithm is required. The proposed non-rigid registration scheme consists in a novel multiscale extension of the normalized cross-correlation algorithm in combination with level-set methods. The myocardium is then divided into standard segments. Contrast enhancement curves are computed measuring the mean pixel intensity of each segment over time, and perfusion indices are extracted from each curve. The overall approach has been tested on synthetic and real datasets. For validation purposes, the sequences have been manually traced by an experienced interpreter, and contrast enhancement curves as well as perfusion indices have been computed. Comparisons between automatically extracted and manually obtained contours and enhancement curves showed high inter-technique agreement. Comparisons of perfusion indices computed using both approaches against quantitative coronary angiography and visual interpretation demonstrated that the two technique have similar diagnostic accuracy. In conclusion, the proposed technique allows fast, automated and accurate measurement of intra-myocardial contrast dynamics, and may thus address the strong clinical need for quantitative evaluation of myocardial perfusion.
Resumo:
This thesis aims at investigating a new approach to document analysis based on the idea of structural patterns in XML vocabularies. My work is founded on the belief that authors do naturally converge to a reasonable use of markup languages and that extreme, yet valid instances are rare and limited. Actual documents, therefore, may be used to derive classes of elements (patterns) persisting across documents and distilling the conceptualization of the documents and their components, and may give ground for automatic tools and services that rely on no background information (such as schemas) at all. The central part of my work consists in introducing from the ground up a formal theory of eight structural patterns (with three sub-patterns) that are able to express the logical organization of any XML document, and verifying their identifiability in a number of different vocabularies. This model is characterized by and validated against three main dimensions: terseness (i.e. the ability to represent the structure of a document with a small number of objects and composition rules), coverage (i.e. the ability to capture any possible situation in any document) and expressiveness (i.e. the ability to make explicit the semantics of structures, relations and dependencies). An algorithm for the automatic recognition of structural patterns is then presented, together with an evaluation of the results of a test performed on a set of more than 1100 documents from eight very different vocabularies. This language-independent analysis confirms the ability of patterns to capture and summarize the guidelines used by the authors in their everyday practice. Finally, I present some systems that work directly on the pattern-based representation of documents. The ability of these tools to cover very different situations and contexts confirms the effectiveness of the model.