7 resultados para Complexity score

em Helda - Digital Repository of University of Helsinki


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Minimum Description Length (MDL) is an information-theoretic principle that can be used for model selection and other statistical inference tasks. There are various ways to use the principle in practice. One theoretically valid way is to use the normalized maximum likelihood (NML) criterion. Due to computational difficulties, this approach has not been used very often. This thesis presents efficient floating-point algorithms that make it possible to compute the NML for multinomial, Naive Bayes and Bayesian forest models. None of the presented algorithms rely on asymptotic analysis and with the first two model classes we also discuss how to compute exact rational number solutions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Oral cancer ranks among the 10 most common cancers worldwide. Since it is commonly diagnosed at locally advanced stage, curing the cancer demands extensive tissue resection. The emergent defect is reconstructed generally with a free flap transfer. Repair of the upper aerodigestive track with maintenance of its multiform activities is challenging. The aim of the study was to extract comprehensive treatment outcomes for patients having undergone microvascular free flap transfer because of large oral cavity or pharyngeal cancer. Ninety-four patients were analyzed for postoperative survival and complications. Forty-four patients were followed-up and analyzed for functional outcome, which was determined in terms of quality of life, speech, swallowing, and intraoral sensation. Quality of life was assessed using the University of Washington Head and Neck Questionnaire. Speech was analyzed for aerodynamic parameters and for nasal acoustic energy, as well as perceptually for articulatory proficiency, voice quality, and intelligibility. Videofluorography was performed to determine the swallowing ability. Intraoral sensation was measured by moving 2-point discrimination. The 3-year overall survival was over 40%. The 1-year disease-free survival was 43%. Postoperative complications arose in over half of the patients. Flap success rate was high. Perioperative mortality varied between 2% and 11%. Unemployment and heavy drinking were the strongest predictors of survival. Sociodemographic factors were found to associate with quality of life. The global quality of life score deteriorated and did not return to the preoperative level. Significant reduction was detectable in the domains measuring chewing and speech, and in appearance and shoulder function. The basic elements necessary for normal speech were maintained. Speech intelligibility reduced and was related to the misarticulations of the /r/ and /s/ phonemes. Deviant /r/ and /s/ persisted in most patients. Hoarseness and hypernasality occurred infrequently. One year postoperatively, 98% of the patients had achieved oral nutrition and half of them were on a regular masticated diet. Overt and silent aspiration was encountered throughout the follow-up. At 12-month swallow test, 44% of the patients aspirated, 70% of whom silently. Of these patients, 15% presented with pulmonary changes referring to aspiration. Intraoral sensation weakened but was unrelated to oral functions. The results provide new data for oral reconstructions and highlight the importance of the functional outcome of the treatment for an oral cancer patient. The mouth and the pharynx encompass a unit of utmost functional complexity. Surgery should continue to make progress in this area, and methods that lead to good function should be developed. Operational outcome should always be evaluated in terms of function.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have presented an overview of the FSIG approach and related FSIG gram- mars to issues of very low complexity and parsing strategy. We ended up with serious optimism according to which most FSIG grammars could be decom- posed in a reasonable way and then processed efficiently.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this dissertation I study language complexity from a typological perspective. Since the structuralist era, it has been assumed that local complexity differences in languages are balanced out in cross-linguistic comparisons and that complexity is not affected by the geopolitical or sociocultural aspects of the speech community. However, these assumptions have seldom been studied systematically from a typological point of view. My objective is to define complexity so that it is possible to compare it across languages and to approach its variation with the methods of quantitative typology. My main empirical research questions are: i) does language complexity vary in any systematic way in local domains, and ii) can language complexity be affected by the geographical or social environment? These questions are studied in three articles, whose findings are summarized in the introduction to the dissertation. In order to enable cross-language comparison, I measure complexity as the description length of the regularities in an entity; I separate it from difficulty, focus on local instead of global complexity, and break it up into different types. This approach helps avoid the problems that plagued earlier metrics of language complexity. My approach to grammar is functional-typological in nature, and the theoretical framework is basic linguistic theory. I delimit the empirical research functionally to the marking of core arguments (the basic participants in the sentence). I assess the distributions of complexity in this domain with multifactorial statistical methods and use different sampling strategies, implementing, for instance, the Greenbergian view of universals as diachronic laws of type preference. My data come from large and balanced samples (up to approximately 850 languages), drawn mainly from reference grammars. The results suggest that various significant trends occur in the marking of core arguments in regard to complexity and that complexity in this domain correlates with population size. These results provide evidence that linguistic patterns interact among themselves in terms of complexity, that language structure adapts to the social environment, and that there may be cognitive mechanisms that limit complexity locally. My approach to complexity and language universals can therefore be successfully applied to empirical data and may serve as a model for further research in these areas.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose an efficient and parameter-free scoring criterion, the factorized conditional log-likelihood (ˆfCLL), for learning Bayesian network classifiers. The proposed score is an approximation of the conditional log-likelihood criterion. The approximation is devised in order to guarantee decomposability over the network structure, as well as efficient estimation of the optimal parameters, achieving the same time and space complexity as the traditional log-likelihood scoring criterion. The resulting criterion has an information-theoretic interpretation based on interaction information, which exhibits its discriminative nature. To evaluate the performance of the proposed criterion, we present an empirical comparison with state-of-the-art classifiers. Results on a large suite of benchmark data sets from the UCI repository show that ˆfCLL-trained classifiers achieve at least as good accuracy as the best compared classifiers, using significantly less computational resources.