17 resultados para Proximal Point Algorithm
em Helda - Digital Repository of University of Helsinki
Resumo:
The purpose of this study is to analyze and develop various forms of abduction as a means of conceptualizing processes of discovery. Abduction was originally presented by Charles S. Peirce (1839-1914) as a "weak", third main mode of inference -- besides deduction and induction -- one which, he proposed, is closely related to many kinds of cognitive processes, such as instincts, perception, practices and mediated activity in general. Both abduction and discovery are controversial issues in philosophy of science. It is often claimed that discovery cannot be a proper subject area for conceptual analysis and, accordingly, abduction cannot serve as a "logic of discovery". I argue, however, that abduction gives essential means for understanding processes of discovery although it cannot give rise to a manual or algorithm for making discoveries. In the first part of the study, I briefly present how the main trend in philosophy of science has, for a long time, been critical towards a systematic account of discovery. Various models have, however, been suggested. I outline a short history of abduction; first Peirce's evolving forms of his theory, and then later developments. Although abduction has not been a major area of research until quite recently, I review some critiques of it and look at the ways it has been analyzed, developed and used in various fields of research. Peirce's own writings and later developments, I argue, leave room for various subsequent interpretations of abduction. The second part of the study consists of six research articles. First I treat "classical" arguments against abduction as a logic of discovery. I show that by developing strategic aspects of abductive inference these arguments can be countered. Nowadays the term 'abduction' is often used as a synonym for the Inference to the Best Explanation (IBE) model. I argue, however, that it is useful to distinguish between IBE ("Harmanian abduction") and "Hansonian abduction"; the latter concentrating on analyzing processes of discovery. The distinctions between loveliness and likeliness, and between potential and actual explanations are more fruitful within Hansonian abduction. I clarify the nature of abduction by using Peirce's distinction between three areas of "semeiotic": grammar, critic, and methodeutic. Grammar (emphasizing "Firstnesses" and iconicity) and methodeutic (i.e., a processual approach) especially, give new means for understanding abduction. Peirce himself held a controversial view that new abductive ideas are products of an instinct and an inference at the same time. I maintain that it is beneficial to make a clear distinction between abductive inference and abductive instinct, on the basis of which both can be developed further. Besides these, I analyze abduction as a part of distributed cognition which emphasizes a long-term interaction with the material, social and cultural environment as a source for abductive ideas. This approach suggests a "trialogical" model in which inquirers are fundamentally connected both to other inquirers and to the objects of inquiry. As for the classical Meno paradox about discovery, I show that abduction provides more than one answer. As my main example of abductive methodology, I analyze the process of Ignaz Semmelweis' research on childbed fever. A central basis for abduction is the claim that discovery is not a sequence of events governed only by processes of chance. Abduction treats those processes which both constrain and instigate the search for new ideas; starting from the use of clues as a starting point for discovery, but continuing in considerations like elegance and 'loveliness'. The study then continues a Peircean-Hansonian research programme by developing abduction as a way of analyzing processes of discovery.
Resumo:
Ei saatavilla
Resumo:
This thesis which consists of an introduction and four peer-reviewed original publications studies the problems of haplotype inference (haplotyping) and local alignment significance. The problems studied here belong to the broad area of bioinformatics and computational biology. The presented solutions are computationally fast and accurate, which makes them practical in high-throughput sequence data analysis. Haplotype inference is a computational problem where the goal is to estimate haplotypes from a sample of genotypes as accurately as possible. This problem is important as the direct measurement of haplotypes is difficult, whereas the genotypes are easier to quantify. Haplotypes are the key-players when studying for example the genetic causes of diseases. In this thesis, three methods are presented for the haplotype inference problem referred to as HaploParser, HIT, and BACH. HaploParser is based on a combinatorial mosaic model and hierarchical parsing that together mimic recombinations and point-mutations in a biologically plausible way. In this mosaic model, the current population is assumed to be evolved from a small founder population. Thus, the haplotypes of the current population are recombinations of the (implicit) founder haplotypes with some point--mutations. HIT (Haplotype Inference Technique) uses a hidden Markov model for haplotypes and efficient algorithms are presented to learn this model from genotype data. The model structure of HIT is analogous to the mosaic model of HaploParser with founder haplotypes. Therefore, it can be seen as a probabilistic model of recombinations and point-mutations. BACH (Bayesian Context-based Haplotyping) utilizes a context tree weighting algorithm to efficiently sum over all variable-length Markov chains to evaluate the posterior probability of a haplotype configuration. Algorithms are presented that find haplotype configurations with high posterior probability. BACH is the most accurate method presented in this thesis and has comparable performance to the best available software for haplotype inference. Local alignment significance is a computational problem where one is interested in whether the local similarities in two sequences are due to the fact that the sequences are related or just by chance. Similarity of sequences is measured by their best local alignment score and from that, a p-value is computed. This p-value is the probability of picking two sequences from the null model that have as good or better best local alignment score. Local alignment significance is used routinely for example in homology searches. In this thesis, a general framework is sketched that allows one to compute a tight upper bound for the p-value of a local pairwise alignment score. Unlike the previous methods, the presented framework is not affeced by so-called edge-effects and can handle gaps (deletions and insertions) without troublesome sampling and curve fitting.
Resumo:
The first aim of the current study was to evaluate the survival of total hip arthroplasty (THA) in patients aged 55 years and older on a nation-wide level. The second aim was to evaluate, on a nation wide-basis, the geographical variation of the incidence of primary THA for primary OA and also to identify those variables that are possibly associated with this variation. The third aim was to evaluate the effects of hospital volume: on the length of stay, on the numbers of re-admissions and on the numbers of complications of THR on population-based level in Finland. The survival of implants was analysed based on data from the Finnish Arthroplasty Register. The incidence and hospital volume data were obtained from the Hospital Discharge Register. Cementless total hip replacements had a significantly reduced risk of revision for aseptic loosening compared with cemented hip replacements. When revision for any reason was the end point in the survival analyses, there were no significant differences found between the groups. Adjusted incidence ratios of THA varied from 1.9- to 3.0-fold during the study period. Neither the average income within a region nor the morbidity index was associated with the incidence of THA. For the four categories of volume of total hip replacements performed per hospital, the length of the surgical treatment period was shorter for the highest volume group than for the lowest volume group. The odds ratio for dislocations was significantly lower in the high volume group than in the low volume group. In patients who were 55 years of age or older, the survival of cementless total hip replacements was as good as that of the cemented replacements. However, multiple wear-related revisions of the cementless cups indicate that excessive polyethylene wear was a major clinical problem with modular cementless cups. The variation in the long-term rates of survival for different cemented stems was considerable. Cementless proximal porous-coated stems were found to be a good option for elderly patients. When hip surgery was performed on with a large repertoire, the indications to perform THAs due to primary OA were tight. Socio-economic status of the patient had no apparent effect on THA rate. Specialization of hip replacements in high volume hospitals should reduce costs by significantly shortening the length of stay, and may reduce the dislocation rate.
Resumo:
Aerosols impact the planet and our daily lives through various effects, perhaps most notably those related to their climatic and health-related consequences. While there are several primary particle sources, secondary new particle formation from precursor vapors is also known to be a frequent, global phenomenon. Nevertheless, the formation mechanism of new particles, as well as the vapors participating in the process, remain a mystery. This thesis consists of studies on new particle formation specifically from the point of view of numerical modeling. A dependence of formation rate of 3 nm particles on the sulphuric acid concentration to the power of 1-2 has been observed. This suggests nucleation mechanism to be of first or second order with respect to the sulphuric acid concentration, in other words the mechanisms based on activation or kinetic collision of clusters. However, model studies have had difficulties in replicating the small exponents observed in nature. The work done in this thesis indicates that the exponents may be lowered by the participation of a co-condensing (and potentially nucleating) low-volatility organic vapor, or by increasing the assumed size of the critical clusters. On the other hand, the presented new and more accurate method for determining the exponent indicates high diurnal variability. Additionally, these studies included several semi-empirical nucleation rate parameterizations as well as a detailed investigation of the analysis used to determine the apparent particle formation rate. Due to their high proportion of the earth's surface area, oceans could potentially prove to be climatically significant sources of secondary particles. In the lack of marine observation data, new particle formation events in a coastal region were parameterized and studied. Since the formation mechanism is believed to be similar, the new parameterization was applied in a marine scenario. The work showed that marine CCN production is feasible in the presence of additional vapors contributing to particle growth. Finally, a new method to estimate concentrations of condensing organics was developed. The algorithm utilizes a Markov chain Monte Carlo method to determine the required combination of vapor concentrations by comparing a measured particle size distribution with one from an aerosol dynamics process model. The evaluation indicated excellent agreement against model data, and initial results with field data appear sound as well.
Resumo:
"The genetic diversity of Puumala hantavirus (PUUV) was studied in a local population of its natural host, the bank vole (Myodes glareolus). The trapping area (2.5x2.5 km) at Konnevesi, Central Finland, included 14 trapping sites, at least 500 m apart; altogether, 147 voles were captured during May and October 2005. Partial sequences of the S, M and L viral genome segments were recovered from 40 animals. Seven, 12 and 17 variants were detected for the S, M and L sequences, respectively; these represent new wild-type PUUV strains that belong to the Finnish genetic lineage. The genetic diversity of PUUV strains from Konnevesi was 0.2-4.9% for the S segment, 0.2-4.8% for the M segment and 0.2-9.7% for the L segment. Most nucleotide substitutions were synonymous and most deduced amino acid substitutions were conservative, probably due to strong stabilizing selection operating at the protein level. Based on both sequence markers and phylogenetic clustering, the S, M and L sequences could be assigned to two groups, 'A' and 'B'. Notably, not all bank voles carried S, M and L sequences belonging to the same group, i.e. SAMALA or SBMBLB.. A substantial proportion (8/40, 20%) of the newly characterized PUUV strains possessed reassortant genomes such as SBMALA, SAMBLB or SBMALB. These results suggest that at least some of the PUUV reassortants are viable and can survive in the presence of their parental strains."
Resumo:
The aim of this study was to evaluate and test methods which could improve local estimates of a general model fitted to a large area. In the first three studies, the intention was to divide the study area into sub-areas that were as homogeneous as possible according to the residuals of the general model, and in the fourth study, the localization was based on the local neighbourhood. According to spatial autocorrelation (SA), points closer together in space are more likely to be similar than those that are farther apart. Local indicators of SA (LISAs) test the similarity of data clusters. A LISA was calculated for every observation in the dataset, and together with the spatial position and residual of the global model, the data were segmented using two different methods: classification and regression trees (CART) and the multiresolution segmentation algorithm (MS) of the eCognition software. The general model was then re-fitted (localized) to the formed sub-areas. In kriging, the SA is modelled with a variogram, and the spatial correlation is a function of the distance (and direction) between the observation and the point of calculation. A general trend is corrected with the residual information of the neighbourhood, whose size is controlled by the number of the nearest neighbours. Nearness is measured as Euclidian distance. With all methods, the root mean square errors (RMSEs) were lower, but with the methods that segmented the study area, the deviance in single localized RMSEs was wide. Therefore, an element capable of controlling the division or localization should be included in the segmentation-localization process. Kriging, on the other hand, provided stable estimates when the number of neighbours was sufficient (over 30), thus offering the best potential for further studies. Even CART could be combined with kriging or non-parametric methods, such as most similar neighbours (MSN).
Resumo:
The Thesis presents a state-space model for a basketball league and a Kalman filter algorithm for the estimation of the state of the league. In the state-space model, each of the basketball teams is associated with a rating that represents its strength compared to the other teams. The ratings are assumed to evolve in time following a stochastic process with independent Gaussian increments. The estimation of the team ratings is based on the observed game scores that are assumed to depend linearly on the true strengths of the teams and independent Gaussian noise. The team ratings are estimated using a recursive Kalman filter algorithm that produces least squares optimal estimates for the team strengths and predictions for the scores of the future games. Additionally, if the Gaussianity assumption holds, the predictions given by the Kalman filter maximize the likelihood of the observed scores. The team ratings allow probabilistic inference about the ranking of the teams and their relative strengths as well as about the teams’ winning probabilities in future games. The predictions about the winners of the games are correct 65-70% of the time. The team ratings explain 16% of the random variation observed in the game scores. Furthermore, the winning probabilities given by the model are concurrent with the observed scores. The state-space model includes four independent parameters that involve the variances of noise terms and the home court advantage observed in the scores. The Thesis presents the estimation of these parameters using the maximum likelihood method as well as using other techniques. The Thesis also gives various example analyses related to the American professional basketball league, i.e., National Basketball Association (NBA), and regular seasons played in year 2005 through 2010. Additionally, the season 2009-2010 is discussed in full detail, including the playoffs.
Resumo:
We show that the ratio of matched individuals to blocking pairs grows linearly with the number of propose–accept rounds executed by the Gale–Shapley algorithm for the stable marriage problem. Consequently, the participants can arrive at an almost stable matching even without full information about the problem instance; for each participant, knowing only its local neighbourhood is enough. In distributed-systems parlance, this means that if each person has only a constant number of acceptable partners, an almost stable matching emerges after a constant number of synchronous communication rounds. We apply our results to give a distributed (2 + ε)-approximation algorithm for maximum-weight matching in bicoloured graphs and a centralised randomised constant-time approximation scheme for estimating the size of a stable matching.
Resumo:
We present a distributed 2-approximation algorithm for the minimum vertex cover problem. The algorithm is deterministic, and it runs in (Δ + 1)2 synchronous communication rounds, where Δ is the maximum degree of the graph. For Δ = 3, we give a 2-approximation algorithm also for the weighted version of the problem.
Resumo:
We present a local algorithm (constant-time distributed algorithm) for finding a 3-approximate vertex cover in bounded-degree graphs. The algorithm is deterministic, and no auxiliary information besides port numbering is required. (c) 2009 Elsevier B.V. All rights reserved.
Resumo:
We present a distributed 2-approximation algorithm for the minimum vertex cover problem. The algorithm is deterministic, and it runs in (Δ + 1)2 synchronous communication rounds, where Δ is the maximum degree of the graph. For Δ = 3, we give a 2-approximation algorithm also for the weighted version of the problem.