12 resultados para Optimization problems

em Helda - Digital Repository of University of Helsinki


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study I offer a diachronic solution for a number of difficult inflectional endings in Old Church Slavic nominal declensions. In this context I address the perhaps most disputed and the most important question of the Slavic nominal inflectional morphology: whether there was in Proto-Slavic an Auslautgesetz (ALG), a law of final syllables, that narrowed the Proto-Indo-European vowel */o/ to */u/ in closed word-final syllables. In addition, the work contains an exhaustive morphological classification of the nouns and adjectives that occur in canonical Old Church Slavic. I argue that Proto-Indo-European */o/ became Proto-Slavic */u/ before word-final */s/ and */N/. This conclusion is based on the impossibility of finding credible analogical (as opposed to phonological) explanations for the forms supporting the ALG hypothesis, and on the survival of the neuter gender in Slavic. It is not likely that the */o/-stem nominative singular ending */-u/ was borrowed from the accusative singular, because the latter would have been the only paradigmatic form with the stem vowel */-u-/. It is equally unlikely that the ending */-u/ was borrowed from the */u/-stems, because the latter constituted a moribund class. The usually stated motivation for such an analogical borrowing, i.e. a need to prevent the merger of */o/-stem masculines with neuters of the same class, is not tenable. Extra-Slavic, as well as intra-Slavic evidence suggests that phonologically-triggered mergers between two semantically opaque genders do not tend to be prevented, but rather that such mergers lead to the loss of the gender opposition in question. On the other hand, if */-os/ had not become */-us/, most nouns and, most importantly, all adjectives and pronouns would have lost the formal distinction between masculines and neuters. This would have necessarily resulted in the loss of the neuter gender. A new explanation is given for the most apparent piece of evidence against the ALG hypothesis, the nominative-accusative singular of the */es/-stem neuters, e.g. nebo 'sky'. I argue that it arose in late Proto-Slavic dialects, replacing regular nebe, under the influence of the */o/- and */yo/-stems where a correlation had emerged between a hard root-final consonant and the termination -o, on the one hand, and a soft root-final consonant and the termination -e, on the other.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the improved prognosis of many forms of cancer, an increasing number of cancer survivors are willing to return to work after their treatment. It is generally believed, however, that people with cancer are either unemployed, stay at home, or retire more often than people without cancer. This study investigated the problems that cancer survivors experience on the labour market, as well as the disease-related, sociodemographic and psychosocial factors at work that are associated with the employment and work ability of cancer survivors. The impact of cancer on employment was studied combining the data of Finnish Cancer Registry and census data of the years 1985, 1990, 1995 or 1997 of Statistics Finland. There were two data sets containing 46 312 and 12 542 people with cancer. The results showed that cancer survivors were slightly less often employed than their referents. Two to three years after the diagnosis the employment rate of the cancer survivors was 9% lower than that of their referents (64% vs. 73%), whereas the employment rate was the same before the diagnosis (78%). The employment rate varied greatly according to the cancer type and education. The probability of being employed was greater in the lower than in the higher educational groups. People with cancer were less often employed than people without cancer mainly because of their higher retirement rate (34% vs. 27%). As well as employment, retirement varied by cancer type. The risk of retirement was twofold for people having cancer of the nervous system or people with leukaemia compared to their referents, whereas people with skin cancer, for example, did not have an increased risk of retirement. The aim of the questionnaire study was to investigate whether the work ability of cancer survivors differs from that of people without cancer and whether cancer had impaired their work ability. There were 591 cancer survivors and 757 referents in the data. Even though current work ability of cancer survivors did not differ between the survivors and their referents, 26% of cancer survivors reported that their physical work ability, and 19% that their mental work ability had deteriorated due to cancer. The survivors who had other diseases or had had chemotherapy, most often reported impaired work ability, whereas survivors with a strong commitment to their work organization, or a good social climate at work, reported impairment less frequently. The aim of the other questionnaire study containing 640 people with the history of cancer was to examine extent of social support that cancer survivors needed, and had received from their work community. The cancer survivors had received most support from their co-workers, and they hoped for more support especially from the occupational health care personnel (39% of women and 29% of men). More support was especially needed by men who had lymphoma, had received chemotherapy or had a low education level. The results of this study show that the majority of the survivors are able to return to work. There is, however, a group of cancer survivors who leave work life early, have impaired work ability due to their illness, and suffer from lack of support from their work place and the occupational health services. Treatment-related, as well as sociodemographic factors play an important role in survivors' work-related problems, and presumably their possibilities to continue working.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Forest management is facing new challenges under climate change. By adjusting thinning regimes, conventional forest management can be adapted to various objectives of utilization of forest resources, such as wood quality, forest bioenergy, and carbon sequestration. This thesis aims to develop and apply a simulation-optimization system as a tool for an interdisciplinary understanding of the interactions between wood science, forest ecology, and forest economics. In this thesis, the OptiFor software was developed for forest resources management. The OptiFor simulation-optimization system integrated the process-based growth model PipeQual, wood quality models, biomass production and carbon emission models, as well as energy wood and commercial logging models into a single optimization model. Osyczka s direct and random search algorithm was employed to identify optimal values for a set of decision variables. The numerical studies in this thesis broadened our current knowledge and understanding of the relationships between wood science, forest ecology, and forest economics. The results for timber production show that optimal thinning regimes depend on site quality and initial stand characteristics. Taking wood properties into account, our results show that increasing the intensity of thinning resulted in lower wood density and shorter fibers. The addition of nutrients accelerated volume growth, but lowered wood quality for Norway spruce. Integrating energy wood harvesting into conventional forest management showed that conventional forest management without energy wood harvesting was still superior in sparse stands of Scots pine. Energy wood from pre-commercial thinning turned out to be optimal for dense stands. When carbon balance is taken into account, our results show that changing carbon assessment methods leads to very different optimal thinning regimes and average carbon stocks. Raising the carbon price resulted in longer rotations and a higher mean annual increment, as well as a significantly higher average carbon stock over the rotation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phosphorus is a nutrient needed in crop production. While boosting crop yields it may also accelerate eutrophication in the surface waters receiving the phosphorus runoff. The privately optimal level of phosphorus use is determined by the input and output prices, and the crop response to phosphorus. Socially optimal use also takes into account the impact of phosphorus runoff on water quality. Increased eutrophication decreases the economic value of surface waters by Deteriorating fish stocks, curtailing the potential for recreational activities and by increasing the probabilities of mass algae blooms. In this dissertation, the optimal use of phosphorus is modelled as a dynamic optimization problem. The potentially plant available phosphorus accumulated in soil is treated as a dynamic state variable, the control variable being the annual phosphorus fertilization. For crop response to phosphorus, the state variable is more important than the annual fertilization. The level of this state variable is also a key determinant of the runoff of dissolved, reactive phosphorus. Also the loss of particulate phosphorus due to erosion is considered in the thesis, as well as its mitigation by constructing vegetative buffers. The dynamic model is applied for crop production on clay soils. At the steady state, the analysis focuses on the effects of prices, damage parameterization, discount rate and soil phosphorus carryover capacity on optimal steady state phosphorus use. The economic instruments needed to sustain the social optimum are also analyzed. According to the results the economic incentives should be conditioned on soil phosphorus values directly, rather than on annual phosphorus applications. The results also emphasize the substantial effects the differences in varying discount rates of the farmer and the social planner have on optimal instruments. The thesis analyzes the optimal soil phosphorus paths from its alternative initial levels. It also examines how erosion susceptibility of a parcel affects these optimal paths. The results underline the significance of the prevailing soil phosphorus status on optimal fertilization levels. With very high initial soil phosphorus levels, both the privately and socially optimal phosphorus application levels are close to zero as the state variable is driven towards its steady state. The soil phosphorus processes are slow. Therefore, depleting high phosphorus soils may take decades. The thesis also presents a methodologically interesting phenomenon in problems of maximizing the flow of discounted payoffs. When both the benefits and damages are related to the same state variable, the steady state solution may have an interesting property, under very general conditions: The tail of the payoffs of the privately optimal path as well as the steady state may provide a higher social welfare than the respective tail of the socially optimal path. The result is formalized and an applied to the created framework of optimal phosphorus use.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The analysis of sequential data is required in many diverse areas such as telecommunications, stock market analysis, and bioinformatics. A basic problem related to the analysis of sequential data is the sequence segmentation problem. A sequence segmentation is a partition of the sequence into a number of non-overlapping segments that cover all data points, such that each segment is as homogeneous as possible. This problem can be solved optimally using a standard dynamic programming algorithm. In the first part of the thesis, we present a new approximation algorithm for the sequence segmentation problem. This algorithm has smaller running time than the optimal dynamic programming algorithm, while it has bounded approximation ratio. The basic idea is to divide the input sequence into subsequences, solve the problem optimally in each subsequence, and then appropriately combine the solutions to the subproblems into one final solution. In the second part of the thesis, we study alternative segmentation models that are devised to better fit the data. More specifically, we focus on clustered segmentations and segmentations with rearrangements. While in the standard segmentation of a multidimensional sequence all dimensions share the same segment boundaries, in a clustered segmentation the multidimensional sequence is segmented in such a way that dimensions are allowed to form clusters. Each cluster of dimensions is then segmented separately. We formally define the problem of clustered segmentations and we experimentally show that segmenting sequences using this segmentation model, leads to solutions with smaller error for the same model cost. Segmentation with rearrangements is a novel variation to the segmentation problem: in addition to partitioning the sequence we also seek to apply a limited amount of reordering, so that the overall representation error is minimized. We formulate the problem of segmentation with rearrangements and we show that it is an NP-hard problem to solve or even to approximate. We devise effective algorithms for the proposed problem, combining ideas from dynamic programming and outlier detection algorithms in sequences. In the final part of the thesis, we discuss the problem of aggregating results of segmentation algorithms on the same set of data points. In this case, we are interested in producing a partitioning of the data that agrees as much as possible with the input partitions. We show that this problem can be solved optimally in polynomial time using dynamic programming. Furthermore, we show that not all data points are candidates for segment boundaries in the optimal solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis we study a series of multi-user resource-sharing problems for the Internet, which involve distribution of a common resource among participants of multi-user systems (servers or networks). We study concurrently accessible resources, which for end-users may be exclusively accessible or non-exclusively. For all kinds we suggest a separate algorithm or a modification of common reputation scheme. Every algorithm or method is studied from different perspectives: optimality of protocols, selfishness of end users, fairness of the protocol for end users. On the one hand the multifaceted analysis allows us to select the most suited protocols among a set of various available ones based on trade-offs of optima criteria. On the other hand, the future Internet predictions dictate new rules for the optimality we should take into account and new properties of the networks that cannot be neglected anymore. In this thesis we have studied new protocols for such resource-sharing problems as the backoff protocol, defense mechanisms against Denial-of-Service, fairness and confidentiality for users in overlay networks. For backoff protocol we present analysis of a general backoff scheme, where an optimization is applied to a general-view backoff function. It leads to an optimality condition for backoff protocols in both slot times and continuous time models. Additionally we present an extension for the backoff scheme in order to achieve fairness for the participants in an unfair environment, such as wireless signal strengths. Finally, for the backoff algorithm we suggest a reputation scheme that deals with misbehaving nodes. For the next problem -- denial-of-service attacks, we suggest two schemes that deal with the malicious behavior for two conditions: forged identities and unspoofed identities. For the first one we suggest a novel most-knocked-first-served algorithm, while for the latter we apply a reputation mechanism in order to restrict resource access for misbehaving nodes. Finally, we study the reputation scheme for the overlays and peer-to-peer networks, where resource is not placed on a common station, but spread across the network. The theoretical analysis suggests what behavior will be selected by the end station under such a reputation mechanism.