31 resultados para structured prediction


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present thesis in focused on the minimization of experimental efforts for the prediction of pollutant propagation in rivers by mathematical modelling and knowledge re-use. Mathematical modelling is based on the well known advection-dispersion equation, while the knowledge re-use approach employs the methods of case based reasoning, graphical analysis and text mining. The thesis contribution to the pollutant transport research field consists of: (1) analytical and numerical models for pollutant transport prediction; (2) two novel techniques which enable the use of variable parameters along rivers in analytical models; (3) models for the estimation of pollutant transport characteristic parameters (velocity, dispersion coefficient and nutrient transformation rates) as functions of water flow, channel characteristics and/or seasonality; (4) the graphical analysis method to be used for the identification of pollution sources along rivers; (5) a case based reasoning tool for the identification of crucial information related to the pollutant transport modelling; (6) and the application of a software tool for the reuse of information during pollutants transport modelling research. These support tools are applicable in the water quality research field and in practice as well, as they can be involved in multiple activities. The models are capable of predicting pollutant propagation along rivers in case of both ordinary pollution and accidents. They can also be applied for other similar rivers in modelling of pollutant transport in rivers with low availability of experimental data concerning concentration. This is because models for parameter estimation developed in the present thesis enable the calculation of transport characteristic parameters as functions of river hydraulic parameters and/or seasonality. The similarity between rivers is assessed using case based reasoning tools, and additional necessary information can be identified by using the software for the information reuse. Such systems represent support for users and open up possibilities for new modelling methods, monitoring facilities and for better river water quality management tools. They are useful also for the estimation of environmental impact of possible technological changes and can be applied in the pre-design stage or/and in the practical use of processes as well.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The size and complexity of projects in the software development are growing very fast. At the same time, the proportion of successful projects is still quite low according to the previous research. Although almost every project's team knows main areas of responsibility which would help to finish project on time and on budget, this knowledge is rarely used in practice. So it is important to evaluate the success of existing software development projects and to suggest a method for evaluating success chances which can be used in the software development projects. The main aim of this study is to evaluate the success of projects in the selected geographical region (Russia-Ukraine-Belarus). The second aim is to compare existing models of success prediction and to determine their strengths and weaknesses. Research was done as an empirical study. A survey with structured forms and theme-based interviews were used as the data collection methods. The information gathering was done in two stages. At the first stage, project manager or someone with similar responsibilities answered the questions over Internet. At the second stage, the participant was interviewed; his or her answers were discussed and refined. It made possible to get accurate information about each project and to avoid errors. It was found out that there are many problems in the software development projects. These problems are widely known and were discussed in literature many times. The research showed that most of the projects have problems with schedule, requirements, architecture, quality, and budget. Comparison of two models of success prediction presented that The Standish Group overestimates problems in project. At the same time, McConnell's model can help to identify problems in time and avoid troubles in future. A framework for evaluating success chances in distributed projects was suggested. The framework is similar to The Standish Group model but it was customized for distributed projects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A software development process is a predetermined sequence of steps to create a piece of software. A software development process is used, so that an implementing organization could gain significant benefits. The benefits for software development companies, that can be attributed to software process improvement efforts, are improved predictability in the development effort and improved quality software products. The implementation, maintenance, and management of a software process as well as the software process improvement efforts are expensive. Especially the implementation phase is expensive with a best case scenario of a slow return on investment. Software processes are rare in very small software development companies because of the cost of implementation and an improbable return on investment. This study presents a new method to enable benefits that are usually related to software process improvement to small companies with a low cost. The study presents reasons for the development of the method, a description of the method, and an implementation process for the method, as well as a theoretical case study of a method implementation. The study's focus is on describing the method. The theoretical use case is used to illustrate the theory of the method and the implementation process of the method. The study ends with a few conclusions on the method and on the method's implementation process. The main conclusion is that the method requires further study as well as implementation experiments to asses the value of the method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Investment decision-making on far-reaching innovation ideas is one of the key challenges practitioners and academics face in the field of innovation management. However, the management practices and theories strongly rely on evaluation systems that do not fit in well with this setting. These systems and practices normally cannot capture the value of future opportunities under high uncertainty because they ignore the firm’s potential for growth and flexibility. Real options theory and options-based methods have been offered as a solution to facilitate decision-making on highly uncertain investment objects. Much of the uncertainty inherent in these investment objects is attributable to unknown future events. In this setting, real options theory and methods have faced some challenges. First, the theory and its applications have largely been limited to market-priced real assets. Second, the options perspective has not proved as useful as anticipated because the tools it offers are perceived to be too complicated for managerial use. Third, there are challenges related to the type of uncertainty existing real options methods can handle: they are primarily limited to parametric uncertainty. Nevertheless, the theory is considered promising in the context of far-reaching and strategically important innovation ideas. The objective of this dissertation is to clarify the potential of options-based methodology in the identification of innovation opportunities. The constructive research approach gives new insights into the development potential of real options theory under non-parametric and closeto- radical uncertainty. The distinction between real options and strategic options is presented as an explanans for the discovered limitations of the theory. The findings offer managers a new means of assessing future innovation ideas based on the frameworks constructed during the course of the study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis, a classi cation problem in predicting credit worthiness of a customer is tackled. This is done by proposing a reliable classi cation procedure on a given data set. The aim of this thesis is to design a model that gives the best classi cation accuracy to e ectively predict bankruptcy. FRPCA techniques proposed by Yang and Wang have been preferred since they are tolerant to certain type of noise in the data. These include FRPCA1, FRPCA2 and FRPCA3 from which the best method is chosen. Two di erent approaches are used at the classi cation stage: Similarity classi er and FKNN classi er. Algorithms are tested with Australian credit card screening data set. Results obtained indicate a mean classi cation accuracy of 83.22% using FRPCA1 with similarity classi- er. The FKNN approach yields a mean classi cation accuracy of 85.93% when used with FRPCA2, making it a better method for the suitable choices of the number of nearest neighbors and fuzziness parameters. Details on the calibration of the fuzziness parameter and other parameters associated with the similarity classi er are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cyanobacteria are unicellular, non-nitrogen-fixing prokaryotes, which perform photosynthesis similarly as higher plants. The cyanobacterium Synechocystis sp. strain PCC 6803 is used as a model organism in photosynthesis research. My research described herein aims at understanding the function of the photosynthetic machinery and how it responds to changes in the environment. Detailed knowledge of the regulation of photosynthesis in cyanobacteria can be utilized for biotechnological purposes, for example in the harnessing of solar energy for biofuel production. In photosynthesis, iron participates in electron transfer. Here, we focused on iron transport in Synechocystis sp. strain PCC 6803 and particularly on the environmental regulation of the genes encoding the FutA2BC ferric iron transporter, which belongs to the ABC transporter family. A homology model built for the ATP-binding subunit FutC indicates that it has a functional ATPbinding site as well as conserved interactions with the channel-forming subunit FutB in the transporter complex. Polyamines are important for the cell proliferation, differentiation and apoptosis in prokaryotic and eukaryotic cells. In plants, polyamines have special roles in stress response and in plant survival. The polyamine metabolism in cyanobacteria in response to environmental stress is of interest in research on stress tolerance of higher plants. In this thesis, the potd gene encoding an polyamine transporter subunit from Synechocystis sp. strain PCC 6803 was characterized for the first time. A homology model built for PotD protein indicated that it has capability of binding polyamines, with the preference for spermidine. Furthermore, in order to investigate the structural features of the substrate specificity, polyamines were docked into the binding site. Spermidine was positioned very similarly in Synechocystis PotD as in the template structure and had most favorable interactions of the docked polyamines. Based on the homology model, experimental work was conducted, which confirmed the binding preference. Flavodiiron proteins (Flv) are enzymes, which protect the cell against toxicity of oxygen and/or nitric oxide by reduction. In this thesis, we present a novel type of photoprotection mechanism in cyanobacteria by the heterodimer of Flv2/Flv4. The constructed homology model of Flv2/Flv4 suggests a functional heterodimer capable of rapid electron transfer. The unknown protein sll0218, encoded by the flv2-flv4 operon, is assumed to facilitate the interaction of the Flv2/Flv4 heterodimer and energy transfer between the phycobilisome and PSII. Flv2/Flv4 provides an alternative electron transfer pathway and functions as an electron sink in PSII electron transfer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Vaikka liiketoimintatiedon hallintaa sekä johdon päätöksentekoa on tutkittu laajasti, näiden kahden käsitteen yhteisvaikutuksesta on olemassa hyvin rajallinen määrä tutkimustietoa. Tulevaisuudessa aiheen tärkeys korostuu, sillä olemassa olevan datan määrä kasvaa jatkuvasti. Yritykset tarvitsevat jatkossa yhä enemmän kyvykkyyksiä sekä resursseja, jotta sekä strukturoitua että strukturoimatonta tietoa voidaan hyödyntää lähteestä riippumatta. Nykyiset Business Intelligence -ratkaisut mahdollistavat tehokkaan liiketoimintatiedon hallinnan osana johdon päätöksentekoa. Aiemman kirjallisuuden pohjalta, tutkimuksen empiirinen osuus tunnistaa liiketoimintatiedon hyödyntämiseen liittyviä tekijöitä, jotka joko tukevat tai rajoittavat johdon päätöksentekoprosessia. Tutkimuksen teoreettinen osuus johdattaa lukijan tutkimusaiheeseen kirjallisuuskatsauksen avulla. Keskeisimmät tutkimukseen liittyvät käsitteet, kuten Business Intelligence ja johdon päätöksenteko, esitetään relevantin kirjallisuuden avulla – tämän lisäksi myös dataan liittyvät käsitteet analysoidaan tarkasti. Tutkimuksen empiirinen osuus rakentuu tutkimusteorian pohjalta. Tutkimuksen empiirisessä osuudessa paneudutaan tutkimusteemoihin käytännön esimerkein: kolmen tapaustutkimuksen avulla tutkitaan sekä kuvataan toisistaan irrallisia tapauksia. Jokainen tapaus kuvataan sekä analysoidaan teoriaan perustuvien väitteiden avulla – nämä väitteet ovat perusedellytyksiä menestyksekkäälle liiketoimintatiedon hyödyntämiseen perustuvalle päätöksenteolle. Tapaustutkimusten avulla alkuperäistä tutkimusongelmaa voidaan analysoida tarkasti huomioiden jo olemassa oleva tutkimustieto. Analyysin tulosten avulla myös yksittäisiä rajoitteita sekä mahdollistavia tekijöitä voidaan analysoida. Tulokset osoittavat, että rajoitteilla on vahvasti negatiivinen vaikutus päätöksentekoprosessin onnistumiseen. Toisaalta yritysjohto on tietoinen liiketoimintatiedon hallintaan liittyvistä positiivisista seurauksista, vaikka kaikkia mahdollisuuksia ei olisikaan hyödynnetty. Tutkimuksen merkittävin tulos esittelee viitekehyksen, jonka puitteissa johdon päätöksentekoprosesseja voidaan arvioida sekä analysoida. Despite the fact that the literature on Business Intelligence and managerial decision-making is extensive, relatively little effort has been made to research the relationship between them. This particular field of study has become important since the amount of data in the world is growing every second. Companies require capabilities and resources in order to utilize structured data and unstructured data from internal and external data sources. However, the present Business Intelligence technologies enable managers to utilize data effectively in decision-making. Based on the prior literature, the empirical part of the thesis identifies the enablers and constraints in computer-aided managerial decision-making process. In this thesis, the theoretical part provides a preliminary understanding about the research area through a literature review. The key concepts such as Business Intelligence and managerial decision-making are explored by reviewing the relevant literature. Additionally, different data sources as well as data forms are analyzed in further detail. All key concepts are taken into account when the empirical part is carried out. The empirical part obtains an understanding of the real world situation when it comes to the themes that were covered in the theoretical part. Three selected case companies are analyzed through those statements, which are considered as critical prerequisites for successful computer-aided managerial decision-making. The case study analysis, which is a part of the empirical part, enables the researcher to examine the relationship between Business Intelligence and managerial decision-making. Based on the findings of the case study analysis, the researcher identifies the enablers and constraints through the case study interviews. The findings indicate that the constraints have a highly negative influence on the decision-making process. In addition, the managers are aware of the positive implications that Business Intelligence has for decision-making, but all possibilities are not yet utilized. As a main result of this study, a data-driven framework for managerial decision-making is introduced. This framework can be used when the managerial decision-making processes are evaluated and analyzed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A linear prediction procedure is one of the approved numerical methods of signal processing. In the field of optical spectroscopy it is used mainly for extrapolation known parts of an optical signal in order to obtain a longer one or deduce missing signal samples. The first is needed particularly when narrowing spectral lines for the purpose of spectral information extraction. In the present paper the coherent anti-Stokes Raman scattering (CARS) spectra were under investigation. The spectra were significantly distorted by the presence of nonlinear nonresonant background. In addition, line shapes were far from Gaussian/Lorentz profiles. To overcome these disadvantages the maximum entropy method (MEM) for phase spectrum retrieval was used. The obtained broad MEM spectra were further underwent the linear prediction analysis in order to be narrowed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis studies the predictability of market switching and delisting events from OMX First North Nordic multilateral stock exchange by using financial statement information and market information from 2007 to 2012. This study was conducted by using a three stage process. In first stage relevant theoretical framework and initial variable pool were constructed. Then, explanatory analysis of the initial variable pool was done in order to further limit and identify relevant variables. The explanatory analysis was conducted by using self-organizing map methodology. In the third stage, the predictive modeling was carried out with random forests and support vector machine methodologies. It was found that the explanatory analysis was able to identify relevant variables. The results indicate that the market switching and delisting events can be predicted in some extent. The empirical results also support the usability of financial statement and market information in the prediction of market switching and delisting events.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of this master’s thesis is to examine if Weibull analysis is suitable method for warranty forecasting in the Case Company. The Case Company has used Reliasoft’s Weibull++ software, which is basing on the Weibull method, but the Company has noticed that the analysis has not given right results. This study was conducted making Weibull simulations in different profit centers of the Case Company and then comparing actual cost and forecasted cost. Simula-tions were made using different time frames and two methods for determining future deliveries. The first sub objective is to examine, which parameters of simulations will give the best result to each profit center. The second sub objective of this study is to create a simple control model for following forecasted costs and actual realized costs. The third sub objective is to document all Qlikview-parameters of profit centers. This study is a constructive research, and solutions for company’s problems are figured out in this master’s thesis. In the theory parts were introduced quality issues, for example; what is quality, quality costing and cost of poor quality. Quality is one of the major aspects in the Case Company, so understand-ing the link between quality and warranty forecasting is important. Warranty management was also introduced and other different tools for warranty forecasting. The Weibull method and its mathematical properties and reliability engineering were introduced. The main results of this master’s thesis are that the Weibull analysis forecasted too high costs, when calculating provision. Although, some forecasted values of profit centers were lower than actual values, the method works better for planning purposes. One of the reasons is that quality improving or alternatively quality decreasing is not showing in the results of the analysis in the short run. The other reason for too high values is that the products of the Case Company are com-plex and analyses were made in the profit center-level. The Weibull method was developed for standard products, but products of the Case Company consists of many complex components. According to the theory, this method was developed for homogeneous-data. So the most im-portant notification is that the analysis should be made in the product level, not the profit center level, when the data is more homogeneous.