35 resultados para the least squares distance method
Resumo:
The research on language equations has been active during last decades. Compared to the equations on words the equations on languages are much more difficult to solve. Even very simple equations that are easy to solve for words can be very hard for languages. In this thesis we study two of such equations, namely commutation and conjugacy equations. We study these equations on some limited special cases and compare some of these results to the solutions of corresponding equations on words. For both equations we study the maximal solutions, the centralizer and the conjugator. We present a fixed point method that we can use to search these maximal solutions and analyze the reasons why this method is not successful for all languages. We give also several examples to illustrate the behaviour of this method.
Resumo:
This thesis deals with distance transforms which are a fundamental issue in image processing and computer vision. In this thesis, two new distance transforms for gray level images are presented. As a new application for distance transforms, they are applied to gray level image compression. The new distance transforms are both new extensions of the well known distance transform algorithm developed by Rosenfeld, Pfaltz and Lay. With some modification their algorithm which calculates a distance transform on binary images with a chosen kernel has been made to calculate a chessboard like distance transform with integer numbers (DTOCS) and a real value distance transform (EDTOCS) on gray level images. Both distance transforms, the DTOCS and EDTOCS, require only two passes over the graylevel image and are extremely simple to implement. Only two image buffers are needed: The original gray level image and the binary image which defines the region(s) of calculation. No other image buffers are needed even if more than one iteration round is performed. For large neighborhoods and complicated images the two pass distance algorithm has to be applied to the image more than once, typically 3 10 times. Different types of kernels can be adopted. It is important to notice that no other existing transform calculates the same kind of distance map as the DTOCS. All the other gray weighted distance function, GRAYMAT etc. algorithms find the minimum path joining two points by the smallest sum of gray levels or weighting the distance values directly by the gray levels in some manner. The DTOCS does not weight them that way. The DTOCS gives a weighted version of the chessboard distance map. The weights are not constant, but gray value differences of the original image. The difference between the DTOCS map and other distance transforms for gray level images is shown. The difference between the DTOCS and EDTOCS is that the EDTOCS calculates these gray level differences in a different way. It propagates local Euclidean distances inside a kernel. Analytical derivations of some results concerning the DTOCS and the EDTOCS are presented. Commonly distance transforms are used for feature extraction in pattern recognition and learning. Their use in image compression is very rare. This thesis introduces a new application area for distance transforms. Three new image compression algorithms based on the DTOCS and one based on the EDTOCS are presented. Control points, i.e. points that are considered fundamental for the reconstruction of the image, are selected from the gray level image using the DTOCS and the EDTOCS. The first group of methods select the maximas of the distance image to new control points and the second group of methods compare the DTOCS distance to binary image chessboard distance. The effect of applying threshold masks of different sizes along the threshold boundaries is studied. The time complexity of the compression algorithms is analyzed both analytically and experimentally. It is shown that the time complexity of the algorithms is independent of the number of control points, i.e. the compression ratio. Also a new morphological image decompression scheme is presented, the 8 kernels' method. Several decompressed images are presented. The best results are obtained using the Delaunay triangulation. The obtained image quality equals that of the DCT images with a 4 x 4
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
The purpose of the thesis is to analyze whether the returns of general stock market indices of Estonia, Latvia and Lithuania follow the random walk hypothesis (RWH), and in addition, whether they are consistent with the weak-form efficiency criterion. Also the existence of the day-of-the-week anomaly is examined in the same regional markets. The data consists of daily closing quotes of the OMX Tallinn, Riga and Vilnius total return indices for the sample period from January 3, 2000 to August 28, 2009. Moreover, the full sample period is also divided into two sub-periods. The RWH is tested by applying three quantitative methods (i.e. the Augmented Dickey-Fuller unit root test, serial correlation test and non-parametric runs test). Ordinary Least Squares (OLS) regression with dummy variables is employed to detect the day-of-the-week anomalies. The random walk hypothesis (RWH) is rejected in the Estonian and Lithuanian stock markets. The Latvian stock market exhibits more efficient behaviour, although some evidence of inefficiency is also found, mostly during the first sub-period from 2000 to 2004. Day-of-the-week anomalies are detected on every stock market examined, though no longer during the later sub-period.
Resumo:
Cooling crystallization is one of the most important purification and separation techniques in the chemical and pharmaceutical industry. The product of the cooling crystallization process is always a suspension that contains both the mother liquor and the product crystals, and therefore the first process step following crystallization is usually solid-liquid separation. The properties of the produced crystals, such as their size and shape, can be affected by modifying the conditions during the crystallization process. The filtration characteristics of solid/liquid suspensions, on the other hand, are strongly influenced by the particle properties, as well as the properties of the liquid phase. It is thus obvious that the effect of the changes made to the crystallization parameters can also be seen in the course of the filtration process. Although the relationship between crystallization and filtration is widely recognized, the number of publications where these unit operations have been considered in the same context seems to be surprisingly small. This thesis explores the influence of different crystallization parameters in an unseeded batch cooling crystallization process on the external appearance of the product crystals and on the pressure filtration characteristics of the obtained product suspensions. Crystallization experiments are performed by crystallizing sulphathiazole (C9H9N3O2S2), which is a wellknown antibiotic agent, from different mixtures of water and n-propanol in an unseeded batch crystallizer. The different crystallization parameters that are studied are the composition of the solvent, the cooling rate during the crystallization experiments carried out by using a constant cooling rate throughout the whole batch, the cooling profile, as well as the mixing intensity during the batch. The obtained crystals are characterized by using an automated image analyzer and the crystals are separated from the solvent through constant pressure batch filtration experiments. Separation characteristics of the suspensions are described by means of average specific cake resistance and average filter cake porosity, and the compressibilities of the cakes are also determined. The results show that fairly large differences can be observed between the size and shape of the crystals, and it is also shown experimentally that the changes in the crystal size and shape have a direct impact on the pressure filtration characteristics of the crystal suspensions. The experimental results are utilized to create a procedure that can be used for estimating the filtration characteristics of solid-liquid suspensions according to the particle size and shape data obtained by image analysis. Multilinear partial least squares regression (N-PLS) models are created between the filtration parameters and the particle size and shape data, and the results presented in this thesis show that relatively obvious correlations can be detected with the obtained models.
Resumo:
This dissertation is based on 5 articles which deal with reaction mechanisms of the following selected industrially important organic reactions: 1. dehydrocyclization of n-butylbenzene to produce naphthalene 2. dehydrocyclization of 1-(p-tolyl)-2-methylbutane (MB) to produce 2,6-dimethylnaphthalene 3. esterification of neopentyl glycol (NPG) with different carboxylic acids to produce monoesters 4. skeletal isomerization of 1-pentene to produce 2-methyl-1-butene and 2-methyl-2-butene The results of initial- and integral-rate experiments of n-butylbenzene dehydrocyclization over selfmade chromia/alumina catalyst were applied when investigating reaction 2. Reaction 2 was performed using commercial chromia/alumina of different acidity, platina on silica and vanadium/calcium/alumina as catalysts. On all catalysts used for the dehydrocyclization, major reactions were fragmentation of MB and 1-(p-tolyl)-2-methylbutenes (MBes), dehydrogenation of MB, double bond transfer, hydrogenation and 1,6-cyclization of MBes. Minor reactions were 1,5-cyclization of MBes and methyl group fragmentation of 1,6- cyclization products. Esterification reactions of NPG were performed using three different carboxylic acids: propionic, isobutyric and 2-ethylhexanoic acid. Commercial heterogeneous gellular (Dowex 50WX2), macroreticular (Amberlyst 15) type resins and homogeneous para-toluene sulfonic acid were used as catalysts. At first NPG reacted with carboxylic acids to form corresponding monoester and water. Then monoester esterified with carboxylic acid to form corresponding diester. In disproportionation reaction two monoester molecules formed NPG and corresponding diester. All these three reactions can attain equilibrium. Concerning esterification, water was removed from the reactor in order to prevent backward reaction. Skeletal isomerization experiments of 1-pentene were performed over HZSM-22 catalyst. Isomerization reactions of three different kind were detected: double bond, cis-trans and skeletal isomerization. Minor side reaction were dimerization and fragmentation. Monomolecular and bimolecular reaction mechanisms for skeletal isomerization explained experimental results almost equally well. Pseudohomogeneous kinetic parameters of reactions 1 and 2 were estimated by usual least squares fitting. Concerning reactions 3 and 4 kinetic parameters were estimated by the leastsquares method, but also the possible cross-correlation and identifiability of parameters were determined using Markov chain Monte Carlo (MCMC) method. Finally using MCMC method, the estimation of model parameters and predictions were performed according to the Bayesian paradigm. According to the fitting results suggested reaction mechanisms explained experimental results rather well. When the possible cross-correlation and identifiability of parameters (Reactions 3 and 4) were determined using MCMC method, the parameters identified well, and no pathological cross-correlation could be seen between any parameter pair.
Resumo:
Bakgrunden och inspirationen till föreliggande studie är tidigare forskning i tillämpningar på randidentifiering i metallindustrin. Effektiv randidentifiering möjliggör mindre säkerhetsmarginaler och längre serviceintervall för apparaturen i industriella högtemperaturprocesser, utan ökad risk för materielhaverier. I idealfallet vore en metod för randidentifiering baserad på uppföljning av någon indirekt variabel som kan mätas rutinmässigt eller till en ringa kostnad. En dylik variabel för smältugnar är temperaturen i olika positioner i väggen. Denna kan utnyttjas som insignal till en randidentifieringsmetod för att övervaka ugnens väggtjocklek. Vi ger en bakgrund och motivering till valet av den geometriskt endimensionella dynamiska modellen för randidentifiering, som diskuteras i arbetets senare del, framom en flerdimensionell geometrisk beskrivning. I de aktuella industriella tillämpningarna är dynamiken samt fördelarna med en enkel modellstruktur viktigare än exakt geometrisk beskrivning. Lösningsmetoder för den s.k. sidledes värmeledningsekvationen har många saker gemensamt med randidentifiering. Därför studerar vi egenskaper hos lösningarna till denna ekvation, inverkan av mätfel och något som brukar kallas förorening av mätbrus, regularisering och allmännare följder av icke-välställdheten hos sidledes värmeledningsekvationen. Vi studerar en uppsättning av tre olika metoder för randidentifiering, av vilka de två första är utvecklade från en strikt matematisk och den tredje från en mera tillämpad utgångspunkt. Metoderna har olika egenskaper med specifika fördelar och nackdelar. De rent matematiskt baserade metoderna karakteriseras av god noggrannhet och låg numerisk kostnad, dock till priset av låg flexibilitet i formuleringen av den modellbeskrivande partiella differentialekvationen. Den tredje, mera tillämpade, metoden kännetecknas av en sämre noggrannhet förorsakad av en högre grad av icke-välställdhet hos den mera flexibla modellen. För denna gjordes även en ansats till feluppskattning, som senare kunde observeras överensstämma med praktiska beräkningar med metoden. Studien kan anses vara en god startpunkt och matematisk bas för utveckling av industriella tillämpningar av randidentifiering, speciellt mot hantering av olinjära och diskontinuerliga materialegenskaper och plötsliga förändringar orsakade av “nedfallande” väggmaterial. Med de behandlade metoderna förefaller det möjligt att uppnå en robust, snabb och tillräckligt noggrann metod av begränsad komplexitet för randidentifiering.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
People make up one of the most important resources for a corporation and therefore it has to continuously seek an ever more diverse and international workforce. Inpatriation is another way of utilizing foreign expertise in a corporation. An inpatriate refers to a person that is on an international assignment at the headquarters of a corporation, where they have been sent either from a subsidiary abroad or from a third country outside the corporation. Strengthening the social network of the inpatriate and their family contributes to the adjustment process and furthermore the success of the work assignment. As social networking sites are currently the fastest developing personal networking tools in the world, it is interesting to see how they can help in inpatriate adjustment. The objective of this thesis is to explore the potential of social networking sites (SNS) in inpatriate adjustment. The main objective can be divided into three sub objectives: 1. What is SNS used for during the inpatriate assignment? 2. What are the inpatriates’ motivations to use SNS? 3. Could the three facets of adjustment (work, interaction and general) be gained through SNS? This qualitative study utilizes the theme interview data collection method and the thematic analysis approach for analysing the interview data. From the interviews with five Indian inpatriates in Finland the most mentioned uses of SNS were related to participating (sharing opinions, recommendations and discussing things and connecting to friends, family and colleagues) and consuming (collecting information for work and free time), the least mentioned use of SNS was producing (posting videos, photos and updates). An interesting finding was that the five interviewees did not use SNS for purely entertainment motives at all during their assignment. This thesis found that all three facets of adjustment could potentially be gained through SNS.
Resumo:
Industrial maintenance can be executed internally, acquired from the original equipment manufacturer or outsourced to a service provider, and this concludes in many different kind of business relationships. To maximize the total value in a maintenance business relationship it is important to know what the partner values. The value of maintenance services can be considered to consist of value elements and the perceived total value for the customer and the service provider is the sum of these value elements. The specific objectives of this thesis are to identify the most important value elements for the maintenance service customer and provider and also to recognize where the value elements differ. The study was executed as a statistical analysis using the survey method. The data has been collected by an online survey sent to 345 maintenance service professionals in Finland. In the survey, four different types of value elements were considered: the customer’s high critical and low critical items and the service provider’s core and support service. The most valued elements by the respondents were reliability, safety at work, environmental safety, and operator knowledge. The least valued elements were asset management factors and access to markets. Statistically significant differences in value elements between service types were also found. As a managerial implication a value gap profile is presented. This Master’s Thesis is part of the MaiSeMa (Industrial Maintenance Services in a Renewing Business Network: Identify, Model and Manage Value) research project where network decision models are created to identify, model and manage the value of maintenance services.
Resumo:
Food industry in Finland has a long tradition and the new trends and the future of Finnish food industry is going towards functional and healthy food for the consumers in and outside Finland. Small companies operating in this industry face many difficulties in trying to compete and expand to new markets, that is why these companies are the key of innovation and they have done many breakthroughs in the food industry as well. It is therefore important to understand the internationalization process these companies follow and entry strategies they use, and moreover how they use their limited resources in order to be successful in international markets. This thesis via a case study approach deals with the issue of internationalization of SMEs and Finnish food industry. This study supports earlier theories of internationalization, primarily the Uppsala model and acknowledges internationalization as an incremental process. Meaning that psychic distance is indeed the major barrier of internationalization, and acquisition of international knowledge requires significant amount of time which influences the level of resource-commitment in foreign markets. It follows that due to the risks involved in foreign markets, the least resource-intensive modes of market entry such as direct and indirect exports are generally preferred at the start of internationalization process. As of what explains the non-conventional rapid internationalization process, we conclude that in an internationalized industry and country with established trade flows like Finland, the context in which firms operate may be less significant than the varying level of entrepreneurial skills and confidence present therein.
Resumo:
This thesis describes work related to the in-depth characterization of the phenolic compounds of silver birch (Betula pendula) inner bark. Phenolic compounds are the most ubiquitous class of plant secondary compounds. The unifying feature of this structurally diverse group is an aromatic ring containing at least one hydroxyl group. Due to the structural diversity, phenolics have various roles in the plant defense against biotic and abiotic stresses. In addition, they can confer several health-promoting properties to humans. Furthermore, the structural diversity of this class of compounds causes challenges for their analysis. The study species in the present work, silver birch, is economically the most important hard wood species in northern Europe. Its inner bark contains a high level of phenolic compounds and it has shown one of the strongest antioxidant activities among 92 Finnish plant materials. The literature review surveys the diversity and organ specific distribution of phenolic compounds in silver birch as well as the proposed ecological functions of phenolic compounds in nature. In addition, the basis for the characterization of phenolics by mass spectrometry (MS), nuclear magnetic resonance spectroscopy (NMR), and circular dichroism spectroscopy (CD) are reviewed. The objective of the experimental work was to extract, purify, characterize, and quantify the inner bark phenolic compounds. Overall 36 compounds were characterized by MS and ultraviolet spectroscopy (UV). 24 compounds were isolated and their structures confirmed by NMR and CD spectroscopy. Five novel natural compounds were identified. Special emphasis was placed on the establishment of a method for the characterization of proanthocyanidins (PAs). Hydrophilic interaction liquid chromatography (HILIC) was utilized because of its high resolution power and predictable elution order of oligomeric and polymeric PAs according to an increasing degree of polymerization. The combination of HILIC and high-resolution MS detection allowed the identification of procyanidin (PC) polymers up to the degree of polymerization of 22. In addition, a series of oligomeric and polymeric PC monoxylosides were observed for the first time in nature. Season and genotype influenced the quantities of the main inner bark phenolics, yet qualitative differences were not observed. However, manual wounding of the inner bark induced the production of ellagitannins (ETs) in the wounded tissues, i.e. callus. Since ETs were not detected in the intact inner bark, this finding may reflect the capacity of silver birch to exploit ellagitannins in its defense.
Resumo:
Real option valuation, in particular the fuzzy pay-off method, has proven to be useful in defining risk and visualizing imprecision of investments in various industry applications. This study examines whether the evaluation of risk and profitability for public real estate investments can be improved by using real option methodology. Firstly, the context of real option valuation in the real estate industry is examined. Further, an empirical case study is performed on 30 real estate investments of a Finnish government enterprise in order to determine whether the presently used investment analysis system can be complemented by the pay-off method. Despite challenges in the application of the pay-off method to the case company’s large investment base, real option valuation is found to create additional value and facilitate more robust risk analysis in public real estate applications.
Resumo:
The purpose of this thesis is to investigate whether different private equity fund characteristics have any influence on the fund performance. Fund characteristics include fund type (venture capital or buyouts), fund size (sizes of funds are divided into six ranges), fund investment industry, fund sequence (first fund or follow-on fund) and investment market (US or EMEA). Fund performance is measured by internal rate of return, and tested by cross-sectional regression analysis with the method of Ordinary Least Squares. The data employs performance and characteristics of 997 private equity funds between 1985 and 2008. Our findings are that fund type has effect on fund performance. The average IRR of venture capital funds is 2.7% less than average IRR of buyout funds. However, We did not find any relationship between fund size and performance, and between fund sequence and performance. Funds based on US market perform better than funds based on EMEA market. The fund performance differs across different industries. The average IRRs of industrial/energy industry, consumer related industry, communications and media industry and medical/health industry are higher than the average IRR of other industries.