12 resultados para agglomerative clustering

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of this study is to assess the potential of the information technology industry in the Saint Petersburg area to become one of the new key industries in the Russian economy. To achieve this objective, the study analyzes especially the international competitiveness of the industry and the conditions for clustering. Russia is currently heavily dependent on its natural resources, which are the main source of its recent economic growth. In order to achieve good long-term economic performance, Russia needs diversification in its well-performing industries in addition to the ones operating in the field of natural resources. The Russian government has acknowledged this and started special initiatives to promote such other industries as information technology and nanotechnology. An interesting industry that is basically less than 20 years old and fast growing in Russia, is information technology. Information technology activities and markets are mainly concentrated in Russia’s two biggest cities, Moscow and Saint Petersburg, and areas around them. The information technology industry in the Saint Petersburg area, although smaller than Moscow, is especially dynamic and is gaining increasing foreign company presence. However, the industry is not yet internationally competitive as it lacks substantial and sustainable competitive advantages. The industry is also merely a potential global information technology cluster, as it lacks the competitive edge and a wide supplier and manufacturing base and other related parts of the whole information technology value system. Alone, the industry will not become a key industry in Russia, but it will, on the other hand, have an important supporting role for the development of other industries. The information technology market in the Saint Petersburg area is already large and if more tightly integrated to Moscow, they will together form a huge and still growing market sufficient for most companies operating in Russia currently and in the future. Therefore, the potential of information technology inside Russia is immense.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of this study is to assess the potential of the information technology industry in the Saint Petersburg area to become one of the new key industries in the Russian economy. To achieve this objective, the study analyzes especially the international competitiveness of the industry and the conditions for clustering. Russia is currently heavily dependent on its natural resources, which are the main source of its recent economic growth. In order to achieve good long-term economic performance, Russia needs diversification in its well-performing industries in addition to the ones operating in the field of natural resources. The Russian government has acknowledged this and started special initiatives to promote such other industries as information technology and nanotechnology. An interesting industry that is basically less than 20 years old and fast growing in Russia, is information technology. Information technology activities and markets are mainly concentrated in Russia’s two biggest cities, Moscow and Saint Petersburg, and areas around them. The information technology industry in the Saint Petersburg area, although smaller than Moscow, is especially dynamic and is gaining increasing foreign company presence. However, the industry is not yet internationally competitive as it lacks substantial and sustainable competitive advantages. The industry is also merely a potential global information technology cluster, as it lacks the competitive edge and a wide supplier and manufacturing base and other related parts of the whole information technology value system. Alone, the industry will not become a key industry in Russia, but it will, on the other hand, have an important supporting role for the development of other industries. The information technology market in the Saint Petersburg area is already large and if more tightly integrated to Moscow, they will together form a huge and still growing market sufficient for most companies operating in Russia currently and in the future. Therefore, the potential of information technology inside Russia is immense.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speaker diarization is the process of sorting speeches according to the speaker. Diarization helps to search and retrieve what a certain speaker uttered in a meeting. Applications of diarization systemsextend to other domains than meetings, for example, lectures, telephone, television, and radio. Besides, diarization enhances the performance of several speech technologies such as speaker recognition, automatic transcription, and speaker tracking. Methodologies previously used in developing diarization systems are discussed. Prior results and techniques are studied and compared. Methods such as Hidden Markov Models and Gaussian Mixture Models that are used in speaker recognition and other speech technologies are also used in speaker diarization. The objective of this thesis is to develop a speaker diarization system in meeting domain. Experimental part of this work indicates that zero-crossing rate can be used effectively in breaking down the audio stream into segments, and adaptive Gaussian Models fit adequately short audio segments. Results show that 35 Gaussian Models and one second as average length of each segment are optimum values to build a diarization system for the tested data. Uniting the segments which are uttered by same speaker is done in a bottom-up clustering by a newapproach of categorizing the mixture weights.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present study examines the repertory of liturgical chant known as St. Petersburg Court Chant which emerged within the Imperial Court of St. Petersburg, Russia, and appeared in print in a number of revisions during the course of the 19th century, eventually to spread throughout the Russian Empire and even abroad. The study seeks answers to questions on the essence and composition of Court Chant, its history and liturgical background, and most importantly, its musical relationship to other repertories of Eastern Slavic chant. The research questions emerge from previous literary accounts of Court Chant (summarized in the Introduction), which have tended to be inaccurate and generally not based on critical research. The study is divided into eight main chapters. Chapter 1 provides a survey of the history of Eastern Slavic chant and the Imperial Court Chapel of St. Petersburg until 1917, with special emphasis on the history of singing traditional chant in polyphony, the status of the Court Chapel as a government authority, and its endeavours in publishing church music. Chapter 2 deals with the liturgical background of Eastern chant, the chant genres, and main repertories of Eastern Slavic chant. Chapter 3 concentrates on chant sources: it introduces the musical notations utilised, after which a typology of chant books is presented. The discussion continues with a survey of the sources of Court Chant and their content, the specimens selected for closer analysis, the comparative materials from other repertories, and ends with a commentary on some chant sources that have been excluded. The comparative sources include a specimen from around the beginning of the 12th century, a few manuscripts from the 17th century, and printed and manuscript chant books from the early 18th to early 20th century, covering the geographical area that delimits to the western Ukraine, Astrakhan, Nizhny Novgorod, and the Solovetsky Monastery. Chapter 4 presents the approach and methods used in the subsequent analytical comparisons. After a survey of the pitch organization of Eastern Slavic chant, the customary harmonization strategy of traditional chant polyphony is examined, according to which a method for meaningful analysis of the harmony is proposed. The method is based on the observation that the harmonic framework of chant polyphony derives from the standard pitch collection of monodic chant known as the Church Gamut, specific pitches of which form eight harmonic regions that behave like the usual tonalities of major and harmonic minor. Because of the considerable quantity of comparative chant forms, computer-assisted statistical methods are applied to the analysis of chant melodies. The primary chant forms and their respective comparative forms have been pre-processed into reduced chant prototypes and divided into redactions. The analyses are carried out by measuring the formal dissimilarities of the primary chant forms of the Court Chant repertory against each comparative form, and also by measuring the reciprocal dissimilarities of all chant versions in a redaction, the results of which are subjected to agglomerative hierarchical clustering in order to find out how the chant forms relate to each other. The dissimilarities are determined by applying a metric dissimilarity function that is based on the Levenshtein Distance. Chapter 5 provides the melodic and harmonic analyses of generic chants (chants used for multiple texts of different lengths), i.e., chants for stichera samoglasny and troparia, Chapter 6 of pseudo-generic chants (chants that are used for multiple texts but with certain restrictions), i.e., chants for heirmoi, prokeimena, and three other hymns, and Chapter 7 of non-generic chants, covering nine chants that in the Court repertory are not shared by multiple texts. The results are summarized and evaluated in Chapter 8. Accordingly, it can be established that, contrary to previous conceptions, melodically, Court Chant is in effect a full part of the wider Eastern Slavic chant tradition. Even if it is somewhat detached from the chant versions of the Synodal square-note chant books and the local tradition of Moscow, it is particularly close to chant forms of East Ukraine and some vernacular repertories from Russia. Respectively, the harmonization strategies of Court Chant do not show significant individuality in comparison with those of the available polyphonic comparative sources, the main difference being the part-writing, which generally conforms to western common practice standard, whereas the deviations from this tend to be more significant in other analysed repertories of polyphonic chant. Thus, insofar as the subsequent prevalence of Court Chant is not based on its forceful dissemination by authorities (as suggested in previous literature but for which little tangible evidence could be found in Chapter 1), in the present author’s interpretation, Court Chant attained its dominance principally because musically it was considered sufficiently traditional, and as a chant body supported by the government, was conveniently available in print in serviceable harmonizations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this thesis is to find out whether all the peer to peer lenders are unworthy of credit and also if there are single qualities or combinations of qualities that determine the probability of default of a person or group of people. Distinguishing qualities are searched with self-organizing maps (SOM). Qualities and groups of people found by the self-organizing map are then compared to the average. The comparison is carried out by looking how big proportion of borrowers meeting the criteria is two months or more behind with their payments. Research data used is collected by an Estonian peer to peer lending company during the years of 2011-2014. Data consists of peer to peer borrowers and information gathered from them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This master thesis work introduces the fuzzy tolerance/equivalence relation and its application in cluster analysis. The work presents about the construction of fuzzy equivalence relations using increasing generators. Here, we investigate and research on the role of increasing generators for the creation of intersection, union and complement operators. The objective is to develop different varieties of fuzzy tolerance/equivalence relations using different varieties of increasing generators. At last, we perform a comparative study with these developed varieties of fuzzy tolerance/equivalence relations in their application to a clustering method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tämän diplomityön tarkoituksena on tutkia, mitä vaaditaan uutisten samanlaisuuden automaattiseen tunnistamiseen. Uutiset ovat tekstipohjaisia uutisia, jotka on haettu eri uutislähteistä. Uutisista on tarkoitus tunnistaa ensinnäkin ne uutiset, jotka tarkoittavat samaa asiaa, sekä ne uutiset, jotka eivät ole aivan sama asia, mutta liittyvät kuitenkin toisiinsa. Tässä diplomityössä tutkitaan, millä algoritmeilla tämä tunnistus onnistuu tehokkaimmin sekä suomalaisessa, että englanninkielisessä tekstissä. Diplomityössä vertaillaan valmiita algoritmeja. Tavoitteena on valita sellainen algoritmiyhdistelmä, että 90 % vertailluista uutisista tunnistuu oikein. Tutkimuksessa käytetään 2 eri ryhmittelyalgoritmia, sekä 3 eri stemmaus-algoritmia. Näitä algoritmeja vertaillaan sekä uutisten tunnistustehokkuuden, että niiden suorituskyvyn suhteen. Parhaimmaksi stemmaus-algoritmiksi osoittautui sekä suomen-, että englanninkielisten uutisten vertailussa Porterin algoritmi. Ryhmittely-algoritmeista tehokkaammaksi osoittautui yksinkertaisempi erilaisiin tunnuslukuihin perustuva algoritmi.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Työn tavoitteena on löytää menetelmä tai malli, jolla suurta joukkoa erilaatuisia ideoita pystytään lajittelemaan ja löytämään tästä joukosta relevantit ja parhaat ideat jatkoke-hittelyyn. Työn empiirisenä aineistona on käytetty kahdella eri toimialalla toimivan yrityksen yhteisessä ideaistunnossa syntyneitä ideoita. Työn teoreettisessa osuudessa on esitelty innovaation määritelmä sekä eri tapoja luoki-tella innovaatioita. Lisäksi teoreettisessa osuudessa on käsitelty avoimen innovaation periaatetta ja sen kenties yhä kasvavaa merkitystä. Työssä on käsitelty myös ideoiden hyödyntämiseen vaikuttavia seikkoja. Näkökulmiksi työssä on valittu yrityksen sisäiset ja ympäristöstä johtuvat seikat, kuten yrityksen osaaminen sekä asiakkaatja markkinat. Teoriaosuuden päättää lyhyt katsaus klusterointiin, sen eri menetelmiin, sekä portfolion hallintaan. Empiirisessä osuudessa yli 50 idean joukosta kyettiin löytämään muutamia ideoita, jotka ovat eri tavoilla toteutettavissa istuntoon osallistuneiden yritysten toimesta.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tässä työssä raportoidaan hybridihitsauksesta otettujen suurnopeuskuvasarjojen automaattisen analyysijärjestelmän kehittäminen.Järjestelmän tarkoitus oli tuottaa tietoa, joka avustaisi analysoijaa arvioimaan kuvatun hitsausprosessin laatua. Tutkimus keskittyi valokaaren taajuuden säännöllisyyden ja lisäainepisaroiden lentosuuntien mittaamiseen. Valokaaria havaittiin kuvasarjoista sumean c-means-klusterointimenetelmän avullaja perättäisten valokaarien välistä aikaväliä käytettiin valokaaren taajuuden säännöllisyyden mittarina. Pisaroita paikannettiin menetelmällä, jossa yhdistyi pääkomponenttianalyysi ja tukivektoriluokitin. Kalman-suodinta käytettiin tuottamaan arvioita pisaroiden lentosuunnista ja nopeuksista. Lentosuunnanmääritysmenetelmä luokitteli pisarat niiden arvioitujen lentosuuntien perusteella. Järjestelmän kehittämiseen käytettävissä olleet kuvasarjat poikkesivat merkittävästi toisistaan kuvanlaadun ja pisaroiden ulkomuodon osalta, johtuen eroista kuvaus- ja hitsausprosesseissa. Analyysijärjestelmä kehitettiin toimimaan pienellä osajoukolla kuvasarjoja, joissa oli tietynlainen kuvaus- ja hitsausprosessi ja joiden kuvanlaatu ja pisaroiden ulkomuoto olivat samankaltaisia, mutta järjestelmää testattiin myös osajoukon ulkopuolisilla kuvasarjoilla. Testitulokset osoittivat, että lentosuunnanmääritystarkkuus oli kohtuullisen suuri osajoukonsisällä ja pieni muissa kuvasarjoissa. Valokaaren taajuuden säännöllisyyden määritys oli tarkka useammassa kuvasarjassa.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Paperin pinnan karheus on yksi paperin laatukriteereistä. Sitä mitataan fyysisestipaperin pintaa mittaavien laitteiden ja optisten laitteiden avulla. Mittaukset vaativat laboratorioolosuhteita, mutta nopeammille, suoraan linjalla tapahtuville mittauksilla olisi tarvetta paperiteollisuudessa. Paperin pinnan karheus voidaan ilmaista yhtenä näytteelle kohdistuvana karheusarvona. Tässä työssä näyte on jaettu merkitseviin alueisiin, ja jokaiselle alueelle on laskettu erillinen karheusarvo. Karheuden mittaukseen on käytetty useita menetelmiä. Yleisesti hyväksyttyä tilastollista menetelmää on käytetty tässä työssä etäisyysmuunnoksen lisäksi. Paperin pinnan karheudenmittauksessa on ollut tarvetta jakaa analysoitava näyte karheuden perusteella alueisiin. Aluejaon avulla voidaan rajata näytteestä selvästi karheampana esiintyvät alueet. Etäisyysmuunnos tuottaa alueita, joita on analysoitu. Näistä alueista on muodostettu yhtenäisiä alueita erilaisilla segmentointimenetelmillä. PNN -menetelmään (Pairwise Nearest Neighbor) ja naapurialueiden yhdistämiseen perustuvia algoritmeja on käytetty.Alueiden jakamiseen ja yhdistämiseen perustuvaa lähestymistapaa on myös tarkasteltu. Segmentoitujen kuvien validointi on yleensä tapahtunut ihmisen tarkastelemana. Tämän työn lähestymistapa on verrata yleisesti hyväksyttyä tilastollista menetelmää segmentoinnin tuloksiin. Korkea korrelaatio näiden tulosten välillä osoittaa onnistunutta segmentointia. Eri kokeiden tuloksia on verrattu keskenään hypoteesin testauksella. Työssä on analysoitu kahta näytesarjaa, joidenmittaukset on suoritettu OptiTopolla ja profilometrillä. Etäisyysmuunnoksen aloitusparametrit, joita muutettiin kokeiden aikana, olivat aloituspisteiden määrä ja sijainti. Samat parametrimuutokset tehtiin kaikille algoritmeille, joita käytettiin alueiden yhdistämiseen. Etäisyysmuunnoksen jälkeen korrelaatio oli voimakkaampaa profilometrillä mitatuille näytteille kuin OptiTopolla mitatuille näytteille. Segmentoiduilla OptiTopo -näytteillä korrelaatio parantui voimakkaammin kuin profilometrinäytteillä. PNN -menetelmän tuottamilla tuloksilla korrelaatio oli paras.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Technological progress has made a huge amount of data available at increasing spatial and spectral resolutions. Therefore, the compression of hyperspectral data is an area of active research. In somefields, the original quality of a hyperspectral image cannot be compromised andin these cases, lossless compression is mandatory. The main goal of this thesisis to provide improved methods for the lossless compression of hyperspectral images. Both prediction- and transform-based methods are studied. Two kinds of prediction based methods are being studied. In the first method the spectra of a hyperspectral image are first clustered and and an optimized linear predictor is calculated for each cluster. In the second prediction method linear prediction coefficients are not fixed but are recalculated for each pixel. A parallel implementation of the above-mentioned linear prediction method is also presented. Also,two transform-based methods are being presented. Vector Quantization (VQ) was used together with a new coding of the residual image. In addition we have developed a new back end for a compression method utilizing Principal Component Analysis (PCA) and Integer Wavelet Transform (IWT). The performance of the compressionmethods are compared to that of other compression methods. The results show that the proposed linear prediction methods outperform the previous methods. In addition, a novel fast exact nearest-neighbor search method is developed. The search method is used to speed up the Linde-Buzo-Gray (LBG) clustering method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The purpose of this thesis is to present a new approach to the lossy compression of multispectral images. Proposed algorithm is based on combination of quantization and clustering. Clustering was investigated for compression of the spatial dimension and the vector quantization was applied for spectral dimension compression. Presenting algo¬rithms proposes to compress multispectral images in two stages. During the first stage we define the classes' etalons, another words to each uniform areas are located inside the image the number of class is given. And if there are the pixels are not yet assigned to some of the clusters then it doing during the second; pass and assign to the closest eta¬lons. Finally a compressed image is represented with a flat index image pointing to a codebook with etalons. The decompression stage is instant too. The proposed method described in this paper has been tested on different satellite multispectral images from different resources. The numerical results and illustrative examples of the method are represented too.