38 resultados para Images - Computational methods
Resumo:
The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.
Resumo:
Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.
Resumo:
This thesis presents a highly sensitive genome wide search method for recessive mutations. The method is suitable for distantly related samples that are divided into phenotype positives and negatives. High throughput genotype arrays are used to identify and compare homozygous regions between the cohorts. The method is demonstrated by comparing colorectal cancer patients against unaffected references. The objective is to find homozygous regions and alleles that are more common in cancer patients. We have designed and implemented software tools to automate the data analysis from genotypes to lists of candidate genes and to their properties. The programs have been designed in respect to a pipeline architecture that allows their integration to other programs such as biological databases and copy number analysis tools. The integration of the tools is crucial as the genome wide analysis of the cohort differences produces many candidate regions not related to the studied phenotype. CohortComparator is a genotype comparison tool that detects homozygous regions and compares their loci and allele constitutions between two sets of samples. The data is visualised in chromosome specific graphs illustrating the homozygous regions and alleles of each sample. The genomic regions that may harbour recessive mutations are emphasised with different colours and a scoring scheme is given for these regions. The detection of homozygous regions, cohort comparisons and result annotations are all subjected to presumptions many of which have been parameterized in our programs. The effect of these parameters and the suitable scope of the methods have been evaluated. Samples with different resolutions can be balanced with the genotype estimates of their haplotypes and they can be used within the same study.
Resumo:
Välikorvaleikkauksiin usein liittyvän välikorvan ja kuuloluuketjun kirurgisen rekonstruktion tavoitteena on luoda olosuhteet, jotka mahdollistavat hyvän kuulon sekä välikorvan säilymisen tulehduksettomana ja ilmapitoisena. Välikorvan rekonstruktiossa on käytetty implanttimateriaaleina perinteisesti potilaan omia kudoksia sekä tarvittaessa erilaisia hajoamattomia biomateriaaleja, mm. titaania ja silikonia. Ongelmana biomateriaalien käytössä voi olla bakteerien adherenssi eli tarttuminen vieraan materiaalin pintaan, mikä saattaa johtaa biofilmin muodostumiseen. Tämä voi aiheuttaa kroonisen, huonosti antibiootteihin reagoivan infektion kudoksessa, mikä usein käytännössä johtaa uusintaleikkaukseen ja implantin poistoon. Maitohappo- ja glykolihappopohjaiset biologisesti hajoavat polymeerit ovat olleet kliinisessä käytössä jo vuosikymmeniä. Niitä on käytetty erityisesti tukimateriaaleina mm. ortopediassa sekä kasvo- ja leukakirurgiassa. Niitä ei ole toistaiseksi käytetty välikorvakirurgiassa. Korvan kuvantamiseen käytetään ensisijaisesti tietokonetomografiaa (TT). TT-tutkimuksen ongelmana on potilaan altistuminen suhteellisen korkealle sädeannokselle, joka kasvaa kumulatiivisesti, jos kuvaus joudutaan toistamaan. Väitöskirjatyö selvittää uuden, aiemmin kliinisessä työssä rutiinisti lähinnä hampaiston ja kasvojen alueen kuvantamiseen käytetyn rajoitetun kartiokeila-TT:n soveltuvuutta korvan alueen kuvantamiseen. Väitöskirjan kahdessa ensimmäisessä osatyössä tutkittiin ja verrattiin kahden kroonisia ja postoperatiivisia korvainfektioita aiheuttavan bakteerin, Staphylococcus aureuksen ja Pseudomonas aeruginosan, in vitro adherenssia titaanin, silikonin ja kahden eri biohajoavan polymeerin (PLGA) pintaan. Lisäksi tutkittiin materiaalien albumiinipinnoituksen vaikutusta adherenssiin. Kolmannessa osatyössä tutkittiin eläinmallissa PLGA:n biokompatibiliteettia eli kudosyhteensopivuutta kokeellisessa välikorvakirurgiassa. Chinchillojen välikorviin istutettiin PLGA-materiaalia, eläimiä seurattiin, ja ne lopetettiin 6 kk:n kuluttua operaatiosta. Biokompatibiliteetin arviointi perustui kliinisiin havaintoihin sekä kudosnäytteisiin. Neljännessä osatyössä tutkittiin kartiokeila-TT:n soveltuvuutta korvan alueen kuvantamiseen vertaamalla sen tarkkuutta perinteisen spiraali-TT:n tarkkuuteen. Molemmilla laitteilla kuvattiin ohimo- eli temporaaliluita korvan alueen kliinisesti ja kirurgisesti tärkeiden rakenteiden kuvantumisen tarkkuuden arvioimiseksi. Viidennessä osatyössä arvioitiin myös operoitujen temporaaliluiden kuvantumista kartiokeila-TT:ssa. Bakteeritutkimuksissa PLGA-materiaalin pintaan tarttui keskimäärin korkeintaan saman verran tai vähemmän bakteereita kuin silikonin tai titaanin. Albumiinipinnoitus vähensi bakteeriadherenssia merkitsevästi kaikilla materiaaleilla. Eläinkokeiden perusteella PLGA todettiin hyvin siedetyksi välikorvassa. Korvakäytävissä tai välikorvissa ei todettu infektioita, tärykalvon perforaatioita tai materiaalin esiin työntymistä. Kudosnäytteissä näkyi lievää tulehdusreaktiota ja fibroosia implantin ympärillä. Temporaaliluutöissä rajoitettu kartiokeila-TT todettiin vähintään yhtä tarkaksi menetelmäksi kuin spiraali-TT välikorvan ja sisäkorvan rakenteiden kuvantamisessa, ja sen aiheuttama kertasäderasitus todettiin spiraali-TT:n vastaavaa huomattavasti vähäisemmäksi. Kartiokeila-TT soveltui hyvin välikorvaimplanttien ja postoperatiivisen korvan kuvantamiseen. Tulokset osoittavat, että PLGA on välikorvakirurgiaan soveltuva, turvallinen ja kudosyhteensopiva biomateriaali. Biomateriaalien pinnoittaminen albumiinilla vähentää merkittävästi bakteeriadherenssia niihin, mikä puoltaa pinnoituksen soveltamista implanttikirurgiassa. Kartiokeila-TT soveltuu korvan alueen kuvantamiseen. Sen tarkkuus kliinisesti tärkeiden rakenteiden osoittamisessa on vähintään yhtä hyvä ja sen potilaalle aiheuttama sädeannos pienempi kuin nykyisen korva-spiraali-TT:n. Tämä tekee menetelmästä spiraali-TT:aa potilasturvallisemman vaihtoehdon erityisesti, jos potilaan tilanne vaatii seurantaa ja useampia kuvauksia, ja jos halutaan kuvata rajoitettuja alueita uni- tai bilateraalisesti.
Resumo:
An efficient and statistically robust solution for the identification of asteroids among numerous sets of astrometry is presented. In particular, numerical methods have been developed for the short-term identification of asteroids at discovery, and for the long-term identification of scarcely observed asteroids over apparitions, a task which has been lacking a robust method until now. The methods are based on the solid foundation of statistical orbital inversion properly taking into account the observational uncertainties, which allows for the detection of practically all correct identifications. Through the use of dimensionality-reduction techniques and efficient data structures, the exact methods have a loglinear, that is, O(nlog(n)), computational complexity, where n is the number of included observation sets. The methods developed are thus suitable for future large-scale surveys which anticipate a substantial increase in the astrometric data rate. Due to the discontinuous nature of asteroid astrometry, separate sets of astrometry must be linked to a common asteroid from the very first discovery detections onwards. The reason for the discontinuity in the observed positions is the rotation of the observer with the Earth as well as the motion of the asteroid and the observer about the Sun. Therefore, the aim of identification is to find a set of orbital elements that reproduce the observed positions with residuals similar to the inevitable observational uncertainty. Unless the astrometric observation sets are linked, the corresponding asteroid is eventually lost as the uncertainty of the predicted positions grows too large to allow successful follow-up. Whereas the presented identification theory and the numerical comparison algorithm are generally applicable, that is, also in fields other than astronomy (e.g., in the identification of space debris), the numerical methods developed for asteroid identification can immediately be applied to all objects on heliocentric orbits with negligible effects due to non-gravitational forces in the time frame of the analysis. The methods developed have been successfully applied to various identification problems. Simulations have shown that the methods developed are able to find virtually all correct linkages despite challenges such as numerous scarce observation sets, astrometric uncertainty, numerous objects confined to a limited region on the celestial sphere, long linking intervals, and substantial parallaxes. Tens of previously unknown main-belt asteroids have been identified with the short-term method in a preliminary study to locate asteroids among numerous unidentified sets of single-night astrometry of moving objects, and scarce astrometry obtained nearly simultaneously with Earth-based and space-based telescopes has been successfully linked despite a substantial parallax. Using the long-term method, thousands of realistic 3-linkages typically spanning several apparitions have so far been found among designated observation sets each spanning less than 48 hours.
Resumo:
This work belongs to the field of computational high-energy physics (HEP). The key methods used in this thesis work to meet the challenges raised by the Large Hadron Collider (LHC) era experiments are object-orientation with software engineering, Monte Carlo simulation, the computer technology of clusters, and artificial neural networks. The first aspect discussed is the development of hadronic cascade models, used for the accurate simulation of medium-energy hadron-nucleus reactions, up to 10 GeV. These models are typically needed in hadronic calorimeter studies and in the estimation of radiation backgrounds. Various applications outside HEP include the medical field (such as hadron treatment simulations), space science (satellite shielding), and nuclear physics (spallation studies). Validation results are presented for several significant improvements released in Geant4 simulation tool, and the significance of the new models for computing in the Large Hadron Collider era is estimated. In particular, we estimate the ability of the Bertini cascade to simulate Compact Muon Solenoid (CMS) hadron calorimeter HCAL. LHC test beam activity has a tightly coupled cycle of simulation-to-data analysis. Typically, a Geant4 computer experiment is used to understand test beam measurements. Thus an another aspect of this thesis is a description of studies related to developing new CMS H2 test beam data analysis tools and performing data analysis on the basis of CMS Monte Carlo events. These events have been simulated in detail using Geant4 physics models, full CMS detector description, and event reconstruction. Using the ROOT data analysis framework we have developed an offline ANN-based approach to tag b-jets associated with heavy neutral Higgs particles, and we show that this kind of NN methodology can be successfully used to separate the Higgs signal from the background in the CMS experiment.
Resumo:
Modern smart phones often come with a significant amount of computational power and an integrated digital camera making them an ideal platform for intelligents assistants. This work is restricted to retail environments, where users could be provided with for example navigational in- structions to desired products or information about special offers within their close proximity. This kind of applications usually require information about the user's current location in the domain environment, which in our case corresponds to a retail store. We propose a vision based positioning approach that recognizes products the user's mobile phone's camera is currently pointing at. The products are related to locations within the store, which enables us to locate the user by pointing the mobile phone's camera to a group of products. The first step of our method is to extract meaningful features from digital images. We use the Scale- Invariant Feature Transform SIFT algorithm, which extracts features that are highly distinctive in the sense that they can be correctly matched against a large database of features from many images. We collect a comprehensive set of images from all meaningful locations within our domain and extract the SIFT features from each of these images. As the SIFT features are of high dimensionality and thus comparing individual features is infeasible, we apply the Bags of Keypoints method which creates a generic representation, visual category, from all features extracted from images taken from a specific location. A category for an unseen image can be deduced by extracting the corresponding SIFT features and by choosing the category that best fits the extracted features. We have applied the proposed method within a Finnish supermarket. We consider grocery shelves as categories which is a sufficient level of accuracy to help users navigate or to provide useful information about nearby products. We achieve a 40% accuracy which is quite low for commercial applications while significantly outperforming the random guess baseline. Our results suggest that the accuracy of the classification could be increased with a deeper analysis on the domain and by combining existing positioning methods with ours.
Resumo:
Finite-state methods have been adopted widely in computational morphology and related linguistic applications. To enable efficient development of finite-state based linguistic descriptions, these methods should be a freely available resource for academic language research and the language technology industry. The following needs can be identified: (i) a registry that maps the existing approaches, implementations and descriptions, (ii) managing the incompatibilities of the existing tools, (iii) increasing synergy and complementary functionality of the tools, (iv) persistent availability of the tools used to manipulate the archived descriptions, (v) an archive for free finite-state based tools and linguistic descriptions. Addressing these challenges contributes to building a common research infrastructure for advanced language technology.