956 resultados para Data compression (Computer science)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The proposal to work on this final project came after several discussions held with Dr. Elzbieta Malinowski Gadja, who in 2008 published the book entitled Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications (Data-Centric Systems and Applications). The project was carried out under the technical supervision of Dr. Malinowski and the direct beneficiary was the University of Costa Rica (UCR) where Dr. Malinowski is a professor at the Department of Computer Science and Informatics. The purpose of this project was twofold: First, to translate chapter III of said book with the intention of generating educational material for the use of the UCR and, second, to venture in the field of technical translation related to data warehouse. For the first component, the goal was to generate a final product that would eventually serve as an educational tool for the post-graduate courses of the UCR. For the second component, this project allowed me to acquire new skills and put into practice techniques that have helped me not only to perfom better in my current job as an Assistant Translator of the Inter-American BAnk (IDB), but also to use them in similar projects. The process was lenggthy and required torough research and constant communication with the author. The investigation focused on the search of terms and definitions to prepare the glossary, which was the basis to start the translation project. The translation process itself was carried out by phases, so that comments and corrections by the author could be taken into account in subsequent stages. Later, based on the glossary and the translated text, illustrations had been created in the Visio software were translated. In addition to the technical revision by the author, professor Carme Mangiron was in charge of revising the non-technical text. The result was a high-quality document that is currently used as reference and study material by the Department of Computer Science and Informatics of Costa Rica.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Proyecto final de grado consistente en la explotación de un data warehouse para el análisis de información sobre el tránsito rodado de vehículos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes an audio watermarking scheme based on lossy compression. The main idea is taken from an image watermarking approach where the JPEG compression algorithm is used to determine where and how the mark should be placed. Similarly, in the audio scheme suggested in this paper, an MPEG 1 Layer 3 algorithm is chosen for compression to determine the position of the mark bits and, thus, the psychoacoustic masking of the MPEG 1 Layer 3compression is implicitly used. This methodology provides with a high robustness degree against compression attacks. The suggested scheme is also shown to succeed against most of the StirMark benchmark attacks for audio.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A newspaper content management system has to deal with a very heterogeneous information space as the experience in the Diari Segre newspaper has shown us. The greatest problem is to harmonise the different ways the involved users (journalist, archivists...) structure the newspaper information space, i.e. news, topics, headlines, etc. Our approach is based on ontology and differentiated universes of discourse (UoD). Users interact with the system and, from this interaction, integration rules are derived. These rules are based on Description Logic ontological relations for subsumption and equivalence. They relate the different UoD and produce a shared conceptualisation of the newspaper information domain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diseño, elaboración y explotación de un data warehouse para una institución sanitaria.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aquest treball de final de carrera vol donar una solució a un suposat encàrrec de la Unió Europea de construir una base de dades relacional que permeti emmagatzemar dades de l'activitat física dels ciutadans, obtingudes a partir de dispositius wearables, i dades de l'estat de salut i malalties diagnosticades, recollides pels sistemes informàtics dels diferents serveis de salut. Amb totes aquestes dades recopilades la nostra base de dades permetrà, a través d'aplicacions d'alt nivell, extreure informació útil que permeti conèixer l'estat de salut real dels ciutadans i dissenyar actuacions i campanyes que permetin la seva millora.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The proposal to work on this final project came after several discussions held with Dr. Elzbieta Malinowski Gadja, who in 2008 published the book entitled Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications (Data-Centric Systems and Applications). The project was carried out under the technical supervision of Dr. Malinowski and the direct beneficiary was the University of Costa Rica (UCR) where Dr. Malinowski is a professor at the Department of Computer Science and Informatics. The purpose of this project was twofold: First, to translate chapter III of said book with the intention of generating educational material for the use of the UCR and, second, to venture in the field of technical translation related to data warehouse. For the first component, the goal was to generate a final product that would eventually serve as an educational tool for the post-graduate courses of the UCR. For the second component, this project allowed me to acquire new skills and put into practice techniques that have helped me not only to perfom better in my current job as an Assistant Translator of the Inter-American BAnk (IDB), but also to use them in similar projects. The process was lenggthy and required torough research and constant communication with the author. The investigation focused on the search of terms and definitions to prepare the glossary, which was the basis to start the translation project. The translation process itself was carried out by phases, so that comments and corrections by the author could be taken into account in subsequent stages. Later, based on the glossary and the translated text, illustrations had been created in the Visio software were translated. In addition to the technical revision by the author, professor Carme Mangiron was in charge of revising the non-technical text. The result was a high-quality document that is currently used as reference and study material by the Department of Computer Science and Informatics of Costa Rica.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tämä tutkielma kuuluu merkkijonoalgoritmiikan piiriin. Merkkijono S on merkkijonojen X[1..m] ja Y[1..n] yhteinen alijono, mikäli se voidaan muodostaa poistamalla X:stä 0..m ja Y:stä 0..n kappaletta merkkejä mielivaltaisista paikoista. Jos yksikään X:n ja Y:n yhteinen alijono ei ole S:ää pidempi, sanotaan, että S on X:n ja Y:n pisin yhteinen alijono (lyh. PYA). Tässä työssä keskitytään kahden merkkijonon PYAn ratkaisemiseen, mutta ongelma on yleistettävissä myös useammalle jonolle. PYA-ongelmalle on sovelluskohteita – paitsi tietojenkäsittelytieteen niin myös bioinformatiikan osa-alueilla. Tunnetuimpia niistä ovat tekstin ja kuvien tiivistäminen, tiedostojen versionhallinta, hahmontunnistus sekä DNA- ja proteiiniketjujen rakennetta vertaileva tutkimus. Ongelman ratkaisemisen tekee hankalaksi ratkaisualgoritmien riippuvuus syötejonojen useista eri parametreista. Näitä ovat syötejonojen pituuden lisäksi mm. syöttöaakkoston koko, syötteiden merkkijakauma, PYAn suhteellinen osuus lyhyemmän syötejonon pituudesta ja täsmäävien merkkiparien lukumäärä. Täten on vaikeaa kehittää algoritmia, joka toimisi tehokkaasti kaikille ongelman esiintymille. Tutkielman on määrä toimia yhtäältä käsikirjana, jossa esitellään ongelman peruskäsitteiden kuvauksen jälkeen jo aikaisemmin kehitettyjä tarkkoja PYAalgoritmeja. Niiden tarkastelu on ryhmitelty algoritmin toimintamallin mukaan joko rivi, korkeuskäyrä tai diagonaali kerrallaan sekä monisuuntaisesti prosessoiviin. Tarkkojen menetelmien lisäksi esitellään PYAn pituuden ylä- tai alarajan laskevia heuristisia menetelmiä, joiden laskemia tuloksia voidaan hyödyntää joko sellaisinaan tai ohjaamaan tarkan algoritmin suoritusta. Tämä osuus perustuu tutkimusryhmämme julkaisemiin artikkeleihin. Niissä käsitellään ensimmäistä kertaa heuristiikoilla tehostettuja tarkkoja menetelmiä. Toisaalta työ sisältää laajahkon empiirisen tutkimusosuuden, jonka tavoitteena on ollut tehostaa olemassa olevien tarkkojen algoritmien ajoaikaa ja muistinkäyttöä. Kyseiseen tavoitteeseen on pyritty ohjelmointiteknisesti esittelemällä algoritmien toimintamallia hyvin tukevia tietorakenteita ja rajoittamalla algoritmien suorittamaa tuloksetonta laskentaa parantamalla niiden kykyä havainnoida suorituksen aikana saavutettuja välituloksia ja hyödyntää niitä. Tutkielman johtopäätöksinä voidaan yleisesti todeta tarkkojen PYA-algoritmien heuristisen esiprosessoinnin lähes systemaattisesti pienentävän niiden suoritusaikaa ja erityisesti muistintarvetta. Lisäksi algoritmin käyttämällä tietorakenteella on ratkaiseva vaikutus laskennan tehokkuuteen: mitä paikallisempia haku- ja päivitysoperaatiot ovat, sitä tehokkaampaa algoritmin suorittama laskenta on.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La théorie de l'information quantique étudie les limites fondamentales qu'imposent les lois de la physique sur les tâches de traitement de données comme la compression et la transmission de données sur un canal bruité. Cette thèse présente des techniques générales permettant de résoudre plusieurs problèmes fondamentaux de la théorie de l'information quantique dans un seul et même cadre. Le théorème central de cette thèse énonce l'existence d'un protocole permettant de transmettre des données quantiques que le receveur connaît déjà partiellement à l'aide d'une seule utilisation d'un canal quantique bruité. Ce théorème a de plus comme corollaires immédiats plusieurs théorèmes centraux de la théorie de l'information quantique. Les chapitres suivants utilisent ce théorème pour prouver l'existence de nouveaux protocoles pour deux autres types de canaux quantiques, soit les canaux de diffusion quantiques et les canaux quantiques avec information supplémentaire fournie au transmetteur. Ces protocoles traitent aussi de la transmission de données quantiques partiellement connues du receveur à l'aide d'une seule utilisation du canal, et ont comme corollaires des versions asymptotiques avec et sans intrication auxiliaire. Les versions asymptotiques avec intrication auxiliaire peuvent, dans les deux cas, être considérées comme des versions quantiques des meilleurs théorèmes de codage connus pour les versions classiques de ces problèmes. Le dernier chapitre traite d'un phénomène purement quantique appelé verrouillage: il est possible d'encoder un message classique dans un état quantique de sorte qu'en lui enlevant un sous-système de taille logarithmique par rapport à sa taille totale, on puisse s'assurer qu'aucune mesure ne puisse avoir de corrélation significative avec le message. Le message se trouve donc «verrouillé» par une clé de taille logarithmique. Cette thèse présente le premier protocole de verrouillage dont le critère de succès est que la distance trace entre la distribution jointe du message et du résultat de la mesure et le produit de leur marginales soit suffisamment petite.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Le réalisme des objets en infographie exige de simuler adéquatement leur apparence sous divers éclairages et à différentes échelles. Une solution communément adoptée par les chercheurs consiste à mesurer avec l’aide d’appareils calibrés la réflectance d’un échantillon de surface réelle, pour ensuite l’encoder sous forme d’un modèle de réflectance (BRDF) ou d’une texture de réflectances (BTF). Malgré des avancées importantes, les données ainsi mises à la portée des artistes restent encore très peu utilisées. Cette réticence pourrait s’expliquer par deux raisons principales : (1) la quantité et la qualité de mesures disponibles et (2) la taille des données. Ce travail propose de s’attaquer à ces deux problèmes sous l’angle de la simulation. Nous conjecturons que le niveau de réalisme du rendu en infographie produit déjà des résultats satisfaisants avec les techniques actuelles. Ainsi, nous proposons de précalculer et encoder dans une BTF augmentée les effets d’éclairage sur une géométrie, qui sera par la suite appliquée sur les surfaces. Ce précalcul de rendu et textures étant déjà bien adopté par les artistes, il pourra mieux s’insérer dans leurs réalisations. Pour nous assurer que ce modèle répond aussi aux exigences des représentations multi-échelles, nous proposons aussi une adaptation des BTFs à un encodage de type MIP map.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a method based on articulated models for the registration of spine data extracted from multimodal medical images of patients with scoliosis. With the ultimate aim being the development of a complete geometrical model of the torso of a scoliotic patient, this work presents a method for the registration of vertebral column data using 3D magnetic resonance images (MRI) acquired in prone position and X-ray data acquired in standing position for five patients with scoliosis. The 3D shape of the vertebrae is estimated from both image modalities for each patient, and an articulated model is used in order to calculate intervertebral transformations required in order to align the vertebrae between both postures. Euclidean distances between anatomical landmarks are calculated in order to assess multimodal registration error. Results show a decrease in the Euclidean distance using the proposed method compared to rigid registration and more physically realistic vertebrae deformations compared to thin-plate-spline (TPS) registration thus improving alignment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present research problem is to study the existing encryption methods and to develop a new technique which is performance wise superior to other existing techniques and at the same time can be very well incorporated in the communication channels of Fault Tolerant Hard Real time systems along with existing Error Checking / Error Correcting codes, so that the intention of eaves dropping can be defeated. There are many encryption methods available now. Each method has got it's own merits and demerits. Similarly, many crypt analysis techniques which adversaries use are also available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern computer systems are plagued with stability and security problems: applications lose data, web servers are hacked, and systems crash under heavy load. Many of these problems or anomalies arise from rare program behavior caused by attacks or errors. A substantial percentage of the web-based attacks are due to buffer overflows. Many methods have been devised to detect and prevent anomalous situations that arise from buffer overflows. The current state-of-art of anomaly detection systems is relatively primitive and mainly depend on static code checking to take care of buffer overflow attacks. For protection, Stack Guards and I-leap Guards are also used in wide varieties.This dissertation proposes an anomaly detection system, based on frequencies of system calls in the system call trace. System call traces represented as frequency sequences are profiled using sequence sets. A sequence set is identified by the starting sequence and frequencies of specific system calls. The deviations of the current input sequence from the corresponding normal profile in the frequency pattern of system calls is computed and expressed as an anomaly score. A simple Bayesian model is used for an accurate detection.Experimental results are reported which show that frequency of system calls represented using sequence sets, captures the normal behavior of programs under normal conditions of usage. This captured behavior allows the system to detect anomalies with a low rate of false positives. Data are presented which show that Bayesian Network on frequency variations responds effectively to induced buffer overflows. It can also help administrators to detect deviations in program flow introduced due to errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.