916 resultados para Data compression (Computer science)
Resumo:
Image registration is an important component of image analysis used to align two or more images. In this paper, we present a new framework for image registration based on compression. The basic idea underlying our approach is the conjecture that two images are correctly registered when we can maximally compress one image given the information in the other. The contribution of this paper is twofold. First, we show that the image registration process can be dealt with from the perspective of a compression problem. Second, we demonstrate that the similarity metric, introduced by Li et al., performs well in image registration. Two different versions of the similarity metric have been used: the Kolmogorov version, computed using standard real-world compressors, and the Shannon version, calculated from an estimation of the entropy rate of the images
Resumo:
Includes indexes.
Resumo:
Originally presented as the author's thesis (M.S.), University of Illinois at Urbana-Champaign.
Resumo:
Mode of access: Internet.
Resumo:
High-speed videokeratoscopy is an emerging technique that enables study of the corneal surface and tear-film dynamics. Unlike its static predecessor, this new technique results in a very large amount of digital data for which storage needs become significant. We aimed to design a compression technique that would use mathematical functions to parsimoniously fit corneal surface data with a minimum number of coefficients. Since the Zernike polynomial functions that have been traditionally used for modeling corneal surfaces may not necessarily correctly represent given corneal surface data in terms of its optical performance, we introduced the concept of Zernike polynomial-based rational functions. Modeling optimality criteria were employed in terms of both the rms surface error as well as the point spread function cross-correlation. The parameters of approximations were estimated using a nonlinear least-squares procedure based on the Levenberg-Marquardt algorithm. A large number of retrospective videokeratoscopic measurements were used to evaluate the performance of the proposed rational-function-based modeling approach. The results indicate that the rational functions almost always outperform the traditional Zernike polynomial approximations with the same number of coefficients.
Resumo:
Purpose The purpose of this study was to evaluate the validity of the CSA activity monitor as a measure of children's physical activity using energy expenditure (EE) as a criterion measure. Methods Thirty subjects aged 10 to 14 performed three 5-min treadmill bouts at 3, 4, and 6 mph, respectively. While on the treadmill, subjects wore CSA (WAM 7164) activity monitors on the right and left hips. (V) over dot O-2 was monitored continuously by an automated system. EE was determined by multiplying the average (V) over dot O-2 by the caloric equivalent of the mean respiratory exchange ratio. Results Repeated measures ANOVA indicated that both CSA monitors were sensitive to changes in treadmill speed. Mean activity counts from each CSA unit were not significantly different and the intraclass reliability coefficient for the two CSA units across all speeds was 0.87. Activity counts from both CSA units were strongly correlated with EE (r = 0.86 and 0.87, P < 0.001). An EE prediction equation was developed from 20 randomly selected subjects and cross-validated on the remaining 10. The equation predicted mean EE within 0.01 kcal.min(-1). The correlation between actual and predicted values was 0.93 (P < 0.01) and the SEE was 0.93 kcal.min(-1). Conclusion These data indicate that the CSA monitor is a valid and reliable tool for quantifying treadmill walking and running in children.
Resumo:
Data flow computers are high-speed machines in which an instruction is executed as soon as all its operands are available. This paper describes the EXtended MANchester (EXMAN) data flow computer which incorporates three major extensions to the basic Manchester machine. As extensions we provide a multiple matching units scheme, an efficient, implementation of array data structure, and a facility to concurrently execute reentrant routines. A simulator for the EXMAN computer has been coded in the discrete event simulation language, SIMULA 67, on the DEC 1090 system. Performance analysis studies have been conducted on the simulated EXMAN computer to study the effectiveness of the proposed extensions. The performance experiments have been carried out using three sample problems: matrix multiplication, Bresenham's line drawing algorithm, and the polygon scan-conversion algorithm.
Resumo:
The EEG time series has been subjected to various formalisms of analysis to extract meaningful information regarding the underlying neural events. In this paper the linear prediction (LP) method has been used for analysis and presentation of spectral array data for the better visualisation of background EEG activity. It has also been used for signal generation, efficient data storage and transmission of EEG. The LP method is compared with the standard Fourier method of compressed spectral array (CSA) of the multichannel EEG data. The autocorrelation autoregressive (AR) technique is used for obtaining the LP coefficients with a model order of 15. While the Fourier method reduces the data only by half, the LP method just requires the storage of signal variance and LP coefficients. The signal generated using white Gaussian noise as the input to the LP filter has a high correlation coefficient of 0.97 with that of original signal, thus making LP as a useful tool for storage and transmission of EEG. The biological significance of Fourier method and the LP method in respect to the microstructure of neuronal events in the generation of EEG is discussed.
Resumo:
In a recent paper (Changes in Web Client Access Patterns: Characteristics and Caching Implications by Barford, Bestavros, Bradley, and Crovella) we performed a variety of analyses upon user traces collected in the Boston University Computer Science department in 1995 and 1998. A sanitized version of the 1995 trace has been publicly available for some time; the 1998 trace has now been sanitized, and is available from: http://www.cs.bu.edu/techreports/1999-011-usertrace-98.gz ftp://ftp.cs.bu.edu/techreports/1999-011-usertrace-98.gz This memo discusses the format of this public version of the log, and includes additional discussion of how the data was collected, how the log was sanitized, what this log is and is not useful for, and areas of potential future research interest.
Resumo:
In recent years, it has been observed that software clones and plagiarism are becoming an increased threat for one?s creativity. Clones are the results of copying and using other?s work. According to the Merriam – Webster dictionary, “A clone is one that appears to be a copy of an original form”. It is synonym to duplicate. Clones lead to redundancy of codes, but not all redundant code is a clone.On basis of this background knowledge ,in order to safeguard one?s idea and to avoid intentional code duplication for pretending other?s work as if their owns, software clone detection should be emphasized more. The objective of this paper is to review the methods for clone detection and to apply those methods for finding the extent of plagiarism occurrence among the Swedish Universities in Master level computer science department and to analyze the results.The rest part of the paper, discuss about software plagiarism detection which employs data analysis technique and then statistical analysis of the results.Plagiarism is an act of stealing and passing off the idea?s and words of another person?s as one?s own. Using data analysis technique, samples(Master level computer Science thesis report) were taken from various Swedish universities and processed in Ephorus anti plagiarism software detection. Ephorus gives the percentage of plagiarism for each thesis document, from this results statistical analysis were carried out using Minitab Software.The results gives a very low percentage of Plagiarism extent among the Swedish universities, which concludes that Plagiarism is not a threat to Sweden?s standard of education in computer science.This paper is based on data analysis, intelligence techniques, EPHORUS software plagiarism detection tool and MINITAB statistical software analysis.
Resumo:
New information and communication technologies may be useful for providing more in-depth knowledge to students in many ways, whether through online multimedia educational material, or through online debates with colleagues, teachers and other area professionals in a synchronous or asynchronous manner. This paper focuses on participation in online discussion in e-learning courses for promoting learning. Although an important theoretical aspect, an analysis of literature reveals there are few studies evaluating the personal and social aspects of online course users in a quantitative manner. This paper aims to introduce a method for diagnosing inclusion and digital proficiency and other personal aspects of the student through a case study comparing Information System, Public Relations and Engineering students at a public university in Brazil. Statistical analysis and analysis of variances (ANOVA) were used as the methodology for data analysis in order to understand existing relations between the components of the proposed method. The survey methodology was also used, in its online format, as a research instrument. The method is based on using online questionnaires that diagnose digital proficiency and time management, level of extroversion and social skills of the students. According to the sample studied, there is no strong correlation between digital proficiency and individual characteristics tied to the use of time, level of extroversion and social skills of students. The differences in course grades for some components are partly due to subject 'Introduction to Economics' being offered to freshmen in Public Relations, whereas subject 'Economics in Engineering' is offered in the final semesters of Engineering and Information Systems courses. Therefore, the difference could be more tied to the respondent's age than to the course. Information Systems students were observed to be older, with access to computers and Internet at the workplace, compared to the other students who access the Internet more often from home. This paper presents a pilot study aimed at conducting a diagnosis that permits proposing actions for information and communication technology to contribute towards student education. Three levels of digital inclusion are described as a scale to measure whether information technology increases personal performance and professional knowledge and skills. This study may be useful for other readers interested in themes related to education in engineering. © 2013 IEEE.
Resumo:
Data deduplication describes a class of approaches that reduce the storage capacity needed to store data or the amount of data that has to be transferred over a network. These approaches detect coarse-grained redundancies within a data set, e.g. a file system, and remove them.rnrnOne of the most important applications of data deduplication are backup storage systems where these approaches are able to reduce the storage requirements to a small fraction of the logical backup data size.rnThis thesis introduces multiple new extensions of so-called fingerprinting-based data deduplication. It starts with the presentation of a novel system design, which allows using a cluster of servers to perform exact data deduplication with small chunks in a scalable way.rnrnAfterwards, a combination of compression approaches for an important, but often over- looked, data structure in data deduplication systems, so called block and file recipes, is introduced. Using these compression approaches that exploit unique properties of data deduplication systems, the size of these recipes can be reduced by more than 92% in all investigated data sets. As file recipes can occupy a significant fraction of the overall storage capacity of data deduplication systems, the compression enables significant savings.rnrnA technique to increase the write throughput of data deduplication systems, based on the aforementioned block and file recipes, is introduced next. The novel Block Locality Caching (BLC) uses properties of block and file recipes to overcome the chunk lookup disk bottleneck of data deduplication systems. This chunk lookup disk bottleneck either limits the scalability or the throughput of data deduplication systems. The presented BLC overcomes the disk bottleneck more efficiently than existing approaches. Furthermore, it is shown that it is less prone to aging effects.rnrnFinally, it is investigated if large HPC storage systems inhibit redundancies that can be found by fingerprinting-based data deduplication. Over 3 PB of HPC storage data from different data sets have been analyzed. In most data sets, between 20 and 30% of the data can be classified as redundant. According to these results, future work in HPC storage systems should further investigate how data deduplication can be integrated into future HPC storage systems.rnrnThis thesis presents important novel work in different area of data deduplication re- search.
Resumo:
In vielen Industriezweigen, zum Beispiel in der Automobilindustrie, werden Digitale Versuchsmodelle (Digital MockUps) eingesetzt, um die Konstruktion und die Funktion eines Produkts am virtuellen Prototypen zu überprüfen. Ein Anwendungsfall ist dabei die Überprüfung von Sicherheitsabständen einzelner Bauteile, die sogenannte Abstandsanalyse. Ingenieure ermitteln dabei für bestimmte Bauteile, ob diese in ihrer Ruhelage sowie während einer Bewegung einen vorgegeben Sicherheitsabstand zu den umgebenden Bauteilen einhalten. Unterschreiten Bauteile den Sicherheitsabstand, so muss deren Form oder Lage verändert werden. Dazu ist es wichtig, die Bereiche der Bauteile, welche den Sicherhabstand verletzen, genau zu kennen. rnrnIn dieser Arbeit präsentieren wir eine Lösung zur Echtzeitberechnung aller den Sicherheitsabstand unterschreitenden Bereiche zwischen zwei geometrischen Objekten. Die Objekte sind dabei jeweils als Menge von Primitiven (z.B. Dreiecken) gegeben. Für jeden Zeitpunkt, in dem eine Transformation auf eines der Objekte angewendet wird, berechnen wir die Menge aller den Sicherheitsabstand unterschreitenden Primitive und bezeichnen diese als die Menge aller toleranzverletzenden Primitive. Wir präsentieren in dieser Arbeit eine ganzheitliche Lösung, welche sich in die folgenden drei großen Themengebiete unterteilen lässt.rnrnIm ersten Teil dieser Arbeit untersuchen wir Algorithmen, die für zwei Dreiecke überprüfen, ob diese toleranzverletzend sind. Hierfür präsentieren wir verschiedene Ansätze für Dreiecks-Dreiecks Toleranztests und zeigen, dass spezielle Toleranztests deutlich performanter sind als bisher verwendete Abstandsberechnungen. Im Fokus unserer Arbeit steht dabei die Entwicklung eines neuartigen Toleranztests, welcher im Dualraum arbeitet. In all unseren Benchmarks zur Berechnung aller toleranzverletzenden Primitive beweist sich unser Ansatz im dualen Raum immer als der Performanteste.rnrnDer zweite Teil dieser Arbeit befasst sich mit Datenstrukturen und Algorithmen zur Echtzeitberechnung aller toleranzverletzenden Primitive zwischen zwei geometrischen Objekten. Wir entwickeln eine kombinierte Datenstruktur, die sich aus einer flachen hierarchischen Datenstruktur und mehreren Uniform Grids zusammensetzt. Um effiziente Laufzeiten zu gewährleisten ist es vor allem wichtig, den geforderten Sicherheitsabstand sinnvoll im Design der Datenstrukturen und der Anfragealgorithmen zu beachten. Wir präsentieren hierzu Lösungen, die die Menge der zu testenden Paare von Primitiven schnell bestimmen. Darüber hinaus entwickeln wir Strategien, wie Primitive als toleranzverletzend erkannt werden können, ohne einen aufwändigen Primitiv-Primitiv Toleranztest zu berechnen. In unseren Benchmarks zeigen wir, dass wir mit unseren Lösungen in der Lage sind, in Echtzeit alle toleranzverletzenden Primitive zwischen zwei komplexen geometrischen Objekten, bestehend aus jeweils vielen hunderttausend Primitiven, zu berechnen. rnrnIm dritten Teil präsentieren wir eine neuartige, speicheroptimierte Datenstruktur zur Verwaltung der Zellinhalte der zuvor verwendeten Uniform Grids. Wir bezeichnen diese Datenstruktur als Shrubs. Bisherige Ansätze zur Speicheroptimierung von Uniform Grids beziehen sich vor allem auf Hashing Methoden. Diese reduzieren aber nicht den Speicherverbrauch der Zellinhalte. In unserem Anwendungsfall haben benachbarte Zellen oft ähnliche Inhalte. Unser Ansatz ist in der Lage, den Speicherbedarf der Zellinhalte eines Uniform Grids, basierend auf den redundanten Zellinhalten, verlustlos auf ein fünftel der bisherigen Größe zu komprimieren und zur Laufzeit zu dekomprimieren.rnrnAbschießend zeigen wir, wie unsere Lösung zur Berechnung aller toleranzverletzenden Primitive Anwendung in der Praxis finden kann. Neben der reinen Abstandsanalyse zeigen wir Anwendungen für verschiedene Problemstellungen der Pfadplanung.