945 resultados para File processing (Computer science)
Resumo:
clRNG et clProbdist sont deux interfaces de programmation (APIs) que nous avons développées pour la génération de nombres aléatoires uniformes et non uniformes sur des dispositifs de calculs parallèles en utilisant l’environnement OpenCL. La première interface permet de créer au niveau d’un ordinateur central (hôte) des objets de type stream considérés comme des générateurs virtuels parallèles qui peuvent être utilisés aussi bien sur l’hôte que sur les dispositifs parallèles (unités de traitement graphique, CPU multinoyaux, etc.) pour la génération de séquences de nombres aléatoires. La seconde interface permet aussi de générer au niveau de ces unités des variables aléatoires selon différentes lois de probabilité continues et discrètes. Dans ce mémoire, nous allons rappeler des notions de base sur les générateurs de nombres aléatoires, décrire les systèmes hétérogènes ainsi que les techniques de génération parallèle de nombres aléatoires. Nous présenterons aussi les différents modèles composant l’architecture de l’environnement OpenCL et détaillerons les structures des APIs développées. Nous distinguons pour clRNG les fonctions qui permettent la création des streams, les fonctions qui génèrent les variables aléatoires uniformes ainsi que celles qui manipulent les états des streams. clProbDist contient les fonctions de génération de variables aléatoires non uniformes selon la technique d’inversion ainsi que les fonctions qui permettent de retourner différentes statistiques des lois de distribution implémentées. Nous évaluerons ces interfaces de programmation avec deux simulations qui implémentent un exemple simplifié d’un modèle d’inventaire et un exemple d’une option financière. Enfin, nous fournirons les résultats d’expérimentation sur les performances des générateurs implémentés.
Resumo:
Sharing of information with those in need of it has always been an idealistic goal of networked environments. With the proliferation of computer networks, information is so widely distributed among systems, that it is imperative to have well-organized schemes for retrieval and also discovery. This thesis attempts to investigate the problems associated with such schemes and suggests a software architecture, which is aimed towards achieving a meaningful discovery. Usage of information elements as a modelling base for efficient information discovery in distributed systems is demonstrated with the aid of a novel conceptual entity called infotron.The investigations are focused on distributed systems and their associated problems. The study was directed towards identifying suitable software architecture and incorporating the same in an environment where information growth is phenomenal and a proper mechanism for carrying out information discovery becomes feasible. An empirical study undertaken with the aid of an election database of constituencies distributed geographically, provided the insights required. This is manifested in the Election Counting and Reporting Software (ECRS) System. ECRS system is a software system, which is essentially distributed in nature designed to prepare reports to district administrators about the election counting process and to generate other miscellaneous statutory reports.Most of the distributed systems of the nature of ECRS normally will possess a "fragile architecture" which would make them amenable to collapse, with the occurrence of minor faults. This is resolved with the help of the penta-tier architecture proposed, that contained five different technologies at different tiers of the architecture.The results of experiment conducted and its analysis show that such an architecture would help to maintain different components of the software intact in an impermeable manner from any internal or external faults. The architecture thus evolved needed a mechanism to support information processing and discovery. This necessitated the introduction of the noveI concept of infotrons. Further, when a computing machine has to perform any meaningful extraction of information, it is guided by what is termed an infotron dictionary.The other empirical study was to find out which of the two prominent markup languages namely HTML and XML, is best suited for the incorporation of infotrons. A comparative study of 200 documents in HTML and XML was undertaken. The result was in favor ofXML.The concept of infotron and that of infotron dictionary, which were developed, was applied to implement an Information Discovery System (IDS). IDS is essentially, a system, that starts with the infotron(s) supplied as clue(s), and results in brewing the information required to satisfy the need of the information discoverer by utilizing the documents available at its disposal (as information space). The various components of the system and their interaction follows the penta-tier architectural model and therefore can be considered fault-tolerant. IDS is generic in nature and therefore the characteristics and the specifications were drawn up accordingly. Many subsystems interacted with multiple infotron dictionaries that were maintained in the system.In order to demonstrate the working of the IDS and to discover the information without modification of a typical Library Information System (LIS), an Information Discovery in Library Information System (lDLIS) application was developed. IDLIS is essentially a wrapper for the LIS, which maintains all the databases of the library. The purpose was to demonstrate that the functionality of a legacy system could be enhanced with the augmentation of IDS leading to information discovery service. IDLIS demonstrates IDS in action. IDLIS proves that any legacy system could be augmented with IDS effectively to provide the additional functionality of information discovery service.Possible applications of IDS and scope for further research in the field are covered.
Resumo:
Modern computer systems are plagued with stability and security problems: applications lose data, web servers are hacked, and systems crash under heavy load. Many of these problems or anomalies arise from rare program behavior caused by attacks or errors. A substantial percentage of the web-based attacks are due to buffer overflows. Many methods have been devised to detect and prevent anomalous situations that arise from buffer overflows. The current state-of-art of anomaly detection systems is relatively primitive and mainly depend on static code checking to take care of buffer overflow attacks. For protection, Stack Guards and I-leap Guards are also used in wide varieties.This dissertation proposes an anomaly detection system, based on frequencies of system calls in the system call trace. System call traces represented as frequency sequences are profiled using sequence sets. A sequence set is identified by the starting sequence and frequencies of specific system calls. The deviations of the current input sequence from the corresponding normal profile in the frequency pattern of system calls is computed and expressed as an anomaly score. A simple Bayesian model is used for an accurate detection.Experimental results are reported which show that frequency of system calls represented using sequence sets, captures the normal behavior of programs under normal conditions of usage. This captured behavior allows the system to detect anomalies with a low rate of false positives. Data are presented which show that Bayesian Network on frequency variations responds effectively to induced buffer overflows. It can also help administrators to detect deviations in program flow introduced due to errors.
Resumo:
This thesis Entitled Journal productivity in fishery science an informetric analysis.The analyses and formulating results of the study, the format of the thesis was determined. The thesis is divided into different chapters mentioned below. Chapter 1 gives an overview on the topic of research. Introduction gives the relevance of topic, define the problem, objectives of the study, hypothesis, methods of data collection, analysis and layout of the thesis. Chapter 2 provides a detailed account of the subject Fishery science and its development. A comprehensive outline is given along with definition, scope, classification, development and sources of information.Method of study used in this research and its literature review form the content of this chapter. Chapter 4 Details of the method adopted for collecting samples for the study, data collection and organization of the data are given. The methods are based on availability of data, period and objectives of the research undertaken.The description, analyses and the results of the study are covered in this chapter.
Resumo:
Information communication technology (IC T) has invariably brought about fundamental changes in the way in which libraries gather. preserve and disseminate information. The study was carried out with an aim to estimate and compare the information seeking behaviour (ISB) of the academics of two prominent universities of Kerala in the context of advancements achieved through ICT. The study was motivated by the fast changing scenario of libraries with the proliferation of many high tech products and services. The main purpose of the study was to identify the chief source of information of the academics, and also to examine academics preference upon the form and format of information source. The study also tries to estimate the adequacy of the resources and services currently provided by the libraries.The questionnaire was the central instrument for data collection. An almost census method was adopted for data collection engaging various methods and tools for eliciting data.The total population of the study was 957, out of which questionnaire was distributed to 859 academics. 646 academics responded to the survey, of which 564 of them were sound responses. Data was coded and analysed using Statistical Package for Social Sciences (SPSS) software and also with the help of Microsofl Excel package. Various statistical techniques were engaged to analyse data. A paradigm shift is evident by the fact that academies push themselves towards information in internet i.e. they prefer electronic source to traditional source and the very shift is coupled itself with e-seeking of information. The study reveals that ISB of the academics is influenced priman'ly by personal factors and comparative analysis shows that the ISB ofthc academics is similar in both universities. The productivity of the academics was tested to dig up any relation with respect to their ISB, and it is found that productivity of the academics is extensively related with their ISB. Study also reveals that the users ofthe library are satisfied with the services provided but not with the sources and in conjunction, study also recommends ways and means to improve the existing library system.
Resumo:
Data mining is one of the hottest research areas nowadays as it has got wide variety of applications in common man’s life to make the world a better place to live. It is all about finding interesting hidden patterns in a huge history data base. As an example, from a sales data base, one can find an interesting pattern like “people who buy magazines tend to buy news papers also” using data mining. Now in the sales point of view the advantage is that one can place these things together in the shop to increase sales. In this research work, data mining is effectively applied to a domain called placement chance prediction, since taking wise career decision is so crucial for anybody for sure. In India technical manpower analysis is carried out by an organization named National Technical Manpower Information System (NTMIS), established in 1983-84 by India's Ministry of Education & Culture. The NTMIS comprises of a lead centre in the IAMR, New Delhi, and 21 nodal centres located at different parts of the country. The Kerala State Nodal Centre is located at Cochin University of Science and Technology. In Nodal Centre, they collect placement information by sending postal questionnaire to passed out students on a regular basis. From this raw data available in the nodal centre, a history data base was prepared. Each record in this data base includes entrance rank ranges, reservation, Sector, Sex, and a particular engineering. From each such combination of attributes from the history data base of student records, corresponding placement chances is computed and stored in the history data base. From this data, various popular data mining models are built and tested. These models can be used to predict the most suitable branch for a particular new student with one of the above combination of criteria. Also a detailed performance comparison of the various data mining models is done.This research work proposes to use a combination of data mining models namely a hybrid stacking ensemble for better predictions. A strategy to predict the overall absorption rate for various branches as well as the time it takes for all the students of a particular branch to get placed etc are also proposed. Finally, this research work puts forward a new data mining algorithm namely C 4.5 * stat for numeric data sets which has been proved to have competent accuracy over standard benchmarking data sets called UCI data sets. It also proposes an optimization strategy called parameter tuning to improve the standard C 4.5 algorithm. As a summary this research work passes through all four dimensions for a typical data mining research work, namely application to a domain, development of classifier models, optimization and ensemble methods.
Resumo:
Handwriting is an acquired tool used for communication of one's observations or feelings. Factors that inuence a person's handwriting not only dependent on the individual's bio-mechanical constraints, handwriting education received, writing instrument, type of paper, background, but also factors like stress, motivation and the purpose of the handwriting. Despite the high variation in a person's handwriting, recent results from different writer identification studies have shown that it possesses sufficient individual traits to be used as an identification method. Handwriting as a behavioral biometric has had the interest of researchers for a long time. But recently it has been enjoying new interest due to an increased need and effort to deal with problems ranging from white-collar crime to terrorist threats. The identification of the writer based on a piece of handwriting is a challenging task for pattern recognition. The main objective of this thesis is to develop a text independent writer identification system for Malayalam Handwriting. The study also extends to developing a framework for online character recognition of Grantha script and Malayalam characters
Resumo:
This paper presents a performance analysis of reversible, fault tolerant VLSI implementations of carry select and hybrid decimal adders suitable for multi-digit BCD addition. The designs enable partial parallel processing of all digits that perform high-speed addition in decimal domain. When the number of digits is more than 25 the hybrid decimal adder can operate 5 times faster than conventional decimal adder using classical logic gates. The speed up factor of hybrid adder increases above 10 when the number of decimal digits is more than 25 for reversible logic implementation. Such highspeed decimal adders find applications in real time processors and internet-based applications. The implementations use only reversible conservative Fredkin gates, which make it suitable for VLSI circuits.
Resumo:
Image processing has been a challenging and multidisciplinary research area since decades with continuing improvements in its various branches especially Medical Imaging. The healthcare industry was very much benefited with the advances in Image Processing techniques for the efficient management of large volumes of clinical data. The popularity and growth of Image Processing field attracts researchers from many disciplines including Computer Science and Medical Science due to its applicability to the real world. In the meantime, Computer Science is becoming an important driving force for the further development of Medical Sciences. The objective of this study is to make use of the basic concepts in Medical Image Processing and develop methods and tools for clinicians’ assistance. This work is motivated from clinical applications of digital mammograms and placental sonograms, and uses real medical images for proposing a method intended to assist radiologists in the diagnostic process. The study consists of two domains of Pattern recognition, Classification and Content Based Retrieval. Mammogram images of breast cancer patients and placental images are used for this study. Cancer is a disaster to human race. The accuracy in characterizing images using simplified user friendly Computer Aided Diagnosis techniques helps radiologists in detecting cancers at an early stage. Breast cancer which accounts for the major cause of cancer death in women can be fully cured if detected at an early stage. Studies relating to placental characteristics and abnormalities are important in foetal monitoring. The diagnostic variability in sonographic examination of placenta can be overlooked by detailed placental texture analysis by focusing on placental grading. The work aims on early breast cancer detection and placental maturity analysis. This dissertation is a stepping stone in combing various application domains of healthcare and technology.
Resumo:
Speech processing and consequent recognition are important areas of Digital Signal Processing since speech allows people to communicate more natu-rally and efficiently. In this work, a speech recognition system is developed for re-cognizing digits in Malayalam. For recognizing speech, features are to be ex-tracted from speech and hence feature extraction method plays an important role in speech recognition. Here, front end processing for extracting the features is per-formed using two wavelet based methods namely Discrete Wavelet Transforms (DWT) and Wavelet Packet Decomposition (WPD). Naive Bayes classifier is used for classification purpose. After classification using Naive Bayes classifier, DWT produced a recognition accuracy of 83.5% and WPD produced an accuracy of 80.7%. This paper is intended to devise a new feature extraction method which produces improvements in the recognition accuracy. So, a new method called Dis-crete Wavelet Packet Decomposition (DWPD) is introduced which utilizes the hy-brid features of both DWT and WPD. The performance of this new approach is evaluated and it produced an improved recognition accuracy of 86.2% along with Naive Bayes classifier.
Resumo:
This paper describes JERIM-320, a new 320-bit hash function used for ensuring message integrity and details a comparison with popular hash functions of similar design. JERIM-320 and FORK -256 operate on four parallel lines of message processing while RIPEMD-320 operates on two parallel lines. Popular hash functions like MD5 and SHA-1 use serial successive iteration for designing compression functions and hence are less secure. The parallel branches help JERIM-320 to achieve higher level of security using multiple iterations and processing on the message blocks. The focus of this work is to prove the ability of JERIM 320 in ensuring the integrity of messages to a higher degree to suit the fast growing internet applications
Resumo:
Speech is the primary, most prominent and convenient means of communication in audible language. Through speech, people can express their thoughts, feelings or perceptions by the articulation of words. Human speech is a complex signal which is non stationary in nature. It consists of immensely rich information about the words spoken, accent, attitude of the speaker, expression, intention, sex, emotion as well as style. The main objective of Automatic Speech Recognition (ASR) is to identify whatever people speak by means of computer algorithms. This enables people to communicate with a computer in a natural spoken language. Automatic recognition of speech by machines has been one of the most exciting, significant and challenging areas of research in the field of signal processing over the past five to six decades. Despite the developments and intensive research done in this area, the performance of ASR is still lower than that of speech recognition by humans and is yet to achieve a completely reliable performance level. The main objective of this thesis is to develop an efficient speech recognition system for recognising speaker independent isolated words in Malayalam.
Resumo:
This report examines why women pursue careers in computer science and related fields far less frequently than men do. In 1990, only 13% of PhDs in computer science went to women, and only 7.8% of computer science professors were female. Causes include the different ways in which boys and girls are raised, the stereotypes of female engineers, subtle biases that females face, problems resulting from working in predominantly male environments, and sexual biases in language. A theme of the report is that women's underrepresentation is not primarily due to direct discrimination but to subconscious behavior that perpetuates the status quo.
Resumo:
In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID⁺. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.
Resumo:
In this paper, we present a P2P-based database sharing system that provides information sharing capabilities through keyword-based search techniques. Our system requires neither a global schema nor schema mappings between different databases, and our keyword-based search algorithms are robust in the presence of frequent changes in the content and membership of peers. To facilitate data integration, we introduce keyword join operator to combine partial answers containing different keywords into complete answers. We also present an efficient algorithm that optimize the keyword join operations for partial answer integration. Our experimental study on both real and synthetic datasets demonstrates the effectiveness of our algorithms, and the efficiency of the proposed query processing strategies.