20 resultados para Arabic language--Style

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the design and implementation of a high-level query language called Generalized Query-By-Rule (GQBR) which supports retrieval, insertion, deletion and update operations. This language, based on the formalism of database logic, enables the users to access each database in a distributed heterogeneous environment, without having to learn all the different data manipulation languages. The compiler has been implemented on a DEC 1090 system in Pascal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Database management systems offer a very reliable and attractive data organization for fast and economical information storage and processing for diverse applications. It is much more important that the information should be easily accessible to users with varied backgrounds, professional as well as casual, through a suitable data sublanguage. The language adopted here (APPLE) is one such language for relational database systems and is completely nonprocedural and well suited to users with minimum or no programming background. This is supported by an access path model which permits the user to formulate completely nonprocedural queries expressed solely in terms of attribute names. The data description language (DDL) and data manipulation language (DML) features of APPLE are also discussed. The underlying relational database has been implemented with the help of the DATATRIEVE-11 utility for record and domain definition which is available on the PDP-11/35. The package is coded in Pascal and MACRO-11. Further, most of the limitations of the DATATRIEVE-11 utility have been eliminated in the interface package.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An applicative language based on the LAMBDA-Calculus is presented. The language, SLIPS (Small Language for Instruction Purposes), is described using the LAMBDA-Calculus as a metalanguage. A call-by-need mechanism of function invocation eliminates the drawbacks of both call-by-name and call-by-value. The system has been implemented in PASCAL.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many novel computer architectures like array and multiprocessors which achieve high performance through the use of concurrency exploit variations of the von Neumann model of computation. The effective utilization of the machines makes special demands on programmers and their programming languages, such as the structuring of data into vectors or the partitioning of programs into concurrent processes. In comparison, the data flow model of computation demands only that the principle of structured programming be followed. A data flow program, often represented as a data flow graph, is a program that expresses a computation by indicating the data dependencies among operators. A data flow computer is a machine designed to take advantage of concurrency in data flow graphs by executing data independent operations in parallel. In this paper, we discuss the design of a high level language (DFL: Data Flow Language) suitable for data flow computers. Some sample procedures in DFL are presented. The implementation aspects have not been discussed in detail since there are no new problems encountered. The language DFL embodies the concepts of functional programming, but in appearance closely resembles Pascal. The language is a better vehicle than the data flow graph for expressing a parallel algorithm. The compiler has been implemented on a DEC 1090 system in Pascal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This correspondence describes a method for automated segmentation of speech. The method proposed in this paper uses a specially designed filter-bank called Bach filter-bank which makes use of 'music' related perception criteria. The speech signal is treated as continuously time varying signal as against a short time stationary model. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. The Bach filters are seen to marginally outperform the other filters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new language concept for high-level distributed programming is proposed. Programs are organised as a collection of concurrently executing processes. Some of these processes, referred to as liaison processes, have a monitor-like structure and contain ports which may be invoked by other processes for the purposes of synchronisation and communication. Synchronisation is achieved by conditional activation of ports and also through port control constructs which may directly specify the execution ordering of ports. These constructs implement a path-expression-like mechanism for synchronisation and are also equipped with options to provide conditional, non-deterministic and priority ordering of ports. The usefulness and expressive power of the proposed concepts are illustrated through solutions of several representative programming problems. Some implementation issues are also considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces CSP-like communication mechanisms into Backus’ Functional Programming (FP) systems extended by nondeterministic constructs. Several new functionals are used to describe nondeterminism and communication in programs. The functionals union and restriction are introduced into FP systems to develop a simple algebra of programs with nondeterminism. The behaviour of other functionals proposed in this paper are characterized by the properties of union and restriction. The axiomatic semantics of communication constructs are presented. Examples show that it is possible to reason about a communicating program by first transforming it into a non-communicating program by using the axioms of communication, and then reasoning about the resulting non-communicating version of the program. It is also shown that communicating programs can be developed from non-communicating programs given as specifications by using a transformational approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a improved language modeling technique for Lempel-Ziv-Welch (LZW) based LID scheme. The previous approach to LID using LZW algorithm prepares the language pattern table using LZW algorithm. Because of the sequential nature of the LZW algorithm, several language specific patterns of the language were missing in the pattern table. To overcome this, we build a universal pattern table, which contains all patterns of different length. For each language it's corresponding language specific pattern table is constructed by retaining the patterns of the universal table whose frequency of appearance in the training data is above the threshold.This approach reduces the classification score (Compression Ratio [LZW-CR] or the weighted discriminant score[LZW-WDS]) for non native languages and increases the LID performance considerably.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new approach to spoken language modeling for language identification (LID) using the Lempel-Ziv-Welch (LZW) algorithm. The LZW technique is applicable to any kind of tokenization of the speech signal. Because of the efficiency of LZW algorithm to obtain variable length symbol strings in the training data, the LZW codebook captures the essentials of a language effectively. We develop two new deterministic measures for LID based on the LZW algorithm namely: (i) Compression ratio score (LZW-CR) and (ii) weighted discriminant score (LZW-WDS). To assess these measures, we consider error-free tokenization of speech as well as artificially induced noise in the tokenization. It is shown that for a 6 language LID task of OGI-TS database with clean tokenization, the new model (LZW-WDS) performs slightly better than the conventional bigram model. For noisy tokenization, which is the more realistic case, LZW-WDS significantly outperforms the bigram technique

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the efforts at MILE lab, IISc, to create a 100,000-word database each in Kannada and Tamil for the design and development of Online Handwritten Recognition. It has been collected from over 600 users in order to capture the variations in writing style. We describe features of the scripts and how the number of symbols were reduced to be able to effectively train the data for recognition. The list of words include all the characters, Kannada and Indo-Arabic numerals, punctuations and other symbols. A semi-automated tool for the annotation of data from stroke to word level is used. It segments each word into stroke groups and also acts as a validation mechanism for segmentation. The tool displays the stroke, stroke groups and aksharas of a word and hence can be used to study the various styles of writing, delayed strokes and for assigning quality tags to the words. The tool is currently being used for annotating Tamil and Kannada data. The output is stored in a standard XML format.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present an unrestricted Kannada online handwritten character recognizer which is viable for real time applications. It handles Kannada and Indo-Arabic numerals, punctuation marks and special symbols like $, &, # etc, apart from all the aksharas of the Kannada script. The dataset used has handwriting of 69 people from four different locations, making the recognition writer independent. It was found that for the DTW classifier, using smoothed first derivatives as features, enhanced the performance to 89% as compared to preprocessed co-ordinates which gave 85%, but was too inefficient in terms of time. To overcome this, we used Statistical Dynamic Time Warping (SDTW) and achieved 46 times faster classification with comparable accuracy i.e. 88%, making it fast enough for practical applications. The accuracies reported are raw symbol recognition results from the classifier. Thus, there is good scope of improvement in actual applications. Where domain constraints such as fixed vocabulary, language models and post processing can be employed. A working demo is also available on tablet PC for recognition of Kannada words.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current scientific research is characterized by increasing specialization, accumulating knowledge at a high speed due to parallel advances in a multitude of sub-disciplines. Recent estimates suggest that human knowledge doubles every two to three years – and with the advances in information and communication technologies, this wide body of scientific knowledge is available to anyone, anywhere, anytime. This may also be referred to as ambient intelligence – an environment characterized by plentiful and available knowledge. The bottleneck in utilizing this knowledge for specific applications is not accessing but assimilating the information and transforming it to suit the needs for a specific application. The increasingly specialized areas of scientific research often have the common goal of converting data into insight allowing the identification of solutions to scientific problems. Due to this common goal, there are strong parallels between different areas of applications that can be exploited and used to cross-fertilize different disciplines. For example, the same fundamental statistical methods are used extensively in speech and language processing, in materials science applications, in visual processing and in biomedicine. Each sub-discipline has found its own specialized methodologies making these statistical methods successful to the given application. The unification of specialized areas is possible because many different problems can share strong analogies, making the theories developed for one problem applicable to other areas of research. It is the goal of this paper to demonstrate the utility of merging two disparate areas of applications to advance scientific research. The merging process requires cross-disciplinary collaboration to allow maximal exploitation of advances in one sub-discipline for that of another. We will demonstrate this general concept with the specific example of merging language technologies and computational biology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel sub-word recognition (PSWR) is a new model that has been proposed for language identification (LID) which does not need elaborate phonetic labeling of the speech data in a foreign language. The new approach performs a front-end tokenization in terms of sub-word units which are designed by automatic segmentation, segment clustering and segment HMM modeling. We develop PSWR based LID in a framework similar to the parallel phone recognition (PPR) approach in the literature. This includes a front-end tokenizer and a back-end language model, for each language to be identified. Considering various combinations of the statistical evaluation scores, it is found that PSWR can perform as well as PPR, even with broad acoustic sub-word tokenization, thus making it an efficient alternative to the PPR system.