26 resultados para Turkish language--Usage
Suite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences
Resumo:
Genome sequences contain a number of patterns that have biomedical significance. Repetitive sequences of various kinds are a primary component of most of the genomic sequence patterns. We extended the suffix-array based Biological Language Modeling Toolkit to compute n-gram frequencies as well as n-gram language-model based perplexity in windows over the whole genome sequence to find biologically relevant patterns. We present the suite of tools and their application for analysis on whole human genome sequence.
Resumo:
A novel approach that can more effectively use the structural information provided by the traditional imaging modalities in multimodal diffuse optical tomographic imaging is introduced. This approach is based on a prior image-constrained-l(1) minimization scheme and has been motivated by the recent progress in the sparse image reconstruction techniques. It is shown that the proposed framework is more effective in terms of localizing the tumor region and recovering the optical property values both in numerical and gelatin phantom cases compared to the traditional methods that use structural information. (C) 2012 Optical Society of America
Resumo:
Monitoring of infrastructural resources in clouds plays a crucial role in providing application guarantees like performance, availability, and security. Monitoring is crucial from two perspectives - the cloud-user and the service provider. The cloud user’s interest is in doing an analysis to arrive at appropriate Service-level agreement (SLA) demands and the cloud provider’s interest is to assess if the demand can be met. To support this, a monitoring framework is necessary particularly since cloud hosts are subject to varying load conditions. To illustrate the importance of such a framework, we choose the example of performance being the Quality of Service (QoS) requirement and show how inappropriate provisioning of resources may lead to unexpected performance bottlenecks. We evaluate existing monitoring frameworks to bring out the motivation for building much more powerful monitoring frameworks. We then propose a distributed monitoring framework, which enables fine grained monitoring for applications and demonstrate with a prototype system implementation for typical use cases.
Resumo:
The advent and evolution of geohazard warning systems is a very interesting study. The two broad fields that are immediately visible are that of geohazard evaluation and subsequent warning dissemination. Evidently, the latter field lacks any systematic study or standards. Arbitrarily organized and vague data and information on warning techniques create confusion and indecision. The purpose of this review is to try and systematize the available bulk of information on warning systems so that meaningful insights can be derived through decidable flowcharts, and a developmental process can be undertaken. Hence, the methods and technologies for numerous geohazard warning systems have been assessed by putting them into suitable categories for better understanding of possible ways to analyze their efficacy as well as shortcomings. By establishing a classification scheme based on extent, control, time period, and advancements in technology, the geohazard warning systems available in any literature could be comprehensively analyzed and evaluated. Although major advancements have taken place in geohazard warning systems in recent times, they have been lacking a complete purpose. Some systems just assess the hazard and wait for other means to communicate, and some are designed only for communication and wait for the hazard information to be provided, which usually is after the mishap. Primarily, systems are left at the mercy of administrators and service providers and are not in real time. An integrated hazard evaluation and warning dissemination system could solve this problem. Warning systems have also suffered from complexity of nature, requirement of expert-level monitoring, extensive and dedicated infrastructural setups, and so on. The user community, which would greatly appreciate having a convenient, fast, and generalized warning methodology, is surveyed in this review. The review concludes with the future scope of research in the field of hazard warning systems and some suggestions for developing an efficient mechanism toward the development of an automated integrated geohazard warning system. DOI: 10.1061/(ASCE)NH.1527-6996.0000078. (C) 2012 American Society of Civil Engineers.
Resumo:
N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.
Resumo:
Realization of cloud computing has been possible due to availability of virtualization technologies on commodity platforms. Measuring resource usage on the virtualized servers is difficult because of the fact that the performance counters used for resource accounting are not virtualized. Hence, many of the prevalent virtualization technologies like Xen, VMware, KVM etc., use host specific CPU usage monitoring, which is coarse grained. In this paper, we present a performance monitoring tool for KVM based virtualized machines, which measures the CPU overhead incurred by the hypervisor on behalf of the virtual machine along-with the CPU usage of virtual machine itself. This fine-grained resource usage information, provided by the above tool, can be used for diverse situations like resource provisioning to support performance associated QoS requirements, identification of bottlenecks during VM placements, resource profiling of applications in cloud environments, etc. We demonstrate a use case of this tool by measuring the performance of web-servers hosted on a KVM based virtualized server.
Resumo:
Primates exhibit laterality in hand usage either in terms of (a) hand with which an individual solves a task or while solving a task that requires both hands, executes the most complex action, that is, hand preference, or (b) hand with which an individual executes actions most efficiently, that is, hand performance. Observations from previous studies indicate that laterality in hand usage might reflect specialization of the two hands for accomplishing tasks that require maneuvering dexterity or physical strength. However, no existing study has investigated handedness with regard to this possibility. In this study, we examined laterality in hand usage in urban free-ranging bonnet macaques, Macaca radiata with regard to the above possibility. While solving four distinct food extraction tasks which varied in the number of steps involved in the food extraction process and the dexterity required in executing the individual steps, the macaques consistently used one hand for extracting food (i.e., task requiring maneuvering dexterity)the maneuvering hand, and the other hand for supporting the body (i.e., task requiring physical strength)the supporting hand. Analogously, the macaques used the maneuvering hand for the spontaneous routine activities that involved maneuvering in three-dimensional space, such as grooming, and hitting an opponent during an agonistic interaction, and the supporting hand for those that required physical strength, such as pulling the body up while climbing. Moreover, while solving a task that ergonomically forced the usage of a particular hand, the macaques extracted food faster with the maneuvering hand as compared to the supporting hand, demonstrating the higher maneuvering dexterity of the maneuvering hand. As opposed to the conventional ideas of handedness in non-human primates, these observations demonstrate division of labor between the two hands marked by their consistent usage across spontaneous and experimental tasks requiring maneuvering in three-dimensional space or those requiring physical strength. Am. J. Primatol. 76:576-585, 2014. (c) 2013 Wiley Periodicals, Inc.
Resumo:
Polyhedral techniques for program transformation are now used in several proprietary and open source compilers. However, most of the research on polyhedral compilation has focused on imperative languages such as C, where the computation is specified in terms of statements with zero or more nested loops and other control structures around them. Graphical dataflow languages, where there is no notion of statements or a schedule specifying their relative execution order, have so far not been studied using a powerful transformation or optimization approach. The execution semantics and referential transparency of dataflow languages impose a different set of challenges. In this paper, we attempt to bridge this gap by presenting techniques that can be used to extract polyhedral representation from dataflow programs and to synthesize them from their equivalent polyhedral representation. We then describe PolyGLoT, a framework for automatic transformation of dataflow programs which we built using our techniques and other popular research tools such as Clan and Pluto. For the purpose of experimental evaluation, we used our tools to compile LabVIEW, one of the most widely used dataflow programming languages. Results show that dataflow programs transformed using our framework are able to outperform those compiled otherwise by up to a factor of seventeen, with a mean speed-up of 2.30x while running on an 8-core Intel system.
Resumo:
Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.
Resumo:
In this work, we describe a system, which recognises open vocabulary, isolated, online handwritten Tamil words and extend it to recognize a paragraph of writing. We explain in detail each step involved in the process: segmentation, preprocessing, feature extraction, classification and bigram-based post-processing. On our database of 45,000 handwritten words obtained through tablet PC, we have obtained symbol level accuracy of 78.5% and 85.3% without and with the usage of post-processing using symbol level language models, respectively. Word level accuracies for the same are 40.1% and 59.6%. A line and word level segmentation strategy is proposed, which gives promising results of 100% line segmentation and 98.1% word segmentation accuracies on our initial trials of 40 handwritten paragraphs. The two modules have been combined to obtain a full-fledged page recognition system for online handwritten Tamil data. To the knowledge of the authors, this is the first ever attempt on recognition of open vocabulary, online handwritten paragraphs in any Indian language.