7 resultados para Forensic Linguistics
em Indian Institute of Science - Bangalore - Índia
Resumo:
Abstract-The success of automatic speaker recognition in laboratory environments suggests applications in forensic science for establishing the Identity of individuals on the basis of features extracted from speech. A theoretical model for such a verification scheme for continuous normaliy distributed featureIss developed. The three cases of using a) single feature, b)multipliendependent measurements of a single feature, and c)multpleindependent features are explored.The number iofndependent features needed for areliable personal identification is computed based on the theoretcal model and an expklatory study of some speech featues.
Resumo:
The Indian Summer Monsoon (ISM) precipitation recharges ground water aquifers in a large portion of the Indian subcontinent. Monsoonal precipitation over the Indian region brings moisture from the Arabian Sea and the Bay of Bengal (BoB). A large difference in the salinity of these two reservoirs, owing to the large amount of freshwater discharge from the continental rivers in the case of the BoB and dominating evaporation processes over the Arabian Sea region, allows us to distinguish the isotopic signatures in water originating in these two water bodies. Most bottled water manufacturers exploit the natural resources of groundwater, replenished by the monsoonal precipitation, for bottling purposes. The work presented here relates the isotopic ratios of bottled water to latitude, moisture source and seasonality in precipitation isotope ratios. We investigated the impact of the above factors on the isotopic composition of bottled water. The result shows a strong relationship between isotope ratios in precipitation (obtained from the GNIP data base)/bottled water with latitude. The approach can be used to predict the latitude at which the bottled water was manufactured. The paper provides two alternative approaches to address the site prediction. The limitations of this approach in identifying source locations and the uncertainty in latitude estimations are discussed. Furthermore, the method provided here can also be used as an important forensic tool for exploring the source location of bottled water from other regions. Copyright (C) 2011 John Wiley & Sons, Ltd.
Resumo:
Transductive SVM (TSVM) is a well known semi-supervised large margin learning method for binary text classification. In this paper we extend this method to multi-class and hierarchical classification problems. We point out that the determination of labels of unlabeled examples with fixed classifier weights is a linear programming problem. We devise an efficient technique for solving it. The method is applicable to general loss functions. We demonstrate the value of the new method using large margin loss on a number of multi-class and hierarchical classification datasets. For maxent loss we show empirically that our method is better than expectation regularization/constraint and posterior regularization methods, and competitive with the version of entropy regularization method which uses label constraints.
Resumo:
We apply the objective method of Aldous to the problem of finding the minimum-cost edge cover of the complete graph with random independent and identically distributed edge costs. The limit, as the number of vertices goes to infinity, of the expected minimum cost for this problem is known via a combinatorial approach of Hessler and Wastlund. We provide a proof of this result using the machinery of the objective method and local weak convergence, which was used to prove the (2) limit of the random assignment problem. A proof via the objective method is useful because it provides us with more information on the nature of the edge's incident on a typical root in the minimum-cost edge cover. We further show that a belief propagation algorithm converges asymptotically to the optimal solution. This can be applied in a computational linguistics problem of semantic projection. The belief propagation algorithm yields a near optimal solution with lesser complexity than the known best algorithms designed for optimality in worst-case settings.
Resumo:
Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.