874 resultados para language acquisition
Resumo:
In this dissertation I study language complexity from a typological perspective. Since the structuralist era, it has been assumed that local complexity differences in languages are balanced out in cross-linguistic comparisons and that complexity is not affected by the geopolitical or sociocultural aspects of the speech community. However, these assumptions have seldom been studied systematically from a typological point of view. My objective is to define complexity so that it is possible to compare it across languages and to approach its variation with the methods of quantitative typology. My main empirical research questions are: i) does language complexity vary in any systematic way in local domains, and ii) can language complexity be affected by the geographical or social environment? These questions are studied in three articles, whose findings are summarized in the introduction to the dissertation. In order to enable cross-language comparison, I measure complexity as the description length of the regularities in an entity; I separate it from difficulty, focus on local instead of global complexity, and break it up into different types. This approach helps avoid the problems that plagued earlier metrics of language complexity. My approach to grammar is functional-typological in nature, and the theoretical framework is basic linguistic theory. I delimit the empirical research functionally to the marking of core arguments (the basic participants in the sentence). I assess the distributions of complexity in this domain with multifactorial statistical methods and use different sampling strategies, implementing, for instance, the Greenbergian view of universals as diachronic laws of type preference. My data come from large and balanced samples (up to approximately 850 languages), drawn mainly from reference grammars. The results suggest that various significant trends occur in the marking of core arguments in regard to complexity and that complexity in this domain correlates with population size. These results provide evidence that linguistic patterns interact among themselves in terms of complexity, that language structure adapts to the social environment, and that there may be cognitive mechanisms that limit complexity locally. My approach to complexity and language universals can therefore be successfully applied to empirical data and may serve as a model for further research in these areas.
Resumo:
In the field of second language (L2) acquisition, the term `foreign accent´ is often used to refer to speech characteristics that differ from the pronunciation of native speakers. Foreign accent may affect the intelligibility and perceived comprehensibility of speech and it is also sometimes associated with negative attitudes. The degree of L2 learners foreign accent and the speech characteristics that account for it have previously been studied through speech perception experiments and acoustic measurements. Perception experiments have shown that native listeners are easily able to identify foreign accent in speech. However to date, no studies have been done on the assessment of foreign accent in the speech of non-native speakers of Finnish. The aim of this study is to examine how native speakers of Finnish rate the degree of foreign accentedness in the speech of Russian L2 learners of Finnish. Furthermore, phonetic analysis is used to study the characteristics of speech that affect the perceived strength of foreign accent. Altogether 96 native speakers of Finnish listened to excerpts of read-aloud and spontaneous Finnish speech from ten Russian and six Finnish female speakers. The Russian speakers were intermediate and advanced learners of Finnish and had all immigrated to Finland as adults. Among the listeners, was a group of teachers of Finnish as an L2, and it was presumed that these teachers had been exposed to foreign accent in Finnish and were used to hearing it. The temporal aspects and segmental properties of speech were phonetically analysed in the speech of the Russian speakers in order to measure their effect on the perceived degree of accent. Although wide differences were observed in the use of the rating scale among the listeners, they were still quite unanimous on which speakers had the strongest foreign accent and which had the mildest. The listeners background factors had little effect on their ratings, and the ratings of the teachers of Finnish as an L2 did not differ from those of the other listeners. However, a clear difference was noted in the ratings of the two types of stimuli used in the perception experiment: the read-aloud speech was rated as more strongly accented than the spontaneous speech. It is important to note that the assessment of foreign accent is affected by many factors and their complex interactions in the experimental setting. Futher the study found that, both the temporal aspects of speech, often associated with fluency, and the number of single deviant phonetic segments contributed to the perceived degree of accentedness in the speech of the native Russian speakers.
Resumo:
The aim of this study was to examine the applicability of the Phonological Mean Length of Utterance (pMLU) method to the data of children acquiring Finnish, for both typically developing children and children with a Specific Language Impairment (SLI). Study I examined typically developing children at the end of the one-word stage (N=17, mean age 1;8), and Study II analysed children s (N=5) productions in a follow-up study with four assessment points (ages 2;0, 2;6, 3;0, 3;6). Study III was carried out in the form of a review article that examined recent research on the phonological development of children acquiring Finnish and compared the results with general trends and cross-linguistic findings in phonological development. Study IV included children with SLI (N=4, mean age 4;10) and age-matched peers. The analyses in Studies I, II and IV were made using the quantitative pMLU method. In the pMLU method, pMLU values are counted for both the words that the children targeted (so-called target words) and the words produced by the children. When the child s average pMLU value was divided with the average target word pMLU value, it is possible to examine that child s accuracy in producing the words with the Whole-Word Proximity (PWP) value. In addition, the number of entirely correctly produced words is counted to obtain the Whole-Word Correctness (PWC) value. Qualitative analyses were carried out in order to examine how the children s phoneme inventories and deficiencies in phonotactics would explain the observed pMLU, PWP and PWC values. The results showed that the pMLU values for children acquiring Finnish were relatively high already at the end of the one-word stage (Study I). The values were found to reflect the characteristics of the ambient language. Typological features that lead to cross-linguistic differences in pMLU values were also observed in the review article (Study III), which noted that in the course of phonological acquisition there are a large number of language-specific phenomena and processes. Study II indicated that overall the children s phonological development during the follow-up period was reflected in the pMLU, PWP and PWC values, although the method showed limitations in detecting qualitative differences between the children. Correct vowels were not scored in the pMLU counts, which led to some misleadingly high pMLU and PWP results: vowel errors were only reflected in the PWC values. Typically developing children in Study II reached the highest possible pMLU results already around age 3;6. At the same time, the differences between the children with SLI and age-matched peers in the pMLU values were very prominent (Study IV). The values for the children with SLI were similar to the ones reported for two-year-old children. Qualitative analyses revealed that the phonologies of the children with SLI largely resembled the ones of younger, typically developing children. However, unusual errors were also witnessed (e.g., vowel errors, omissions of word-initial stops, consonants added to the initial position in words beginning with a vowel). This dissertation provides an application of a new tool for quantitative phonological assessment and analysis in children acquiring Finnish. The preliminary results suggest that, with some modifications, the pMLU method can be used to assess children s phonological development and that it has some advantages compared to the earlier, segment-oriented approaches. Qualitative analyses complemented the pMLU s observations on the children s phonologies. More research is needed in order to verify the levels of the pMLU, PWP and PWC values in children acquiring Finnish.
Resumo:
Researchers and developers in academia and industry would benefit from a facility that enables them to easily locate, licence and use the kind of empirical data they need for testing and refining their hypotheses and to deposit and disseminate their data e.g. to support replication and validation of reported scientific experiments. To answer these needs initially in Finland, there is an ongoing project at University of Helsinki and its collaborators to create a user-friendly web service for researchers and developers in Finland and other countries. In our talk, we describe ongoing work to create a palette of extensive but easily available Finnish language resources and technologies for the research community, including lexical resources, wordnets, morphologically tagged corpora, dependency syntactic treebanks and parsebanks, open-source finite state toolkits and libraries and language models to support text analysis and processing at customer site. Also first publicly available results are presented.
Resumo:
This paper introduces the META-NORD project which develops Nordic and Baltic part of the European open language resource infrastructure. META-NORD works on assembling, linking across languages, and making widely available the basic language resources used by developers, professionals and researchers to build specific products and applications. The goals of the project, overall approach and specific focus lines on wordnets, terminology resources and treebanks are described. Moreover, results achieved in first five months of the project, i.e. language whitepapers, metadata specification and IPR, are presented.
Resumo:
In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet.
Resumo:
Indian logic has a long history. It somewhat covers the domains of two of the six schools (darsanas) of Indian philosophy, namely, Nyaya and Vaisesika. The generally accepted definition of Indian logic over the ages is the science which ascertains valid knowledge either by means of six senses or by means of the five members of the syllogism. In other words, perception and inference constitute the subject matter of logic. The science of logic evolved in India through three ages: the ancient, the medieval and the modern, spanning almost thirty centuries. Advances in Computer Science, in particular, in Artificial Intelligence have got researchers in these areas interested in the basic problems of language, logic and cognition in the past three decades. In the 1980s, Artificial Intelligence has evolved into knowledge-based and intelligent system design, and the knowledge base and inference engine have become standard subsystems of an intelligent system. One of the important issues in the design of such systems is knowledge acquisition from humans who are experts in a branch of learning (such as medicine or law) and transferring that knowledge to a computing system. The second important issue in such systems is the validation of the knowledge base of the system i.e. ensuring that the knowledge is complete and consistent. It is in this context that comparative study of Indian logic with recent theories of logic, language and knowledge engineering will help the computer scientist understand the deeper implications of the terms and concepts he is currently using and attempting to develop.
Resumo:
It is possible to sample signals at sub-Nyquist rate and still be able to reconstruct them with reasonable accuracy provided they exhibit local Fourier sparsity. Underdetermined systems of equations, which arise out of undersampling, have been solved to yield sparse solutions using compressed sensing algorithms. In this paper, we propose a framework for real time sampling of multiple analog channels with a single A/D converter achieving higher effective sampling rate. Signal reconstruction from noisy measurements on two different synthetic signals has been presented. A scheme of implementing the algorithm in hardware has also been suggested.
Resumo:
Tower platforms, with instrumentation at six levels above the surface to a height of 30 m, were used to record various atmospheric parameters in the surface layer. Sensors for measuring both mean and fluctuating quantities were used, with the majority of them indigenously built. Soil temperature sensors up to a depth of 30 cm from the surface were among the variables connected to the mean data logger. A PC-based data acquisition system built at the Centre for Atmospheric Sciences, IISc, was used to acquire the data from fast response sensors. This paper reports the various components of a typical MONTBLEX tower observatory and describes the actual experiments carried out in the surface layer at four sites over the monsoon trough region as a part of the MONTBLEX programme. It also describes and discusses several checks made on randomly selected tower data-sets acquired during the experiment. Checks made include visual inspection of time traces from various sensors, comparative plots of sensors measuring the same variable, wind and temperature profile plots calculation of roughness lengths, statistical and stability parameters, diurnal variation of stability parameters, and plots of probability density and energy spectrum for the different sensors. Results from these checks are found to be very encouraging and reveal the potential for further detailed analysis to understand more about surface layer characteristics.
Resumo:
Analytical studies are carried out to minimize acquisition time in phase-lock loop (PLL) applications using aiding functions. A second order aided PLL is realized with the help of the quasi-stationary approach to verify the acquisition behavior in the absence of noise. Time acquisition is measured both from the study of the LPF output transient and by employing a lock detecting and indicating circuit to crosscheck experimental and analytical results. A closed form solution is obtained for the evaluation of the time acquisition using different aiding functions. The aiding signal is simple and economical and can be used with state of the art hardware.
Resumo:
We present a improved language modeling technique for Lempel-Ziv-Welch (LZW) based LID scheme. The previous approach to LID using LZW algorithm prepares the language pattern table using LZW algorithm. Because of the sequential nature of the LZW algorithm, several language specific patterns of the language were missing in the pattern table. To overcome this, we build a universal pattern table, which contains all patterns of different length. For each language it's corresponding language specific pattern table is constructed by retaining the patterns of the universal table whose frequency of appearance in the training data is above the threshold.This approach reduces the classification score (Compression Ratio [LZW-CR] or the weighted discriminant score[LZW-WDS]) for non native languages and increases the LID performance considerably.
Resumo:
We present a new approach to spoken language modeling for language identification (LID) using the Lempel-Ziv-Welch (LZW) algorithm. The LZW technique is applicable to any kind of tokenization of the speech signal. Because of the efficiency of LZW algorithm to obtain variable length symbol strings in the training data, the LZW codebook captures the essentials of a language effectively. We develop two new deterministic measures for LID based on the LZW algorithm namely: (i) Compression ratio score (LZW-CR) and (ii) weighted discriminant score (LZW-WDS). To assess these measures, we consider error-free tokenization of speech as well as artificially induced noise in the tokenization. It is shown that for a 6 language LID task of OGI-TS database with clean tokenization, the new model (LZW-WDS) performs slightly better than the conventional bigram model. For noisy tokenization, which is the more realistic case, LZW-WDS significantly outperforms the bigram technique