21 resultados para Statistical genomics
em Cochin University of Science
Resumo:
Department of Statistics, Cochin University of Science and Technology
Resumo:
The standard models for statistical signal extraction assume that the signal and noise are generated by linear Gaussian processes. The optimum filter weights for those models are derived using the method of minimum mean square error. In the present work we study the properties of signal extraction models under the assumption that signal/noise are generated by symmetric stable processes. The optimum filter is obtained by the method of minimum dispersion. The performance of the new filter is compared with their Gaussian counterparts by simulation.
Resumo:
Learning Disability (LD) is a general term that describes specific kinds of learning problems. It is a neurological condition that affects a child's brain and impairs his ability to carry out one or many specific tasks. The learning disabled children are neither slow nor mentally retarded. This disorder can make it problematic for a child to learn as quickly or in the same way as some child who isn't affected by a learning disability. An affected child can have normal or above average intelligence. They may have difficulty paying attention, with reading or letter recognition, or with mathematics. It does not mean that children who have learning disabilities are less intelligent. In fact, many children who have learning disabilities are more intelligent than an average child. Learning disabilities vary from child to child. One child with LD may not have the same kind of learning problems as another child with LD. There is no cure for learning disabilities and they are life-long. However, children with LD can be high achievers and can be taught ways to get around the learning disability. In this research work, data mining using machine learning techniques are used to analyze the symptoms of LD, establish interrelationships between them and evaluate the relative importance of these symptoms. To increase the diagnostic accuracy of learning disability prediction, a knowledge based tool based on statistical machine learning or data mining techniques, with high accuracy,according to the knowledge obtained from the clinical information, is proposed. The basic idea of the developed knowledge based tool is to increase the accuracy of the learning disability assessment and reduce the time used for the same. Different statistical machine learning techniques in data mining are used in the study. Identifying the important parameters of LD prediction using the data mining techniques, identifying the hidden relationship between the symptoms of LD and estimating the relative significance of each symptoms of LD are also the parts of the objectives of this research work. The developed tool has many advantages compared to the traditional methods of using check lists in determination of learning disabilities. For improving the performance of various classifiers, we developed some preprocessing methods for the LD prediction system. A new system based on fuzzy and rough set models are also developed for LD prediction. Here also the importance of pre-processing is studied. A Graphical User Interface (GUI) is designed for developing an integrated knowledge based tool for prediction of LD as well as its degree. The designed tool stores the details of the children in the student database and retrieves their LD report as and when required. The present study undoubtedly proves the effectiveness of the tool developed based on various machine learning techniques. It also identifies the important parameters of LD and accurately predicts the learning disability in school age children. This thesis makes several major contributions in technical, general and social areas. The results are found very beneficial to the parents, teachers and the institutions. They are able to diagnose the child’s problem at an early stage and can go for the proper treatments/counseling at the correct time so as to avoid the academic and social losses.
Resumo:
The overall focus of the thesis involves the International trade and cochin port a historical and statistical analysis 1881-1980.Analysing the trend of exports and imports through cochin port during the course of the last hundred years .This analysis has brought to light some very pertinent facts which , in our opinion,deserve serious consideration of the policy makers,the partise involved in trade and those who are interested in the development of the cochin port.Our study is restricted to twelve commodities -ten commodities of exports and two commodities of imports.The study reveals that the commodities that were exported from cochin are subjected to fluctuations -some mild and others wild. The projections only indicate the potential and unless we are very cautious the chance will be taken away by our competitors .With reference to the development of the port in particular and the states economy in general we would like to make a suggestion .This suggestion relates to declaring cochin as a free port .This will go a long way in the develppment of the port and the state's economy.The sooner it is done the better for the port and the state.
Resumo:
The preceding discussion and review of literature show that studies on gear selectivity have received great attention, while gear efficiency studies do not seem to have received equal consideration. In temperate waters, fishing industry is well organised and relatively large and well equipped vessels and gear are used for commercial fishing and the number of species are less; whereas in tropics particularly in India, small scale fishery dominates the scene and the fishery is multispecies operated upon by nmltigear. Therefore many of the problems faced in India may not exist in developed countries. Perhaps this would be the reason for the paucity of literature on the problems in estimation of relative efficiency. Much work has been carried out in estimating relative efficiency (Pycha, 1962; Pope, 1963; Gulland, 1967; Dickson, 1971 and Collins, 1979). The main subject of interest in the present thesis is an investigation into the problems in the comparison of fishing gears. especially in using classical test procedures with special reference to the prevailing fishing practices (that is. with reference to the catch data generated by the existing system). This has been taken up with a view to standardizing an approach for comparing the efficiency of fishing gear. Besides this, the implications of the terms ‘gear efficiency‘ and ‘gear selectivity‘ have been examined and based on the commonly used selectivity model (Holt, 1963), estimation of the ratio of fishing power of two gear has been considered. An attempt to determine the size of fish for which a gear is most efficient.has also been made. The work has been presented in eight chapters
Resumo:
Some investigations on the spectral and statistical characteristics of deep water waves are available for Indian waters. But practically no systematic investigation on the shallow water wave spectral and probabilistic characteristics is made for any part of the Indian coast except for a few restricted studies. Hence a comprehensive study of the shallow water wave climate and their spectral and statistical characteristics for a location (Alleppey) along the southwest coast of India is undertaken based on recorded data. The results of the investigation are presented in this thesis.The thesis comprises of seven chapters
Resumo:
For years, choosing the right career by monitoring the trends and scope for different career paths have been a requirement for all youngsters all over the world. In this paper we provide a scientific, data mining based method for job absorption rate prediction and predicting the waiting time needed for 100% placement, for different engineering courses in India. This will help the students in India in a great deal in deciding the right discipline for them for a bright future. Information about passed out students are obtained from the NTMIS ( National technical manpower information system ) NODAL center in Kochi, India residing in Cochin University of science and technology
Resumo:
This paper compares statistical technique of paraphrase identification to semantic technique of paraphrase identification. The statistical techniques used for comparison are word set and word-order based methods where as the semantic technique used is the WordNet similarity matrix method described by Stevenson and Fernando in [3].
Resumo:
In this paper we describe the methodology and the structural design of a system that translates English into Malayalam using statistical models. A monolingual Malayalam corpus and a bilingual English/Malayalam corpus are the main resource in building this Statistical Machine Translator. Training strategy adopted has been enhanced by PoS tagging which helps to get rid of the insignificant alignments. Moreover, incorporating units like suffix separator and the stop word eliminator has proven to be effective in bringing about better training results. In the decoder, order conversion rules are applied to reduce the structural difference between the language pair. The quality of statistical outcome of the decoder is further improved by applying mending rules. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Resumo:
This paper underlines a methodology for translating text from English into the Dravidian language, Malayalam using statistical models. By using a monolingual Malayalam corpus and a bilingual English/Malayalam corpus in the training phase, the machine automatically generates Malayalam translations of English sentences. This paper also discusses a technique to improve the alignment model by incorporating the parts of speech information into the bilingual corpus. Removing the insignificant alignments from the sentence pairs by this approach has ensured better training results. Pre-processing techniques like suffix separation from the Malayalam corpus and stop word elimination from the bilingual corpus also proved to be effective in training. Various handcrafted rules designed for the suffix separation process which can be used as a guideline in implementing suffix separation in Malayalam language are also presented in this paper. The structural difference between the English Malayalam pair is resolved in the decoder by applying the order conversion rules. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Resumo:
A methodology for translating text from English into the Dravidian language, Malayalam using statistical models is discussed in this paper. The translator utilizes a monolingual Malayalam corpus and a bilingual English/Malayalam corpus in the training phase and generates automatically the Malayalam translation of an unseen English sentence. Various techniques to improve the alignment model by incorporating the morphological inputs into the bilingual corpus are discussed. Removing the insignificant alignments from the sentence pairs by this approach has ensured better training results. Pre-processing techniques like suffix separation from the Malayalam corpus and stop word elimination from the bilingual corpus also proved to be effective in producing better alignments. Difficulties in translation process that arise due to the structural difference between the English Malayalam pair is resolved in the decoding phase by applying the order conversion rules. The handcrafted rules designed for the suffix separation process which can be used as a guideline in implementing suffix separation in Malayalam language are also presented in this paper. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics