13 resultados para STATISTICAL-METHODS
em Cochin University of Science
Resumo:
The preceding discussion and review of literature show that studies on gear selectivity have received great attention, while gear efficiency studies do not seem to have received equal consideration. In temperate waters, fishing industry is well organised and relatively large and well equipped vessels and gear are used for commercial fishing and the number of species are less; whereas in tropics particularly in India, small scale fishery dominates the scene and the fishery is multispecies operated upon by nmltigear. Therefore many of the problems faced in India may not exist in developed countries. Perhaps this would be the reason for the paucity of literature on the problems in estimation of relative efficiency. Much work has been carried out in estimating relative efficiency (Pycha, 1962; Pope, 1963; Gulland, 1967; Dickson, 1971 and Collins, 1979). The main subject of interest in the present thesis is an investigation into the problems in the comparison of fishing gears. especially in using classical test procedures with special reference to the prevailing fishing practices (that is. with reference to the catch data generated by the existing system). This has been taken up with a view to standardizing an approach for comparing the efficiency of fishing gear. Besides this, the implications of the terms ‘gear efficiency‘ and ‘gear selectivity‘ have been examined and based on the commonly used selectivity model (Holt, 1963), estimation of the ratio of fishing power of two gear has been considered. An attempt to determine the size of fish for which a gear is most efficient.has also been made. The work has been presented in eight chapters
Resumo:
The characterization and grading of glioma tumors, via image derived features, for diagnosis, prognosis, and treatment response has been an active research area in medical image computing. This paper presents a novel method for automatic detection and classification of glioma from conventional T2 weighted MR images. Automatic detection of the tumor was established using newly developed method called Adaptive Gray level Algebraic set Segmentation Algorithm (AGASA).Statistical Features were extracted from the detected tumor texture using first order statistics and gray level co-occurrence matrix (GLCM) based second order statistical methods. Statistical significance of the features was determined by t-test and its corresponding p-value. A decision system was developed for the grade detection of glioma using these selected features and its p-value. The detection performance of the decision system was validated using the receiver operating characteristic (ROC) curve. The diagnosis and grading of glioma using this non-invasive method can contribute promising results in medical image computing
Resumo:
Marine fungi remain totally unexplored as a source of industrial enzyme and prospective applications. Further tannase production by a marine organism has so far not been established. The primary objective of this study included the evaluation of the potential of Aspergillus awamori isolated from sea water as part of an earlier study and available in the culture collection of the Microbial technology laboratory for tannase production through different fermentation methods, optimization of bioprocess variables by statistical methods, purification and characterization of the enzyme, genetic study, and assessment of its potential applications.
Resumo:
The present study focused on the quality of rainwater at various land use locations and its variations on interaction with various domestic rainwater harvesting systems.Sampling sites were selected based upon the land use pattern of the locations and were classified as rural, urban, industrial and sub urban. Rainwater samples were collected from the south west monsoon of May 2007 to north east monsoon of October 2008, from four sampling sites namely Kothamangalam, Emakulam, Eloor and Kalamassery, in Ernakulam district of the State of Kerala, which characterized typical rural, urban, industrial and suburban locations respectively. Rain water samples at various stages of harvesting were also collected. The samples were analyzed according to standard procedures and their physico-chemical and microbiological parameters were determined. The variations of the chemical composition of the rainwater collected were studied using statistical methods. It was observed that 17.5%, 30%, 45.8% and 12.1% of rainwater samples collected at rural, urban, industrial and suburban locations respectively had pH less than 5.6, which is considered as the pH of cloud water at equilibrium with atmospheric CO,.Nearly 46% of the rainwater samples were in acidic range in the industrial location while it was only 17% in the rural location. Multivariate statistical analysls was done using Principal Component Analysis, and the sources that inf1uence the composition of rainwater at each locations were identified .which clearly indicated that the quality of rain water is site specific and represents the atmospheric characteristics of the free fall The quality of harvested rainwater showed significant variations at different stages of harvesting due to deposition of dust from the roof catchment surface, leaching of cement constituents etc. Except the micro biological quality, the harvested rainwater satisfied the Indian Standard guide lines for drinking water. Studies conducted on the leaching of cement constituents in water concluded that tanks made with ordinary portland cement and portland pozzolana cement could be safely used for storage of rain water.
Resumo:
In the present thesis entitled” Implications of Hydrobiology and Nutrient dynamics on Trophic structure and Interactions in Cochin backwaters”, an attempt has been made to assess the influence of general hydrography, nutrients and other environmental factors on the abundance, distribution and trophic interactions in Cochin backwater system. The study was based on five seasonal sampling campaigns carried out at 15 stations spread along the Cochin backwater system. The thesis is presented in the following 5 chapters. Salient features of each chapter are summarized below: Chapter 1- General Introduction: Provides information on the topic of study, environmental factors, back ground information, the significance, review of literature, aim and scope of the present study and its objectives.Chapter 2- Materials and Methods: This chapter deals with the description of the study area and the methodology adopted for sample collection and analysis. Chapter 3- General Hydrograhy and Sediment Characteristics: Describes the environmental setting of the study area explaining seasonal variation in physicochemical parameters of water column and sediment characteristics. Data on hydrographical parameters, nitrogen fractionation, phosphorus fractionation and biochemical composition of the sediment samples were assessed to evaluate the trophic status. Chapter 4- Nutrient Dynamics on Trophic Structure and Interactions: Describes primary, secondary and tertiary production in Cochin backwater system. Primary production related to cell abundance, diversity of phytoplankton that varies seasonally, concentration of various pigments and primary productivitySecondary production refers to the seasonal abundance of zooplankton especially copepod abundance and tertiary production deals with seasonal fish landings, gut content analysis and proximate composition of dominant fish species. The spatiotemporal variation, interrelationships and trophic interactions were evaluated by statistical methods. Chapter 5- Summary: The results and findings of the study are summarized in the fifth chapter of the thesis.
Resumo:
Pollution of water with pesticides has become a threat to the man, material and environment. The pesticides released to the environment reach the water bodies through run off. Industrial wastewater from pesticide manufacturing industries contains pesticides at higher concentration and hence a major source of water pollution. Pesticides create a lot of health and environmental hazards which include diseases like cancer, liver and kidney disorders, reproductive disorders, fatal death, birth defects etc. Conventional wastewater treatment plants based on biological treatment are not efficient to remove these compounds to the desired level. Most of the pesticides are phyto-toxic i.e., they kill the microorganism responsible for the degradation and are recalcitrant in nature. Advanced oxidation process (AOP) is a class of oxidation techniques where hydroxyl radicals are employed for oxidation of pollutants. AOPs have the ability to totally mineralise the organic pollutants to CO2 and water. Different methods are employed for the generation of hydroxyl radicals in AOP systems. Acetamiprid is a neonicotinoid insecticide widely used to control sucking type insects on crops such as leafy vegetables, citrus fruits, pome fruits, grapes, cotton, ornamental flowers. It is now recommended as a substitute for organophosphorous pesticides. Since its use is increasing, its presence is increasingly found in the environment. It has high water solubility and is not easily biodegradable. It has the potential to pollute surface and ground waters. Here, the use of AOPs for the removal of acetamiprid from wastewater has been investigated. Five methods were selected for the study based on literature survey and preliminary experiments conducted. Fenton process, UV treatment, UV/ H2O2 process, photo-Fenton and photocatalysis using TiO2 were selected for study. Undoped TiO2 and TiO2 doped with Cu and Fe were prepared by sol-gel method. Characterisation of the prepared catalysts was done by X-ray diffraction, scanning electron microscope, differential thermal analysis and thermogravimetric analysis. Influence of major operating parameters on the removal of acetamiprid has been investigated. All the experiments were designed using central compoiste design (CCD) of response surface methodology (RSM). Model equations were developed for Fenton, UV/ H2O2, photo-Fenton and photocatalysis for predicting acetamiprid removal and total organic carbon (TOC) removal for different operating conditions. Quality of the models were analysed by statistical methods. Experimental validations were also done to confirm the quality of the models. Optimum conditions obtained by experiment were verified with that obtained using response optimiser. Fenton Process is the simplest and oldest AOP where hydrogen peroxide and iron are employed for the generation of hydroxyl radicals. Influence of H2O2 and Fe2+ on the acetamiprid removal and TOC removal by Fenton process were investigated and it was found that removal increases with increase in H2O2 and Fe2+ concentration. At an initial concentration of 50 mg/L acetamiprid, 200 mg/L H2O2 and 20 mg/L Fe2+ at pH 3 was found to be optimum for acetamiprid removal. For UV treatment effect of pH was studied and it was found that pH has not much effect on the removal rate. Addition of H2O2 to UV process increased the removal rate because of the hydroxyl radical formation due to photolyis of H2O2. An H2O2 concentration of 110 mg/L at pH 6 was found to be optimum for acetamiprid removal. With photo-Fenton drastic reduction in the treatment time was observed with 10 times reduction in the amount of reagents required. H2O2 concentration of 20 mg/L and Fe2+ concentration of 2 mg/L was found to be optimum at pH 3. With TiO2 photocatalysis improvement in the removal rate was noticed compared to UV treatment. Effect of Cu and Fe doping on the photocatalytic activity under UV light was studied and it was observed that Cu doping enhanced the removal rate slightly while Fe doping has decreased the removal rate. Maximum acetamiprid removal was observed for an optimum catalyst loading of 1000 mg/L and Cu concentration of 1 wt%. It was noticed that mineralisation efficiency of the processes is low compared to acetamiprid removal efficiency. This may be due to the presence of stable intermediate compounds formed during degradation Kinetic studies were conducted for all the treatment processes and it was found that all processes follow pseudo-first order kinetics. Kinetic constants were found out from the experimental data for all the processes and half lives were calculated. The rate of reaction was in the order, photo- Fenton>UV/ H2O2>Fenton> TiO2 photocatalysis>UV. Operating cost was calculated for the processes and it was found that photo-Fenton removes the acetamiprid at lowest operating cost in lesser time. A kinetic model was developed for photo-Fenton process using the elementary reaction data and mass balance equations for the species involved in the process. Variation of acetamiprid concentration with time for different H2O2 and Fe2+ concentration at pH 3 can be found out using this model. The model was validated by comparing the simulated concentration profiles with that obtained from experiments. This study established the viability of the selected AOPs for the removal of acetamiprid from wastewater. Of the studied AOPs photo- Fenton gives the highest removal efficiency with lowest operating cost within shortest time.
Resumo:
Learning Disability (LD) is a general term that describes specific kinds of learning problems. It is a neurological condition that affects a child's brain and impairs his ability to carry out one or many specific tasks. The learning disabled children are neither slow nor mentally retarded. This disorder can make it problematic for a child to learn as quickly or in the same way as some child who isn't affected by a learning disability. An affected child can have normal or above average intelligence. They may have difficulty paying attention, with reading or letter recognition, or with mathematics. It does not mean that children who have learning disabilities are less intelligent. In fact, many children who have learning disabilities are more intelligent than an average child. Learning disabilities vary from child to child. One child with LD may not have the same kind of learning problems as another child with LD. There is no cure for learning disabilities and they are life-long. However, children with LD can be high achievers and can be taught ways to get around the learning disability. In this research work, data mining using machine learning techniques are used to analyze the symptoms of LD, establish interrelationships between them and evaluate the relative importance of these symptoms. To increase the diagnostic accuracy of learning disability prediction, a knowledge based tool based on statistical machine learning or data mining techniques, with high accuracy,according to the knowledge obtained from the clinical information, is proposed. The basic idea of the developed knowledge based tool is to increase the accuracy of the learning disability assessment and reduce the time used for the same. Different statistical machine learning techniques in data mining are used in the study. Identifying the important parameters of LD prediction using the data mining techniques, identifying the hidden relationship between the symptoms of LD and estimating the relative significance of each symptoms of LD are also the parts of the objectives of this research work. The developed tool has many advantages compared to the traditional methods of using check lists in determination of learning disabilities. For improving the performance of various classifiers, we developed some preprocessing methods for the LD prediction system. A new system based on fuzzy and rough set models are also developed for LD prediction. Here also the importance of pre-processing is studied. A Graphical User Interface (GUI) is designed for developing an integrated knowledge based tool for prediction of LD as well as its degree. The designed tool stores the details of the children in the student database and retrieves their LD report as and when required. The present study undoubtedly proves the effectiveness of the tool developed based on various machine learning techniques. It also identifies the important parameters of LD and accurately predicts the learning disability in school age children. This thesis makes several major contributions in technical, general and social areas. The results are found very beneficial to the parents, teachers and the institutions. They are able to diagnose the child’s problem at an early stage and can go for the proper treatments/counseling at the correct time so as to avoid the academic and social losses.
Resumo:
Treating e-mail filtering as a binary text classification problem, researchers have applied several statistical learning algorithms to email corpora with promising results. This paper examines the performance of a Naive Bayes classifier using different approaches to feature selection and tokenization on different email corpora
Resumo:
This paper compares statistical technique of paraphrase identification to semantic technique of paraphrase identification. The statistical techniques used for comparison are word set and word-order based methods where as the semantic technique used is the WordNet similarity matrix method described by Stevenson and Fernando in [3].
Resumo:
Statistical Machine Translation (SMT) is one of the potential applications in the field of Natural Language Processing. The translation process in SMT is carried out by acquiring translation rules automatically from the parallel corpora. However, for many language pairs (e.g. Malayalam- English), they are available only in very limited quantities. Therefore, for these language pairs a huge portion of phrases encountered at run-time will be unknown. This paper focuses on methods for handling such out-of-vocabulary (OOV) words in Malayalam that cannot be translated to English using conventional phrase-based statistical machine translation systems. The OOV words in the source sentence are pre-processed to obtain the root word and its suffix. Different inflected forms of the OOV root are generated and a match is looked up for the word variants in the phrase translation table of the translation model. A Vocabulary filter is used to choose the best among the translations of these word variants by finding the unigram count. A match for the OOV suffix is also looked up in the phrase entries and the target translations are filtered out. Structuring of the filtered phrases is done and SMT translation model is extended by adding OOV with its new phrase translations. By the results of the manual evaluation done it is observed that amount of OOV words in the input has been reduced considerably
Resumo:
In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam sentence using statistical models. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set among the sentence pairs of the source and target language before subjecting them for training. This paper deals with certain techniques which can be adopted for improving the alignment model of SMT. Methods to incorporate the parts of speech information into the bilingual corpus has resulted in eliminating many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Presence of Malayalam words with predictable translations has also contributed in reducing the insignificant alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics.
Resumo:
Study on variable stars is an important topic of modern astrophysics. After the invention of powerful telescopes and high resolving powered CCD’s, the variable star data is accumulating in the order of peta-bytes. The huge amount of data need lot of automated methods as well as human experts. This thesis is devoted to the data analysis on variable star’s astronomical time series data and hence belong to the inter-disciplinary topic, Astrostatistics. For an observer on earth, stars that have a change in apparent brightness over time are called variable stars. The variation in brightness may be regular (periodic), quasi periodic (semi-periodic) or irregular manner (aperiodic) and are caused by various reasons. In some cases, the variation is due to some internal thermo-nuclear processes, which are generally known as intrinsic vari- ables and in some other cases, it is due to some external processes, like eclipse or rotation, which are known as extrinsic variables. Intrinsic variables can be further grouped into pulsating variables, eruptive variables and flare stars. Extrinsic variables are grouped into eclipsing binary stars and chromospheri- cal stars. Pulsating variables can again classified into Cepheid, RR Lyrae, RV Tauri, Delta Scuti, Mira etc. The eruptive or cataclysmic variables are novae, supernovae, etc., which rarely occurs and are not periodic phenomena. Most of the other variations are periodic in nature. Variable stars can be observed through many ways such as photometry, spectrophotometry and spectroscopy. The sequence of photometric observa- xiv tions on variable stars produces time series data, which contains time, magni- tude and error. The plot between variable star’s apparent magnitude and time are known as light curve. If the time series data is folded on a period, the plot between apparent magnitude and phase is known as phased light curve. The unique shape of phased light curve is a characteristic of each type of variable star. One way to identify the type of variable star and to classify them is by visually looking at the phased light curve by an expert. For last several years, automated algorithms are used to classify a group of variable stars, with the help of computers. Research on variable stars can be divided into different stages like observa- tion, data reduction, data analysis, modeling and classification. The modeling on variable stars helps to determine the short-term and long-term behaviour and to construct theoretical models (for eg:- Wilson-Devinney model for eclips- ing binaries) and to derive stellar properties like mass, radius, luminosity, tem- perature, internal and external structure, chemical composition and evolution. The classification requires the determination of the basic parameters like pe- riod, amplitude and phase and also some other derived parameters. Out of these, period is the most important parameter since the wrong periods can lead to sparse light curves and misleading information. Time series analysis is a method of applying mathematical and statistical tests to data, to quantify the variation, understand the nature of time-varying phenomena, to gain physical understanding of the system and to predict future behavior of the system. Astronomical time series usually suffer from unevenly spaced time instants, varying error conditions and possibility of big gaps. This is due to daily varying daylight and the weather conditions for ground based observations and observations from space may suffer from the impact of cosmic ray particles. Many large scale astronomical surveys such as MACHO, OGLE, EROS, xv ROTSE, PLANET, Hipparcos, MISAO, NSVS, ASAS, Pan-STARRS, Ke- pler,ESA, Gaia, LSST, CRTS provide variable star’s time series data, even though their primary intention is not variable star observation. Center for Astrostatistics, Pennsylvania State University is established to help the astro- nomical community with the aid of statistical tools for harvesting and analysing archival data. Most of these surveys releases the data to the public for further analysis. There exist many period search algorithms through astronomical time se- ries analysis, which can be classified into parametric (assume some underlying distribution for data) and non-parametric (do not assume any statistical model like Gaussian etc.,) methods. Many of the parametric methods are based on variations of discrete Fourier transforms like Generalised Lomb-Scargle peri- odogram (GLSP) by Zechmeister(2009), Significant Spectrum (SigSpec) by Reegen(2007) etc. Non-parametric methods include Phase Dispersion Minimi- sation (PDM) by Stellingwerf(1978) and Cubic spline method by Akerlof(1994) etc. Even though most of the methods can be brought under automation, any of the method stated above could not fully recover the true periods. The wrong detection of period can be due to several reasons such as power leakage to other frequencies which is due to finite total interval, finite sampling interval and finite amount of data. Another problem is aliasing, which is due to the influence of regular sampling. Also spurious periods appear due to long gaps and power flow to harmonic frequencies is an inherent problem of Fourier methods. Hence obtaining the exact period of variable star from it’s time series data is still a difficult problem, in case of huge databases, when subjected to automation. As Matthew Templeton, AAVSO, states “Variable star data analysis is not always straightforward; large-scale, automated analysis design is non-trivial”. Derekas et al. 2007, Deb et.al. 2010 states “The processing of xvi huge amount of data in these databases is quite challenging, even when looking at seemingly small issues such as period determination and classification”. It will be beneficial for the variable star astronomical community, if basic parameters, such as period, amplitude and phase are obtained more accurately, when huge time series databases are subjected to automation. In the present thesis work, the theories of four popular period search methods are studied, the strength and weakness of these methods are evaluated by applying it on two survey databases and finally a modified form of cubic spline method is intro- duced to confirm the exact period of variable star. For the classification of new variable stars discovered and entering them in the “General Catalogue of Vari- able Stars” or other databases like “Variable Star Index“, the characteristics of the variability has to be quantified in term of variable star parameters.