4 resultados para CATEGORICAL-DATA ANALYSIS

em Cochin University of Science


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reliability analysis is a well established branch of statistics that deals with the statistical study of different aspects of lifetimes of a system of components. As we pointed out earlier that major part of the theory and applications in connection with reliability analysis were discussed based on the measures in terms of distribution function. In the beginning chapters of the thesis, we have described some attractive features of quantile functions and the relevance of its use in reliability analysis. Motivated by the works of Parzen (1979), Freimer et al. (1988) and Gilchrist (2000), who indicated the scope of quantile functions in reliability analysis and as a follow up of the systematic study in this connection by Nair and Sankaran (2009), in the present work we tried to extend their ideas to develop necessary theoretical framework for lifetime data analysis. In Chapter 1, we have given the relevance and scope of the study and a brief outline of the work we have carried out. Chapter 2 of this thesis is devoted to the presentation of various concepts and their brief reviews, which were useful for the discussions in the subsequent chapters .In the introduction of Chapter 4, we have pointed out the role of ageing concepts in reliability analysis and in identifying life distributions .In Chapter 6, we have studied the first two L-moments of residual life and their relevance in various applications of reliability analysis. We have shown that the first L-moment of residual function is equivalent to the vitality function, which have been widely discussed in the literature .In Chapter 7, we have defined percentile residual life in reversed time (RPRL) and derived its relationship with reversed hazard rate (RHR). We have discussed the characterization problem of RPRL and demonstrated with an example that the RPRL for given does not determine the distribution uniquely

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Atmospheric surface boundary layer parameters vary anomalously in response to the occurrence of annular solar eclipse on 15th January 2010 over Cochin. It was the longest annular solar eclipse occurred over South India with high intensity. As it occurred during the noon hours, it is considered to be much more significant because of its effects in all the regions of atmosphere including ionosphere. Since the insolation is the main driving factor responsible for the anomalous changes occurred in the surface layer due to annular solar eclipse, occurred on 15th January 2010, that played very important role in understanding dynamics of the atmosphere during the eclipse period because of its coincidence with the noon time. The Sonic anemometer is able to give data of zonal, meridional and vertical wind as well as the air temperature at a temporal resolution of 1 s. Different surface boundary layer parameters and turbulent fluxes were computed by the application of eddy correlation technique using the high resolution station data. The surface boundary layer parameters that are computed using the sonic anemometer data during the period are momentum flux, sensible heat flux, turbulent kinetic energy, frictional velocity (u*), variance of temperature, variances of u, v and w wind. In order to compare the results, a control run has been done using the data of previous day as well as next day. It is noted that over the specified time period of annular solar eclipse, all the above stated surface boundary layer parameters vary anomalously when compared with the control run. From the observations we could note that momentum flux was 0.1 Nm 2 instead of the mean value 0.2 Nm-2 when there was eclipse. Sensible heat flux anomalously decreases to 50 Nm 2 instead of the mean value 200 Nm 2 at the time of solar eclipse. The turbulent kinetic energy decreases to 0.2 m2s 2 from the mean value 1 m2s 2. The frictional velocity value decreases to 0.05 ms 1 instead of the mean value 0.2 ms 1. The present study aimed at understanding the dynamics of surface layer in response to the annular solar eclipse over a tropical coastal station, occurred during the noon hours. Key words: annular solar eclipse, surface boundary layer, sonic anemometer

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Microarray data analysis is one of data mining tool which is used to extract meaningful information hidden in biological data. One of the major focuses on microarray data analysis is the reconstruction of gene regulatory network that may be used to provide a broader understanding on the functioning of complex cellular systems. Since cancer is a genetic disease arising from the abnormal gene function, the identification of cancerous genes and the regulatory pathways they control will provide a better platform for understanding the tumor formation and development. The major focus of this thesis is to understand the regulation of genes responsible for the development of cancer, particularly colorectal cancer by analyzing the microarray expression data. In this thesis, four computational algorithms namely fuzzy logic algorithm, modified genetic algorithm, dynamic neural fuzzy network and Takagi Sugeno Kang-type recurrent neural fuzzy network are used to extract cancer specific gene regulatory network from plasma RNA dataset of colorectal cancer patients. Plasma RNA is highly attractive for cancer analysis since it requires a collection of small amount of blood and it can be obtained at any time in repetitive fashion allowing the analysis of disease progression and treatment response.