827 resultados para clustering accuracy
Resumo:
The use of limiting dilution assay (LDA) for assessing the frequency of responders in a cell population is a method extensively used by immunologists. A series of studies addressing the statistical method of choice in an LDA have been published. However, none of these studies has addressed the point of how many wells should be employed in a given assay. The objective of this study was to demonstrate how a researcher can predict the number of wells that should be employed in order to obtain results with a given accuracy, and, therefore, to help in choosing a better experimental design to fulfill one's expectations. We present the rationale underlying the expected relative error computation based on simple binomial distributions. A series of simulated in machina experiments were performed to test the validity of the a priori computation of expected errors, thus confirming the predictions. The step-by-step procedure of the relative error estimation is given. We also discuss the constraints under which an LDA must be performed.
Resumo:
Previous genetic association studies have overlooked the potential for biased results when analyzing different population structures in ethnically diverse populations. The purpose of the present study was to quantify this bias in two-locus association studies conducted on an admixtured urban population. We studied the genetic structure distribution of angiotensin-converting enzyme insertion/deletion (ACE I/D) and angiotensinogen methionine/threonine (M/T) polymorphisms in 382 subjects from three subgroups in a highly admixtured urban population. Group I included 150 white subjects; group II, 142 mulatto subjects, and group III, 90 black subjects. We conducted sample size simulation studies using these data in different genetic models of gene action and interaction and used genetic distance calculation algorithms to help determine the population structure for the studied loci. Our results showed a statistically different population structure distribution of both ACE I/D (P = 0.02, OR = 1.56, 95% CI = 1.05-2.33 for the D allele, white versus black subgroup) and angiotensinogen M/T polymorphism (P = 0.007, OR = 1.71, 95% CI = 1.14-2.58 for the T allele, white versus black subgroup). Different sample sizes are predicted to be determinant of the power to detect a given genotypic association with a particular phenotype when conducting two-locus association studies in admixtured populations. In addition, the postulated genetic model is also a major determinant of the power to detect any association in a given sample size. The present simulation study helped to demonstrate the complex interrelation among ethnicity, power of the association, and the postulated genetic model of action of a particular allele in the context of clustering studies. This information is essential for the correct planning and interpretation of future association studies conducted on this population.
Resumo:
This master thesis work introduces the fuzzy tolerance/equivalence relation and its application in cluster analysis. The work presents about the construction of fuzzy equivalence relations using increasing generators. Here, we investigate and research on the role of increasing generators for the creation of intersection, union and complement operators. The objective is to develop different varieties of fuzzy tolerance/equivalence relations using different varieties of increasing generators. At last, we perform a comparative study with these developed varieties of fuzzy tolerance/equivalence relations in their application to a clustering method.
Resumo:
This research concerns different statistical methods that assist to increase the demand forecasting accuracy of company X’s forecasting model. Current forecasting process was analyzed in details. As a result, graphical scheme of logical algorithm was developed. Based on the analysis of the algorithm and forecasting errors, all the potential directions for model future improvements in context of its accuracy were gathered into the complete list. Three improvement directions were chosen for further practical research, on their basis, three test models were created and verified. Novelty of this work lies in the methodological approach of the original analysis of the model, which identified its critical points, as well as the uniqueness of the developed test models. Results of the study formed the basis of the grant of the Government of St. Petersburg.
Resumo:
Verbal fluency tests are used as a measure of executive functions and language, and can also be used to evaluate semantic memory. We analyzed the influence of education, gender and age on scores in a verbal fluency test using the animal category, and on number of categories, clustering and switching. We examined 257 healthy participants (152 females and 105 males) with a mean age of 49.42 years (SD = 15.75) and having a mean educational level of 5.58 (SD = 4.25) years. We asked them to name as many animals as they could. Analysis of variance was performed to determine the effect of demographic variables. No significant effect of gender was observed for any of the measures. However, age seemed to influence the number of category changes, as expected for a sensitive frontal measure, after being controlled for the effect of education. Educational level had a statistically significant effect on all measures, except for clustering. Subject performance (mean number of animals named) according to schooling was: illiterates, 12.1; 1 to 4 years, 12.3; 5 to 8 years, 14.0; 9 to 11 years, 16.7, and more than 11 years, 17.8. We observed a decrease in performance in these five educational groups over time (more items recalled during the first 15 s, followed by a progressive reduction until the fourth interval). We conclude that education had the greatest effect on the category fluency test in this Brazilian sample. Therefore, we must take care in evaluating performance in lower educational subjects.
Resumo:
The main objective of this thesis was to study if the quantitative sales forecasting methods will enhance the accuracy of the sales forecast in comparison to qualitative sales forecasting method. A literature review in the field of forecasting was conducted, including general sales forecasting process, forecasting methods and techniques and forecasting accuracy measurement. In the empirical part of the study the accuracy of the forecasts provided by both qualitative and quantitative methods is being studied and compared in the case of short, medium and long term forecasts. The SAS® Forecast Server –tool was used in creating the quantitative forecasts.
Resumo:
The distribution of psychiatric disorders and of chronic medical illnesses was studied in a population-based sample to determine whether these conditions co-occur in the same individual. A representative sample (N = 1464) of adults living in households was assessed by the Composite International Diagnostic Interview, version 1.1, as part of the São Paulo Epidemiological Catchment Area Study. The association of sociodemographic variables and psychological symptoms regarding medical illness multimorbidity (8 lifetime somatic conditions) and psychiatric multimorbidity (15 lifetime psychiatric disorders) was determined by negative binomial regression. A total of 1785 chronic medical conditions and 1163 psychiatric conditions were detected in the population concentrated in 34.1 and 20% of respondents, respectively. Subjects reporting more psychiatric disorders had more medical illnesses. Characteristics such as age range (35-59 years, risk ratio (RR) = 1.3, and more than 60 years, RR = 1.7), being separated (RR = 1.2), being a student (protective effect, RR = 0.7), being of low educational level (RR = 1.2) and being psychologically distressed (RR = 1.1) were determinants of medical conditions. Age (35-59 years, RR = 1.2, and more than 60 years, RR = 0.5), being retired (RR = 2.5), and being psychologically distressed (females, RR = 1.5, and males, RR = 1.4) were determinants of psychiatric disorders. In conclusion, psychological distress and some sociodemographic features such as age, marital status, occupational status, educational level, and gender are associated with psychiatric and medical multimorbidity. The distribution of both types of morbidity suggests the need of integrating mental health into general clinical settings.
Resumo:
Radiotherapy is one of the main approaches to cure prostate cancer, and its success depends on the accuracy of dose planning. A complicating factor is the presence of a metallic prosthesis in the femur and pelvis, which is becoming more common in elderly populations. The goal of this work was to perform dose measurements to check the accuracy of radiotherapy treatment planning under these complicated conditions. To accomplish this, a scale phantom of an adult pelvic region was used with alanine dosimeters inserted in the prostate region. This phantom was irradiated according to the planned treatment under the following three conditions: with two metallic prostheses in the region of the femur head, with only one prosthesis, and without any prostheses. The combined relative standard uncertainty of dose measurement by electron spin resonance (ESR)/alanine was 5.05%, whereas the combined relative standard uncertainty of the applied dose was 3.35%, resulting in a combined relative standard uncertainty of the whole process of 6.06%. The ESR dosimetry indicated that there was no difference (P>0.05, ANOVA) in dosage between the planned dose and treatments. The results are in the range of the planned dose, within the combined relative uncertainty, demonstrating that the treatment-planning system compensates for the effects caused by the presence of femur and hip metal prostheses.
Resumo:
Genetic, Prenatal and Postnatal Determinants of Weight Gain and Obesity in Young Children – The STEPS Study University of Turku, Faculty of Medicine, Department of Paediatrics, University of Turku Doctoral Program of Clinical Investigation (CLIPD), Turku Institute for Child and Youth Research. Conditions of being overweight and obese in childhood are common health problems with longlasting effects into adulthood. Currently 22% of Finnish boys and 12% of Finnish girls are overweight and 4% of Finnish boys and 2% of Finnish girls are obese. The foundation for later health is formed early, even before birth, and the importance of prenatal growth on later health outcomes is widely acknowledged. When the mother is overweight, had high gestational weight gain and disturbances in glucose metabolism during pregnancy, an increased risk of obesity in children is present. On the other hand, breastfeeding and later introduction of complementary foods are associated with a decreased obesity risk. In addition to these, many genetic and environmental factors have an effect on obesity risk, but the clustering of these factors is not extensively studied. The main objective of this thesis was to provide comprehensive information on prenatal and early postnatal factors associated with weight gain and obesity in infancy up to two years of age. The study was part of the STEPS Study (Steps to Healthy Development), which is a follow-up study consisting of 1797 families. This thesis focused on children up to 24 months of age. Altogether 26% of boys and 17% of girls were overweight and 5% of boys and 4% of girls were obese at 24 months of age according to New Finnish Growth references for Children BMI-for-age criteria. Compared to children who remained normal weight, the children who became overweight or obese showed different growth trajectories already at 13 months of age. The mother being overweight had an impact on children’s birth weight and early growth from birth to 24 months of age. The mean duration of breastfeeding was almost 2 months shorter in overweight women in comparison to normal weight women. A longer duration of breastfeeding was protective against excessive weight gain, high BMI, high body weight and high weight-for-length SDS during the first 24 months of life. Breast milk fatty acid composition differed between overweight and normal weight mothers, and overweight women had more saturated fatty acids and less n-3 fatty acids in breast milk. Overweight women also introduced complementary foods to their infants earlier than normal weight mothers. Genetic risk score calculated from 83 obesogenic- and adiposity-related single nucleotide polymorphisms (SNPs) showed that infants with a high genetic risk for being overweight and obese were heavier at 13 months and 24 months of age than infants with a low genetic risk, thus possibly predisposing to later obesity in obesogenic environment. Obesity Risk Score showed that children with highest number of risk factors had almost 6-fold risk of being overweight and obese at 24 months compared to children with lowest number of risk factors. The accuracy of the Obesity Risk Score in predicting overweight and obesity at 24 months was 82%. This study showed that many of the obesogenic risk factors tend to cluster within children and families and that children who later became overweight or obese show different growth trajectories already at a young age. These results highlight the importance of early detection of children with higher obesity risk as well as the importance of prevention measures focused on parents. Keywords: Breastfeeding, Child, Complementary Feeding, Genes, Glucose metabolism, Growth, Infant Nutrition Physiology, Nutrition, Obesity, Overweight, Programming
Resumo:
Remote sensing techniques involving hyperspectral imagery have applications in a number of sciences that study some aspects of the surface of the planet. The analysis of hyperspectral images is complex because of the large amount of information involved and the noise within that data. Investigating images with regard to identify minerals, rocks, vegetation and other materials is an application of hyperspectral remote sensing in the earth sciences. This thesis evaluates the performance of two classification and clustering techniques on hyperspectral images for mineral identification. Support Vector Machines (SVM) and Self-Organizing Maps (SOM) are applied as classification and clustering techniques, respectively. Principal Component Analysis (PCA) is used to prepare the data to be analyzed. The purpose of using PCA is to reduce the amount of data that needs to be processed by identifying the most important components within the data. A well-studied dataset from Cuprite, Nevada and a dataset of more complex data from Baffin Island were used to assess the performance of these techniques. The main goal of this research study is to evaluate the advantage of training a classifier based on a small amount of data compared to an unsupervised method. Determining the effect of feature extraction on the accuracy of the clustering and classification method is another goal of this research. This thesis concludes that using PCA increases the learning accuracy, and especially so in classification. SVM classifies Cuprite data with a high precision and the SOM challenges SVM on datasets with high level of noise (like Baffin Island).
Resumo:
The goal of most clustering algorithms is to find the optimal number of clusters (i.e. fewest number of clusters). However, analysis of molecular conformations of biological macromolecules obtained from computer simulations may benefit from a larger array of clusters. The Self-Organizing Map (SOM) clustering method has the advantage of generating large numbers of clusters, but often gives ambiguous results. In this work, SOMs have been shown to be reproducible when the same conformational dataset is independently clustered multiple times (~100), with the help of the Cramérs V-index (C_v). The ability of C_v to determine which SOMs are reproduced is generalizable across different SOM source codes. The conformational ensembles produced from MD (molecular dynamics) and REMD (replica exchange molecular dynamics) simulations of the penta peptide Met-enkephalin (MET) and the 34 amino acid protein human Parathyroid Hormone (hPTH) were used to evaluate SOM reproducibility. The training length for the SOM has a huge impact on the reproducibility. Analysis of MET conformational data definitively determined that toroidal SOMs cluster data better than bordered maps due to the fact that toroidal maps do not have an edge effect. For the source code from MATLAB, it was determined that the learning rate function should be LINEAR with an initial learning rate factor of 0.05 and the SOM should be trained by a sequential algorithm. The trained SOMs can be used as a supervised classification for another dataset. The toroidal 10×10 hexagonal SOMs produced from the MATLAB program for hPTH conformational data produced three sets of reproducible clusters (27%, 15%, and 13% of 100 independent runs) which find similar partitionings to those of smaller 6×6 SOMs. The χ^2 values produced as part of the C_v calculation were used to locate clusters with identical conformational memberships on independently trained SOMs, even those with different dimensions. The χ^2 values could relate the different SOM partitionings to each other.
Resumo:
Rapport de recherche
Resumo:
Affiliation: Institut de recherche en immunologie et en cancérologie, Université de Montréal
Resumo:
UANL