920 resultados para Fourier coefficients vector
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Backgrounds Ea aims: The boundaries between the categories of body composition provided by vectorial analysis of bioimpedance are not well defined. In this paper, fuzzy sets theory was used for modeling such uncertainty. Methods: An Italian database with 179 cases 18-70 years was divided randomly into developing (n = 20) and testing samples (n = 159). From the 159 registries of the testing sample, 99 contributed with unequivocal diagnosis. Resistance/height and reactance/height were the input variables in the model. Output variables were the seven categories of body composition of vectorial analysis. For each case the linguistic model estimated the membership degree of each impedance category. To compare such results to the previously established diagnoses Kappa statistics was used. This demanded singling out one among the output set of seven categories of membership degrees. This procedure (defuzzification rule) established that the category with the highest membership degree should be the most likely category for the case. Results: The fuzzy model showed a good fit to the development sample. Excellent agreement was achieved between the defuzzified impedance diagnoses and the clinical diagnoses in the testing sample (Kappa = 0.85, p < 0.001). Conclusions: fuzzy linguistic model was found in good agreement with clinical diagnoses. If the whole model output is considered, information on to which extent each BIVA category is present does better advise clinical practice with an enlarged nosological framework and diverse therapeutic strategies. (C) 2012 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.