3 resultados para limit sets

em DigitalCommons@University of Nebraska - Lincoln


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Large gene expression studies, such as those conducted using DNA arrays, often provide millions of different pieces of data. To address the problem of analyzing such data, we describe a statistical method, which we have called ‘gene shaving’. The method identifies subsets of genes with coherent expression patterns and large variation across conditions. Gene shaving differs from hierarchical clustering and other widely used methods for analyzing gene expression studies in that genes may belong to more than one cluster, and the clustering may be supervised by an outcome measure. The technique can be ‘unsupervised’, that is, the genes and samples are treated as unlabeled, or partially or fully supervised by using known properties of the genes or samples to assist in finding meaningful groupings. Results: We illustrate the use of the gene shaving method to analyze gene expression measurements made on samples from patients with diffuse large B-cell lymphoma. The method identifies a small cluster of genes whose expression is highly predictive of survival. Conclusions: The gene shaving method is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worth further investigation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Prairie dog (Cynomys ludovicianus) control has historically consisted of lethal methods to maintain, reduce, or eliminate populations in South Dakota and throughout the species range. Non-lethal methods of control are desired to meet changing management objectives for the black-tailed prairie dog. The use of naturally occurring buffer strips as vegetative barriers may be effective in limiting prairie dog town expansion. The objectives of this study were: 1) to evaluate effective width of vegetative barriers in limiting prairie dog towns expansion in western South Dakota; and 2) to document effect native vegetation height on expansion of prairie dog towns in western South Dakota. Five study sites were established in western South Dakota on rangelands containing prairie dog towns of adequate size. Electric fences were constructed for the purpose of excluding cattle and creating buffer strips of native grasses and shrubs. Prairie dogs were poisoned to create a prairie dog free buffer zone adjacent to active prairie dog towns. Grazing was allowed on both sides of the buffer strip. When grazing pressure was not sufficient, mowing was used to simulate grazing. Buffer strips were 100 meters long and 10, 25, and 40 meters in width. A zero meter control was included on all study sites. Quadrats (25) were randomly distributed throughout the buffer strips. Evaluation of study sites included visual obstruction, vegetation cover, vegetation frequency, vegetation height, and vegetation identification. Barrier penetration was evaluated by the presence of new active burrows behind vegetative barriers. Significant relationships were documented for both VOR and vegetation height. No significant difference was found between frequency of breakthroughs and buffer widths.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.