2 resultados para INSECT VECTOR

em DigitalCommons@University of Nebraska - Lincoln


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Stage-structured models that integrate demography and dispersal can be used to identify points in the life cycle with large effects on rates of population spatial spread, information that is vital in the development of containment strategies for invasive species. Current challenges in the application of these tools include: (1) accounting for large uncertainty in model parameters, which may violate assumptions of ‘‘local’’ perturbation metrics such as sensitivities and elasticities, and (2) forecasting not only asymptotic rates of spatial spread, as is usually done, but also transient spatial dynamics in the early stages of invasion. We developed an invasion model for the Diaprepes root weevil (DRW; Diaprepes abbreviatus [Coleoptera: Curculionidae]), a generalist herbivore that has invaded citrus-growing regions of the United States. We synthesized data on DRW demography and dispersal and generated predictions for asymptotic and transient peak invasion speeds, accounting for parameter uncertainty. We quantified the contributions of each parameter toward invasion speed using a ‘‘global’’ perturbation analysis, and we contrasted parameter contributions during the transient and asymptotic phases. We found that the asymptotic invasion speed was 0.02–0.028 km/week, although the transient peak invasion speed (0.03– 0.045 km/week) was significantly greater. Both asymptotic and transient invasions speeds were most responsive to weevil dispersal distances. However, demographic parameters that had large effects on asymptotic speed (e.g., survival of early-instar larvae) had little effect on transient speed. Comparison of the global analysis with lower-level elasticities indicated that local perturbation analysis would have generated unreliable predictions for the responsiveness of invasion speed to underlying parameters. Observed range expansion in southern Florida (1992–2006) was significantly lower than the invasion speed predicted by the model. Possible causes of this mismatch include overestimation of dispersal distances, demographic rates, and spatiotemporal variation in parameter values. This study demonstrates that, when parameter uncertainty is large, as is often the case, global perturbation analyses are needed to identify which points in the life cycle should be targets of management. Our results also suggest that effective strategies for reducing spread during the asymptotic phase may have little effect during the transient phase. Includes Appendix.