13 resultados para Context data
em DigitalCommons@University of Nebraska - Lincoln
Resumo:
When an appropriate fish host is selected, analysis of its parasites offers a useful, reliable, economical, telescoped indication or monitor of environmental health. The value of that information increases when corroborated by another non-parasitological technique. The analysis of parasites is not necessarily simple because not all hosts serve as good models and because the number of species, presence of specific species, intensity of infections, life histories of species, location of species in hosts, and host response for each parasitic species have to be addressed individually to assure usefulness of the tool. Also, different anthropogenic contaminants act in a distinct manner relative to hosts, parasites, and each other as well as being influenced by natural environmental conditions. Total values for all parasitic species infecting a sample cannot necessarily be grouped together. For example, an abundance of numbers of either species or individuals can indicate either a healthy or an unhealthy environment, depending on the species of parasite. Moreover, depending on the parasitic species, its infection, and the time chosen for collection/examination, the assessment may indicate a chronic or acute state of the environmental health. For most types of analyses, the host should be one that has a restricted home range, can be infected by numerous species of parasites, many of which have a variety of additional hosts in their life cycles, and can be readily sampled. Data on parasitic infections in the western mosquitofish (Gambusia affinis), a fish that meets the criteria in two separate studies, illustrate the usefulness of that host as a model to indicate both healthy and detrimentally influenced environments. In those studies, species richness, intensity of select species, host resistance, other hosts involved in life cycles, and other factors all relate to site and contaminating discharge.
Resumo:
The recent likely extinction of the baiji (Chinese river dolphin [Lipotes vexillifer]) (Turvey et al. 2007) makes the vaquita (Gulf of California porpoise [Phocoena sinus]) the most endangered cetacean. The vaquita has the smallest range of any porpoise, dolphin, or whale and, like the baiji, has long been threatened primarily by accidental deaths in fishing gear (bycatch) (Rojas-Bracho et al. 2006). Despite repeated recommendations from scientific bodies and conservation organizations, no effective actions have been taken to remove nets from the vaquita’s environment. Here, we address three questions that are important to vaquita conservation: (1) How many vaquitas remain? (2) How much time is left to find a solution to the bycatch problem? and (3) Are further abundance surveys or bycatch estimates needed to justify the immediate removal of all entangling nets from the range of the vaquita? Our answers are, in short: (1) there are about 150 vaquitas left, (2) there are at most 2 years within which to find a solution, and (3) further abundance surveys or bycatch estimates are not needed. The answers to the first two questions make clear that action is needed now, whereas the answer to the last question removes the excuse of uncertainty as a delay tactic. Herein we explain our reasoning.
Resumo:
Prior studies of phylogenetic relationships among phocoenids based on morphology and molecular sequence data conflict and yield unresolved relationships among species. This study evaluates a comprehensive set of cranial, postcranial, and soft anatomical characters to infer interrelationships among extant species and several well-known fossil phocoenids, using two different methods to analyze polymorphic data: polymorphic coding and frequency step matrix. Our phylogenetic results confirmed phocoenid monophyly. The division of Phocoenidae into two subfamilies previously proposed was rejected, as well as the alliance of the two extinct genera Salumiphocaena and Piscolithax with Phocoena dioptrica and Phocoenoides dalli. Extinct phocoenids are basal to all extant species. We also examined the origin and distribution of porpoises within the context of this phylogenetic framework. Phocoenid phylogeny together with available geologic evidence suggests that the early history of phocoenids was centered in the North Pacific during the middle Miocene, with subsequent dispersal into the southern hemisphere in the middle Pliocene. A cooling period in the Pleistocene allowed dispersal of the southern ancestor of Phocoena sinusinto the North Pacific (Gulf of California).
Resumo:
Dynamic conferencing refers to a scenario wherein any subset of users in a universe of users form a conference for sharing confidential information among themselves. The key distribution (KD) problem in dynamic conferencing is to compute a shared secret key for such a dynamically formed conference. In literature, the KD schemes for dynamic conferencing either are computationally unscalable or require communication among users, which is undesirable. The extended symmetric polynomial based dynamic conferencing scheme (ESPDCS) is one such KD scheme which has a high computational complexity that is universe size dependent. In this paper we present an enhancement to the ESPDCS scheme to develop a KD scheme called universe-independent SPDCS (UI-SPDCS) such that its complexity is independent of the universe size. However, the UI-SPDCS scheme does not scale with the conference size. We propose a relatively scalable KD scheme termed as DH-SPDCS that uses the UI-SPDCS scheme and the tree-based group Diffie- Hellman (TGDH) key exchange protocol. The proposed DH-SPDCS scheme provides a configurable trade-off between computation and communication complexity of the scheme.
Resumo:
In this paper, we propose a Loss Tolerant Reliable (LTR) data transport mechanism for dynamic Event Sensing (LTRES) in WSNs. In LTRES, a reliable event sensing requirement at the transport layer is dynamically determined by the sink. A distributed source rate adaptation mechanism is designed, incorporating a loss rate based lightweight congestion control mechanism, to regulate the data traffic injected into the network so that the reliability requirement can be satisfied. An equation based fair rate control algorithm is used to improve the fairness among the LTRES flows sharing the congestion path. The performance evaluations show that LTRES can provide LTR data transport service for multiple events with short convergence time, low lost rate and high overall bandwidth utilization.
Resumo:
Most authors struggle to pick a title that adequately conveys all of the material covered in a book. When I first saw Applied Spatial Data Analysis with R, I expected a review of spatial statistical models and their applications in packages (libraries) from the CRAN site of R. The authors’ title is not misleading, but I was very pleasantly surprised by how deep the word “applied” is here. The first half of the book essentially covers how R handles spatial data. To some statisticians this may be boring. Do you want, or need, to know the difference between S3 and S4 classes, how spatial objects in R are organized, and how various methods work on the spatial objects? A few years ago I would have said “no,” especially to the “want” part. Just let me slap my EXCEL spreadsheet into R and run some spatial functions on it. Unfortunately, the world is not so simple, and ultimately we want to minimize effort to get all of our spatial analyses accomplished. The first half of this book certainly convinced me that some extra effort in organizing my data into certain spatial class structures makes the analysis easier and less subject to mistakes. I also admit that I found it very interesting and I learned a lot.
Resumo:
We propose a general framework for the analysis of animal telemetry data through the use of weighted distributions. It is shown that several interpretations of resource selection functions arise when constructed from the ratio of a use and availability distribution. Through the proposed general framework, several popular resource selection models are shown to be special cases of the general model by making assumptions about animal movement and behavior. The weighted distribution framework is shown to be easily extended to readily account for telemetry data that are highly auto-correlated; as is typical with use of new technology such as global positioning systems animal relocations. An analysis of simulated data using several models constructed within the proposed framework is also presented to illustrate the possible gains from the flexible modeling framework. The proposed model is applied to a brown bear data set from southeast Alaska.
Resumo:
We consider a fully model-based approach for the analysis of distance sampling data. Distance sampling has been widely used to estimate abundance (or density) of animals or plants in a spatially explicit study area. There is, however, no readily available method of making statistical inference on the relationships between abundance and environmental covariates. Spatial Poisson process likelihoods can be used to simultaneously estimate detection and intensity parameters by modeling distance sampling data as a thinned spatial point process. A model-based spatial approach to distance sampling data has three main benefits: it allows complex and opportunistic transect designs to be employed, it allows estimation of abundance in small subregions, and it provides a framework to assess the effects of habitat or experimental manipulation on density. We demonstrate the model-based methodology with a small simulation study and analysis of the Dubbo weed data set. In addition, a simple ad hoc method for handling overdispersion is also proposed. The simulation study showed that the model-based approach compared favorably to conventional distance sampling methods for abundance estimation. In addition, the overdispersion correction performed adequately when the number of transects was high. Analysis of the Dubbo data set indicated a transect effect on abundance via Akaike’s information criterion model selection. Further goodness-of-fit analysis, however, indicated some potential confounding of intensity with the detection function.
Resumo:
The 3PL model is a flexible and widely used tool in assessment. However, it suffers from limitations due to its need for large sample sizes. This study introduces and evaluates the efficacy of a new sample size augmentation technique called Duplicate, Erase, and Replace (DupER) Augmentation through a simulation study. Data are augmented using several variations of DupER Augmentation (based on different imputation methodologies, deletion rates, and duplication rates), analyzed in BILOG-MG 3, and results are compared to those obtained from analyzing the raw data. Additional manipulated variables include test length and sample size. Estimates are compared using seven different evaluative criteria. Results are mixed and inconclusive. DupER augmented data tend to result in larger root mean squared errors (RMSEs) and lower correlations between estimates and parameters for both item and ability parameters. However, some DupER variations produce estimates that are much less biased than those obtained from the raw data alone. For one DupER variation, it was found that DupER produced better results for low-ability simulees and worse results for those with high abilities. Findings, limitations, and recommendations for future studies are discussed. Specific recommendations for future studies include the application of Duper Augmentation (1) to empirical data, (2) with additional IRT models, and (3) the analysis of the efficacy of the procedure for different item and ability parameter distributions.
Resumo:
Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.
Resumo:
Evaluations of measurement invariance provide essential construct validity evidence. However, the quality of such evidence is partly dependent upon the validity of the resulting statistical conclusions. The presence of Type I or Type II errors can render measurement invariance conclusions meaningless. The purpose of this study was to determine the effects of categorization and censoring on the behavior of the chi-square/likelihood ratio test statistic and two alternative fit indices (CFI and RMSEA) under the context of evaluating measurement invariance. Monte Carlo simulation was used to examine Type I error and power rates for the (a) overall test statistic/fit indices, and (b) change in test statistic/fit indices. Data were generated according to a multiple-group single-factor CFA model across 40 conditions that varied by sample size, strength of item factor loadings, and categorization thresholds. Seven different combinations of model estimators (ML, Yuan-Bentler scaled ML, and WLSMV) and specified measurement scales (continuous, censored, and categorical) were used to analyze each of the simulation conditions. As hypothesized, non-normality increased Type I error rates for the continuous scale of measurement and did not affect error rates for the categorical scale of measurement. Maximum likelihood estimation combined with a categorical scale of measurement resulted in more correct statistical conclusions than the other analysis combinations. For the continuous and censored scales of measurement, the Yuan-Bentler scaled ML resulted in more correct conclusions than normal-theory ML. The censored measurement scale did not offer any advantages over the continuous measurement scale. Comparing across fit statistics and indices, the chi-square-based test statistics were preferred over the alternative fit indices, and ΔRMSEA was preferred over ΔCFI. Results from this study should be used to inform the modeling decisions of applied researchers. However, no single analysis combination can be recommended for all situations. Therefore, it is essential that researchers consider the context and purpose of their analyses.
Resumo:
Regression coefficients specify the partial effect of a regressor on the dependent variable. Sometimes the bivariate or limited multivariate relationship of that regressor variable with the dependent variable is known from population-level data. We show here that such population- level data can be used to reduce variance and bias about estimates of those regression coefficients from sample survey data. The method of constrained MLE is used to achieve these improvements. Its statistical properties are first described. The method constrains the weighted sum of all the covariate-specific associations (partial effects) of the regressors on the dependent variable to equal the overall association of one or more regressors, where the latter is known exactly from the population data. We refer to those regressors whose bivariate or limited multivariate relationships with the dependent variable are constrained by population data as being ‘‘directly constrained.’’ Our study investigates the improvements in the estimation of directly constrained variables as well as the improvements in the estimation of other regressor variables that may be correlated with the directly constrained variables, and thus ‘‘indirectly constrained’’ by the population data. The example application is to the marital fertility of black versus white women. The difference between white and black women’s rates of marital fertility, available from population-level data, gives the overall association of race with fertility. We show that the constrained MLE technique both provides a far more powerful statistical test of the partial effect of being black and purges the test of a bias that would otherwise distort the estimated magnitude of this effect. We find only trivial reductions, however, in the standard errors of the parameters for indirectly constrained regressors.
Resumo:
Drawing on longitudinal data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998–1999, this study used IRT modeling to operationalize a measure of parental educational investments based on Lareau’s notion of concerted cultivation. It used multilevel piecewise growth models regressing children’s math and reading achievement from entry into kindergarten through the third grade on concerted cultivation and family context variables. The results indicate that educational investments are an important mediator of socioeconomic and racial/ethnic disparities, completely explaining the black-white reading gap at kindergarten entry and consistently explaining 20 percent to 60 percent and 30 percent to 50 percent of the black-white and Hispanic-white disparities in the growth parameters, respectively, and approximately 20 percent of the socioeconomic gradients. Notably, concerted cultivation played a more significant role in explaining racial/ethnic gaps in achievement than expected from Lareau’s discussion, which suggests that after socioeconomic background is controlled, concerted cultivation should not be implicated in racial/ethnic disparities in learning.