904 resultados para Context data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Evaluations of measurement invariance provide essential construct validity evidence. However, the quality of such evidence is partly dependent upon the validity of the resulting statistical conclusions. The presence of Type I or Type II errors can render measurement invariance conclusions meaningless. The purpose of this study was to determine the effects of categorization and censoring on the behavior of the chi-square/likelihood ratio test statistic and two alternative fit indices (CFI and RMSEA) under the context of evaluating measurement invariance. Monte Carlo simulation was used to examine Type I error and power rates for the (a) overall test statistic/fit indices, and (b) change in test statistic/fit indices. Data were generated according to a multiple-group single-factor CFA model across 40 conditions that varied by sample size, strength of item factor loadings, and categorization thresholds. Seven different combinations of model estimators (ML, Yuan-Bentler scaled ML, and WLSMV) and specified measurement scales (continuous, censored, and categorical) were used to analyze each of the simulation conditions. As hypothesized, non-normality increased Type I error rates for the continuous scale of measurement and did not affect error rates for the categorical scale of measurement. Maximum likelihood estimation combined with a categorical scale of measurement resulted in more correct statistical conclusions than the other analysis combinations. For the continuous and censored scales of measurement, the Yuan-Bentler scaled ML resulted in more correct conclusions than normal-theory ML. The censored measurement scale did not offer any advantages over the continuous measurement scale. Comparing across fit statistics and indices, the chi-square-based test statistics were preferred over the alternative fit indices, and ΔRMSEA was preferred over ΔCFI. Results from this study should be used to inform the modeling decisions of applied researchers. However, no single analysis combination can be recommended for all situations. Therefore, it is essential that researchers consider the context and purpose of their analyses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Regression coefficients specify the partial effect of a regressor on the dependent variable. Sometimes the bivariate or limited multivariate relationship of that regressor variable with the dependent variable is known from population-level data. We show here that such population- level data can be used to reduce variance and bias about estimates of those regression coefficients from sample survey data. The method of constrained MLE is used to achieve these improvements. Its statistical properties are first described. The method constrains the weighted sum of all the covariate-specific associations (partial effects) of the regressors on the dependent variable to equal the overall association of one or more regressors, where the latter is known exactly from the population data. We refer to those regressors whose bivariate or limited multivariate relationships with the dependent variable are constrained by population data as being ‘‘directly constrained.’’ Our study investigates the improvements in the estimation of directly constrained variables as well as the improvements in the estimation of other regressor variables that may be correlated with the directly constrained variables, and thus ‘‘indirectly constrained’’ by the population data. The example application is to the marital fertility of black versus white women. The difference between white and black women’s rates of marital fertility, available from population-level data, gives the overall association of race with fertility. We show that the constrained MLE technique both provides a far more powerful statistical test of the partial effect of being black and purges the test of a bias that would otherwise distort the estimated magnitude of this effect. We find only trivial reductions, however, in the standard errors of the parameters for indirectly constrained regressors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Drawing on longitudinal data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998–1999, this study used IRT modeling to operationalize a measure of parental educational investments based on Lareau’s notion of concerted cultivation. It used multilevel piecewise growth models regressing children’s math and reading achievement from entry into kindergarten through the third grade on concerted cultivation and family context variables. The results indicate that educational investments are an important mediator of socioeconomic and racial/ethnic disparities, completely explaining the black-white reading gap at kindergarten entry and consistently explaining 20 percent to 60 percent and 30 percent to 50 percent of the black-white and Hispanic-white disparities in the growth parameters, respectively, and approximately 20 percent of the socioeconomic gradients. Notably, concerted cultivation played a more significant role in explaining racial/ethnic gaps in achievement than expected from Lareau’s discussion, which suggests that after socioeconomic background is controlled, concerted cultivation should not be implicated in racial/ethnic disparities in learning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Informação - FFC

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stereotyped behaviors have been routinely used as characters for phylogeny inference, but the same cannot be said of the plastic aspects of performance, which routinely are taken as a result of ecological processes. In this paper we examine the evolution of one of these plastic behavioral phenotypes, thus fostering a bridge between ecological and evolutionary processes. Foraging behavior in spiders is context dependent in many aspects, since it varies with prey type and size, spider nutritional and developmental state, previous experience and, in webweavers, is dependent on the structure of the web. Reeling is a predatory tactic typical of cobweb weavers (Theridiidae), in which the spider moves the prey toward her by pulling the capture thread (gumfoot) to which it is adhered. Predatory reeling is dependent on the gumfoot for its expression, and has not been previously reported in orbweavers. In order to investigate the evolution of this web dependent behavior, we built artificial, pseudogumfoot lines in orbwebs and registered parameters of the predatory tactics in this modified web. Aspects of the predatory tactics of 240 individuals (12 species in 4 families) were measured, and the resulting data were optimized on the phylogeny of Orbiculariae. All species perform predatory reeling with the pseudogumfoot lines. Thus, predatory reeling is homologous for the whole Orbiculariae group. In nature, holes made by insects in ecribellate orbs produce pseudogumfoot lines (similar to out experimentally modified webs), and thus reeling occurred naturally in ecribellates. Nevertheless, outside lab conditions, predatory reeling does not occur among cribellate orbweavers, so that this behavior could not have been selected for in the cribellate ancester of orbweavers. Cribellate spiders are flexible enough as to present novel and adaptive predatory responses (reeling) even when exposed for the first time to conditions outside their usual environment. Thus, the evolution of reeling suggests and alternative mechanism for the production of evolutionary novelties; that is, the exploration of unusual ecological conditions and of the regular effects these abnormal conditions have on phenotype expression.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To To conduct a cost-effectiveness analysis of a universal childhood hepatitis A vaccination program in Brazil. Methods: An age and time-dependent dynamic model was developed to estimate the incidence of hepatitis A for 24 years. The analysis was run separately according to the pattern of regional endemicity, one for South + Southeast (low endemicity) and one for the North + Northeast + Midwest (intermediate endemicity). The decision analysis model compared universal childhood vaccination with current program of vaccinating high risk individuals. Epidemiologic and cost estimates were based on data from a nationwide seroprevalence survey of viral hepatitis, primary data collection, National Health Information Systems and literature. The analysis was conducted from both the health system and societal perspectives. Costs are expressed in 2008 Brazilian currency (Real). Results: A universal immunization program would have a significant impact on disease epidemiology in all regions, resulting in 64% reduction in the number of cases of icteric hepatitis, 59% reduction in deaths for the disease and a 62% decrease of life years lost, in a national perspective. With a vaccine price of R$16.89 (US$7.23) per dose, vaccination against hepatitis A was a cost-saving strategy in the low and intermediate endemicity regions and in Brazil as a whole from both health system and society perspective. Results were most sensitive to the frequency of icteric hepatitis, ambulatory care and vaccine costs. Conclusions: Universal childhood vaccination program against hepatitis A could be a cost-saving strategy in all regions of Brazil. These results are useful for the Brazilian government for vaccine related decisions and for monitoring population impact if the vaccine is included in the National Immunization Program. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and purposes: Anti-aquaporin 4 antibodies are specific markers for Devics disease. This study aimed to test if this high specificity holds in the context of a large spectrum of systemic autoimmune and non-autoimmune diseases. Methods: Anti-aquaporin-4 antibodies (NMO-IgG) were determined by indirect immunofluorescence (IIF) on mouse cerebellum in 673 samples, as follows: group I (clinically defined Devic's disease, n = 47); group II [ inflammatory/demyelinating central nervous system (CNS) diseases, n = 41]; group III (systemic and organ-specific autoimmune diseases, n = 250); group IV (chronic or acute viral diseases, n = 35); and group V (randomly selected samples from a general clinical laboratory, n = 300). Results: MNO-IgG was present in 40/47 patients with classic Devic's disease (85.1% sensitivity) and in 13/22 (59.1%) patients with disorders related to Devic's disease. The latter 13 positive samples had diagnosis of longitudinally extensive transverse myelitis (n = 10) and isolated idiopathic optic neuritis (n = 3). One patient with multiple sclerosis and none of the remaining 602 samples with autoimmune and miscellaneous diseases presented NMO-IgG (99.8% specificity). The autoimmune disease subset included five systemic lupus erythematosus individuals with isolated or combined optic neuritis and myelitis and four primary Sjogren's syndrome (SS) patients with cranial/peripheral neuropathy. Conclusions: The available data clearly point to the high specificity of anti-aquaporin-4 antibodies for Devic's disease and related syndromes also in the context of miscellaneous non-neurologic autoimmune and non-autoimmune disorders.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Hubble constant, H-0, sets the scale of the size and age of the Universe and its determination from independent methods is still worthwhile to be investigated. In this article, by using the Sunyaev-Zeldovich effect and X-ray surface brightness data from 38 galaxy clusters observed by Bonamente et al. (Astrophys J 647:25, 2006), we obtain a new estimate of H-0 in the context of a flat Lambda CDM model. There is a degeneracy on the mass density parameter (Omega(m)) which is broken by applying a joint analysis involving the baryon acoustic oscillations (BAO) as given by Sloan Digital Sky Survey. This happens because the BAO signature does not depend on H-0. Our basic finding is that a joint analysis involving these tests yield H-0 = 76.5(-3.33)(+3.35) km/s/mpc and Omega(m) = 0.27(-0.02)(+0.03). Since the hypothesis of spherical geometry assumed by Bonamente et al. is questionable, we have also compared the above results to a recent work where a sample of galaxy clusters described by an elliptical profile was used in analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aedes aegypti is the most important vector of dengue viruses in tropical and subtropical regions. Because vaccines are still under development, dengue prevention depends primarily on vector control. Population genetics is a common approach in research involving Ae. aegypti. In the context of medical entomology, wing morphometric analysis has been proposed as a strong and low-cost complementary tool for investigating population structure. Therefore, we comparatively evaluated the genetic and phenotypic variability of population samples of Ae. aegypti from four sampling sites in the metropolitan area of Sao Paulo city, Brazil. The distances between the sites ranged from 7.1 to 50 km. This area, where knowledge on the population genetics of this mosquito is incipient, was chosen due to the thousands of dengue cases registered yearly. The analysed loci were polymorphic, and they revealed population structure (global F-ST = 0.062; p < 0.05) and low levels of gene flow (Nm = 0.47) between the four locations. Principal component and discriminant analyses of wing shape variables (18 landmarks) demonstrated that wing polymorphisms were only slightly more common between populations than within populations. Whereas microsatellites allowed for geographic differentiation, wing geometry failed to distinguish the samples. These data suggest that microevolution in this species may affect genetic and morphological characters to different degrees. In this case, wing shape was not validated as a marker for assessing population structure. According to the interpretation of a previous report, the wing shape of Ae. aegypti does not vary significantly because it is stabilised by selective pressure. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Context. The ESO public survey VISTA variables in the Via Lactea (VVV) started in 2010. VVV targets 562 sq. deg in the Galactic bulge and an adjacent plane region and is expected to run for about five years. Aims. We describe the progress of the survey observations in the first observing season, the observing strategy, and quality of the data obtained. Methods. The observations are carried out on the 4-m VISTA telescope in the ZYJHK(s) filters. In addition to the multi-band imaging the variability monitoring campaign in the K-s filter has started. Data reduction is carried out using the pipeline at the Cambridge Astronomical Survey Unit. The photometric and astrometric calibration is performed via the numerous 2MASS sources observed in each pointing. Results. The first data release contains the aperture photometry and astrometric catalogues for 348 individual pointings in the ZYJHK(s) filters taken in the 2010 observing season. The typical image quality is similar to 0 ''.9-1 ''.0. The stringent photometric and image quality requirements of the survey are satisfied in 100% of the JHK(s) images in the disk area and 90% of the JHK(s) images in the bulge area. The completeness in the Z and Y images is 84% in the disk, and 40% in the bulge. The first season catalogues contain 1.28 x 10(8) stellar sources in the bulge and 1.68 x 10(8) in the disk area detected in at least one of the photometric bands. The combined, multi-band catalogues contain more than 1.63 x 10(8) stellar sources. About 10% of these are double detections because of overlapping adjacent pointings. These overlapping multiple detections are used to characterise the quality of the data. The images in the JHK(s) bands extend typically similar to 4 mag deeper than 2MASS. The magnitude limit and photometric quality depend strongly on crowding in the inner Galactic regions. The astrometry for K-s = 15-18 mag has rms similar to 35-175 mas. Conclusions. The VVV Survey data products offer a unique dataset to map the stellar populations in the Galactic bulge and the adjacent plane and provide an exciting new tool for the study of the structure, content, and star-formation history of our Galaxy, as well as for investigations of the newly discovered star clusters, star-forming regions in the disk, high proper motion stars, asteroids, planetary nebulae, and other interesting objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Determination of the utility harmonic impedance based on measurements is a significant task for utility power-quality improvement and management. Compared to those well-established, accurate invasive methods, the noninvasive methods are more desirable since they work with natural variations of the loads connected to the point of common coupling (PCC), so that no intentional disturbance is needed. However, the accuracy of these methods has to be improved. In this context, this paper first points out that the critical problem of the noninvasive methods is how to select the measurements that can be used with confidence for utility harmonic impedance calculation. Then, this paper presents a new measurement technique which is based on the complex data-based least-square regression, combined with two techniques of data selection. Simulation and field test results show that the proposed noninvasive method is practical and robust so that it can be used with confidence to determine the utility harmonic impedances.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Inclusive business is a term currently used to explain the organizations that aim to solve social problems with efficiency and financial sustainability by means of market mechanisms. It can be said that inclusive businesses are those targeted at generating employment and income for groups with little or no market mobility, in keeping with the standards of so-called "decent jobs" and in a self-sustaining manner, i.e., generating profit for the enterprises, and establishing relationships with typical business organizations as suppliers of products and services or in the distribution of this type of production. This article discusses the different concepts found in the scientific literature on inclusive businesses. It also analyses data from a survey conducted with the audiences of Social Corporate Responsibility seminars held by FIEMG. This analysis reveals that prospects, risks and idealizations similar to those found in inclusive business theories can also be found among individuals that run social corporate responsibility projects, even if this designation is new for them. The connection between companies and poverty, especially in relation to inclusive businesses, seems full of stumbling blocks and traps in the Brazilian context.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Industrial recurrent event data where an event of interest can be observed more than once in a single sample unit are presented in several areas, such as engineering, manufacturing and industrial reliability. Such type of data provide information about the number of events, time to their occurrence and also their costs. Nelson (1995) presents a methodology to obtain asymptotic confidence intervals for the cost and the number of cumulative recurrent events. Although this is a standard procedure, it can not perform well in some situations, in particular when the sample size available is small. In this context, computer-intensive methods such as bootstrap can be used to construct confidence intervals. In this paper, we propose a technique based on the bootstrap method to have interval estimates for the cost and the number of cumulative events. One of the advantages of the proposed methodology is the possibility for its application in several areas and its easy computational implementation. In addition, it can be a better alternative than asymptotic-based methods to calculate confidence intervals, according to some Monte Carlo simulations. An example from the engineering area illustrates the methodology.