962 resultados para variable data printing


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There may be circumstances where it is necessary for microbiologists to compare variances rather than means, e,g., in analysing data from experiments to determine whether a particular treatment alters the degree of variability or testing the assumption of homogeneity of variance prior to other statistical tests. All of the tests described in this Statnote have their limitations. Bartlett’s test may be too sensitive but Levene’s and the Brown-Forsythe tests also have problems. We would recommend the use of the variance-ratio test to compare two variances and the careful application of Bartlett’s test if there are more than two groups. Considering that these tests are not particularly robust, it should be remembered that the homogeneity of variance assumption is usually the least important of those considered when carrying out an ANOVA. If there is concern about this assumption and especially if the other assumptions of the analysis are also not likely to be met, e.g., lack of normality or non additivity of treatment effects then it may be better either to transform the data or to carry out a non-parametric test on the data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In a Data Envelopment Analysis model, some of the weights used to compute the efficiency of a unit can have zero or negligible value despite of the importance of the corresponding input or output. This paper offers an approach to preventing inputs and outputs from being ignored in the DEA assessment under the multiple input and output VRS environment, building on an approach introduced in Allen and Thanassoulis (2004) for single input multiple output CRS cases. The proposed method is based on the idea of introducing unobserved DMUs created by adjusting input and output levels of certain observed relatively efficient DMUs, in a manner which reflects a combination of technical information and the decision maker's value judgements. In contrast to many alternative techniques used to constrain weights and/or improve envelopment in DEA, this approach allows one to impose local information on production trade-offs, which are in line with the general VRS technology. The suggested procedure is illustrated using real data. © 2011 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Groundwater systems of different densities are often mathematically modeled to understand and predict environmental behavior such as seawater intrusion or submarine groundwater discharge. Additional data collection may be justified if it will cost-effectively aid in reducing the uncertainty of a model's prediction. The collection of salinity, as well as, temperature data could aid in reducing predictive uncertainty in a variable-density model. However, before numerical models can be created, rigorous testing of the modeling code needs to be completed. This research documents the benchmark testing of a new modeling code, SEAWAT Version 4. The benchmark problems include various combinations of density-dependent flow resulting from variations in concentration and temperature. The verified code, SEAWAT, was then applied to two different hydrological analyses to explore the capacity of a variable-density model to guide data collection. ^ The first analysis tested a linear method to guide data collection by quantifying the contribution of different data types and locations toward reducing predictive uncertainty in a nonlinear variable-density flow and transport model. The relative contributions of temperature and concentration measurements, at different locations within a simulated carbonate platform, for predicting movement of the saltwater interface were assessed. Results from the method showed that concentration data had greater worth than temperature data in reducing predictive uncertainty in this case. Results also indicated that a linear method could be used to quantify data worth in a nonlinear model. ^ The second hydrological analysis utilized a model to identify the transient response of the salinity, temperature, age, and amount of submarine groundwater discharge to changes in tidal ocean stage, seasonal temperature variations, and different types of geology. The model was compared to multiple kinds of data to (1) calibrate and verify the model, and (2) explore the potential for the model to be used to guide the collection of data using techniques such as electromagnetic resistivity, thermal imagery, and seepage meters. Results indicated that the model can be used to give insight to submarine groundwater discharge and be used to guide data collection. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Obstructive sleep apnea syndrome has a high prevalence among adults. Cephalometric variables can be a valuable method for evaluating patients with this syndrome. To correlate cephalometric data with the apnea-hypopnea sleep index. We performed a retrospective and cross-sectional study that analyzed the cephalometric data of patients followed in the Sleep Disorders Outpatient Clinic of the Discipline of Otorhinolaryngology of a university hospital, from June 2007 to May 2012. Ninety-six patients were included, 45 men, and 51 women, with a mean age of 50.3 years. A total of 11 patients had snoring, 20 had mild apnea, 26 had moderate apnea, and 39 had severe apnea. The distance from the hyoid bone to the mandibular plane was the only variable that showed a statistically significant correlation with the apnea-hypopnea index. Cephalometric variables are useful tools for the understanding of obstructive sleep apnea syndrome. The distance from the hyoid bone to the mandibular plane showed a statistically significant correlation with the apnea-hypopnea index.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The syndrome of resistance to thyroid hormone (RTH β) is an inherited disorder characterized by variable tissue hyposensitivity to 3,5,30-l-triiodothyronine (T3), with persistent elevation of free-circulating T3 (FT3) and free thyroxine (FT4) levels in association with nonsuppressed serum thyrotropin (TSH). Clinical presentation is variable and the molecular analysis of THRB gene provides a short cut diagnosis. Here, we describe 2 cases in which RTH β was suspected on the basis of laboratory findings. The diagnosis was confirmed by direct THRB sequencing that revealed 2 novel mutations: the heterozygous p.Ala317Ser in subject 1 and the heterozygous p.Arg438Pro in subject 2. Both mutations were shown to be deleterious by SIFT, PolyPhen, and Align GV-GD predictive methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this study was to estimate the regressions calibration for the dietary data that were measured using the quantitative food frequency questionnaire (QFFQ) in the Natural History of HPV Infection in Men: the HIM Study in Brazil. A sample of 98 individuals from the HIM study answered one QFFQ and three 24-hour recalls (24HR) at interviews. The calibration was performed using linear regression analysis in which the 24HR was the dependent variable and the QFFQ was the independent variable. Age, body mass index, physical activity, income and schooling were used as adjustment variables in the models. The geometric means between the 24HR and the calibration-corrected QFFQ were statistically equal. The dispersion graphs between the instruments demonstrate increased correlation after making the correction, although there is greater dispersion of the points with worse explanatory power of the models. Identification of the regressions calibration for the dietary data of the HIM study will make it possible to estimate the effect of the diet on HPV infection, corrected for the measurement error of the QFFQ.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aims. In this work, we describe the pipeline for the fast supervised classification of light curves observed by the CoRoT exoplanet CCDs. We present the classification results obtained for the first four measured fields, which represent a one-year in-orbit operation. Methods. The basis of the adopted supervised classification methodology has been described in detail in a previous paper, as is its application to the OGLE database. Here, we present the modifications of the algorithms and of the training set to optimize the performance when applied to the CoRoT data. Results. Classification results are presented for the observed fields IRa01, SRc01, LRc01, and LRa01 of the CoRoT mission. Statistics on the number of variables and the number of objects per class are given and typical light curves of high-probability candidates are shown. We also report on new stellar variability types discovered in the CoRoT data. The full classification results are publicly available.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Perturbative Quantum Chromodynamics (pQCD) predicts that the small-x gluons in the hadron wavefunction should form a Color Glass Condensate (CGC), which has universal properties, which are the same for nucleon or nuclei. Making use of the results in V.P. Goncalves, M.S. Kugeratski, M.V.T. Machado, F.S. Navarra, Phys. Lett. B643, 273 (2006), we study the behavior of the anomalous dimension in the saturation models as a function of the photon virtuality and of the scaling variable rQ(s), since the main difference among the known parameterizations are characterized by this quantity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interval-censored survival data, in which the event of interest is not observed exactly but is only known to occur within some time interval, occur very frequently. In some situations, event times might be censored into different, possibly overlapping intervals of variable widths; however, in other situations, information is available for all units at the same observed visit time. In the latter cases, interval-censored data are termed grouped survival data. Here we present alternative approaches for analyzing interval-censored data. We illustrate these techniques using a survival data set involving mango tree lifetimes. This study is an example of grouped survival data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective was to develop and test a procedure for applying variable rates of fertilizers and evaluate yield response in coffee (Coffea arabica L.) with regard to the application of phosphorus and potassium. The work was conducted during the 2004 season in a 6.4 ha field located in central Sao Paulo state. Two treatments were applied with alternating strips of fixed and variable rates during the whole season: one following the fertilizing procedures recommended locally, and the other based on a grid soil sampling. A prototype pneumatic fertilizer applicator was used, carrying two conveyor belts, one for each row. Harvesting was done with a commercial harvester equipped with a customized volumetric yield monitor, separating the two treatments. Data were analyzed based on geostatistics, correlations and regressions. The procedure showed to be feasible and effective. The area that received fertilizer applications at a variable rate showed a 34% yield increase compared to the area that received a fixed rate. The variable rate fertilizer resulted in a savings of 23% in phosphate fertilizer and a 13% increase in potassium fertilizer, when compared to fixed rate fertilizer. Yield in 2005, the year after the variable rate treatments, still presented residual effect from treatments carried out during the previous cycle.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses a multi-layer feedforward (MLF) neural network incident detection model that was developed and evaluated using field data. In contrast to published neural network incident detection models which relied on simulated or limited field data for model development and testing, the model described in this paper was trained and tested on a real-world data set of 100 incidents. The model uses speed, flow and occupancy data measured at dual stations, averaged across all lanes and only from time interval t. The off-line performance of the model is reported under both incident and non-incident conditions. The incident detection performance of the model is reported based on a validation-test data set of 40 incidents that were independent of the 60 incidents used for training. The false alarm rates of the model are evaluated based on non-incident data that were collected from a freeway section which was video-taped for a period of 33 days. A comparative evaluation between the neural network model and the incident detection model in operation on Melbourne's freeways is also presented. The results of the comparative performance evaluation clearly demonstrate the substantial improvement in incident detection performance obtained by the neural network model. The paper also presents additional results that demonstrate how improvements in model performance can be achieved using variable decision thresholds. Finally, the model's fault-tolerance under conditions of corrupt or missing data is investigated and the impact of loop detector failure/malfunction on the performance of the trained model is evaluated and discussed. The results presented in this paper provide a comprehensive evaluation of the developed model and confirm that neural network models can provide fast and reliable incident detection on freeways. (C) 1997 Elsevier Science Ltd. All rights reserved.