137 resultados para Massive Data


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objectives: This study examines human scalp electroencephalographic (EEG) data for evidence of non-linear interdependence between posterior channels. The spectral and phase properties of those epochs of EEG exhibiting non-linear interdependence are studied. Methods: Scalp EEG data was collected from 40 healthy subjects. A technique for the detection of non-linear interdependence was applied to 2.048 s segments of posterior bipolar electrode data. Amplitude-adjusted phase-randomized surrogate data was used to statistically determine which EEG epochs exhibited non-linear interdependence. Results: Statistically significant evidence of non-linear interactions were evident in 2.9% (eyes open) to 4.8% (eyes closed) of the epochs. In the eyes-open recordings, these epochs exhibited a peak in the spectral and cross-spectral density functions at about 10 Hz. Two types of EEG epochs are evident in the eyes-closed recordings; one type exhibits a peak in the spectral density and cross-spectrum at 8 Hz. The other type has increased spectral and cross-spectral power across faster frequencies. Epochs identified as exhibiting non-linear interdependence display a tendency towards phase interdependencies across and between a broad range of frequencies. Conclusions: Non-linear interdependence is detectable in a small number of multichannel EEG epochs, and makes a contribution to the alpha rhythm. Non-linear interdependence produces spatially distributed activity that exhibits phase synchronization between oscillations present at different frequencies. The possible physiological significance of these findings are discussed with reference to the dynamical properties of neural systems and the role of synchronous activity in the neocortex. (C) 2002 Elsevier Science Ireland Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In many occupational safety interventions, the objective is to reduce the injury incidence as well as the mean claims cost once injury has occurred. The claims cost data within a period typically contain a large proportion of zero observations (no claim). The distribution thus comprises a point mass at 0 mixed with a non-degenerate parametric component. Essentially, the likelihood function can be factorized into two orthogonal components. These two components relate respectively to the effect of covariates on the incidence of claims and the magnitude of claims, given that claims are made. Furthermore, the longitudinal nature of the intervention inherently imposes some correlation among the observations. This paper introduces a zero-augmented gamma random effects model for analysing longitudinal data with many zeros. Adopting the generalized linear mixed model (GLMM) approach reduces the original problem to the fitting of two independent GLMMs. The method is applied to evaluate the effectiveness of a workplace risk assessment teams program, trialled within the cleaning services of a Western Australian public hospital.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genetic research on risk of alcohol, tobacco or drug dependence must make allowance for the partial overlap of risk-factors for initiation of use, and risk-factors for dependence or other outcomes in users. Except in the extreme cases where genetic and environmental risk-factors for initiation and dependence overlap completely or are uncorrelated, there is no consensus about how best to estimate the magnitude of genetic or environmental correlations between Initiation and Dependence in twin and family data. We explore by computer simulation the biases to estimates of genetic and environmental parameters caused by model misspecification when Initiation can only be defined as a binary variable. For plausible simulated parameter values, the two-stage genetic models that we consider yield estimates of genetic and environmental variances for Dependence that, although biased, are not very discrepant from the true values. However, estimates of genetic (or environmental) correlations between Initiation and Dependence may be seriously biased, and may differ markedly under different two-stage models. Such estimates may have little credibility unless external data favor selection of one particular model. These problems can be avoided if Initiation can be assessed as a multiple-category variable (e.g. never versus early-onset versus later onset user), with at least two categories measurable in users at risk for dependence. Under these conditions, under certain distributional assumptions., recovery of simulated genetic and environmental correlations becomes possible, Illustrative application of the model to Australian twin data on smoking confirmed substantial heritability of smoking persistence (42%) with minimal overlap with genetic influences on initiation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Observations of an insect's movement lead to theory on the insect's flight behaviour and the role of movement in the species' population dynamics. This theory leads to predictions of the way the population changes in time under different conditions. If a hypothesis on movement predicts a specific change in the population, then the hypothesis can be tested against observations of population change. Routine pest monitoring of agricultural crops provides a convenient source of data for studying movement into a region and among fields within a region. Examples of the use of statistical and computational methods for testing hypotheses with such data are presented. The types of questions that can be addressed with these methods and the limitations of pest monitoring data when used for this purpose are discussed. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Observational data collected in the Lake Tekapo hydro catchment of the Southern Alps in New Zealand are used to analyse the wind and temperature fields in the alpine lake basin during summertime fair weather conditions. Measurements from surface stations, pilot balloon and tethersonde soundings, Doppler sodar and an instrumented light aircraft provide evidence of multi-scale interacting wind systems, ranging from microscale slope winds to mesoscale coast-to-basin flows. Thermal forcing of the winds occurred due to differential heating as a consequence of orography and heterogeneous surface features, which is quantified by heat budget and pressure field analysis. The daytime vertical temperature structure was characterised by distinct layering. Features of particular interest are the formation of thermal internal boundary layers due to the lake-land discontinuity and the development of elevated mixed layers. The latter were generated by advective heating from the basin and valley sidewalls by slope winds and by a superimposed valley wind blowing from the basin over Lake Tekapo and up the tributary Godley Valley. Daytime heating in the basin and its tributary valleys caused the development of a strong horizontal temperature gradient between the basin atmosphere and that over the surrounding landscape, and hence the development of a mesoscale heat low over the basin. After noon, air from outside the basin started flowing over mountain saddles into the basin causing cooling in the lowest layers, whereas at ridge top height the horizontal air temperature gradient between inside and outside the basin continued to increase. In the early evening, a more massive intrusion of cold air caused rapid cooling and a transition to a rather uniform slightly stable stratification up to about 2000 m agl. The onset time of this rapid cooling varied about 1-2 h between observation sites and was probably triggered by the decay of up-slope winds inside the basin, which previously countered the intrusion of air over the surrounding ridges. The intrusion of air from outside the basin continued until about mid-night, when a northerly mountain wind from the Godley Valley became dominant. The results illustrate the extreme complexity that can be caused by the operation of thermal forcing processes at a wide range of spatial scales.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phase-equilibrium data and the liquidus for the system. "MnO"-CaO-(Al2O3-SiO2) at a manganese-rich alloy saturation have been determined in the temperature range from 1423 to 1723 K. The results are presented in the form of a pseudoternary section "MnO"-CaO-(Al2O3 + SiO2) with an Al2O3/SiO2 weight ratio of 0.41. The following primary phases are present in the range of conditions investigated:, 3Al(2)O(3).2SiO(2); SiO2; MnO.Al2O3-2SiO(2); (Mn,Ca)O.SiO2; 2(Mn,Ca)O.SiO2; MnO.Al2O3; (Mn,Ca)O; alpha-2CaO.SiO2; alpha'-2CaO.SiO2; 2CaO.Al2O3.SiO2; CaO.SiO2, and CaO.Al2O3.2SiO(2). The presence of alumina in this system is shown to have a significant effect on the liquidus compared to the system "MnO"-CaO-SiO2, leading to, the stabilization of the anorthite and gehlenite phases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report mineral chemistry, whole-rock major element compositions, and trace element analyses on Hole 735B samples drilled and selected during Leg 176. We discuss these data, together with Leg 176 shipboard data and Leg 118 sample data from the literature, in terms of primary igneous petrogenesis. Despite mineral compositional variation in a given sample, major constituent minerals in Hole 735B gabbroic rocks display good chemical equilibrium as shown by significant correlations among Mg# (= Mg/[Mg+Fe2+]) of olivine, clinopyroxene, and orthopyroxene and An (=Ca/[Ca+Na]) of plagioclase. This indicates that the mineral assemblages olivine + plagioclase in troctolite, plagioclase + clinopyroxene in gabbro, plagioclases + clinopyroxene + olivine in olivine gabbro, and plagioclase + clinopyroxene + olivine + orthopyroxene in gabbronorite, and so on, have all coprecipitated from their respective parental melts. Fe-Ti oxides (ilmenite and titanomagnetite), which are ubiquitous in most of these rocks, are not in chemical equilibrium with olivine, clinopyroxene, and plagioclase, but precipitated later at lower temperatures. Disseminated oxides in some samples may have precipitated from trapped Fe-Ti–rich melts. Oxides that concentrate along shear bands/zones may mark zones of melt coalescence/transport expelled from the cumulate sequence as a result of compaction or filter pressing. Bulk Hole 735B is of cumulate composition. The most primitive olivine, with Fo = 0.842, in Hole 735B suggests that the most primitive melt parental to Hole 735B lithologies must have Mg# ≤ 0.637, which is significantly less than Mg# = 0.714 of bulk Hole 735B.