920 resultados para Large Data
Resumo:
The fluctuations exhibited by the cross sections generated in a compound-nucleus reaction or, more generally, in a quantum-chaotic scattering process, when varying the excitation energy or another external parameter, are characterized by the width Gamma(corr) of the cross-section correlation function. Brink and Stephen Phys. Lett. 5, 77 (1963)] proposed a method for its determination by simply counting the number of maxima featured by the cross sections as a function of the parameter under consideration. They stated that the product of the average number of maxima per unit energy range and Gamma(corr) is constant in the Ercison region of strongly overlapping resonances. We use the analogy between the scattering formalism for compound-nucleus reactions and for microwave resonators to test this method experimentally with unprecedented accuracy using large data sets and propose an analytical description for the regions of isolated and overlapping resonances.
Resumo:
During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication and funding. An important outcome of this meeting was the creation of a Specialist Protein Resource Network that we believe will improve coordination of the activities of its member resources. We invite further protein database resources to join the network and continue the dialogue.
Resumo:
The connections between convexity and submodularity are explored, for purposes of minimizing and learning submodular set functions.
First, we develop a novel method for minimizing a particular class of submodular functions, which can be expressed as a sum of concave functions composed with modular functions. The basic algorithm uses an accelerated first order method applied to a smoothed version of its convex extension. The smoothing algorithm is particularly novel as it allows us to treat general concave potentials without needing to construct a piecewise linear approximation as with graph-based techniques.
Second, we derive the general conditions under which it is possible to find a minimizer of a submodular function via a convex problem. This provides a framework for developing submodular minimization algorithms. The framework is then used to develop several algorithms that can be run in a distributed fashion. This is particularly useful for applications where the submodular objective function consists of a sum of many terms, each term dependent on a small part of a large data set.
Lastly, we approach the problem of learning set functions from an unorthodox perspective---sparse reconstruction. We demonstrate an explicit connection between the problem of learning set functions from random evaluations and that of sparse signals. Based on the observation that the Fourier transform for set functions satisfies exactly the conditions needed for sparse reconstruction algorithms to work, we examine some different function classes under which uniform reconstruction is possible.
Resumo:
The anisotropy of 1.3 - 2.3 MeV protons in interplanetary space has been measured using the Caltech Electron/Isotope Spectrometer aboard IMP-7 for 317 6-hour periods from 72/273 to 74/2. Periods dominated by prompt solar particle events are not included. The convective and diffusive anisotropies are determined from the observed anisotropy using concurrent solar wind speed measurements and observed energy spectra. The diffusive flow of particles is found to be typically toward the sun, indicating a positive radial gradient in the particle density. This anisotropy is inconsistent with previously proposed sources of low-energy proton increases seen at 1 AU which involve continual solar acceleration.
The typical properties of this new component of low-energy cosmic rays have been determine d for this period which is near solar minimum. The particles have a median intensity of 0.06 protons/ cm^(2)-sec-sr-MeV and a mean spectral index of -3.15.The amplitude of the diffusive anisotropy is approximately proportional to the solar wind speed. The rate at which particles are diffusing toward the sun is larger than the rate at which the solar wind is convecting the particles away from the sun. The 20 to 1 proton to alpha ratio typical of this new component has been reported by Mewaldt, et al. (1975b).
A propagation model with κ_(rr) assumed independent of radius and energy is used to show that the anisotropy could be due to increases similar to those found by McDonald, et al. (1975) at ~3 AU. The interplanetary Fermi-acceleration model proposed by Fisk (1976) to explain the increases seen near 3 AU is not consistent with the ~12 per cent diffusive anisotropy found.
The dependence of the diffusive anisotropy on various parameters is shown. A strong dependence of the direction of the diffusive anisotropy on the concurrently measured magnetic field direction is found, indicating a κ_⊥ less than κ_∥ to be typical for this large data set.
Resumo:
The evoked response, a signal present in the electro-encephalogram when specific sense modalities are stimulated with brief sensory inputs, has not yet revealed as much about brain function as it apparently promised when first recorded in the late 1940's. One of the problems has been to record the responses at a large number of points on the surface of the head; thus in order to achieve greater spatial resolution than previously attained, a 50-channel recording system was designed to monitor experiments with human visually evoked responses.
Conventional voltage versus time plots of the responses were found inadequate as a means of making qualitative studies of such a large data space. This problem was solved by creating a graphical display of the responses in the form of equipotential maps of the activity at successive instants during the complete response. In order to ascertain the necessary complexity of any models of the responses, factor analytic procedures were used to show that models characterized by only five or six independent parameters could adequately represent the variability in all recording channels.
One type of equivalent source for the responses which meets these specifications is the electrostatic dipole. Two different dipole models were studied: the dipole in a homogeneous sphere and the dipole in a sphere comprised of two spherical shells (of different conductivities) concentric with and enclosing a homogeneous sphere of a third conductivity. These models were used to determine nonlinear least squares fits of dipole parameters to a given potential distribution on the surface of a spherical approximation to the head. Numerous tests of the procedures were conducted with problems having known solutions. After these theoretical studies demonstrated the applicability of the technique, the models were used to determine inverse solutions for the evoked response potentials at various times throughout the responses. It was found that reliable estimates of the location and strength of cortical activity were obtained, and that the two models differed only slightly in their inverse solutions. These techniques enabled information flow in the brain, as indicated by locations and strengths of active sites, to be followed throughout the evoked response.
Resumo:
Ao se realizar estudo em qualquer área do conhecimento, quanto mais dados se dispuser, maior a dificuldade de se extrair conhecimento útil deste banco de dados. A finalidade deste trabalho é apresentar algumas ferramentas ditas inteligentes, de extração de conhecimento destes grandes repositórios de dados. Apesar de ter várias conotações, neste trabalho, irá se entender extração de conhecimento dos repositórios de dados a ocorrência combinada de alguns dados com freqüência e confiabilidade que se consideram interessantes, ou seja, na medida e que determinado dado ou conjunto de dados aparece no repositório de dados, em freqüência considerada razoável, outro dado ou conjunto de dados irá aparecer. Executada sobre repositórios de dados referentes a informações georreferenciadas dos alunos da UERJ (Universidade do Estado do Rio de Janeiro), irá se analisar os resultados de duas ferramentas de extração de dados, bem como apresentar possibilidades de otimização computacional destas ferramentas.
Resumo:
A mobilidade urbana é um problema em diversos centros urbanos e é agravada pelo número crescente de automóveis e seu uso indiscriminado. Este estudo exploratório-descritivo abordará revisões conceituais e levantamento extenso de dados sobre a função de transporte; o automóvel, quanto a sua origem e simbolismos; o contexto do Brasil e da cidade do Rio de Janeiro; a dependência dos veículos e os impactos do trânsito na sociedade, a fim de explicar a insustentabilidade desse meio de transporte, da forma como tem sido utilizado nas cidades. Dentre os principais impactos causados pela dependência do automóvel, destacam-se os relativos a saúde, com problemas que vão desde complicações no sistema respiratório e circulatório até o comprometimento da saúde mental; qualidade de vida e a relação entre tempo e custos de locomoção; segurança e todo aparato tecnológico de automóveis que protege o usuário em detrimento do público mais vulnerável, como pedestres e ciclistas; morfologia da cidade, que acaba por privilegiar um modal individual e cria novas formas urbanas que demandam mais espaços para automóveis; mudanças climáticas devido à poluição desproporcional, que influencia os padrões bioquímicos de vários ecossistemas, gerando mudanças climáticas; e prejuízos econômicos, estimados por três diferentes metodologias de estudo, que procuraram monetizar o custo dos congestionamentos. A pesquisa propõe diversas atitudes para reverter ou mitigar o uso excessivo do Transporte em Automóveis. Esta contribuição para os estudos da geografia de transportes vislumbra deixar subsídios para que se avance no debate sobre a dependência do automóvel, especialmente em grandes cidades
Resumo:
The location of a flame front is often taken as the point of maximum OH gradient. Planar laser-induced fluorescence of OH can be used to obtain the flame front by extracting the points of maximum gradient. This operation is typically performed using an edge detection algorithm. The choice of operating parameters a priori poses significant problems of robustness when handling images with a range of signal-to-noise ratios. A statistical method of parameter selection originating in the image processing literature is detailed, and its merit for this application is demonstrated. A reduced search space method is proposed to decrease computational cost and render the technique viable for large data sets. This gives nearly identical output to the full method. These methods demonstrate substantial decreases in data rejection compared to the use of a priori parameters. These methods are viable for any application where maximum gradient contours must be accurately extracted from images of species or temperature, even at very low signal-to-noise ratios.
2D PIV measurements in the near field of grid turbulence using stitched fields from multiple cameras
Resumo:
We present measurements of grid turbulence using 2D particle image velocimetry taken immediately downstream from the grid at a Reynolds number of Re M = 16500 where M is the rod spacing. A long field of view of 14M x 4M in the down- and cross-stream directions was achieved by stitching multiple cameras together. Two uniform biplanar grids were selected to have the same M and pressure drop but different rod diameter D and crosssection. A large data set (10 4 vector fields) was obtained to ensure good convergence of second-order statistics. Estimations of the dissipation rate ε of turbulent kinetic energy (TKE) were found to be sensitive to the number of meansquared velocity gradient terms included and not whether the turbulence was assumed to adhere to isotropy or axisymmetry. The resolution dependency of different turbulence statistics was assessed with a procedure that does not rely on the dissipation scale η. The streamwise evolution of the TKE components and ε was found to collapse across grids when the rod diameter was included in the normalisation. We argue that this should be the case between all regular grids when the other relevant dimensionless quantities are matched and the flow has become homogeneous across the stream. Two-point space correlation functions at x/M = 1 show evidence of complex wake interactions which exhibit a strong Reynolds number dependence. However, these changes in initial conditions disappear indicating rapid cross-stream homogenisation. On the other hand, isotropy was, as expected, not found to be established by x/M = 12 for any case studied. © Springer-Verlag 2012.
Resumo:
Understanding how and why changes propagate during engineering design is critical because most products and systems emerge from predecessors and not through clean sheet design. This paper applies change propagation analysis methods and extends prior reasoning through examination of a large data set from industry including 41,500 change requests, spanning 8 years during the design of a complex sensor system. Different methods are used to analyze the data and the results are compared to each other and evaluated in the context of previous findings. In particular the networks of connected parent, child and sibling changes are resolved over time and mapped to 46 subsystem areas. A normalized change propagation index (CPI) is then developed, showing the relative strength of each area on the absorber-multiplier spectrum between -1 and +1. Multipliers send out more changes than they receive and are good candidates for more focused change management. Another interesting finding is the quantitative confirmation of the "ripple" change pattern. Unlike the earlier prediction, however, it was found that the peak of cyclical change activity occurred late in the program driven by systems integration and functional testing. Patterns emerged from the data and offer clear implications for technical change management approaches in system design. Copyright © 2007 by ASME.
Resumo:
A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from high-throughput data sources. In this paper we examine how Bayesian regularization using a Dirichlet prior over the model parameters affects the learned model structure in a domain with discrete variables. Surprisingly, a weak prior in the sense of smaller equivalent sample size leads to a strong regularization of the model structure (sparse graph) given a sufficiently large data set. In particular, the empty graph is obtained in the limit of a vanishing strength of prior belief. This is diametrically opposite to what one may expect in this limit, namely the complete graph from an (unregularized) maximum likelihood estimate. Since the prior affects the parameters as expected, the prior strength balances a "trade-off" between regularizing the parameters or the structure of the model. We demonstrate the benefits of optimizing this trade-off in the sense of predictive accuracy.
Resumo:
Barker, M. (2005) 'The Lord of the Rings and 'identification': a critical encounter', European Journal of Communication, 20, 3, 353-378 Sponsorship: This research was made possible by a grant from the Economic and Social Research Council (ESRC Grant No. 000-22-0323)
Resumo:
Cook, Anthony; Gibbens, M.J., (2006) 'Constructing Visual Taxonomies by Shape', 18th International Conference on Pattern Recognition (ICPR'06) Volume 2, pp. 732 - 735 RAE2008