97 resultados para Data-Intensive Science

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

80.00% 80.00%

Publicador:

Resumo:

A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall`s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. (C) 2008 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Point placement strategies aim at mapping data points represented in higher dimensions to bi-dimensional spaces and are frequently used to visualize relationships amongst data instances. They have been valuable tools for analysis and exploration of data sets of various kinds. Many conventional techniques, however, do not behave well when the number of dimensions is high, such as in the case of documents collections. Later approaches handle that shortcoming, but may cause too much clutter to allow flexible exploration to take place. In this work we present a novel hierarchical point placement technique that is capable of dealing with these problems. While good grouping and separation of data with high similarity is maintained without increasing computation cost, its hierarchical structure lends itself both to exploration in various levels of detail and to handling data in subsets, improving analysis capability and also allowing manipulation of larger data sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a variable time step, fully adaptive in space, hybrid method for the accurate simulation of incompressible two-phase flows in the presence of surface tension in two dimensions. The method is based on the hybrid level set/front-tracking approach proposed in [H. D. Ceniceros and A. M. Roma, J. Comput. Phys., 205, 391400, 2005]. Geometric, interfacial quantities are computed from front-tracking via the immersed-boundary setting while the signed distance (level set) function, which is evaluated fast and to machine precision, is used as a fluid indicator. The surface tension force is obtained by employing the mixed Eulerian/Lagrangian representation introduced in [S. Shin, S. I. Abdel-Khalik, V. Daru and D. Juric, J. Comput. Phys., 203, 493-516, 2005] whose success for greatly reducing parasitic currents has been demonstrated. The use of our accurate fluid indicator together with effective Lagrangian marker control enhance this parasitic current reduction by several orders of magnitude. To resolve accurately and efficiently sharp gradients and salient flow features we employ dynamic, adaptive mesh refinements. This spatial adaption is used in concert with a dynamic control of the distribution of the Lagrangian nodes along the fluid interface and a variable time step, linearly implicit time integration scheme. We present numerical examples designed to test the capabilities and performance of the proposed approach as well as three applications: the long-time evolution of a fluid interface undergoing Rayleigh-Taylor instability, an example of bubble ascending dynamics, and a drop impacting on a free interface whose dynamics we compare with both existing numerical and experimental data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We used environmental accounting to evaluate high-intensity clonal eucalyptus production in Sao Paolo, Brazil, converting inputs (environmental, material, and labor) to emergy units so ecological efficiency could be compared on a common basis. Input data were compiled under three pH management scenarios (lime, ash, and sludge). The dominant emergy input is environmental work (transpired water, similar to 58% of total emergy), followed by diesel (similar to 15%); most purchased emergy is invested during harvest (41.8% of 7-year production totals). Where recycled materials are used for pH amendment (ash or sludge instead of lime), we observe marked improvements in ecological efficiency; lime (raw) yielded the highest unit emergy value (UEV = emergy per unit energy in the product = 9.6E + 03 sej J(-1)), whereas using sludge and ash (recycled) reduced the UEV to 8.9E + 03 and 8.8E + 03 sej J(-1), respectively. The emergy yield ratio was similarly affected, suggesting better ecological return on energy invested. Sensitivity of resource use to other operational modifications (e.g., decreased diesel, labor, or agrochemicals) was small (<3% change). Emergy synthesis permits comparison of sustainability among forest production systems globally. This eucalyptus scheme shows the highest ecological efficiency of analyzed pulp production operations (UEV range = 1.1 to 3.6E + 04 sej J(-1)) despite high operational intensity.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Quantity and variety of environmental antigens, age, diet, vaccine protocols, exercising practice and mucosal cytokine microenvironment are factors that influence serum immunoglobulin (Ig) levels. IgA, IgG, IgG(T) and IgM were quantified in 60 horses, which were classified into two groups, `intensive` or `relaxed`, according to sanitary standards of the facilities and physical exercise to which animals were subjected to. The `intensive` group presented lower means for all isotypes, but only IgA presented a significant (P < 0.0064) difference when compared to the `relaxed` group. This suggests that mucosal immunity found in the `intensive` group is lower when compared to the `relaxed` group. Our data suggest that athlete horses may be less poised to mount an effective mucosal immunity response to environmental challenges and should not be considered by the same perspectives as a free-ranging horse.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Diagnostic methods have been an important tool in regression analysis to detect anomalies, such as departures from error assumptions and the presence of outliers and influential observations with the fitted models. Assuming censored data, we considered a classical analysis and Bayesian analysis assuming no informative priors for the parameters of the model with a cure fraction. A Bayesian approach was considered by using Markov Chain Monte Carlo Methods with Metropolis-Hasting algorithms steps to obtain the posterior summaries of interest. Some influence methods, such as the local influence, total local influence of an individual, local influence on predictions and generalized leverage were derived, analyzed and discussed in survival data with a cure fraction and covariates. The relevance of the approach was illustrated with a real data set, where it is shown that, by removing the most influential observations, the decision about which model best fits the data is changed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. Methods/Principal Findings: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of ""what if'' situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. Conclusion/Significance: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results: The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions: Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Melanoma is a highly aggressive and therapy resistant tumor for which the identification of specific markers and therapeutic targets is highly desirable. We describe here the development and use of a bioinformatic pipeline tool, made publicly available under the name of EST2TSE, for the in silico detection of candidate genes with tissue-specific expression. Using this tool we mined the human EST (Expressed Sequence Tag) database for sequences derived exclusively from melanoma. We found 29 UniGene clusters of multiple ESTs with the potential to predict novel genes with melanoma-specific expression. Using a diverse panel of human tissues and cell lines, we validated the expression of a subset of three previously uncharacterized genes (clusters Hs.295012, Hs.518391, and Hs.559350) to be highly restricted to melanoma/melanocytes and named them RMEL1, 2 and 3, respectively. Expression analysis in nevi, primary melanomas, and metastatic melanomas revealed RMEL1 as a novel melanocytic lineage-specific gene up-regulated during melanoma development. RMEL2 expression was restricted to melanoma tissues and glioblastoma. RMEL3 showed strong up-regulation in nevi and was lost in metastatic tumors. Interestingly, we found correlations of RMEL2 and RMEL3 expression with improved patient outcome, suggesting tumor and/or metastasis suppressor functions for these genes. The three genes are composed of multiple exons and map to 2q12.2, 1q25.3, and 5q11.2, respectively. They are well conserved throughout primates, but not other genomes, and were predicted as having no coding potential, although primate-conserved and human-specific short ORFs could be found. Hairpin RNA secondary structures were also predicted. Concluding, this work offers new melanoma-specific genes for future validation as prognostic markers or as targets for the development of therapeutic strategies to treat melanoma.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Without intensive selection, the majority of bovine oocytes submitted to in vitro embryo production (IVP) fail to develop to the blastocyst stage. This is attributed partly to their maturation status and competences. Using the Affymetrix GeneChip Bovine Genome Array, global mRNA expression analysis of immature (GV) and in vitro matured (IVM) bovine oocytes was carried out to characterize the transcriptome of bovine oocytes and then use a variety of approaches to determine whether the observed transcriptional changes during IVM was real or an artifact of the techniques used during analysis. Results: 8489 transcripts were detected across the two oocyte groups, of which similar to 25.0% (2117 transcripts) were differentially expressed (p < 0.001); corresponding to 589 over-expressed and 1528 under-expressed transcripts in the IVM oocytes compared to their immature counterparts. Over expression of transcripts by IVM oocytes is particularly interesting, therefore, a variety of approaches were employed to determine whether the observed transcriptional changes during IVM were real or an artifact of the techniques used during analysis, including the analysis of transcript abundance in oocytes in vitro matured in the presence of a-amanitin. Subsets of the differentially expressed genes were also validated by quantitative real-time PCR (qPCR) and the gene expression data was classified according to gene ontology and pathway enrichment. Numerous cell cycle linked (CDC2, CDK5, CDK8, HSPA2, MAPK14, TXNL4B), molecular transport (STX5, STX17, SEC22A, SEC22B), and differentiation (NACA) related genes were found to be among the several over-expressed transcripts in GV oocytes compared to the matured counterparts, while ANXA1, PLAU, STC1and LUM were among the over-expressed genes after oocyte maturation. Conclusion: Using sequential experiments, we have shown and confirmed transcriptional changes during oocyte maturation. This dataset provides a unique reference resource for studies concerned with the molecular mechanisms controlling oocyte meiotic maturation in cattle, addresses the existing conflicting issue of transcription during meiotic maturation and contributes to the global goal of improving assisted reproductive technology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The A1763 superstructure at z = 0.23 contains the first galaxy filament to be directly detected using mid-infrared observations. Our previous work has shown that the frequency of starbursting galaxies, as characterized by 24 mu m emission is much higher within the filament than at either the center of the rich galaxy cluster, or the field surrounding the system. New Very Large Array and XMM-Newton data are presented here. We use the radio and X-ray data to examine the fraction and location of active galaxies, both active galactic nuclei (AGNs) and starbursts (SBs). The radio far-infrared correlation, X-ray point source location, IRAC colors, and quasar positions are all used to gain an understanding of the presence of dominant AGNs. We find very few MIPS-selected galaxies that are clearly dominated by AGN activity. Most radio-selected members within the filament are SBs. Within the supercluster, three of eight spectroscopic members detected both in the radio and in the mid-infrared are radio-bright AGNs. They are found at or near the core of A1763. The five SBs are located further along the filament. We calculate the physical properties of the known wide angle tail (WAT) source which is the brightest cluster galaxy of A1763. A second double lobe source is found along the filament well outside of the virial radius of either cluster. The velocity offset of the WAT from the X-ray centroid and the bend of the WAT in the intracluster medium are both consistent with ram pressure stripping, indicative of streaming motions along the direction of the filament. We consider this as further evidence of the cluster-feeding nature of the galaxy filament.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The HR Del nova remnant was observed with the IFU-GMOS at Gemini North. The spatially resolved spectral data cube was used in the kinematic, morphological, and abundance analysis of the ejecta. The line maps show a very clumpy shell with two main symmetric structures. The first one is the outer part of the shell seen in H alpha, which forms two rings projected in the sky plane. These ring structures correspond to a closed hourglass shape, first proposed by Harman & O'Brien. The equatorial emission enhancement is caused by the superimposed hourglass structures in the line of sight. The second structure seen only in the [O III] and [N II] maps is located along the polar directions inside the hourglass structure. Abundance gradients between the polar caps and equatorial region were not found. However, the outer part of the shell seems to be less abundant in oxygen and nitrogen than the inner regions. Detailed 2.5-dimensional photoionization modeling of the three-dimensional shell was performed using the mass distribution inferred from the observations and the presence of mass clumps. The resulting model grids are used to constrain the physical properties of the shell as well as the central ionizing source. A sequence of three-dimensional clumpy models including a disk-shaped ionization source is able to reproduce the ionization gradients between polar and equatorial regions of the shell. Differences between shell axial ratios in different lines can also be explained by aspherical illumination. A total shell mass of 9 x 10(-4) M(circle dot) is derived from these models. We estimate that 50%-70% of the shell mass is contained in neutral clumps with density contrast up to a factor of 30.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report on an intensive observational campaign carried out with HARPS at the 3.6 m telescope at La Silla on the star CoRoT-7. Additional simultaneous photometric measurements carried out with the Euler Swiss telescope have demonstrated that the observed radial velocity variations are dominated by rotational modulation from cool spots on the stellar surface. Several approaches were used to extract the radial velocity signal of the planet(s) from the stellar activity signal. First, a simple pre-whitening procedure was employed to find and subsequently remove periodic signals from the complex frequency structure of the radial velocity data. The dominant frequency in the power spectrum was found at 23 days, which corresponds to the rotation period of CoRoT-7. The 0.8535 day period of CoRoT-7b planetary candidate was detected with an amplitude of 3.3 m s(-1). Most other frequencies, some with amplitudes larger than the CoRoT-7b signal, are most likely associated with activity. A second approach used harmonic decomposition of the rotational period and up to the first three harmonics to filter out the activity signal from radial velocity variations caused by orbiting planets. After correcting the radial velocity data for activity, two periodic signals are detected: the CoRoT-7b transit period and a second one with a period of 3.69 days and an amplitude of 4 m s(-1). This second signal was also found in the pre-whitening analysis. We attribute the second signal to a second, more remote planet CoRoT-7c. The orbital solution of both planets is compatible with circular orbits. The mass of CoRoT-7b is 4.8 +/- 0.8 (M(circle plus)) and that of CoRoT-7c is 8.4 +/- 0.9 (M(circle plus)), assuming both planets are on coplanar orbits. We also investigated the false positive scenario of a blend by a faint stellar binary, and this may be rejected by the stability of the bisector on a nightly scale. According to their masses both planets belong to the super-Earth planet category. The average density of CoRoT-7b is rho = 5.6 +/- 1.3 g cm(-3), similar to the Earth. The CoRoT-7 planetary system provides us with the first insight into the physical nature of short period super-Earth planets recently detected by radial velocity surveys. These planets may be denser than Neptune and therefore likely made of rocks like the Earth, or a mix of water ice and rocks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using the published KTeV samples of K(L) -> pi(+/-)e(-/+)nu and K(L) -> pi(+/-)mu(-/+)nu decays, we perform a reanalysis of the scalar and vector form factors based on the dispersive parametrization. We obtain phase-space integrals I(K)(e) = 0.15446 +/- 0.00025 and I(K)(mu) = 0.10219 +/- 0.00025. For the scalar form factor parametrization, the only free parameter is the normalized form factor value at the Callan-Treiman point (C); our best-fit results in InC = 0.1915 +/- 0.0122. We also study the sensitivity of C to different parametrizations of the vector form factor. The results for the phase-space integrals and C are then used to make tests of the standard model. Finally, we compare our results with lattice QCD calculations of F(K)/F(pi) and f(+)(0).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tibolone is used for hormone reposition of postmenopause women and isotibolone is considered the major degradation product of tibolone. Isotibolone can also be present in tibolone API raw materials due to some inadequate synthesis. Its presence is then necessary to be identified and quantified in the quality control of both API and drug products. In this work we present the indexing of an isotibolone X-ray diffraction pattern measured with synchrotron light (lambda=1.2407 angstrom) in the transmission mode. The characterization of the isotibolone sample by IR spectroscopy, elemental analysis, and thermal analysis are also presented. The isotibolone crystallographic data are a=6.8066 angstrom, b=20.7350 angstrom, c=6.4489 angstrom, beta=76.428 degrees, V=884.75 angstrom(3), and space group P2(1), rho(o)= 1.187 g cm(-3), Z=2. (C) 2009 International Centre for Diffraction Data. [DOI: 10.1154/1.3257612]