25 resultados para Web Mining, Data Mining, User Topic Model, Web User Profiles
Resumo:
Background: High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results: The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions: Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.
A bivariate regression model for matched paired survival data: local influence and residual analysis
Resumo:
The use of bivariate distributions plays a fundamental role in survival and reliability studies. In this paper, we consider a location scale model for bivariate survival times based on the proposal of a copula to model the dependence of bivariate survival data. For the proposed model, we consider inferential procedures based on maximum likelihood. Gains in efficiency from bivariate models are also examined in the censored data setting. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the bivariate regression model for matched paired survival data. Sensitivity analysis methods such as local and total influence are presented and derived under three perturbation schemes. The martingale marginal and the deviance marginal residual measures are used to check the adequacy of the model. Furthermore, we propose a new measure which we call modified deviance component residual. The methodology in the paper is illustrated on a lifetime data set for kidney patients.
Resumo:
Ion channels are pores formed by proteins and responsible for carrying ion fluxes through cellular membranes. The ion channels can assume conformational states thereby controlling ion flow. Physically, the conformational transitions from one state to another are associated with energy barriers between them and are dependent on stimulus, such as, electrical field, ligands, second messengers, etc. Several models have been proposed to describe the kinetics of ion channels. The classical Markovian model assumes that a future transition is independent of the time that the ion channel stayed in a previous state. Others models as the fractal and the chaotic assume that the rate of transitions between the states depend on the time that the ionic channel stayed in a previous state. For the calcium activated potassium channels of Leydig cells the R/S Hurst analysis has indicated that the channels are long-term correlated with a Hurst coefficient H around 0.7, showing a persistent memory in this kinetic. Here, we applied the R/S analysis to the opening and closing dwell time series obtained from simulated data from a chaotic model proposed by L. Liebovitch and T. Toth [J. Theor. Biol. 148, 243 (1991)] and we show that this chaotic model or any model that treats the set of channel openings and closings as independent events is inadequate to describe the long-term correlation (memory) already described for the experimental data. (C) 2008 American Institute of Physics.
Resumo:
The objective of this work is to develop an improved model of the human thermal system. The features included are important to solve real problems: 3D heat conduction, the use of elliptical cylinders to adequately approximate body geometry, the careful representation of tissues and important organs, and the flexibility of the computational implementation. Focus is on the passive system, which is composed by 15 cylindrical elements and it includes heat transfer between large arteries and veins. The results of thermal neutrality and transient simulations are in excellent agreement with experimental data, indicating that the model represents adequately the behavior of the human thermal system. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
This work proposes a refined technique for the extraction of the generation lifetime in single- and double-gate partially depleted SOI nMOSFETs. The model presented in this paper, based on the drain current switch-off transients, takes into account the influence of the laterally non-uniform channel doping, caused by the presence of the halo implanted region, and the amount of charge controlled by the drain and source junctions on the floating body effect when the channel length is reduced. The obtained results for single- gate (SG) devices are compared with two-dimensional numerical simulations and experimental data, extracted for devices fabricated in a 0.1 mu m SOI CMOS technology, showing excellent agreement. The improved model to determine the generation lifetime in double-gate (DG) devices beyond the considerations previously presented also consider the influence of the silicon layer thickness on the drain current transient. The extracted data through the improved model for DG devices were compared with measurements and two-dimensional numerical simulations of the SG devices also presenting a good adjustment with the channel length reduction and the same tendency with the silicon layer thickness variation.
Resumo:
In this study, regression models are evaluated for grouped survival data when the effect of censoring time is considered in the model and the regression structure is modeled through four link functions. The methodology for grouped survival data is based on life tables, and the times are grouped in k intervals so that ties are eliminated. Thus, the data modeling is performed by considering the discrete models of lifetime regression. The model parameters are estimated by using the maximum likelihood and jackknife methods. To detect influential observations in the proposed models, diagnostic measures based on case deletion, which are denominated global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to those measures, the local influence and the total influential estimate are also employed. Various simulation studies are performed and compared to the performance of the four link functions of the regression models for grouped survival data for different parameter settings, sample sizes and numbers of intervals. Finally, a data set is analyzed by using the proposed regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The present paper reports phase equilibrium experimental data for two systems composed by peanut oil or avocado seed oil + commercial oleic acid + ethanol + water at 298.2 K and different water contents in the solvent. The addition of water to the solvent reduces the loss of neutral oil in the alcoholic phase and improves the solvent selectivity. The experimental data were correlated by the NRTL and UNIQUAC models. The global deviations between calculated and experimental values were 0.63 % and 1.08 %, respectively, for the systems containing avocado seed oil. In the case of systems containing peanut oil those deviations were 0.65 % and 0.98 %, respectively. Such results indicate that both models were able to reproduce correctly the experimental data, although the NRTL model presented a better performance.
Resumo:
A time efficient optical model is proposed for GATE simulation of a LYSO scintillation matrix coupled to a photomultiplier. The purpose is to avoid the excessively long computation time when activating the optical processes in GATE. The usefulness of the model is demonstrated by comparing the simulated and experimental energy spectra obtained with the dual planar head equipment for dosimetry with a positron emission tomograph ( DoPET). The procedure to apply the model is divided in two steps. Firstly, a simplified simulation of a single crystal element of DoPET is used to fit an analytic function that models the optical attenuation inside the crystal. In a second step, the model is employed to calculate the influence of this attenuation in the energy registered by the tomograph. The use of the proposed optical model is around three orders of magnitude faster than a GATE simulation with optical processes enabled. A good agreement was found between the experimental and simulated data using the optical model. The results indicate that optical interactions inside the crystal elements play an important role on the energy resolution and induce a considerable degradation of the spectra information acquired by DoPET. Finally, the same approach employed by the proposed optical model could be useful to simulate a scintillation matrix coupled to a photomultiplier using single or dual readout scheme.
Resumo:
The aim of this paper is to present an economical design of an X chart for a short-run production. The process mean starts equal to mu(0) (in-control, State I) and in a random time it shifts to mu(1) > mu(0) (out-of-control, State II). The monitoring procedure consists of inspecting a single item at every m produced ones. If the measurement of the quality characteristic does not meet the control limits, the process is stopped, adjusted, and additional (r - 1) items are inspected retrospectively. The probabilistic model was developed considering only shifts in the process mean. A direct search technique is applied to find the optimum parameters which minimizes the expected cost function. Numerical examples illustrate the proposed procedure. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
The object of this article is to estimate demand elasticities for a basket of staple food important for providing the caloric needs of Brazilian households. These elasticities are useful in the measurement of the impact of structural reforms on poverty. A two-stage demand system was constructed, based on data from Household Expenditure Surveys (POF) produced by IBGE (The Brazilian Bureau of Statistics) in 1987/88 and 1995/96. We have used panel data to estimate the model, and have calculated income, own-price, and cross-price elasticities for eight groups of goods and services and, in the second stage, for 11 sub groups of staple food products. We estimated those elasticities for the whole sample of consumers and for two income groups.
Resumo:
This paper addresses the investment decisions considering the presence of financial constraints of 373 large Brazilian firms from 1997 to 2004, using panel data. A Bayesian econometric model was used considering ridge regression for multicollinearity problems among the variables in the model. Prior distributions are assumed for the parameters, classifying the model into random or fixed effects. We used a Bayesian approach to estimate the parameters, considering normal and Student t distributions for the error and assumed that the initial values for the lagged dependent variable are not fixed, but generated by a random process. The recursive predictive density criterion was used for model comparisons. Twenty models were tested and the results indicated that multicollinearity does influence the value of the estimated parameters. Controlling for capital intensity, financial constraints are found to be more important for capital-intensive firms, probably due to their lower profitability indexes, higher fixed costs and higher degree of property diversification.
Resumo:
Methods Stepwise regression of annual data was applied to model incidence, calculated based on 91 cases, from lagged variables: antecedent precipitation, air temperature, soil water storage, absolute and relative air humidity, and Southern Oscillation Index (SOI). Results Multiple regression analyses resulted in a model, which explains 49% of the incidence variance, taking into account the absolute air humidity in the year of exposure, soil water storage and SOI of the previous 2 years. Conclusions The correlations may reflect enhanced fungal growth after increase in soil water storage in the longer term and greater spore release with increase in absolute air humidity in the short term.
Resumo:
We present a comprehensive analysis of the spatial, kinematic and chemical properties of stars and globular clusters (GCs) in the `ordinary` elliptical galaxy NGC 4494 using data from the Keck and Subaru telescopes. We derive galaxy surface brightness and colour profiles out to large galactocentric radii. We compare the latter to metallicities derived using the near-infrared Calcium Triplet. We obtain stellar kinematics out to similar to 3.5 effective radii. The latter appear flattened or elongated beyond similar to 1.8 effective radii in contrast to the relatively round photometric isophotes. In fact, NGC 4494 may be a flattened galaxy, possibly even an S0, seen at an inclination of similar to 45 degrees. We publish a catalogue of 431 GC candidates brighter than i(0) = 24 based on the photometry, of which 109 are confirmed spectroscopically and 54 have measured spectroscopic metallicities. We also report the discovery of three spectroscopically confirmed ultra-compact dwarfs around NGC 4494 with measured metallicities of -0.4 less than or similar to [Fe/H] less than or similar to -0.3. Based on their properties, we conclude that they are simply bright GCs. The metal-poor GCs are found to be rotating with similar amplitude as the galaxy stars, while the metal-rich GCs show marginal rotation. We supplement our analysis with available literature data and results. Using model predictions of galaxy formation, and a suite of merger simulations, we find that many of the observational properties of NGC 4494 may be explained by formation in a relatively recent gas-rich major merger. Complete studies of individual galaxies incorporating a range of observational avenues and methods such as the one presented here will be an invaluable tool for constraining the fine details of galaxy formation models, especially at large galactocentric radii.
Resumo:
In arthropods, most cases of morphological dimorphism within males are the result of a conditional evolutionarily stable strategy (ESS) with status-dependent tactics. In conditionally male-dimorphic species, the status` distributions of male morphs often overlap, and the environmentally cued threshold model (ET) states that the degree of overlap depends on the genetic variation in the distribution of the switchpoints that determine which morph is expressed in each value of status. Here we describe male dimorphism and alternative mating behaviors in the harvestman Serracutisoma proximum. Majors express elongated second legs and use them in territorial fights; minors possess short second legs and do not fight, but rather sneak into majors` territories and copulate with egg-guarding females. The static allometry of second legs reveals that major phenotype expression depends on body size (status), and that the switchpoint underlying the dimorphism presents a large amount of genetic variation in the population, which probably results from weak selective pressure on this trait. With a mark-recapture study, we show that major phenotype expression does not result in survival costs, which is consistent with our hypothesis that there is weak selection on the switchpoint. Finally, we demonstrate that switchpoint is independent of status distribution. In conclusion, our data support the ET model prediction that the genetic correlation between status and switchpoint is low, allowing the status distribution to evolve or to fluctuate seasonally, without any effect on the position of the mean switchpoint.
Resumo:
Clustering quality or validation indices allow the evaluation of the quality of clustering in order to support the selection of a specific partition or clustering structure in its natural unsupervised environment, where the real solution is unknown or not available. In this paper, we investigate the use of quality indices mostly based on the concepts of clusters` compactness and separation, for the evaluation of clustering results (partitions in particular). This work intends to offer a general perspective regarding the appropriate use of quality indices for the purpose of clustering evaluation. After presenting some commonly used indices, as well as indices recently proposed in the literature, key issues regarding the practical use of quality indices are addressed. A general methodological approach is presented which considers the identification of appropriate indices thresholds. This general approach is compared with the simple use of quality indices for evaluating a clustering solution.