55 resultados para algorithm Context
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Context tree models have been introduced by Rissanen in [25] as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics. The present paper investigates non-asymptotic properties of two popular procedures of context tree estimation: Rissanen's algorithm Context and penalized maximum likelihood. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning overestimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The under-estimation properties rely on classical hypotheses for processes of infinite memory. These results improve on and generalize the bounds obtained in Duarte et al. (2006) [12], Galves et al. (2008) [18], Galves and Leonardi (2008) [17], Leonardi (2010) [22], refining asymptotic results of Buhlmann and Wyner (1999) [4] and Csiszar and Talata (2006) [9]. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
We consider binary infinite order stochastic chains perturbed by a random noise. This means that at each time step, the value assumed by the chain can be randomly and independently flipped with a small fixed probability. We show that the transition probabilities of the perturbed chain are uniformly close to the corresponding transition probabilities of the original chain. As a consequence, in the case of stochastic chains with unbounded but otherwise finite variable length memory, we show that it is possible to recover the context tree of the original chain, using a suitable version of the algorithm Context, provided that the noise is small enough.
Resumo:
Context. B[e] supergiants are luminous, massive post-main sequence stars exhibiting non-spherical winds, forbidden lines, and hot dust in a disc-like structure. The physical properties of their rich and complex circumstellar environment (CSE) are not well understood, partly because these CSE cannot be easily resolved at the large distances found for B[e] supergiants (typically greater than or similar to 1 kpc). Aims. From mid-IR spectro-interferometric observations obtained with VLTI/MIDI we seek to resolve and study the CSE of the Galactic B[e] supergiant CPD-57 degrees 2874. Methods. For a physical interpretation of the observables (visibilities and spectrum) we use our ray-tracing radiative transfer code (FRACS), which is optimised for thermal spectro-interferometric observations. Results. Thanks to the short computing time required by FRACS (<10 s per monochromatic model), best-fit parameters and uncertainties for several physical quantities of CPD-57 degrees 2874 were obtained, such as inner dust radius, relative flux contribution of the central source and of the dusty CSE, dust temperature profile, and disc inclination. Conclusions. The analysis of VLTI/MIDI data with FRACS allowed one of the first direct determinations of physical parameters of the dusty CSE of a B[e] supergiant based on interferometric data and using a full model-fitting approach. In a larger context, the study of B[e] supergiants is important for a deeper understanding of the complex structure and evolution of hot, massive stars.
Resumo:
We present a novel array RLS algorithm with forgetting factor that circumvents the problem of fading regularization, inherent to the standard exponentially-weighted RLS, by allowing for time-varying regularization matrices with generic structure. Simulations in finite precision show the algorithm`s superiority as compared to alternative algorithms in the context of adaptive beamforming.
Resumo:
This paper presents the formulation of a combinatorial optimization problem with the following characteristics: (i) the search space is the power set of a finite set structured as a Boolean lattice; (ii) the cost function forms a U-shaped curve when applied to any lattice chain. This formulation applies for feature selection in the context of pattern recognition. The known approaches for this problem are branch-and-bound algorithms and heuristics that explore partially the search space. Branch-and-bound algorithms are equivalent to the full search, while heuristics are not. This paper presents a branch-and-bound algorithm that differs from the others known by exploring the lattice structure and the U-shaped chain curves of the search space. The main contribution of this paper is the architecture of this algorithm that is based on the representation and exploration of the search space by new lattice properties proven here. Several experiments, with well known public data, indicate the superiority of the proposed method to the sequential floating forward selection (SFFS), which is a popular heuristic that gives good results in very short computational time. In all experiments, the proposed method got better or equal results in similar or even smaller computational time. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
The network of HIV counseling and testing centers in São Paulo, Brazil is a major source of data used to build epidemiological profiles of the client population. We examined HIV-1 incidence from November 2000 to April 2001, comparing epidemiological and socio-behavioral data of recently-infected individuals with those with long-standing infection. A less sensitive ELISA was employed to identify recent infection. The overall incidence of HIV-1 infection was 0.53/100/year (95% CI: 0.31-0.85/100/year): 0.77/100/year for males (95% CI: 0.42-1.27/100/year) and 0.22/100/ year (95% CI: 0.05-0.59/100/year) for females. Overall HIV-1 prevalence was 3.2% (95% CI: 2.8-3.7%), being 4.0% among males (95% CI: 3.3-4.7%) and 2.1% among females (95% CI: 1.6-2.8%). Recent infections accounted for 15% of the total (95% CI: 10.2-20.8%). Recent infection correlated with being younger and male (p = 0.019). Therefore, recent infection was more common among younger males and older females.
Resumo:
This work develops a method for solving ordinary differential equations, that is, initial-value problems, with solutions approximated by using Legendre's polynomials. An iterative procedure for the adjustment of the polynomial coefficients is developed, based on the genetic algorithm. This procedure is applied to several examples providing comparisons between its results and the best polynomial fitting when numerical solutions by the traditional Runge-Kutta or Adams methods are available. The resulting algorithm provides reliable solutions even if the numerical solutions are not available, that is, when the mass matrix is singular or the equation produces unstable running processes.
Resumo:
The State Reform processes combined with the emergence and use of Information and Communication Technology (ICT) originated electronic government policies and initiatives in Brazil. This paper dwells on Brazilian e-government by investigating the institutional design it assumed in the state's public sphere, and how it contributed to outcomes related to e-gov possibilities. The analyses were carried out under an interpretativist perspective by making use of Institutional Theory. From the analyses of interviews with relevant actors in the public sphere, such as state secretaries and presidents of public ICT companies, conclusions point towards low institutionalization of e-gov policies. The institutional design of Brazilian e-gov limits the use of ICT to provide integrated public services, to amplify participation and transparency, and to improve public policies management.
Resumo:
Current HIV vaccine approaches are focused on immunogens encoding whole HIV antigenic proteins that mainly elicit cytotoxic CD8+ responses. Mounting evidence points toward a critical role for CD4+ T cells in the control of immunodeficiency virus replication, probably due to cognate help. Vaccine-induced CD4+ T cell responses might, therefore, have a protective effect in HIV replication. In addition, successful vaccines may have to elicit responses to multiple epitopes in a high proportion of vaccinees, to match the highly variable circulating strains of HIV. Using rational vaccine design, we developed a DNA vaccine encoding 18 algorithm-selected conserved, ""promiscuous"" ( multiple HLA-DR-binding) B-subtype HIV CD4 epitopes - previously found to be frequently recognized by HIV-infected patients. We assessed the ability of the vaccine to induce broad T cell responses in the context of multiple HLA class II molecules using different strains of HLA class II-transgenic mice (-DR2, -DR4, -DQ6 and -DQ8). Mice displayed CD4+ and CD8+ T cell responses of significant breadth and magnitude, and 16 out of the 18 encoded epitopes were recognized. By virtue of inducing broad responses against conserved CD4+ T cell epitopes that can be recognized in the context of widely diverse, common HLA class II alleles, this vaccine concept may cope both with HIV genetic variability and increased population coverage. The vaccine may thus be a source of cognate help for HIV-specific CD8+ T cells elicited by conventional immunogens, in a wide proportion of vaccinees.
Resumo:
Background: High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results: The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions: Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.
Resumo:
This paper presents a new statistical algorithm to estimate rainfall over the Amazon Basin region using the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI). The algorithm relies on empirical relationships derived for different raining-type systems between coincident measurements of surface rainfall rate and 85-GHz polarization-corrected brightness temperature as observed by the precipitation radar (PR) and TMI on board the TRMM satellite. The scheme includes rain/no-rain area delineation (screening) and system-type classification routines for rain retrieval. The algorithm is validated against independent measurements of the TRMM-PR and S-band dual-polarization Doppler radar (S-Pol) surface rainfall data for two different periods. Moreover, the performance of this rainfall estimation technique is evaluated against well-known methods, namely, the TRMM-2A12 [ the Goddard profiling algorithm (GPROF)], the Goddard scattering algorithm (GSCAT), and the National Environmental Satellite, Data, and Information Service (NESDIS) algorithms. The proposed algorithm shows a normalized bias of approximately 23% for both PR and S-Pol ground truth datasets and a mean error of 0.244 mm h(-1) ( PR) and -0.157 mm h(-1)(S-Pol). For rain volume estimates using PR as reference, a correlation coefficient of 0.939 and a normalized bias of 0.039 were found. With respect to rainfall distributions and rain area comparisons, the results showed that the formulation proposed is efficient and compatible with the physics and dynamics of the observed systems over the area of interest. The performance of the other algorithms showed that GSCAT presented low normalized bias for rain areas and rain volume [0.346 ( PR) and 0.361 (S-Pol)], and GPROF showed rainfall distribution similar to that of the PR and S-Pol but with a bimodal distribution. Last, the five algorithms were evaluated during the TRMM-Large-Scale Biosphere-Atmosphere Experiment in Amazonia (LBA) 1999 field campaign to verify the precipitation characteristics observed during the easterly and westerly Amazon wind flow regimes. The proposed algorithm presented a cumulative rainfall distribution similar to the observations during the easterly regime, but it underestimated for the westerly period for rainfall rates above 5 mm h(-1). NESDIS(1) overestimated for both wind regimes but presented the best westerly representation. NESDIS(2), GSCAT, and GPROF underestimated in both regimes, but GPROF was closer to the observations during the easterly flow.
Resumo:
Context. CoRoT is a pioneering space mission devoted to the analysis of stellar variability and the photometric detection of extrasolar planets. Aims. We present the list of planetary transit candidates detected in the first field observed by CoRoT, IRa01, the initial run toward the Galactic anticenter, which lasted for 60 days. Methods. We analysed 3898 sources in the coloured bands and 5974 in the monochromatic band. Instrumental noise and stellar variability were taken into account using detrending tools before applying various transit search algorithms. Results. Fifty sources were classified as planetary transit candidates and the most reliable 40 detections were declared targets for follow-up ground-based observations. Two of these targets have so far been confirmed as planets, CoRoT-1b and CoRoT-4b, for which a complete characterization and specific studies were performed.
Resumo:
Context. Cluster properties can be more distinctly studied in pairs of clusters, where we expect the effects of interactions to be strong. Aims. We here discuss the properties of the double cluster Abell 1758 at a redshift z similar to 0.279. These clusters show strong evidence for merging. Methods. We analyse the optical properties of the North and South cluster of Abell 1758 based on deep imaging obtained with the Canada-France-Hawaii Telescope (CFHT) archive Megaprime/Megacam camera in the g' and r' bands, covering a total region of about 1.05 x 1.16 deg(2), or 16.1 x 17.6 Mpc(2). Our X-ray analysis is based on archive XMM-Newton images. Numerical simulations were performed using an N-body algorithm to treat the dark-matter component, a semi-analytical galaxy-formation model for the evolution of the galaxies and a grid-based hydrodynamic code with a parts per million (PPM) scheme for the dynamics of the intra-cluster medium. We computed galaxy luminosity functions (GLFs) and 2D temperature and metallicity maps of the X-ray gas, which we then compared to the results of our numerical simulations. Results. The GLFs of Abell 1758 North are well fit by Schechter functions in the g' and r' bands, but with a small excess of bright galaxies, particularly in the r' band; their faint-end slopes are similar in both bands. In contrast, the GLFs of Abell 1758 South are not well fit by Schechter functions: excesses of bright galaxies are seen in both bands; the faint-end of the GLF is not very well defined in g'. The GLF computed from our numerical simulations assuming a halo mass-luminosity relation agrees with those derived from the observations. From the X-ray analysis, the most striking features are structures in the metal distribution. We found two elongated regions of high metallicity in Abell 1758 North with two peaks towards the centre. In contrast, Abell 1758 South shows a deficit of metals in its central regions. Comparing observational results to those derived from numerical simulations, we could mimic the most prominent features present in the metallicity map and propose an explanation for the dynamical history of the cluster. We found in particular that in the metal-rich elongated regions of the North cluster, winds had been more efficient than ram-pressure stripping in transporting metal-enriched gas to the outskirts. Conclusions. We confirm the merging structure of the North and South clusters, both at optical and X-ray wavelengths.
Resumo:
In Natural Language Processing (NLP) symbolic systems, several linguistic phenomena, for instance, the thematic role relationships between sentence constituents, such as AGENT, PATIENT, and LOCATION, can be accounted for by the employment of a rule-based grammar. Another approach to NLP concerns the use of the connectionist model, which has the benefits of learning, generalization and fault tolerance, among others. A third option merges the two previous approaches into a hybrid one: a symbolic thematic theory is used to supply the connectionist network with initial knowledge. Inspired on neuroscience, it is proposed a symbolic-connectionist hybrid system called BIO theta PRED (BIOlogically plausible thematic (theta) symbolic-connectionist PREDictor), designed to reveal the thematic grid assigned to a sentence. Its connectionist architecture comprises, as input, a featural representation of the words (based on the verb/noun WordNet classification and on the classical semantic microfeature representation), and, as output, the thematic grid assigned to the sentence. BIO theta PRED is designed to ""predict"" thematic (semantic) roles assigned to words in a sentence context, employing biologically inspired training algorithm and architecture, and adopting a psycholinguistic view of thematic theory.