821 resultados para event sequences
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models
Resumo:
Under certain circumstances, it is possible to identify clonal variants of Mycobacterium tuberculosis infecting a single patient, probably as a result of subtle genetic rearrangements in part of the bacillary population. We systematically searched for these microevolution events in a different context, namely, recent transmission chains. We studied the clustered cases identified using a population-based universal molecular epidemiology strategy over a 5-year period. Clonal variants of the reference strain defining the cluster were found in 9 (12%) of the 74 clusters identified after the genotyping of 612 M. tuberculosis isolates by IS6110 restriction fragment length polymorphism analysis and mycobacterial interspersed repetitive units-variable-number tandem repeat typing. Clusters with microevolution events were epidemiologically supported and involved 4 to 9 cases diagnosed over a 1- to 5-year period. The IS6110 insertion sites from 16 representative isolates of reference and microevolved variants were mapped by ligation-mediated PCR in order to characterize the genetic background involved in microevolution. Both intragenic and intergenic IS6110 locations resulted from these microevolution events. Among those cases of IS6110 locations in intergenic regions which could have an effect on the regulation of adjacent genes, we identified the overexpression of cytochrome P450 in one microevolved variant using quantitative real-time reverse transcription-PCR. Our results help to define the frequency with which microevolution can be expected in M. tuberculosis transmission chains. They provide a snapshot of the genetic background of these subtle rearrangements and identify an event in which IS6110-mediated microevolution in an isogenic background has functional consequences.
Resumo:
BACKGROUND. The phenomenon of misdiagnosing tuberculosis (TB) by laboratory cross-contamination when culturing Mycobacterium tuberculosis (MTB) has been widely reported and it has an obvious clinical, therapeutic and social impact. The final confirmation of a cross-contamination event requires the molecular identification of the same MTB strain cultured from both the potential source of the contamination and from the false-positive candidate. The molecular tool usually applied in this context is IS6110-RFLP which takes a long time to provide an answer, usually longer than is acceptable for microbiologists and clinicians to make decisions. Our purpose in this study is to evaluate a novel PCR-based method, MIRU-VNTR as an alternative to assure a rapid and optimized analysis of cross-contamination alerts. RESULTS. MIRU-VNTR was prospectively compared with IS6110-RFLP for clarifying 19 alerts of false positivity from other laboratories. MIRU-VNTR highly correlated with IS6110-RFLP, reduced the response time by 27 days and clarified six alerts unresolved by RFLP. Additionally, MIRU-VNTR revealed complex situations such as contamination events involving polyclonal isolates and a false-positive case due to the simultaneous cross-contamination from two independent sources. CONCLUSION. Unlike standard RFLP-based genotyping, MIRU-VNTR i) could help reduce the impact of a false positive diagnosis of TB, ii) increased the number of events that could be solved and iii) revealed the complexity of some cross-contamination events that could not be dissected by IS6110-RFLP.
Resumo:
Forest fire sequences can be modelled as a stochastic point process where events are characterized by their spatial locations and occurrence in time. Cluster analysis permits the detection of the space/time pattern distribution of forest fires. These analyses are useful to assist fire-managers in identifying risk areas, implementing preventive measures and conducting strategies for an efficient distribution of the firefighting resources. This paper aims to identify hot spots in forest fire sequences by means of the space-time scan statistics permutation model (STSSP) and a geographical information system (GIS) for data and results visualization. The scan statistical methodology uses a scanning window, which moves across space and time, detecting local excesses of events in specific areas over a certain period of time. Finally, the statistical significance of each cluster is evaluated through Monte Carlo hypothesis testing. The case study is the forest fires registered by the Forest Service in Canton Ticino (Switzerland) from 1969 to 2008. This dataset consists of geo-referenced single events including the location of the ignition points and additional information. The data were aggregated into three sub-periods (considering important preventive legal dispositions) and two main ignition-causes (lightning and anthropogenic causes). Results revealed that forest fire events in Ticino are mainly clustered in the southern region where most of the population is settled. Our analysis uncovered local hot spots arising from extemporaneous arson activities. Results regarding the naturally-caused fires (lightning fires) disclosed two clusters detected in the northern mountainous area.
Resumo:
Anopheles triannulatus s.l. is a malaria vector with a wide geographic distribution, ranging from Argentina-Nicaragua and Trinidad. Here we analysed sequences of two genes, timeless and cpr, to assess the genetic variability and divergence among three sympatric cryptic species of this complex from Salobra, central-western Brazil. The timeless gene sequences did not conclusively differentiate Anopheles halophylus and An. triannulatus species "C". However, a partial separation has been observed between these species and An. triannulatus s.s. Importantly, the analysis of the cpr gene sequences revealed fixed differences, no shared polymorphisms and considerable genetic differentiation among the three species of the An. triannulatus complex. The results confirm that An. triannulatus s.s., An. halophylus and An. triannulatus species C are distinct taxa, with the latter two likely representing a more recent speciation event.
Resumo:
PURPOSE: Studies of diffuse large B-cell lymphoma (DLBCL) are typically evaluated by using a time-to-event approach with relapse, re-treatment, and death commonly used as the events. We evaluated the timing and type of events in newly diagnosed DLBCL and compared patient outcome with reference population data. PATIENTS AND METHODS: Patients with newly diagnosed DLBCL treated with immunochemotherapy were prospectively enrolled onto the University of Iowa/Mayo Clinic Specialized Program of Research Excellence Molecular Epidemiology Resource (MER) and the North Central Cancer Treatment Group NCCTG-N0489 clinical trial from 2002 to 2009. Patient outcomes were evaluated at diagnosis and in the subsets of patients achieving event-free status at 12 months (EFS12) and 24 months (EFS24) from diagnosis. Overall survival was compared with age- and sex-matched population data. Results were replicated in an external validation cohort from the Groupe d'Etude des Lymphomes de l'Adulte (GELA) Lymphome Non Hodgkinien 2003 (LNH2003) program and a registry based in Lyon, France. RESULTS: In all, 767 patients with newly diagnosed DLBCL who had a median age of 63 years were enrolled onto the MER and NCCTG studies. At a median follow-up of 60 months (range, 8 to 116 months), 299 patients had an event and 210 patients had died. Patients achieving EFS24 had an overall survival equivalent to that of the age- and sex-matched general population (standardized mortality ratio [SMR], 1.18; P = .25). This result was confirmed in 820 patients from the GELA study and registry in Lyon (SMR, 1.09; P = .71). Simulation studies showed that EFS24 has comparable power to continuous EFS when evaluating clinical trials in DLBCL. CONCLUSION: Patients with DLBCL who achieve EFS24 have a subsequent overall survival equivalent to that of the age- and sex-matched general population. EFS24 will be useful in patient counseling and should be considered as an end point for future studies of newly diagnosed DLBCL.
Resumo:
Transcripts similar to those that encode the nonstructural (NS) proteins NS3 and NS5 from flaviviruses were found in a salivary gland (SG) complementary DNA (cDNA) library from the cattle tick Rhipicephalus microplus.Tick extracts were cultured with cells to enable the isolation of viruses capable of replicating in cultured invertebrate and vertebrate cells. Deep sequencing of the viral RNA isolated from culture supernatants provided the complete coding sequences for the NS3 and NS5 proteins and their molecular characterisation confirmed similarity with the NS3 and NS5 sequences from other flaviviruses. Despite this similarity, phylogenetic analyses revealed that this potentially novel virus may be a highly divergent member of the genus Flavivirus. Interestingly, we detected the divergent NS3 and NS5 sequences in ticks collected from several dairy farms widely distributed throughout three regions of Brazil. This is the first report of flavivirus-like transcripts inR. microplus ticks. This novel virus is a potential arbovirus because it replicated in arthropod and mammalian cells; furthermore, it was detected in a cDNA library from tick SGs and therefore may be present in tick saliva. It is important to determine whether and by what means this potential virus is transmissible and to monitor the virus as a potential emerging tick-borne zoonotic pathogen.
Resumo:
Human T-cell lymphotropic virus type 1 (HTLV-1) is mainly associated with two diseases: tropical spastic paraparesis/HTLV-1-associated myelopathy (TSP/HAM) and adult T-cell leukaemia/lymphoma. This retrovirus infects five-10 million individuals throughout the world. Previously, we developed a database that annotates sequence data from GenBank and the present study aimed to describe the clinical, molecular and epidemiological scenarios of HTLV-1 infection through the stored sequences in this database. A total of 2,545 registered complete and partial sequences of HTLV-1 were collected and 1,967 (77.3%) of those sequences represented unique isolates. Among these isolates, 93% contained geographic origin information and only 39% were related to any clinical status. A total of 1,091 sequences contained information about the geographic origin and viral subtype and 93% of these sequences were identified as subtype “a”. Ethnicity data are very scarce. Regarding clinical status data, 29% of the sequences were generated from TSP/HAM and 67.8% from healthy carrier individuals. Although the data mining enabled some inferences about specific aspects of HTLV-1 infection to be made, due to the relative scarcity of data of available sequences, it was not possible to delineate a global scenario of HTLV-1 infection.
Resumo:
Since 1984, Anopheles (Kerteszia) lepidotus has been considered a mosquito species that is involved in the transmission of malaria in Colombia, after having been incriminated as such with epidemiological evidence from a malaria outbreak in Cunday-Villarrica, Tolima. Subsequent morphological analyses of females captured in the same place and at the time of the outbreak showed that the species responsible for the transmission was not An. lepidotus, but rather Anopheles pholidotus. However, the associated morphological stages and DNA sequences of An. pholidotus from the foci of Cunday-Villarrica had not been analysed. Using samples that were caught recently from the outbreak region, the purpose of this study was to provide updated and additional information by analysing the morphology of female mosquitoes, the genitalia of male mosquitoes and fourth instar larvae of An. pholidotus, which was confirmed with DNA sequences of cytochrome oxidase I and rDNA internal transcribed spacer. A total of 1,596 adult females were collected in addition to 37 larval collections in bromeliads. Furthermore, 141 adult females, which were captured from the same area in the years 1981-1982, were analysed morphologically. Ninety-five DNA sequences were analysed for this study. Morphological and molecular analyses showed that the species present in this region corresponds to An. pholidotus. Given the absence of An. lepidotus, even in recent years, we consider that the species of mosquitoes that was previously incriminated as the malaria vector during the outbreak was indeed An. pholidotus, thus ending the controversy.
Resumo:
In this article we review first some of the possibilities in which the notions of Fo lner sequences and quasidiagonality have been applied to spectral approximation problems. We construct then a canonical Fo lner sequence for the crossed product of a concrete C* -algebra and a discrete amenable group. We apply our results to the rotation algebra (which contains interesting operators like almost Mathieu operators or periodic magnetic Schrödinger operators on graphs) and the C* -algebra generated by bounded Jacobi operators.
Resumo:
This article analyzes Folner sequences of projections for bounded linear operators and their relationship to the class of finite operators introduced by Williams in the 70ies. We prove that each essentially hyponormal operator has a proper Folner sequence (i.e. a Folner sequence of projections strongly converging to 1). In particular, any quasinormal, any subnormal, any hyponormal and any essentially normal operator has a proper Folner sequence. Moreover, we show that an operator is finite if and only if it has a proper Folner sequence or if it has a non-trivial finite dimensional reducing subspace. We also analyze the structure of operators which have no Folner sequence and give examples of them. For this analysis we introduce the notion of strongly non-Folner operators, which are far from finite block reducible operators, in some uniform sense, and show that this class coincides with the class of non-finite operators.
Resumo:
The influenza A(H3N2) virus has circulated worldwide for almost five decades and is the dominant subtype in most seasonal influenza epidemics, as occurred in the 2014 season in South America. In this study we evaluate five whole genome sequences of influenza A(H3N2) viruses detected in patients with mild illness collected from January-March 2014. To sequence the genomes, a new generation sequencing (NGS) protocol was performed using the Ion Torrent PGM platform. In addition to analysing the common genes, haemagglutinin, neuraminidase and matrix, our work also comprised internal genes. This was the first report of a whole genome analysis with Brazilian influenza A(H3N2) samples. Considerable amino acid variability was encountered in all gene segments, demonstrating the importance of studying the internal genes. NGS of whole genomes in this study will facilitate deeper virus characterisation, contributing to the improvement of influenza strain surveillance in Brazil.
Resumo:
The Brazilian Amazon Region is a highly endemic area for hepatitis B virus (HBV). However, little is known regarding the genetic variability of the strains circulating in this geographical region. Here, we describe the first full-length genomes of HBV isolated in the Brazilian Amazon Region; these genomes are also the first complete HBV subgenotype D3 genomes reported for Brazil. The genomes of the five Brazilian isolates were all 3,182 base pairs in length and the isolates were classified as belonging to subgenotype D3, subtypes ayw2 (n = 3) and ayw3 (n = 2). Phylogenetic analysis suggested that the Brazilian sequences are not likely to be closely related to European D3 sequences. Such results will contribute to further epidemiological and evolutionary studies of HBV.
Resumo:
Three multivariate statistical tools (principal component analysis, factor analysis, analysis discriminant) have been tested to characterize and model the sags registered in distribution substations. Those models use several features to represent the magnitude, duration and unbalanced grade of sags. They have been obtained from voltage and current waveforms. The techniques are tested and compared using 69 registers of sags. The advantages and drawbacks of each technique are listed