897 resultados para patent sequence datasets


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Schwalbenberg II loess-paleosol sequence (LPS) denotes a key site for Marine Isotope Stage (MIS 3) in Western Europe owing to eight succeeding cambisols, which primarily constitute the Ahrgau Subformation. Therefore, this LPS qualifies as a test candidate for the potential of temporal high-resolution geochemical data obtained X-ray fluorescence (XRF) scanning of discrete samplesproviding a fast and non-destructive tool for determining the element composition. The geochemical data is first contextualized to existing proxy data such as magnetic susceptibility (MS) and organic carbon (Corg) and then aggregated to element log ratios characteristic for weathering intensity [LOG (Ca/Sr), LOG (Rb/Sr), LOG (Ba/Sr), LOG (Rb/K)] and dust provenance [LOG (Ti/Zr), LOG (Ti/Al), LOG (Si/Al)]. Generally, an interpretation of rock magnetic particles is challenged in western Europe, where not only magnetic enhancement but also depletion plays a role. Our data indicates leaching and top-soil erosion induced MS depletion at the Schwalbenberg II LPS. Besides weathering, LOG (Ca/Sr) is susceptible for secondary calcification. Thus, also LOG (Rb/Sr) and LOG (Ba/Sr) are shown to be influenced by calcification dynamics. Consequently, LOG (Rb/K) seems to be the most suitable weathering index identifying the Sinzig Soils S1 and S2 as the most pronounced paleosols for this site. Sinzig Soil S3 is enclosed by gelic gleysols and in contrast to S1 and S2 only initially weathered pointing to colder climate conditions. Also the Remagen Soils are characterized by subtle to moderate positive excursions in the weathering indices. Comparing the Schwalbenberg II LPS with the nearby Eifel Lake Sediment Archive (ELSA) and other more distant German, Austrian and Czech LPS while discussing time and climate as limiting factors for pedogenesis, we suggest that the lithologically determined paleosols are in-situ soil formations. The provenance indices document a Zr-enrichment at the transition from the Ahrgau to the Hesbaye Subformation. This is explained by a conceptual model incorporating multiple sediment recycling and sorting effects in eolian and fluvial domains.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mineralogical and geochemical analyses of alteration products from upper and lower volcanic series recovered during ODP Leg 104 reveal variations both in composition and order of crystallization of clay minerals vesicles and voids filling and replacing glass. These results provide information about successive alteration stages of rocks and interlayered volcaniclastic sediments. The first stage, related to initial basalt-seawater interaction, is characterized by development of Fe-smectites, especially Fe-rich saponite. A second stage of intermittently superimposed subaerial weathering is marked by iron-oxides-halloysite-kaolinite formation. The third episode, interpreted as hydrothermal on the basis of O-isotopic data, is defined by postburial coprecipitation of Fe-poor, Mg-rich saponite and celadonite. A distinct final and pervasive hydrothermal stage, occurring mainly in the lower series and dominated by Al-smectites-zeolites assemblage, indicates changes toward a more reducing alteration environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

During ODP Leg 166, the recovery of cores from a transect of drill sites across the Bahamas margin from marginal to deep basin environments was an essential requirement for the study of the response of the sedimentary systems to sea-level changes. A detailed biostratigraphy based on planktonic foraminifera was performed on ODP Hole 1006A for an accurate stratigraphic control. The investigated late middle Miocene-early Pliocene sequence spans the interval from about 12.5 Ma (Biozone N12) to approximately 4.5 Ma (Biozone N19). Several bioevents calibrated with the time scale of Berggren et al. (1995a,b) were identified. The ODP Site 1006 benthic oxygen isotope stratigraphy can be correlated to the corresponding deep-water benthic oxygen isotope curve from ODP Site 846 in the Eastern Equatorial Pacific (Shackleton et al., 1995. Proc. ODP Sci. Res. 138, 337-356), which was orbitally tuned for the entire Pliocene into the latest Miocene at 6.0 Ma. The approximate stratigraphic match of the isotopic signals from both records between 4.5 and 6.0 Ma implies that the paleoceanographic signal from the Bahamas is not simply a record of regional variations but, indeed, represents glacio-eustatic fluctuations. The ODP Site 1006 oxygen and carbon isotope record, based on benthic and planktonic foraminifera, was used to define paleoceanographic changes on the margin, which could be tied to lithostratigraphic events on the Bahamas carbonate platform using seismic sequence stratigraphy. The oxygen isotope values show a general cooling trend from the middle to late Miocene, which was interrupted by a significant trend towards warmer sea-surface temperatures (SST) and associated sea-level rise with decreased ice volume during the latest Miocene. This trend reached a maximum coincident with the Miocene/Pliocene boundary. An abrupt cooling in the early Pliocene then followed the warming which continued into the earliest Pliocene. The late Miocene paleoceanographic evolution along the Bahamas margin can be observed in the ODP Site 1006 delta13C values, which support other evidence for the beginning of the closure of the Panama gateway at 8 Ma followed by a reduced intermediate water supply of water from the Pacific into the Caribbean at about 5 Ma. A general correlation of lower sedimentation rates with the major seismic sequence boundaries (SSBs) was observed. Additionally, the SSBs are associated with transitions towards more positive oxygen isotope excursions. This observed correspondence implies that the presence of a SSB, representing a density impedance contrast in the sedimentary sequence, may reflect changes in the character of the deposited sediment during highstands versus those during lowstands. However, not all of the recorded oxygen isotope excursions correspond to SSBs. The absence of a SSB in association with an oxygen isotope excursion indicates that not all oxygen isotope sea-level events impact the carbonate margin to the same extent, or maybe even represent equivalent sea-level fluctuations. Thus, it can be tentatively concluded that SSBs produced on carbonate margins do record sea-level fluctuations but not every sea-level fluctuation is represented by a SSB in the sequence stratigraphic record.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A high resolution mixed carbonate and siliciclastic sequence from DSDP Site 594 contains a detailed record of climate change in the late Pliocene. The sequence can be accurately dated by the LAD of Nitzschia weaveri, the LAD of Thalassiosira insigna, the LAD of T. vulnifica and the LAD of T. kolbei diatom datums. Carbonate content and delta18O signatures provide added resolution and place the sequence between isotope stage 100 and 92. The sequence contains well-preserved and diverse dinoflagellate cyst floras. Use of principal component (PCA) and canonical correspondence analyses (CCA) identifies changes in the assemblages that principally reflect warming and cooling trends. Species association with warmer climates included Impagidinium patulum, I. paradoxum and I. sp. cf. paradoxum while those from cooler climates include Invertecysta tabulata and I. velorum. CCA is shown to be a valuable method of determining the past environmental preferences of extinct species such as I. tabulata.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The argillite sequence located at the base of the sedimentary cover on the continental slope of the Sea of Japan was studied by petrographic, palynological, and X-ray diffraction methods. Two spores-pollen complexes were distinguished in it: the Late Oligocene reflecting cooling and the Early Miocene corresponding to initiated warming. Data obtained indicate that the sequence is composed of terrigenous silty-clayey sediments that accumulated in shallow coastal-marine settings. The global sea-level rise at the Early-Middle Miocene transition, combined with regional tectonic processes, determined basin's deepening, owing to which the argillite sequence was overlain by a thick layer of Middle Miocene diatomaceous-clayey sediments. Due to tectonic movement along existing faults in the terminal Late Miocene, the argillite sequence occurring initially at depths of at least 400-500 m was locally exhumed to the basin bottom.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With rapid advances in video processing technologies and ever fast increments in network bandwidth, the popularity of video content publishing and sharing has made similarity search an indispensable operation to retrieve videos of user interests. The video similarity is usually measured by the percentage of similar frames shared by two video sequences, and each frame is typically represented as a high-dimensional feature vector. Unfortunately, high complexity of video content has posed the following major challenges for fast retrieval: (a) effective and compact video representations, (b) efficient similarity measurements, and (c) efficient indexing on the compact representations. In this paper, we propose a number of methods to achieve fast similarity search for very large video database. First, each video sequence is summarized into a small number of clusters, each of which contains similar frames and is represented by a novel compact model called Video Triplet (ViTri). ViTri models a cluster as a tightly bounded hypersphere described by its position, radius, and density. The ViTri similarity is measured by the volume of intersection between two hyperspheres multiplying the minimal density, i.e., the estimated number of similar frames shared by two clusters. The total number of similar frames is then estimated to derive the overall similarity between two video sequences. Hence the time complexity of video similarity measure can be reduced greatly. To further reduce the number of similarity computations on ViTris, we introduce a new one dimensional transformation technique which rotates and shifts the original axis system using PCA in such a way that the original inter-distance between two high-dimensional vectors can be maximally retained after mapping. An efficient B+-tree is then built on the transformed one dimensional values of ViTris' positions. Such a transformation enables B+-tree to achieve its optimal performance by quickly filtering a large portion of non-similar ViTris. Our extensive experiments on real large video datasets prove the effectiveness of our proposals that outperform existing methods significantly.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Formal grammars can used for describing complex repeatable structures such as DNA sequences. In this paper, we describe the structural composition of DNA sequences using a context-free stochastic L-grammar. L-grammars are a special class of parallel grammars that can model the growth of living organisms, e.g. plant development, and model the morphology of a variety of organisms. We believe that parallel grammars also can be used for modeling genetic mechanisms and sequences such as promoters. Promoters are short regulatory DNA sequences located upstream of a gene. Detection of promoters in DNA sequences is important for successful gene prediction. Promoters can be recognized by certain patterns that are conserved within a species, but there are many exceptions which makes the promoter recognition a complex problem. We replace the problem of promoter recognition by induction of context-free stochastic L-grammar rules, which are later used for the structural analysis of promoter sequences. L-grammar rules are derived automatically from the drosophila and vertebrate promoter datasets using a genetic programming technique and their fitness is evaluated using a Support Vector Machine (SVM) classifier. The artificial promoter sequences generated using the derived L- grammar rules are analyzed and compared with natural promoter sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In product reviews, it is observed that the distribution of polarity ratings over reviews written by different users or evaluated based on different products are often skewed in the real world. As such, incorporating user and product information would be helpful for the task of sentiment classification of reviews. However, existing approaches ignored the temporal nature of reviews posted by the same user or evaluated on the same product. We argue that the temporal relations of reviews might be potentially useful for learning user and product embedding and thus propose employing a sequence model to embed these temporal relations into user and product representations so as to improve the performance of document-level sentiment analysis. Specifically, we first learn a distributed representation of each review by a one-dimensional convolutional neural network. Then, taking these representations as pretrained vectors, we use a recurrent neural network with gated recurrent units to learn distributed representations of users and products. Finally, we feed the user, product and review representations into a machine learning classifier for sentiment classification. Our approach has been evaluated on three large-scale review datasets from the IMDB and Yelp. Experimental results show that: (1) sequence modeling for the purposes of distributed user and product representation learning can improve the performance of document-level sentiment classification; (2) the proposed approach achieves state-of-The-Art results on these benchmark datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at: http://www.lge.ibi.unicamp.br/lnbio/IIS/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The actinobacterium Streptomyces wadayamensis A23 is an endophyte of Citrus reticulata that produces the antimycin and mannopeptimycin antibiotics, among others. The strain has the capability to inhibit Xylella fastidiosa growth. The draft genome of S. wadayamensis A23 has ~7.0 Mb and 6,006 protein-coding sequences, with a 73.5% G+C content.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To analyze the effects of treatment approach on the outcomes of newborns (birth weight [BW] < 1,000 g) with patent ductus arteriosus (PDA), from the Brazilian Neonatal Research Network (BNRN) on: death, bronchopulmonary dysplasia (BPD), severe intraventricular hemorrhage (IVH III/IV), retinopathy of prematurity requiring surgical (ROPsur), necrotizing enterocolitis requiring surgery (NECsur), and death/BPD. This was a multicentric, cohort study, retrospective data collection, including newborns (BW < 1000 g) with gestational age (GA) < 33 weeks and echocardiographic diagnosis of PDA, from 16 neonatal units of the BNRN from January 1, 2010 to Dec 31, 2011. Newborns who died or were transferred until the third day of life, and those with presence of congenital malformation or infection were excluded. Groups: G1 - conservative approach (without treatment), G2 - pharmacologic (indomethacin or ibuprofen), G3 - surgical ligation (independent of previous treatment). Factors analyzed: antenatal corticosteroid, cesarean section, BW, GA, 5 min. Apgar score < 4, male gender, Score for Neonatal Acute Physiology Perinatal Extension (SNAPPE II), respiratory distress syndrome (RDS), late sepsis (LS), mechanical ventilation (MV), surfactant (< 2 h of life), and time of MV. death, O2 dependence at 36 weeks (BPD36wks), IVH III/IV, ROPsur, NECsur, and death/BPD36wks. Student's t-test, chi-squared test, or Fisher's exact test; Odds ratio (95% CI); logistic binary regression and backward stepwise multiple regression. Software: MedCalc (Medical Calculator) software, version 12.1.4.0. p-values < 0.05 were considered statistically significant. 1,097 newborns were selected and 494 newborns were included: G1 - 187 (37.8%), G2 - 205 (41.5%), and G3 - 102 (20.6%). The highest mortality was observed in G1 (51.3%) and the lowest in G3 (14.7%). The highest frequencies of BPD36wks (70.6%) and ROPsur were observed in G3 (23.5%). The lowest occurrence of death/BPD36wks occurred in G2 (58.0%). Pharmacological (OR 0.29; 95% CI: 0.14-0.62) and conservative (OR 0.34; 95% CI: 0.14-0.79) treatments were protective for the outcome death/BPD36wks. The conservative approach of PDA was associated to high mortality, the surgical approach to the occurrence of BPD36wks and ROPsur, and the pharmacological treatment was protective for the outcome death/BPD36wks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bacillus safensis is a microorganism recognized for its biotechnological and industrial potential due to its interesting enzymatic portfolio. Here, as a means of gathering information about the importance of this species in oil biodegradation, we report a draft genome sequence of a strain isolated from petroleum.