169 resultados para Biological traits analysis
Resumo:
DNA exists predominantly in a duplex form that is preserved via specific base pairing. This base pairing affords a considerable degree of protection against chemical or physical damage and preserves coding potential. However, there are many situations, e.g. during DNA damage and programmed cellular processes such as DNA replication and transcription, in which the DNA duplex is separated into two singlestranded DNA (ssDNA) strands. This ssDNA is vulnerable to attack by nucleases, binding by inappropriate proteins and chemical attack. It is very important to control the generation of ssDNA and protect it when it forms, and for this reason all cellular organisms and many viruses encode a ssDNA binding protein (SSB). All known SSBs use an oligosaccharide/oligonucleotide binding (OB)-fold domain for DNA binding. SSBs have multiple roles in binding and sequestering ssDNA, detecting DNA damage, stimulating strand-exchange proteins and helicases, and mediation of protein–protein interactions. Recently two additional human SSBs have been identified that are more closely related to bacterial and archaeal SSBs. Prior to this it was believed that replication protein A, RPA, was the only human equivalent of bacterial SSB. RPA is thought to be required for most aspects of DNA metabolism including DNA replication, recombination and repair. This review will discuss in further detail the biological pathways in which human SSBs function.
Resumo:
Nuclear Factor Y (NF-Y) is a trimeric complex that binds to the CCAAT box, a ubiquitous eukaryotic promoter element. The three subunits NF-YA, NF-YB and NF-YC are represented by single genes in yeast and mammals. However, in model plant species (Arabidopsis and rice) multiple genes encode each subunit providing the impetus for the investigation of the NF-Y transcription factor family in wheat. A total of 37 NF-Y and Dr1 genes (10 NF-YA, 11 NF-YB, 14 NF-YC and 2 Dr1) in Triticum aestivum were identified in the global DNA databases by computational analysis in this study. Each of the wheat NF-Y subunit families could be further divided into 4-5 clades based on their conserved core region sequences. Several conserved motifs outside of the NF-Y core regions were also identified by comparison of NF-Y members from wheat, rice and Arabidopsis. Quantitative RT-PCR analysis revealed that some of the wheat NF-Y genes were expressed ubiquitously, while others were expressed in an organ-specific manner. In particular, each TaNF-Y subunit family had members that were expressed predominantly in the endosperm. The expression of nine NF-Y and two Dr1 genes in wheat leaves appeared to be responsive to drought stress. Three of these genes were up-regulated under drought conditions, indicating that these members of the NF-Y and Dr1 families are potentially involved in plant drought adaptation. The combined expression and phylogenetic analyses revealed that members within the same phylogenetic clade generally shared a similar expression profile. Organ-specific expression and differential response to drought indicate a plant-specific biological role for various members of this transcription factor family.
Resumo:
Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.
Resumo:
Inverse problems based on using experimental data to estimate unknown parameters of a system often arise in biological and chaotic systems. In this paper, we consider parameter estimation in systems biology involving linear and non-linear complex dynamical models, including the Michaelis–Menten enzyme kinetic system, a dynamical model of competence induction in Bacillus subtilis bacteria and a model of feedback bypass in B. subtilis bacteria. We propose some novel techniques for inverse problems. Firstly, we establish an approximation of a non-linear differential algebraic equation that corresponds to the given biological systems. Secondly, we use the Picard contraction mapping, collage methods and numerical integration techniques to convert the parameter estimation into a minimization problem of the parameters. We propose two optimization techniques: a grid approximation method and a modified hybrid Nelder–Mead simplex search and particle swarm optimization (MH-NMSS-PSO) for non-linear parameter estimation. The two techniques are used for parameter estimation in a model of competence induction in B. subtilis bacteria with noisy data. The MH-NMSS-PSO scheme is applied to a dynamical model of competence induction in B. subtilis bacteria based on experimental data and the model for feedback bypass. Numerical results demonstrate the effectiveness of our approach.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.
Resumo:
Insect monitoring and sampling programmes are used in the stored grains industry for the detection and estimation of insect pests. At the low pest densities dictated by economic and commercial requirements, the accuracy of both detection and abundance estimates can be influenced by variations in the spatial structure of pest populations over short distances. Geostatistical analysis of Rhyzopertha dominica populations in 2 dimensions showed that, in both the horizontal and vertical directions and at all temperatures examined, insect numbers were positively correlated over short (0-5cm) distances, and negatively correlated over longer (≥10cm) distances. Analysis in 3 dimensions showed a similar pattern, with positive correlations over short distances and negative correlations at longer distances. At 35°C, insects were located significantly further from the grain surface than at 25 and 30°C. Dispersion metrics showed statistically significant aggregation in all cases. This is the first research using small sample units, high sampling intensities, and a range of temperatures, to show spatial structuring of R. dominica populations over short distances. This research will have significant implications for sampling in the stored grains industry.
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.
Resumo:
Denaturation of tissues can provide a unique biological environment for regenerative medicine application only if minimal disruption of their microarchitecture is achieved during the decellularization process. The goal is to keep the structural integrity of such a construct as functional as the tissues from which they were derived. In this work, cartilage-on-bone laminates were decellularized through enzymatic, non-ionic and ionic protocols. This work investigated the effects of decellularization process on the microarchitecture of cartiligous extracellular matrix; determining the extent of how each process deteriorated the structural organization of the network. High resolution microscopy was used to capture cross-sectional images of samples prior to and after treatment. The variation of the microarchitecture was then analysed using a well defined fast Fourier image processing algorithm. Statistical analysis of the results revealed how significant the alternations among aforementioned protocols were (p < 0.05). Ranking the treatments by their effectiveness in disrupting the ECM integrity, they were ordered as: Trypsin> SDS> Triton X-100.
Resumo:
In herbaceous ecosystems worldwide, biodiversity has been negatively impacted by changed grazing regimes and nutrient enrichment. Altered disturbance regimes are thought to favour invasive species that have a high phenotypic plasticity, although most studies measure plasticity under controlled conditions in the greenhouse and then assume plasticity is an advantage in the field. Here, we compare trait plasticity between three co-occurring, C 4 perennial grass species, an invader Eragrostis curvula, and natives Eragrostis sororia and Aristida personata to grazing and fertilizer in a three-year field trial. We measured abundances and several leaf traits known to correlate with strategies used by plants to fix carbon and acquire resources, i.e. specific leaf area (SLA), leaf dry matter content (LDMC), leaf nutrient concentrations (N, C:N, P), assimilation rates (Amax) and photosynthetic nitrogen use efficiency (PNUE). In the control treatment (grazed only), trait values for SLA, leaf C:N ratios, Amax and PNUE differed significantly between the three grass species. When trait values were compared across treatments, E. curvula showed higher trait plasticity than the native grasses, and this correlated with an increase in abundance across all but the grazed/fertilized treatment. The native grasses showed little trait plasticity in response to the treatments. Aristida personata decreased significantly in the treatments where E. curvula increased, and E. sororia abundance increased possibly due to increased rainfall and not in response to treatments or invader abundance. Overall, we found that plasticity did not favour an increase in abundance of E. curvula under the grazed/fertilized treatment likely because leaf nutrient contents increased and subsequently its' palatability to consumers. E. curvula also displayed a higher resource use efficiency than the native grasses. These findings suggest resource conditions and disturbance regimes can be manipulated to disadvantage the success of even plastic exotic species.
Resumo:
Baseline monitoring of groundwater quality aims to characterize the ambient condition of the resource and identify spatial or temporal trends. Sites comprising any baseline monitoring network must be selected to provide a representative perspective of groundwater quality across the aquifer(s) of interest. Hierarchical cluster analysis (HCA) has been used as a means of assessing the representativeness of a groundwater quality monitoring network, using example datasets from New Zealand. HCA allows New Zealand's national and regional monitoring networks to be compared in terms of the number of water-quality categories identified in each network, the hydrochemistry at the centroids of these water-quality categories, the proportions of monitoring sites assigned to each water-quality category, and the range of concentrations for each analyte within each water-quality category. Through the HCA approach, the National Groundwater Monitoring Programme (117 sites) is shown to provide a highly representative perspective of groundwater quality across New Zealand, relative to the amalgamated regional monitoring networks operated by 15 different regional authorities (680 sites have sufficient data for inclusion in HCA). This methodology can be applied to evaluate the representativeness of any subset of monitoring sites taken from a larger network.
Resumo:
The finite element (FE) analysis is an effective method to study the strength and predict the fracture risk of endodontically-treated teeth. This paper presents a rapid method developed to generate a comprehensive tooth FE model using data retrieved from micro-computed tomography (μCT). With this method, the inhomogeneity of material properties of teeth was included into the model without dividing the tooth model into different regions. The material properties of the tooth were assumed to be related to the mineral density. The fracture risk at different tooth portions was assessed for root canal treatments. The micro-CT images of a tooth were processed by a Matlab software programme and the CT numbers were retrieved. The tooth contours were obtained with thresholding segmentation using Amira. The inner and outer surfaces of the tooth were imported into Solidworks and a three-dimensional (3D) tooth model was constructed. An assembly of the tooth model with the periodontal ligament (PDL) layer and surrounding bone was imported into ABAQUS. The material properties of the tooth were calculated from the retrieved CT numbers via ABAQUS user's subroutines. Three root canal geometries (original and two enlargements) were investigated. The proposed method in this study can generate detailed 3D finite element models of a tooth with different root canal enlargements and filling materials, and would be very useful for the assessment of the fracture risk at different tooth portions after root canal treatments.