Biblioteca Digital

208 resultados para Soils classification

Supplementary material : large scale read classification for next generation sequencing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.

Classification in the light of modern regulatory approaches

Relevância:

20.00% 20.00%

Publicador:

Binary image steganographic techniques classification based on multi-class steganalysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a new multi-class steganalysis for binary image. The proposed method can identify the type of steganographic technique used by examining on the given binary image. In addition, our proposed method is also capable of differentiating an image with hidden message from the one without hidden message. In order to do that, we will extract some features from the binary image. The feature extraction method used is a combination of the method extended from our previous work and some new methods proposed in this paper. Based on the extracted feature sets, we construct our multi-class steganalysis from the SVM classifier. We also present the empirical works to demonstrate that the proposed method can effectively identify five different types of steganography.

Automated classification of limb fractures from free-text radiology reports using a clinician-informed gazetteer methodology

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Timely diagnosis and reporting of patient symptoms in hospital emergency departments (ED) is a critical component of health services delivery. However, due to dispersed information resources and a vast amount of manual processing of unstructured information, accurate point-of-care diagnosis is often difficult. Aims The aim of this research is to report initial experimental evaluation of a clinician-informed automated method for the issue of initial misdiagnoses associated with delayed receipt of unstructured radiology reports. Method A method was developed that resembles clinical reasoning for identifying limb abnormalities. The method consists of a gazetteer of keywords related to radiological findings; the method classifies an X-ray report as abnormal if it contains evidence contained in the gazetteer. A set of 99 narrative reports of radiological findings was sourced from a tertiary hospital. Reports were manually assessed by two clinicians and discrepancies were validated by a third expert ED clinician; the final manual classification generated by the expert ED clinician was used as ground truth to empirically evaluate the approach. Results The automated method that attempts to individuate limb abnormalities by searching for keywords expressed by clinicians achieved an F-measure of 0.80 and an accuracy of 0.80. Conclusion While the automated clinician-driven method achieved promising performances, a number of avenues for improvement were identified using advanced natural language processing (NLP) and machine learning techniques.

Classification of cancer-related death certificates using machine learning

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Cancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities. Aims In this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated. Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes. Results Death certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM) classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032) and false negative rate (0.0297) while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers. Conclusion The selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with an SVM classifier.

Two-scale computational modelling of unsaturated flow in soils exhibiting small scale heterogeneities

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Unsaturated water flow in soil is commonly modelled using Richards’ equation, which requires the hydraulic properties of the soil (e.g., porosity, hydraulic conductivity, etc.) to be characterised. Naturally occurring soils, however, are heterogeneous in nature, that is, they are composed of a number of interwoven homogeneous soils each with their own set of hydraulic properties. When the length scale of these soil heterogeneities is small, numerical solution of Richards’ equation is computationally impractical due to the immense effort and refinement required to mesh the actual heterogeneous geometry. A classic way forward is to use a macroscopic model, where the heterogeneous medium is replaced with a fictitious homogeneous medium, which attempts to give the average flow behaviour at the macroscopic scale (i.e., at a scale much larger than the scale of the heterogeneities). Using the homogenisation theory, a macroscopic equation can be derived that takes the form of Richards’ equation with effective parameters. A disadvantage of the macroscopic approach, however, is that it fails in cases when the assumption of local equilibrium does not hold. This limitation has seen the introduction of two-scale models that include at each point in the macroscopic domain an additional flow equation at the scale of the heterogeneities (microscopic scale). This report outlines a well-known two-scale model and contributes to the literature a number of important advances in its numerical implementation. These include the use of an unstructured control volume finite element method and image-based meshing techniques, that allow for irregular micro-scale geometries to be treated, and the use of an exponential time integration scheme that permits both scales to be resolved simultaneously in a completely coupled manner. Numerical comparisons against a classical macroscopic model confirm that only the two-scale model correctly captures the important features of the flow for a range of parameter values.

Random projections on manifolds of symmetric positive definite matrices for image classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent advances suggest that encoding images through Symmetric Positive Definite (SPD) matrices and then interpreting such matrices as points on Riemannian manifolds can lead to increased classification performance. Taking into account manifold geometry is typically done via (1) embedding the manifolds in tangent spaces, or (2) embedding into Reproducing Kernel Hilbert Spaces (RKHS). While embedding into tangent spaces allows the use of existing Euclidean-based learning algorithms, manifold shape is only approximated which can cause loss of discriminatory information. The RKHS approach retains more of the manifold structure, but may require non-trivial effort to kernelise Euclidean-based learning algorithms. In contrast to the above approaches, in this paper we offer a novel solution that allows SPD matrices to be used with unmodified Euclidean-based learning algorithms, with the true manifold shape well-preserved. Specifically, we propose to project SPD matrices using a set of random projection hyperplanes over RKHS into a random projection space, which leads to representing each matrix as a vector of projection coefficients. Experiments on face recognition, person re-identification and texture classification show that the proposed approach outperforms several recent methods, such as Tensor Sparse Coding, Histogram Plus Epitome, Riemannian Locality Preserving Projection and Relational Divergence Classification.

Automatic classification of human epithelial type 2 cell indirect immunofluorescence images using cell pyramid matching

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a novel system for automatic classification of images obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The IIF protocol on HEp-2 cells has been the hallmark method to identify the presence of ANAs, due to its high sensitivity and the large range of antigens that can be detected. However, it suffers from numerous shortcomings, such as being subjective as well as time and labour intensive. Computer Aided Diagnostic (CAD) systems have been developed to address these problems, which automatically classify a HEp-2 cell image into one of its known patterns (eg. speckled, homogeneous). Most of the existing CAD systems use handpicked features to represent a HEp-2 cell image, which may only work in limited scenarios. We propose a novel automatic cell image classification method termed Cell Pyramid Matching (CPM), which is comprised of regional histograms of visual words coupled with the Multiple Kernel Learning framework. We present a study of several variations of generating histograms and show the efficacy of the system on two publicly available datasets: the ICPR HEp-2 cell classification contest dataset and the SNPHEp-2 dataset.

Improved image set classification via joint sparse approximated nearest subspaces

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).

Resistance of microbial populations in DDT-contaminated and uncontaminated soils

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One DDT-contaminated soil and two uncontaminated soils were used to enumerate DDT-resistant microbes (bacteria, actinomycetes and fungi) by using soil dilution agar plates in media either with 150 μg DDT ml -1 or without DDT at different temperatures (25, 37 and 55°C). Microbial populations in this study were significantly (p<0.001) affected by DDT in the growth medium. However, the numbers of microbes in long-term contaminated and uncontaminated soils were similar, presumably indicating that DDT-resistant microbes had developed over a long time exposure. The tolerance of isolated soil microbes to DDT varied in the order fungi>actinomycetes>bacteria. Bacteria from contaminated soil were more resistant to DDT than bacteria from uncontaminated soils. Microbes isolated at different temperatures also demonstrated varying degrees of DDT resistance. For example, bacteria and actinomycetes isolated at all incubation temperatures were sensitive to DDT. Conversely fungi isolated at all temperatures were unaffected by DDT.

DDT resistance and transformation by different microbial strains isolated from DDT-contaminated soils and compost materials

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bioremediation is a potential option to treat 1, 1, 1-trichloro-2, 2 bis (4-chlorophenyl) ethane (DDT) contaminated sites. In areas where suitable microbes are not present, the use of DDT resistant microbial inoculants may be necessary. It is vital that such inoculants do not produce recalcitrant breakdown products e.g. 1, 1-dichloro-2, 2-bis (4-chlorophenyl) ethylene (DDE). Therefore, this work aimed to screen DDT-contaminated soil and compost materials for the presence of DDT-resistant microbes for use as potential inoculants. Four compost amended soils, contaminated with different concentrations of DDT, were used to isolate DDT-resistant microbes in media containing 150 mg I -1 DDT at three temperatures (25, 37 and 55°C). In all soils, bacteria were more sensitive to DDT than actinomycetes and fungi. Bacteria isolated at 55°C from any source were the most DDT sensitive. However DDT-resistant bacterial strains showed more promise in degrading DDT than isolated fungal strains, as 1, 1-dichloro 2, 2-bis (4-chlorophenyl) ethane (DDD) was a major bacterial transformation product, while fungi tended to produce more DDE. Further studies on selected bacterial isolates found that the most promising bacterial strain (Bacillus sp. BHD-4) could remove 51% of DDT from liquid culture after 7 days growth. Of the amount transformed, 6% was found as DDD and 3% as DDE suggesting that further transformation of DDT and its metabolites occurred.

Polarization of forecast densities : a new approach to time series classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time series classification has been extensively explored in many fields of study. Most methods are based on the historical or current information extracted from data. However, if interest is in a specific future time period, methods that directly relate to forecasts of time series are much more appropriate. An approach to time series classification is proposed based on a polarization measure of forecast densities of time series. By fitting autoregressive models, forecast replicates of each time series are obtained via the bias-corrected bootstrap, and a stationarity correction is considered when necessary. Kernel estimators are then employed to approximate forecast densities, and discrepancies of forecast densities of pairs of time series are estimated by a polarization measure, which evaluates the extent to which two densities overlap. Following the distributional properties of the polarization measure, a discriminant rule and a clustering method are proposed to conduct the supervised and unsupervised classification, respectively. The proposed methodology is applied to both simulated and real data sets, and the results show desirable properties.

A pilot study on affective classification of facial images for emerging news topics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The proliferation of news reports published in online websites and news information sharing among social media users necessitates effective techniques for analysing the image, text and video data related to news topics. This paper presents the first study to classify affective facial images on emerging news topics. The proposed system dynamically monitors and selects the current hot (of great interest) news topics with strong affective interestingness using textual keywords in news articles and social media discussions. Images from the selected hot topics are extracted and classified into three categorized emotions, positive, neutral and negative, based on facial expressions of subjects in the images. Performance evaluations on two facial image datasets collected from real-world resources demonstrate the applicability and effectiveness of the proposed system in affective classification of facial images in news reports. Facial expression shows high consistency with the affective textual content in news reports for positive emotion, while only low correlation has been observed for neutral and negative. The system can be directly used for applications, such as assisting editors in choosing photos with a proper affective semantic for a certain topic during news report preparation.

Large scale read classification for next generation sequencing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.

The relationship between N2O, NO, and N2 fluxes from fertilized and irrigated dryland soils of the Aral Sea Basin, Uzbekistan

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Microbial respiratory reduction of nitrous oxide (N2O) to dinitrogen (N2) via denitrification plays a key role within the global N-cycle since it is the most important process for converting reactive nitrogen back into inert molecular N2. However, due to methodological constraints, we still lack a comprehensive, quantitative understanding of denitrification rates and controlling factors across various ecosystems. We investigated N2, N2O and NO emissions from irrigated cotton fields within the Aral Sera Basin using the He/O2 atmosphere gas flow soil core technique and an incubation assay. NH4NO3 fertilizer, equivalent to 75 kg ha−1 and irrigation water, adjusting the water holding capacity to 70, 100 and 130% were applied to the incubation vessels to assess its influence on gaseous N emissions. Under soil conditions as they are naturally found after concomitant irrigation and fertilization, denitrification was the dominant process and N2 the main end product of denitrification. The mean ratios of N2/N2O emissions increased with increasing soil moisture content. N2 emissions exceeded N2O emissions by a factor of 5 ± 2 at 70% soil water holding capacity (WHC) and a factor of 55 ± 27 at 130% WHC. The mean ratios of N2O/NO emissions varied between 1.5 ± 0.4 (70% WHC) and 644 ± 108 (130% WHC). The magnitude of N2 emissions for irrigated cotton was estimated to be in the range of 24 ± 9 to 175 ± 65 kg-N ha−1season−1, while emissions of NO were only of minor importance (between 0.1 to 0.7 kg-N ha−1 season−1). The findings demonstrate that for irrigated dryland soils in the Aral Sera Basin, denitrification is a major pathway of N-loss and that substantial amounts of N-fertilizer are lost as N2 to the atmosphere for irrigated dryland soils.

«
1
2
...
6
7
8
9
10
11
12
13
14
»