970 resultados para Datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study examines the potential of next-generation sequencing based ‘genotyping-by-sequencing’ (GBS) of microsatellite loci for rapid and cost-effective genotyping in large-scale population genetic studies. The recovery of individual genotypes from large sequence pools was achieved by PCR-incorporated combinatorial barcoding using universal primers. Three experimental conditions were employed to explore the possibility of using this approach with existing and novel multiplex marker panels and weighted amplicon mixture. The GBS approach was validated against microsatellite data generated by capillary electrophoresis. GBS allows access to the underlying nucleotide sequences that can reveal homoplasy, even in large datasets and facilitates cross laboratory transfer. GBS of microsatellites, using individual combinatorial barcoding, is potentially faster and cheaper than current microsatellite approaches and offers better and more data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a new wrapper feature selection algorithm for human detection. This algorithm is a hybrid featureselection approach combining the benefits of filter and wrapper methods. It allows the selection of an optimalfeature vector that well represents the shapes of the subjects in the images. In detail, the proposed featureselection algorithm adopts the k-fold subsampling and sequential backward elimination approach, while thestandard linear support vector machine (SVM) is used as the classifier for human detection. We apply theproposed algorithm to the publicly accessible INRIA and ETH pedestrian full image datasets with the PASCALVOC evaluation criteria. Compared to other state of the arts algorithms, our feature selection based approachcan improve the detection speed of the SVM classifier by over 50% with up to 2% better detection accuracy.Our algorithm also outperforms the equivalent systems introduced in the deformable part model approach witharound 9% improvement in the detection accuracy

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the rapid development of internet-of-things (IoT), face scrambling has been proposed for privacy protection during IoT-targeted image/video distribution. Consequently in these IoT applications, biometric verification needs to be carried out in the scrambled domain, presenting significant challenges in face recognition. Since face models become chaotic signals after scrambling/encryption, a typical solution is to utilize traditional data-driven face recognition algorithms. While chaotic pattern recognition is still a challenging task, in this paper we propose a new ensemble approach – Many-Kernel Random Discriminant Analysis (MK-RDA) to discover discriminative patterns from chaotic signals. We also incorporate a salience-aware strategy into the proposed ensemble method to handle chaotic facial patterns in the scrambled domain, where random selections of features are made on semantic components via salience modelling. In our experiments, the proposed MK-RDA was tested rigorously on three human face datasets: the ORL face dataset, the PIE face dataset and the PUBFIG wild face dataset. The experimental results successfully demonstrate that the proposed scheme can effectively handle chaotic signals and significantly improve the recognition accuracy, making our method a promising candidate for secure biometric verification in emerging IoT applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose:
A number of independent gene expression profiling studies have identified transcriptional subtypes in colorectal cancer (CRC) with potential diagnostic utility, culminating in publication of a CRC Consensus Molecular Subtype classification. The worst prognostic subtype has been defined by genes associated with stem-like biology. Recently, it has been shown that the majority of genes associated with this poor prognostic group are stromal-derived. We investigated the potential for tumor misclassification into multiple diagnostic subgroups based on tumoral region sampled.

Experimental Design:
We performed multi-region tissue RNA extraction/transcriptomic analysis using Colorectal Specific Arrays on invasive front, central tumor and lymph node regions selected from tissue samples from 25 CRC patients.

Results:
We identified a consensus 30 gene list which represents the intratumoral heterogeneity within a cohort of primary CRC tumors. Using a series of online datasets, we showed that this gene list displays prognostic potential (HR=2.914 (CI 0.9286-9.162) in stage II/III CRC patients, but in addition we demonstrated that these genes are stromal derived, challenging the assumption that poor prognosis tumors with stem-like biology have undergone a widespread Epithelial Mesenchymal Transition (EMT). Most importantly, we showed that patients can be simultaneously classified into multiple diagnostically relevant subgroups based purely on the tumoral region analysed.

Conclusions:
Gene expression profiles derived from the non-malignant stromal region can influence assignment of CRC transcriptional subtypes, questioning the current molecular classification dogma and highlighting the need to consider pathology sampling region and degree of stromal infiltration when employing transcription-based classifiers to underpin clinical decision-making in CRC.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Morphological changes in the retinal vascular network are associated with future risk of many systemic and vascular diseases. However, uncertainty over the presence and nature of some of these associations exists. Analysis of data from large population based studies will help to resolve these uncertainties. The QUARTZ (QUantitative Analysis of Retinal vessel Topology and siZe) retinal image analysis system allows automated processing of large numbers of retinal images. However, an image quality assessment module is needed to achieve full automation. In this paper, we propose such an algorithm, which uses the segmented vessel map to determine the suitability of retinal images for use in the creation of vessel morphometric data suitable for epidemiological studies. This includes an effective 3-dimensional feature set and support vector machine classification. A random subset of 800 retinal images from UK Biobank (a large prospective study of 500,000 middle aged adults; where 68,151 underwent retinal imaging) was used to examine the performance of the image quality algorithm. The algorithm achieved a sensitivity of 95.33% and a specificity of 91.13% for the detection of inadequate images. The strong performance of this image quality algorithm will make rapid automated analysis of vascular morphometry feasible on the entire UK Biobank dataset (and other large retinal datasets), with minimal operator involvement, and at low cost.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Urothelial cancer (UC) is highly recurrent and can progress from non-invasive (NMIUC) to a more aggressive muscle-invasive (MIUC) subtype that invades the muscle tissue layer of the bladder. We present a proof of principle study that network-based features of gene pairs can be used to improve classifier performance and the functional analysis of urothelial cancer gene expression data. In the first step of our procedure each individual sample of a UC gene expression dataset is inflated by gene pair expression ratios that are defined based on a given network structure. In the second step an elastic net feature selection procedure for network-based signatures is applied to discriminate between NMIUC and MIUC samples. We performed a repeated random subsampling cross validation in three independent datasets. The network signatures were characterized by a functional enrichment analysis and studied for the enrichment of known cancer genes. We observed that the network-based gene signatures from meta collections of proteinprotein interaction (PPI) databases such as CPDB and the PPI databases HPRD and BioGrid improved the classification performance compared to single gene based signatures. The network based signatures that were derived from PPI databases showed a prominent enrichment of cancer genes (e.g., TP53, TRIM27 and HNRNPA2Bl). We provide a novel integrative approach for large-scale gene expression analysis for the identification and development of novel diagnostical targets in bladder cancer. Further, our method allowed to link cancer gene associations to network-based expression signatures that are not observed in gene-based expression signatures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An evaluation of the global atmospheric energetics is presented in the framework of the basic decomposition into the zonal mean and eddy components, the zonal wavenumber decomposition, and the three-dimensional normal mode decomposition. An extension to the normal mode energetics formulation is also presented in the study, which enables the explicit evaluation of the conversion rate between available potential energy and kinetic energy along with their generation and dissipation rates, in both the zonal wavenumber and vertical mode domains. In addition, it has been proposed an extended energy cycle diagram describing the flow of energy among the zonal mean and eddy components, and also among the barotropic and baroclinic components. The energetics is first assessed for three reanalysis datasets and five state-ofthe- art climate models simulations representing the present climate conditions. It is performed a comparative analysis between the observationally based energetics and that based on the climate models' simulations. In order to appraise possible changes in the atmospheric energetics of a future climate scenario relative to that of the present climate conditions, the analysis is extended using the datasets simulated by the same five climate models for a future climate scenario experiment, as defined in the Special Report on Emissions Scenarios (SRES) of the Intergovernmental Panel on Climate Change (IPCC).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information Visualization is gradually emerging to assist the representation and comprehension of large datasets about Higher Education Institutions, making the data more easily understood. The importance of gaining insights and knowledge regarding higher education institutions is little disputed. Within this knowledge, the emerging and urging area in need of a systematic understanding is the use of communication technologies, area that is having a transformative impact on educational practices worldwide. This study focused on the need to visually represent a dataset about how Portuguese Public Higher Education Institutions are using Communication Technologies as a support to teaching and learning processes. Project TRACER identified this need, regarding the Portuguese public higher education context, and carried out a national data collection. This study was developed within project TRACER, and worked with the dataset collected in order to conceptualize an information visualization tool U-TRACER®. The main goals of this study related to: conceptualization of the information visualization tool U-TRACER®, to represent the data collected by project TRACER; understand higher education decision makers perception of usefulness regarding the tool. The goals allowed us to contextualize the phenomenon of information visualization tools regarding higher education data, realizing the existing trends. The research undertaken was of qualitative nature, and followed the method of case study with four moments of data collection.The first moment regarded the conceptualization of the U-TRACER®, with two focus group sessions with Higher Education professionals, with the aim of defining the interaction features the U-TRACER® should offer. The second data collection moment involved the proposal of the graphical displays that would represent the dataset, which reading effectiveness was tested by end-users. The third moment involved the development of a usability test to the UTRACER ® performed by higher education professionals and which resulted in the proposal of improvements to the final prototype of the tool. The fourth moment of data collection involved conducting exploratory, semi-structured interviews, to the institutional decision makers regarding their perceived usefulness of the U-TRACER®. We consider that the results of this study contribute towards two moments of reflection. The challenges of involving end-users in the conceptualization of an information visualization tool; the relevance of effective visual displays for an effective communication of the data and information. The second relates to the reflection about how the higher education decision makers, stakeholders of the U-TRACER® tool, perceive usefulness of the tool, both for communicating their institutions data and for benchmarking exercises, as well as a support for decision processes. Also to reflect on the main concerns about opening up data about higher education institutions in a global market.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biodiversity continues to decline in the face of increasing anthropogenic pressures such as habitat destruction, exploitation, pollution and introduction of alien species. Existing global databases of species’ threat status or population time series are dominated by charismatic species. The collation of datasets with broad taxonomic and biogeographic extents, and that support computation of a range of biodiversity indicators, is necessary to enable better understanding of historical declines and to project – and avert – future declines. We describe and assess a new database of more than 1.6 million samples from 78 countries representing over 28,000 species, collated from existing spatial comparisons of local-scale biodiversity exposed to different intensities and types of anthropogenic pressures, from terrestrial sites around the world. The database contains measurements taken in 208 (of 814) ecoregions, 13 (of 14) biomes, 25 (of 35) biodiversity hotspots and 16 (of 17) megadiverse countries. The database contains more than 1% of the total number of all species described, and more than 1% of the described species within many taxonomic groups – including flowering plants, gymnosperms, birds, mammals, reptiles, amphibians, beetles, lepidopterans and hymenopterans. The dataset, which is still being added to, is therefore already considerably larger and more representative than those used by previous quantitative models of biodiversity trends and responses. The database is being assembled as part of the PREDICTS project (Projecting Responses of Ecological Diversity In Changing Terrestrial Systems – www.predicts.org.uk). We make site-level summary data available alongside this article. The full database will be publicly available in 2015.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tese dout., Philosophy, Lancaster University, 2010

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tese de dout., Ciências do Mar, da Terra e do Ambiente (Ciências do Mar-Oceanografia Física), Faculdade de Ciências e Tecnologia, Univ. do Algarve, 2011

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tese de doutoramento, Informática (Bioinformática), Universidade de Lisboa, Faculdade de Ciências, 2014

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tese de doutoramento, Ciências Geofísicas e da Geoinformação (Geofísica), Universidade de Lisboa, Faculdade de Ciências, 2014

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2015

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2015