841 resultados para Imbalanced datasets


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Numerous studies show that increasing species richness leads to higher ecosystem productivity. This effect is often attributed to more efficient portioning of multiple resources in communities with higher numbers of competing species, indicating the role of resource supply and stoichiometry for biodiversity-ecosystem functioning relationships. Here, we merged theory on ecological stoichiometry with a framework of biodiversity-ecosystem functioning to understand how resource use transfers into primary production. We applied a structural equation model to define patterns of diversity-productivity relationships with respect to available resources. Meta-analysis was used to summarize the findings across ecosystem types ranging from aquatic ecosystems to grasslands and forests. As hypothesized, resource supply increased realized productivity and richness, but we found significant differences between ecosystems and study types. Increased richness was associated with increased productivity, although this effect was not seen in experiments. More even communities had lower productivity, indicating that biomass production is often maintained by a few dominant species, and reduced dominance generally reduced ecosystem productivity. This synthesis, which integrates observational and experimental studies in a variety of ecosystems and geographical regions, exposes common patterns and differences in biodiversity-functioning relationships, and increases the mechanistic understanding of changes in ecosystems productivity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Supply Chain Simulation (SCS) is applied to acquire information to support outsourcing decisions but obtaining enough detail in key parameters can often be a barrier to making well informed decisions.
One aspect of SCS that has been relatively unexplored is the impact of inaccurate data around delays within the SC. The impact of the magnitude and variability of process cycle time on typical performance indicators in a SC context is studied.
System cycle time, WIP levels and throughput are more sensitive to the magnitude of deterministic deviations in process cycle time than variable deviations. Manufacturing costs are not very sensitive to these deviations.
Future opportunities include investigating the impact of process failure or product defects, including logistics and transportation between SC members and using alternative costing methodologies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A collaboration between dot.rural at the University of Aberdeen and the iSchool at Northumbria University, POWkist is a pilot-study exploring potential usages of currently available linked datasets within the cultural heritage domain. Many privately-held family history collections (shoebox archives) remain vulnerable unless a sustainable, affordable and accessible model of citizen-archivist digital preservation can be offered. Citizen-historians have used the web as a platform to preserve cultural heritage, however with no accessible or sustainable model these digital footprints have been ad hoc and rarely connected to broader historical research. Similarly, current approaches to connecting material on the web by exploiting linked datasets do not take into account the data characteristics of the cultural heritage domain. Funded by Semantic Media, the POWKist project is investigating how best to capture, curate, connect and present the contents of citizen-historians’ shoebox archives in an accessible and sustainable online collection. Using the Curios platform - an open-source digital archive - we have digitised a collection relating to a prisoner of war during WWII (1939-1945). Following a series of user group workshops, POWkist is now connecting these ‘made digital’ items with the broader web using a semantic technology model and identifying appropriate linked datasets of relevant content such as DBPedia (an archived linked dataset of Wikipedia) and Ordnance Survey Open Data. We are analysing the characteristics of cultural heritage linked datasets, so that these materials are better visualised, contextualised and presented in an attractive and comprehensive user interface. Our paper will consider the issues we have identified, the solutions we are developing and include a demonstration of our work-in-progress.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Global land cover maps play an important role in the understanding of the Earth's ecosystem dynamic. Several global land cover maps have been produced recently namely, Global Land Cover Share (GLC-Share) and GlobeLand30. These datasets are very useful sources of land cover information and potential users and producers are many times interested in comparing these datasets. However these global land cover maps are produced based on different techniques and using different classification schemes making their interoperability in a standardized way a challenge. The Environmental Information and Observation Network (EIONET) Action Group on Land Monitoring in Europe (EAGLE) concept was developed in order to translate the differences in the classification schemes into a standardized format which allows a comparison between class definitions. This is done by elaborating an EAGLE matrix for each classification scheme, where a bar code is assigned to each class definition that compose a certain land cover class. Ahlqvist (2005) developed an overlap metric to cope with semantic uncertainty of geographical concepts, providing this way a measure of how geographical concepts are more related to each other. In this paper, the comparison of global land cover datasets is done by translating each land cover legend into the EAGLE bar coding for the Land Cover Components of the EAGLE matrix. The bar coding values assigned to each class definition are transformed in a fuzzy function that is used to compute the overlap metric proposed by Ahlqvist (2005) and overlap matrices between land cover legends are elaborated. The overlap matrices allow the semantic comparison between the classification schemes of each global land cover map. The proposed methodology is tested on a case study where the overlap metric proposed by Ahlqvist (2005) is computed in the comparison of two global land cover maps for Continental Portugal. The study resulted with the overlap spatial distribution among the two global land cover maps, Globeland30 and GLC-Share. These results shows that Globeland30 product overlap with a degree of 77% with GLC-Share product in Continental Portugal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Inter-subject parcellation of functional Magnetic Resonance Imaging (fMRI) data based on a standard General Linear Model (GLM) and spectral clustering was recently proposed as a means to alleviate the issues associated with spatial normalization in fMRI. However, for all its appeal, a GLM-based parcellation approach introduces its own biases, in the form of a priori knowledge about the shape of Hemodynamic Response Function (HRF) and task-related signal changes, or about the subject behaviour during the task. In this paper, we introduce a data-driven version of the spectral clustering parcellation, based on Independent Component Analysis (ICA) and Partial Least Squares (PLS) instead of the GLM. First, a number of independent components are automatically selected. Seed voxels are then obtained from the associated ICA maps and we compute the PLS latent variables between the fMRI signal of the seed voxels (which covers regional variations of the HRF) and the principal components of the signal across all voxels. Finally, we parcellate all subjects data with a spectral clustering of the PLS latent variables. We present results of the application of the proposed method on both single-subject and multi-subject fMRI datasets. Preliminary experimental results, evaluated with intra-parcel variance of GLM t-values and PLS derived t-values, indicate that this data-driven approach offers improvement in terms of parcellation accuracy over GLM based techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Forecasting abrupt variations in wind power generation (the so-called ramps) helps achieve large scale wind power integration. One of the main issues to be confronted when addressing wind power ramp forecasting is the way in which relevant information is identified from large datasets to optimally feed forecasting models. To this end, an innovative methodology oriented to systematically relate multivariate datasets to ramp events is presented. The methodology comprises two stages: the identification of relevant features in the data and the assessment of the dependence between these features and ramp occurrence. As a test case, the proposed methodology was employed to explore the relationships between atmospheric dynamics at the global/synoptic scales and ramp events experienced in two wind farms located in Spain. The achieved results suggested different connection degrees between these atmospheric scales and ramp occurrence. For one of the wind farms, it was found that ramp events could be partly explained from regional circulations and zonal pressure gradients. To perform a comprehensive analysis of ramp underlying causes, the proposed methodology could be applied to datasets related to other stages of the wind-topower conversion chain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The exploitation of hydrocarbon reservoirs by the oil and gas industries represents one of the most relevant and concerning anthropic stressor in various marine areas worldwide and the presence of extractive structures can have severe consequences on the marine environment. Environmental monitoring surveys are carried out to monitor the effects and impacts of offshore energy facilities. Macrobenthic communities, inhabiting the soft-bottom, represent a key component of these surveys given their great responsiveness to natural and anthropic changes. A comprehensive collection of monitoring data from four Italian seas was used to investigate distributional pattern of macrozoobenthos assemblages confirming a high spatial variability in relation to the environmental variables analyzed. Since these datasets could represent a powerful tool for the industrial and scientific research, the steps and standardized procedures needed to obtain robust and comparable high-quality data were investigated and outlined. Over recent years, decommissioning of old platforms is a growing topic in this sector, involving many actors in the various decision-making processes. A Multi-Criteria Decision Analysis, specific for the Adriatic Sea, was developed to investigate the impacts of decommissioning of a gas platform on environmental and socio-economic aspects, to select the best decommissioning scenario. From the scenarios studied, the most impacting one has resulted to be total removal, affecting all the faunal component considered in the study. Currently, the European nations are increasing the production of energy from offshore wind farms with an exponential expansion. A comparative study of methodologies used five countries of the North Sea countries was carried out to investigate the best approaches to monitor the effects of wind farms on the benthic communities. In the foreseeable future, collaboration between industry, scientific communities, national and international policies are needed to gain knowledge concerning the effects of these industrial activities on the ecological status of the ecosystems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this thesis is the study of the normal phase of a mass imbalanced and polarized ultra-cold Fermi gas in the context of the BCS-BEC crossover, using a diagrammatic approach known as t-matrix approximation. More specifically, the calculations are implemented using the fully self-consistent t-matrix (or Luttinger- Ward) approach, which is already experimentally and numerically validated for the balanced case. An imbalance (polarization) between the two spin populations works against pairing and superfluidity. For sufficiently large polarization (and not too strong attraction) the system remains in the normal phase even at zero temperature. This phase is expected to be well described by the Landau’s Fermi liquid theory. By reducing the spin polarization, a critical imbalance is reached where a quantum phase transition towards a superfluid phase occurs and the Fermi liquid description breaks down. Depending on the strength of the interaction, the exotic superfluid phase at the quantum critical point (QCP) can be either a FFLO phase (Fulde-Ferrell-Larkin-Ovchinnikov) or a Sarma phase. In this regard, the presence of mass imbalance can strongly influence the nature of the QCP, by favouring one of these two exotic types of pairing over the other, depending on whether the majority of the two species is heavier or lighter than the minority. The analysis of the system is made by focusing on the temperature-coupling-polarization phase diagram for different mass ratios of the two components and on the study of different thermodynamic quantities at finite temperature. The evolution towards a non-Fermi liquid behavior at the QCP is investigated by calculating the fermionic quasi-particle residues, the effective masses and the self-energies at zero temperature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Diabetic Retinopathy (DR) is a complication of diabetes that can lead to blindness if not readily discovered. Automated screening algorithms have the potential to improve identification of patients who need further medical attention. However, the identification of lesions must be accurate to be useful for clinical application. The bag-of-visual-words (BoVW) algorithm employs a maximum-margin classifier in a flexible framework that is able to detect the most common DR-related lesions such as microaneurysms, cotton-wool spots and hard exudates. BoVW allows to bypass the need for pre- and post-processing of the retinographic images, as well as the need of specific ad hoc techniques for identification of each type of lesion. An extensive evaluation of the BoVW model, using three large retinograph datasets (DR1, DR2 and Messidor) with different resolution and collected by different healthcare personnel, was performed. The results demonstrate that the BoVW classification approach can identify different lesions within an image without having to utilize different algorithms for each lesion reducing processing time and providing a more flexible diagnostic system. Our BoVW scheme is based on sparse low-level feature detection with a Speeded-Up Robust Features (SURF) local descriptor, and mid-level features based on semi-soft coding with max pooling. The best BoVW representation for retinal image classification was an area under the receiver operating characteristic curve (AUC-ROC) of 97.8% (exudates) and 93.5% (red lesions), applying a cross-dataset validation protocol. To assess the accuracy for detecting cases that require referral within one year, the sparse extraction technique associated with semi-soft coding and max pooling obtained an AUC of 94.2 ± 2.0%, outperforming current methods. Those results indicate that, for retinal image classification tasks in clinical practice, BoVW is equal and, in some instances, surpasses results obtained using dense detection (widely believed to be the best choice in many vision problems) for the low-level descriptors.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at: http://www.lge.ibi.unicamp.br/lnbio/IIS/.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of this study is to test the feasibility and reproducibility of diffusion-weighted magnetic resonance imaging (DW-MRI) evaluations of the fetal brains in cases of twin-twin transfusion syndrome (TTTS). From May 2011 to June 2012, 24 patients with severe TTTS underwent MRI scans for evaluation of the fetal brains. Datasets were analyzed offline on axial DW images and apparent diffusion coefficient (ADC) maps by two radiologists. The subjective evaluation was described as the absence or presence of water diffusion restriction. The objective evaluation was performed by the placement of 20-mm(2) circular regions of interest on the DW image and ADC maps. Subjective interobserver agreement was assessed by the kappa correlation coefficient. Objective intraobserver and interobserver agreements were assessed by proportionate Bland-Altman tests. Seventy-four DW-MRI scans were performed. Sixty of them (81.1%) were considered to be of good quality. Agreement between the radiologists was 100% for the absence or presence of diffusion restriction of water. For both intraobserver and interobserver agreement of ADC measurements, proportionate Bland-Altman tests showed average percentage differences of less than 1.5% and 95% CI of less than 18% for all sites evaluated. Our data demonstrate that DW-MRI evaluation of the fetal brain in TTTS is feasible and reproducible.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sickle cell disease (SCD) pathogenesis leads to recurrent vaso-occlusive and hemolytic processes, causing numerous clinical complications including renal damage. As vasoconstrictive mechanisms may be enhanced in SCD, due to endothelial dysfunction and vasoactive protein production, we aimed to determine whether the expression of proteins of the renin-angiotensin system (RAS) may be altered in an animal model of SCD. Plasma angiotensin II (Ang II) was measured in C57BL/6 (WT) mice and mice with SCD by ELISA, while quantitative PCR was used to compare the expressions of the genes encoding the angiotensin-II-receptors 1 and 2 (AT1R and AT2R) and the angiotensin-converting enzymes (ACE1 and ACE2) in the kidneys, hearts, livers and brains of mice. The effects of hydroxyurea (HU; 50-75mg/kg/day, 4weeks) treatment on these parameters were also determined. Plasma Ang II was significantly diminished in SCD mice, compared with WT mice, in association with decreased AT1R and ACE1 expressions in SCD mice kidneys. Treatment of SCD mice with HU reduced leukocyte and platelet counts and increased plasma Ang II to levels similar to those of WT mice. HU also increased AT1R and ACE2 gene expression in the kidney and heart. Results indicate an imbalanced RAS in an SCD mouse model; HU therapy may be able to restore some RAS parameters in these mice. Further investigations regarding Ang II production and the RAS in human SCD may be warranted, as such changes may reflect or contribute to renal damage and alterations in blood pressure.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

By comparing the SEED and Pfam functional profiles of metagenomes of two Brazilian coral species with 29 datasets that are publicly available, we were able to identify some functions, such as protein secretion systems, that are overrepresented in the metagenomes of corals and may play a role in the establishment and maintenance of bacteria-coral associations. However, only a small percentage of the reads of these metagenomes could be annotated by these reference databases, which may lead to a strong bias in the comparative studies. For this reason, we have searched for identical sequences (99% of nucleotide identity) among these metagenomes in order to perform a reference-independent comparative analysis, and we were able to identify groups of microbial communities that may be under similar selective pressures. The identification of sequences shared among the metagenomes was found to be even better for the identification of groups of communities with similar niche requirements than the traditional analysis of functional profiles. This approach is not only helpful for the investigation of similarities between microbial communities with high proportion of unknown reads, but also enables an indirect overview of gene exchange between communities.