955 resultados para Large datasets


Relevância:

30.00% 30.00%

Publicador:

Resumo:

L'increment de bases de dades que cada vegada contenen imatges més difícils i amb un nombre més elevat de categories, està forçant el desenvolupament de tècniques de representació d'imatges que siguin discriminatives quan es vol treballar amb múltiples classes i d'algorismes que siguin eficients en l'aprenentatge i classificació. Aquesta tesi explora el problema de classificar les imatges segons l'objecte que contenen quan es disposa d'un gran nombre de categories. Primerament s'investiga com un sistema híbrid format per un model generatiu i un model discriminatiu pot beneficiar la tasca de classificació d'imatges on el nivell d'anotació humà sigui mínim. Per aquesta tasca introduïm un nou vocabulari utilitzant una representació densa de descriptors color-SIFT, i desprès s'investiga com els diferents paràmetres afecten la classificació final. Tot seguit es proposa un mètode par tal d'incorporar informació espacial amb el sistema híbrid, mostrant que la informació de context es de gran ajuda per la classificació d'imatges. Desprès introduïm un nou descriptor de forma que representa la imatge segons la seva forma local i la seva forma espacial, tot junt amb un kernel que incorpora aquesta informació espacial en forma piramidal. La forma es representada per un vector compacte obtenint un descriptor molt adequat per ésser utilitzat amb algorismes d'aprenentatge amb kernels. Els experiments realitzats postren que aquesta informació de forma te uns resultats semblants (i a vegades millors) als descriptors basats en aparença. També s'investiga com diferents característiques es poden combinar per ésser utilitzades en la classificació d'imatges i es mostra com el descriptor de forma proposat juntament amb un descriptor d'aparença millora substancialment la classificació. Finalment es descriu un algoritme que detecta les regions d'interès automàticament durant l'entrenament i la classificació. Això proporciona un mètode per inhibir el fons de la imatge i afegeix invariança a la posició dels objectes dins les imatges. S'ensenya que la forma i l'aparença sobre aquesta regió d'interès i utilitzant els classificadors random forests millora la classificació i el temps computacional. Es comparen els postres resultats amb resultats de la literatura utilitzant les mateixes bases de dades que els autors Aixa com els mateixos protocols d'aprenentatge i classificació. Es veu com totes les innovacions introduïdes incrementen la classificació final de les imatges.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Environmental Data Abstraction Library provides a modular data management library for bringing new and diverse datatypes together for visualisation within numerous software packages, including the ncWMS viewing service, which already has very wide international uptake. The structure of EDAL is presented along with examples of its use to compare satellite, model and in situ data types within the same visualisation framework. We emphasize the value of this capability for cross calibration of datasets and evaluation of model products against observations, including preparation for data assimilation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent progress in the technology for single unit recordings has given the neuroscientific community theopportunity to record the spiking activity of large neuronal populations. At the same pace, statistical andmathematical tools were developed to deal with high-dimensional datasets typical of such recordings.A major line of research investigates the functional role of subsets of neurons with significant co-firingbehavior: the Hebbian cell assemblies. Here we review three linear methods for the detection of cellassemblies in large neuronal populations that rely on principal and independent component analysis.Based on their performance in spike train simulations, we propose a modified framework that incorpo-rates multiple features of these previous methods. We apply the new framework to actual single unitrecordings and show the existence of cell assemblies in the rat hippocampus, which typically oscillate attheta frequencies and couple to different phases of the underlying field rhythm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Concerns over global change and its effect on coral reef survivorship have highlighted the need for long-term datasets and proxy records, to interpret environmental trends and inform policymakers. Citizen science programs have showed to be a valid method for collecting data, reducing financial and time costs for institutions. This study is based on the elaboration of data collected by recreational divers and its main purpose is to evaluate changes in the state of coral reef biodiversity in the Red Sea over a long term period and validate the volunteer-based monitoring method. Volunteers recreational divers completed a questionnaire after each dive, recording the presence of 72 animal taxa and negative reef conditions. Comparisons were made between records from volunteers and independent records from a marine biologist who performed the same dive at the same time. A total of 500 volunteers were tested in 78 validation trials. Relative values of accuracy, reliability and similarity seem to be comparable to those performed by volunteer divers on precise transects in other projects, or in community-based terrestrial monitoring. 9301 recreational divers participated in the monitoring program, completing 23,059 survey questionnaires in a 5-year period. The volunteer-sightings-based index showed significant differences between the geographical areas. The area of Hurghada is distinguished by a medium-low biodiversity index, heavily damaged by a not controlled anthropic exploitation. Coral reefs along the Ras Mohammed National Park at Sharm el Sheikh, conversely showed high biodiversity index. The detected pattern seems to be correlated with the conservation measures adopted. In our experience and that of other research institutes, citizen science can integrate conventional methods and significantly reduce costs and time. Involving recreational divers we were able to build a large data set, covering a wide geographic area. The main limitation remains the difficulty of obtaining an homogeneous spatial sampling distribution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Deep tissue imaging has become state of the art in biology, but now the problem is to quantify spatial information in a global, organ-wide context. Although access to the raw data is no longer a limitation, the computational tools to extract biologically useful information out of these large data sets is still catching up. In many cases, to understand the mechanism behind a biological process, where molecules or cells interact with each other, it is mandatory to know their mutual positions. We illustrate this principle here with the immune system. Although the general functions of lymph nodes as immune sentinels are well described, many cellular and molecular details governing the interactions of lymphocytes and dendritic cells remain unclear to date and prevent an in-depth mechanistic understanding of the immune system. We imaged ex vivo lymph nodes isolated from both wild-type and transgenic mice lacking key factors for dendritic cell positioning and used software written in MATLAB to determine the spatial distances between the dendritic cells and the internal high endothelial vascular network. This allowed us to quantify the spatial localization of the dendritic cells in the lymph node, which is a critical parameter determining the effectiveness of an adaptive immune response.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transcriptomics could contribute significantly to the early and specific diagnosis of rejection episodes by defining 'molecular Banff' signatures. Recently, the description of pathogenesis-based transcript sets offered a new opportunity for objective and quantitative diagnosis. Generating high-quality transcript panels is thus critical to define high-performance diagnostic classifier. In this study, a comparative analysis was performed across four different microarray datasets of heterogeneous sample collections from two published clinical datasets and two own datasets including biopsies for clinical indication, and samples from nonhuman primates. We characterized a common transcriptional profile of 70 genes, defined as acute rejection transcript set (ARTS). ARTS expression is significantly up-regulated in all AR samples as compared with stable allografts or healthy kidneys, and strongly correlates with the severity of Banff AR types. Similarly, ARTS were tested as a classifier in a large collection of 143 independent biopsies recently published by the University of Alberta. Results demonstrate that the 'in silico' approach applied in this study is able to identify a robust and reliable molecular signature for AR, supporting a specific and sensitive molecular diagnostic approach for renal transplant monitoring.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a "cosmopolitan" tagging approach to capture the genetic diversity across approximately 2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Economists and other social scientists often face situations where they have access to two datasets that they can use but one set of data suffers from censoring or truncation. If the censored sample is much bigger than the uncensored sample, it is common for researchers to use the censored sample alone and attempt to deal with the problem of partial observation in some manner. Alternatively, they simply use only the uncensored sample and ignore the censored one so as to avoid biases. It is rarely the case that researchers use both datasets together, mainly because they lack guidance about how to combine them. In this paper, we develop a tractable semiparametric framework for combining the censored and uncensored datasets so that the resulting estimators are consistent, asymptotically normal, and use all information optimally. When the censored sample, which we refer to as the master sample, is much bigger than the uncensored sample (which we call the refreshment sample), the latter can be thought of as providing identification where it is otherwise absent. In contrast, when the refreshment sample is large and could typically be used alone, our methodology can be interpreted as using information from the censored sample to increase effciency. To illustrate our results in an empirical setting, we show how to estimate the effect of changes in compulsory schooling laws on age at first marriage, a variable that is censored for younger individuals. We also demonstrate how refreshment samples for this application can be created by matching cohort information across census datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Species composition, abundance, and biomass of phytoplankton in the surface water layer were determined at 10 stations in the central part of the Western Basin (WB) and at one station in the Eastern Basin (EB) of the Large Aral Sea. 42 algal species were found. Diatoms had the highest number of species. Similarity of phytoplankton composition in the WB was high, whereas phytoplankton composition in the WB and EB differed significantly. In WB abundance and biomass of phytoplankton varied from 826x10**3 to 6312x10**3 cells/l (aver. 1877x10**3 cells/l) and from 53 to 241 ?g C/l (aver. 95 ?g C/l). In EB the phytoplankton abundance was 915x10**3 cells/l and 93 ?g C/l. Vertical distribution of phytoplankton in upper 35 m was investigated at one station in WB. Maximum values of phytoplankton abundance and biomass were recorded under the thermocline at 20 m depth. Integrated biomass of phytoplankton was 14 g C/m**2.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study characterises the shape of the flow separation zone (FSZ) and wake region over large asymmetric bedforms under tidal flow conditions. High resolution bathymetry, flow velocity and turbulence data were measured along two parallel transects in a tidal channel covered with bedforms. The field data are used to verify the applicability of a numerical model for a systematic study using the Delft3D modelling system and test the model sensitivity to roughness length. Three experiments are then conducted to investigate how the FSZ size and wake extent vary depending on tidally-varying flow conditions, water levels and bathymetry. During the ebb, a large FSZ occurs over the steep lee side of each bedform. During the flood, no flow separation develops over the bedforms having a flat crest; however, a small FSZ is observed over the steepest part of the crest of some bedforms, where the slope is locally up to 15°. Over a given bedform morphology and constant water levels, no FSZ occurs for velocity magnitudes smaller than 0.1 m s**-1; as the flow accelerates, the FSZ reaches a stable size for velocity magnitudes greater than 0.4 m s**-1. The shape of the FSZ is not influenced by changes in water levels. On the other hand, variations in bed morphology, as recorded from the high-resolution bathymetry collected during the tidal cycle, influence the size and position of the FSZ: a FSZ develops only when the maximum lee side slope over a horizontal distance of 5 m is greater than 10°. The height and length of the wake region are related to the length of the FSZ. The total roughness along the transect lines is an order of magnitude larger during the ebb than during the flood due to flow direction in relation to bedform asymmetry: during the ebb, roughness is created by the large bedforms because a FSZ and wake develops over the steep lee side. The results add to the understanding of hydrodynamics of natural bedforms in a tidal environment and may be used to better parameterise small-scale processes in large-scale studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Marine mammals forage in dynamic environments characterized by variables that are continuously changing in relation to large-scale oceanographic processes. In the present study, behavioural states of satellite-tagged juvenile southern elephant seals (n = 16) from Marion Island were assessed for each reliable location, using variation in turning angle and speed in a state-space modelling framework. A mixed modelling approach was used to analyse the behavioural response of juvenile southern elephant seals to sea-surface temperature and proximity to frontal and bathymetric features. The findings emphasised the importance of frontal features as potentially rewarding areas for foraging juvenile southern elephant seals and provided further evidence of the importance of the area west of Marion Island for higher trophic-level predators. The importance of bathymetric features during the transit phase of juvenile southern elephant seal migrations indicates the use of these features as possible navigational cues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transmission electron microscopy observations and rock magnetic measurements reveal that alteration of fine- and large-grained iron-titanium oxides can occur at different rates. Fine-grained titanomagnetite occurs as a crystallization product within interstitial glass that originated as an immiscible liquid within a fully differentiated melt; in several samples with ages to 32 Ma it displays very little or no oxidation (z = ca. 0). In contrast, samples with ages of 10 Ma or older are observed to also contain highly oxidized (z >/= 0.66) large-grained titanomaghemite. These large grains, having originated by direct crystallization from melt, are associated with pore space. Such pore space can serve as a conduit for fluids that promote alteration, whereas fine grains may have been "armored" against alteration by the glass matrix in which they are embedded. Apparently, alteration of oceanic crust is a heterogeneous process on a microscopic scale. The existence of pristine, fine-grained titanomagnetite in the interstitial glass of older ocean-floor basalts that have undergone significant alteration implies that such glassy material is capable of carrying original thermal remanent magnetization and may be suitable for paleointensity determinations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

High temporal resolution (three hours) records of temperature, wind speed and sea level pressure recorded at Antarctic research station Neumayer (70°S, 8°W) during 1982-2011 are analysed to identify oscillations from daily to intraseasonal timescales. The diurnal cycle dominates the three-hourly time series of temperature during the Antarctic summer and is almost absent during winter. In contrast, the three-hourly time series of wind speed and sea level pressure show a weak diurnal cycle. The dominant pattern of the intraseasonal variability of these quantities, which captures the out-of-phase variation of temperature and wind speed with sea level pressure, shows enhanced variability at timescales of ~ 40 days and ~ 80 days, respectively. Correlation and composite analysis reveal that these oscillations may be related to tropical intraseasonal oscillations via large-scale eastward propagating atmospheric circulation wave-trains. The second pattern of intraseasonal variability, which captures in-phase variations of temperature, wind and sea level pressure, shows enhanced variability at timescales of ~ 35, ~ 60 and ~ 120 days. These oscillations are attributed to the Southern Annular Mode/Antarctic Oscillation (SAM/AAO) which shows enhanced variability at these timescales. We argue that intraseasonal oscillations of tropical climate and SAM/AAO are related to distinct patterns of climate variables measured at Neumayer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The analysis of time-dependent data is an important problem in many application domains, and interactive visualization of time-series data can help in understanding patterns in large time series data. Many effective approaches already exist for visual analysis of univariate time series supporting tasks such as assessment of data quality, detection of outliers, or identification of periodically or frequently occurring patterns. However, much fewer approaches exist which support multivariate time series. The existence of multiple values per time stamp makes the analysis task per se harder, and existing visualization techniques often do not scale well. We introduce an approach for visual analysis of large multivariate time-dependent data, based on the idea of projecting multivariate measurements to a 2D display, visualizing the time dimension by trajectories. We use visual data aggregation metaphors based on grouping of similar data elements to scale with multivariate time series. Aggregation procedures can either be based on statistical properties of the data or on data clustering routines. Appropriately defined user controls allow to navigate and explore the data and interactively steer the parameters of the data aggregation to enhance data analysis. We present an implementation of our approach and apply it on a comprehensive data set from the field of earth bservation, demonstrating the applicability and usefulness of our approach.