913 resultados para alta risoluzione Trentino Alto Adige data-set climatologia temperatura giornaliera orografia complessa


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In data mining, an important goal is to generate an abstraction of the data. Such an abstraction helps in reducing the space and search time requirements of the overall decision making process. Further, it is important that the abstraction is generated from the data with a small number of disk scans. We propose a novel data structure, pattern count tree (PC-tree), that can be built by scanning the database only once. PC-tree is a minimal size complete representation of the data and it can be used to represent dynamic databases with the help of knowledge that is either static or changing. We show that further compactness can be achieved by constructing the PC-tree on segmented patterns. We exploit the flexibility offered by rough sets to realize a rough PC-tree and use it for efficient and effective rough classification. To be consistent with the sizes of the branches of the PC-tree, we use upper and lower approximations of feature sets in a manner different from the conventional rough set theory. We conducted experiments using the proposed classification scheme on a large-scale hand-written digit data set. We use the experimental results to establish the efficacy of the proposed approach. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two new statistics, namely Delta(chi 2) and Delta(chi), based on the extreme value theory, were derived by Gupta et al. We use these statistics to study the direction dependence in the HST Key Project data, which provides one of the most precise measurements of the Hubble constant. We also study the non-Gaussianity in this data set using these statistics. Our results for Delta(chi 2) show that the significance of direction-dependent systematics is restricted to well below the 1 sigma confidence limit; however, the presence of non-Gaussian features is subtle. On the other hand, the Delta(chi). statistic, which is more sensitive to direction dependence, shows direction dependence systematics to be at a slightly higher confidence level, and the presence of non-Gaussian features at a level similar to the Delta(chi 2) statistic.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In most taxa, species boundaries are inferred based on differences in morphology or DNA sequences revealed by taxonomic or phylogenetic analyses. In crickets, acoustic mating signals or calling songs have species-specific structures and provide a third data set to infer species boundaries. We examined the concordance in species boundaries obtained using acoustic, morphological, and molecular data sets in the field cricket genus Itaropsis. This genus is currently described by only one valid species, Itaropsis tenella, with a broad distribution in western peninsular India and Sri Lanka. Calling songs of males sampled from four sites in peninsular India exhibited significant differences in a number of call features, suggesting the existence of multiple species. Cluster analysis of the acoustic data, molecular phylogenetic analyses, and phylogenetic analyses combining all data sets suggested the existence of three clades. Whatever the differences in calling signals, no full congruence was obtained between all the data sets, even though the resultant lineages were largely concordant with the acoustic clusters. The genus Itaropsis could thus be represented by three morphologically cryptic incipient species in peninsular India; their distributions are congruent with usual patterns of endemism in the Western Ghats, India. Song evolution is analysed through the divergence in syllable period, syllable and call duration, and dominant frequency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this study, we applied the integration methodology developed in the companion paper by Aires (2014) by using real satellite observations over the Mississippi Basin. The methodology provides basin-scale estimates of the four water budget components (precipitation P, evapotranspiration E, water storage change Delta S, and runoff R) in a two-step process: the Simple Weighting (SW) integration and a Postprocessing Filtering (PF) that imposes the water budget closure. A comparison with in situ observations of P and E demonstrated that PF improved the estimation of both components. A Closure Correction Model (CCM) has been derived from the integrated product (SW+PF) that allows to correct each observation data set independently, unlike the SW+PF method which requires simultaneous estimates of the four components. The CCM allows to standardize the various data sets for each component and highly decrease the budget residual (P - E - Delta S - R). As a direct application, the CCM was combined with the water budget equation to reconstruct missing values in any component. Results of a Monte Carlo experiment with synthetic gaps demonstrated the good performances of the method, except for the runoff data that has a variability of the same order of magnitude as the budget residual. Similarly, we proposed a reconstruction of Delta S between 1990 and 2002 where no Gravity Recovery and Climate Experiment data are available. Unlike most of the studies dealing with the water budget closure at the basin scale, only satellite observations and in situ runoff measurements are used. Consequently, the integrated data sets are model independent and can be used for model calibration or validation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study presents a comprehensive evaluation of five widely used multisatellite precipitation estimates (MPEs) against 1 degrees x 1 degrees gridded rain gauge data set as ground truth over India. One decade observations are used to assess the performance of various MPEs (Climate Prediction Center (CPC)-South Asia data set, CPC Morphing Technique (CMORPH), Precipitation Estimation From Remotely Sensed Information Using Artificial Neural Networks, Tropical Rainfall Measuring Mission's Multisatellite Precipitation Analysis (TMPA-3B42), and Global Precipitation Climatology Project). All MPEs have high detection skills of rain with larger probability of detection (POD) and smaller ``missing'' values. However, the detection sensitivity differs from one product (and also one region) to the other. While the CMORPH has the lowest sensitivity of detecting rain, CPC shows highest sensitivity and often overdetects rain, as evidenced by large POD and false alarm ratio and small missing values. All MPEs show higher rain sensitivity over eastern India than western India. These differential sensitivities are found to alter the biases in rain amount differently. All MPEs show similar spatial patterns of seasonal rain bias and root-mean-square error, but their spatial variability across India is complex and pronounced. The MPEs overestimate the rainfall over the dry regions (northwest and southeast India) and severely underestimate over mountainous regions (west coast and northeast India), whereas the bias is relatively small over the core monsoon zone. Higher occurrence of virga rain due to subcloud evaporation and possible missing of small-scale convective events by gauges over the dry regions are the main reasons for the observed overestimation of rain by MPEs. The decomposed components of total bias show that the major part of overestimation is due to false precipitation. The severe underestimation of rain along the west coast is attributed to the predominant occurrence of shallow rain and underestimation of moderate to heavy rain by MPEs. The decomposed components suggest that the missed precipitation and hit bias are the leading error sources for the total bias along the west coast. All evaluation metrics are found to be nearly equal in two contrasting monsoon seasons (southwest and northeast), indicating that the performance of MPEs does not change with the season, at least over southeast India. Among various MPEs, the performance of TMPA is found to be better than others, as it reproduced most of the spatial variability exhibited by the reference.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

DNA microarrays provide such a huge amount of data that unsupervised methods are required to reduce the dimension of the data set and to extract meaningful biological information. This work shows that Independent Component Analysis (ICA) is a promising approach for the analysis of genome-wide transcriptomic data. The paper first presents an overview of the most popular algorithms to perform ICA. These algorithms are then applied on a microarray breast-cancer data set. Some issues about the application of ICA and the evaluation of biological relevance of the results are discussed. This study indicates that ICA significantly outperforms Principal Component Analysis (PCA).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Integran este número de la revista ponencias presentadas en Studia Hispanica Medievalia VIII: Actas de las IX Jornadas Internacionales de Literatura Española Medieval, 2008, y de Homenaje al Quinto Centenario de Amadis de Gaula

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In April 2005, a SHOALS 1000T LIDAR system was used as an efficient alternative for safely acquiring data to describe the existing conditions of nearshore bathymetry and the intertidal zone over an approximately 40.7 km2 (11.8 nm2) portion of hazardous coastline within the Olympic Coast National Marine Sanctuary (OCNMS). Data were logged from 1,593 km (860 nm) of track lines in just over 21 hours of flight time. Several islands and offshore rocks were also surveyed, and over 24,000 geo-referenced digital still photos were captured to assist with data cleaning and QA/QC. The 1 kHz bathymetry laser obtained a maximum water depth of 22.2 meters. Floating kelp beds, breaking surf lines and turbid water were all challenges to the survey. Although sea state was favorable for this time of the year, recent heavy rainfall and a persistent low-lying layer of fog reduced acquisition productivity. The existence of a completed VDatum model covering this same geographic region permitted the LIDAR data to be vertically transformed and merged with existing shallow water multibeam data and referenced to the mean lower low water (MLLW) tidal datum. Analysis of a multibeam bathymetry-LIDAR difference surface containing over 44,000 samples indicated surface deviations from –24.3 to 8.48 meters, with a mean difference of –0.967 meters, and standard deviation of 1.762 meters. Errors in data cleaning and false detections due to interference from surf, kelp, and turbidity likely account for the larger surface separations, while the remaining general surface difference trend could partially be attributed to a more dense data set, and shoal-biased cleaning, binning and gridding associated with the multibeam data for maintaining conservative least depths important for charting dangers to navigation. (PDF contains 27 pages.)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The connections between convexity and submodularity are explored, for purposes of minimizing and learning submodular set functions.

First, we develop a novel method for minimizing a particular class of submodular functions, which can be expressed as a sum of concave functions composed with modular functions. The basic algorithm uses an accelerated first order method applied to a smoothed version of its convex extension. The smoothing algorithm is particularly novel as it allows us to treat general concave potentials without needing to construct a piecewise linear approximation as with graph-based techniques.

Second, we derive the general conditions under which it is possible to find a minimizer of a submodular function via a convex problem. This provides a framework for developing submodular minimization algorithms. The framework is then used to develop several algorithms that can be run in a distributed fashion. This is particularly useful for applications where the submodular objective function consists of a sum of many terms, each term dependent on a small part of a large data set.

Lastly, we approach the problem of learning set functions from an unorthodox perspective---sparse reconstruction. We demonstrate an explicit connection between the problem of learning set functions from random evaluations and that of sparse signals. Based on the observation that the Fourier transform for set functions satisfies exactly the conditions needed for sparse reconstruction algorithms to work, we examine some different function classes under which uniform reconstruction is possible.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Crustal structure in Southern California is investigated using travel times from over 200 stations and thousands of local earthquakes. The data are divided into two sets of first arrivals representing a two-layer crust. The Pg arrivals have paths that refract at depths near 10 km and the Pn arrivals refract along the Moho discontinuity. These data are used to find lateral and azimuthal refractor velocity variations and to determine refractor topography.

In Chapter 2 the Pn raypaths are modeled using linear inverse theory. This enables statistical verification that static delays, lateral slowness variations and anisotropy are all significant parameters. However, because of the inherent size limitations of inverse theory, the full array data set could not be processed and the possible resolution was limited. The tomographic backprojection algorithm developed for Chapters 3 and 4 avoids these size problems. This algorithm allows us to process the data sequentially and to iteratively refine the solution. The variance and resolution for tomography are determined empirically using synthetic structures.

The Pg results spectacularly image the San Andreas Fault, the Garlock Fault and the San Jacinto Fault. The Mojave has slower velocities near 6.0 km/s while the Peninsular Ranges have higher velocities of over 6.5 km/s. The San Jacinto block has velocities only slightly above the Mojave velocities. It may have overthrust Mojave rocks. Surprisingly, the Transverse Ranges are not apparent at Pg depths. The batholiths in these mountains are possibly only surficial.

Pn velocities are fast in the Mojave, slow in Southern California Peninsular Ranges and slow north of the Garlock Fault. Pn anisotropy of 2% with a NWW fast direction exists in Southern California. A region of thin crust (22 km) centers around the Colorado River where the crust bas undergone basin and range type extension. Station delays see the Ventura and Los Angeles Basins but not the Salton Trough, where high velocity rocks underlie the sediments. The Transverse Ranges have a root in their eastern half but not in their western half. The Southern Coast Ranges also have a thickened crust but the Peninsular Ranges have no major root.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents two different forms of the Born approximations for acoustic and elastic wavefields and discusses their application to the inversion of seismic data. The Born approximation is valid for small amplitude heterogeneities superimposed over a slowly varying background. The first method is related to frequency-wavenumber migration methods. It is shown to properly recover two independent acoustic parameters within the bandpass of the source time function of the experiment for contrasts of about 5 percent from data generated using an exact theory for flat interfaces. The independent determination of two parameters is shown to depend on the angle coverage of the medium. For surface data, the impedance profile is well recovered.

The second method explored is mathematically similar to iterative tomographic methods recently introduced in the geophysical literature. Its basis is an integral relation between the scattered wavefield and the medium parameters obtained after applying a far-field approximation to the first-order Born approximation. The Davidon-Fletcher-Powell algorithm is used since it converges faster than the steepest descent method. It consists essentially of successive backprojections of the recorded wavefield, with angular and propagation weighing coefficients for density and bulk modulus. After each backprojection, the forward problem is computed and the residual evaluated. Each backprojection is similar to a before-stack Kirchhoff migration and is therefore readily applicable to seismic data. Several examples of reconstruction for simple point scatterer models are performed. Recovery of the amplitudes of the anomalies are improved with successive iterations. Iterations also improve the sharpness of the images.

The elastic Born approximation, with the addition of a far-field approximation is shown to correspond physically to a sum of WKBJ-asymptotic scattered rays. Four types of scattered rays enter in the sum, corresponding to P-P, P-S, S-P and S-S pairs of incident-scattered rays. Incident rays propagate in the background medium, interacting only once with the scatterers. Scattered rays propagate as if in the background medium, with no interaction with the scatterers. An example of P-wave impedance inversion is performed on a VSP data set consisting of three offsets recorded in two wells.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.

In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.

Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.

In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O presente estudo avalia a associação entre estresse no trabalho e auto-relato de diagnóstico médico de lesão por esforço repetitivo (LER). Trata-se de um estudo seccional, inserido no Estudo Pró-Saúde, que consiste no acompanhamento de uma coorte de funcionários técnico-administrativos de uma universidade no Estado do Rio de Janeiro. Os dados foram obtidos, no ano de 2001, a partir da aplicação de um questionário auto-preenchível. A população de estudo constou de 3.314 funcionários, dentre os quais, 485 apresentaram auto-relato de diagnóstico médico de LER, após sua admissão como funcionários da universidade. A prevalência de LER foi maior entre as mulheres (19,4%) do que entre os homens (8,8%). O estresse no trabalho foi avaliado através da versão reduzida do Job Content Questionnaire, desenvolvido por Karasek e Theorell, cujas questões se destinam a avaliar a demanda psicológica, o controle sobre o próprio trabalho e, o apoio social no trabalho. A análise do estresse no trabalho foi realizada de acordo com os quadrantes propostos por Karasek (1979): baixa exigência (baixa demanda e alto controle); trabalho passivo (baixa demanda e baixo controle); alta exigência (alta demanda e baixo controle) e; trabalho ativo (alta demanda e alto controle). Nesta análise, utilizou-se como categoria de referência, a baixa exigência, por compor um cenário ideal de trabalho. Após ajuste por variáveis socioeconômicas e demográficas (idade, escolaridade e renda familiar per capita) e, ocupacionais (anos de trabalho na universidade e ocupação), homens e mulheres com alta exigência no trabalho, apresentaram uma chance maior de serem acometidos por LER (homens: OR= 1,88; IC95% 1,07-3,29 e mulheres: OR= 1,90; IC95% 1,32-2,02). No ajuste adicional pelo apoio social no trabalho, houve redução da força da associação, para ambos os sexos. Para as mulheres com alta exigência no trabalho, esta associação manteve-se significativa (OR= 1,63; IC95%= 1,12-2,37); enquanto que para os homens, esta associação ficou marginalmente significante (OR= 1,62; IC95%= 0,91-2,87). Este estudo reforça que o desequilíbrio entre a demanda psicológica no trabalho e o controle sobre o próprio trabalho é importante na ocorrência de LER e, portanto, pode ser útil na elaboração de medidas preventivas desse crescente problema de sáude pública. Espera-se que as hipóteses geradas neste estudo possam ser testadas em novas investigações que incorporem o desenho longitudinal, como o Estudo Pró-Saúde, no qual este se insere.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O objeto de estudo foi o estresse no trabalho e os níveis de cortisol salivar. O objetivo geral foi avaliar a associação entre estresse no trabalho e variações de cortisol salivar de trabalhadores de Enfermagem inseridos na assistência hospitalar no Rio de Janeiro. A hipótese do estudo foi que existe associação entre trabalhadores expostos à alta exigência no trabalho e as variações de cortisol salivar. Trata-se de estudo epidemiológico observacional analítico seccional, realizado em hospital estadual localizado no Rio de Janeiro com amostra de 103 trabalhadores. Para avaliação dos aspectos psicossociais do trabalho, utilizou-se o questionário Job Content Questionnaire. O cortisol salivar foi medido através da coleta de 04 amostras de cada participante em um dia do plantão: ao acordar, 30 minutos depois, às 12h e 18 h. A coleta de dados foi realizada entre março e abril de 2012. Utilizou-se o programa SPSS 18.0 para análise dos dados. As dimensões demanda psicológica e controle e a subtração foram utilizadas sob a forma contínua nas análises de correlação com as covariáveis e desfecho. Os níveis de cortisol foram quantificados por meio de cinco índices: o cálculo da área sob a curva em relação ao zero ou base (AUCg), área sob a curva em relação ao aumento (AUCi), o aumento médio (MnInc), a excreção do cortisol no período pós acordar (AUCtrab) e a área sob a curva em relação ao zero ou base do ciclo diurno (AUCCD). Para avaliar a associação entre as covariáveis e exposição e desfecho utilizou-se os testes de Mann Whitney e Kruskall-Wallis. As covariáveis associadas à exposição ou ao desfecho com nível de significância de 20% (p≤0,20) foram testadas no modelo de regressão linear. Realizada análise de correlação utilizando-se o coeficiente de correlação de Spearmans. Como resultado encontrou-se que os trabalhadores de Enfermagem obtiveram médias para demanda psicológica e controle que tendem para o limite superior, bem como para a subtração, caracterizando alta demanda e alto controle, ou seja, trabalho ativo. O valor médio de cortisol observado ao acordar, 30 minutos após, 12h e 18h foi de 5,82 nmol/L) (4,86), 16,60 nmol/L ( 8,31), 7,49 nmol/L ( 6,97) e 3,93 nmol/L ( 3,15), respectivamente. O aumento do cortisol entre o acordar e 30 minutos após foi em média de 64%. Já para os índices de cortisol adotados observa-se o valor médio da MnInc, AUCg, AUCi, AUCtrab e AUCCD foi de 10,78 nmol/L (6,99), 5,61 nmol/L ( 2,92) 2,69 nmol/L ( 1,75), 32,51 nmol/L ( 21,99) e 107,99 nmol/L ( 61,63), respectivamente. Este estudo demonstrou que os níveis de cortisol salivar livre não estão associados à alta exigência no trabalho, mesmo quando ajustadas pelas possíveis variáveis de confusão ou modificadoras de efeito. A hipótese do estudo não foi confirmada. Os dados obtidos neste estudo revelaram aspectos importantes dos riscos psicossociais a que estão expostos os trabalhadores de Enfermagem durante o processo de trabalho, oferecendo subsídios para que sejam implementados programas de orientação e promoção à saúde do trabalhador e fornecem um contributo para entender os caminhos biológicos pelos quais o estresse no trabalho influencia a saúde.