884 resultados para Data Streams Distribution
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
This paper investigates a simple procedure to estimate robustly the mean of an asymmetric distribution. The procedure removes the observations which are larger or smaller than certain limits and takes the arithmetic mean of the remaining observations, the limits being determined with the help of a parametric model, e.g., the Gamma, the Weibull or the Lognormal distribution. The breakdown point, the influence function, the (asymptotic) variance, and the contamination bias of this estimator are explored and compared numerically with those of competing estimates.
Resumo:
This paper proposes a novel approach for the analysis of illicit tablets based on their visual characteristics. In particular, the paper concentrates on the problem of ecstasy pill seizure profiling and monitoring. The presented method extracts the visual information from pill images and builds a representation of it, i.e. it builds a pill profile based on the pill visual appearance. Different visual features are used to build different image similarity measures, which are the basis for a pill monitoring strategy based on both discriminative and clustering models. The discriminative model permits to infer whether two pills come from the same seizure, while the clustering models groups of pills that share similar visual characteristics. The resulting clustering structure allows to perform a visual identification of the relationships between different seizures. The proposed approach was evaluated using a data set of 621 Ecstasy pill pictures. The results demonstrate that this is a feasible and cost effective method for performing pill profiling and monitoring.
Resumo:
In the past, sensors networks in cities have been limited to fixed sensors, embedded in particular locations, under centralised control. Today, new applications can leverage wireless devices and use them as sensors to create aggregated information. In this paper, we show that the emerging patterns unveiled through the analysis of large sets of aggregated digital footprints can provide novel insights into how people experience the city and into some of the drivers behind these emerging patterns. We particularly explore the capacity to quantify the evolution of the attractiveness of urban space with a case study of in the area of the New York City Waterfalls, a public art project of four man-made waterfalls rising from the New York Harbor. Methods to study the impact of an event of this nature are traditionally based on the collection of static information such as surveys and ticket-based people counts, which allow to generate estimates about visitors’ presence in specific areas over time. In contrast, our contribution makes use of the dynamic data that visitors generate, such as the density and distribution of aggregate phone calls and photos taken in different areas of interest and over time. Our analysis provides novel ways to quantify the impact of a public event on the distribution of visitors and on the evolution of the attractiveness of the points of interest in proximity. This information has potential uses for local authorities, researchers, as well as service providers such as mobile network operators.
Resumo:
Shrews of the genus Sorex are characterized by a Holarctic distribution, and relationships among extant taxa have never been fully resolved. Phylogenies have been proposed based on morphological, karyological, and biochemical comparisons, but these analyses often produced controversial and contradictory results. Phylogenetic analyses of partial mitochondrial cytochrome b gene sequences (1011 bp) were used to examine the relationships among 27 Sorex species. The molecular data suggest that Sorex comprises two major monophyletic lineages, one restricted mostly to the New World and one with a primarily Palearctic distribution. Furthermore, several sister-species relationships are revealed by the analysis. Based on the split between the Soricinae and Crocidurinae subfamilies, we used a 95% confidence interval for both the calibration of a molecular clock and the subsequent calculation of major diversification events within the genus Sorex. Our analysis does not support an unambiguous acceleration of the molecular clock in shrews, the estimated rate being similar to other estimates of mammalian mitochondrial clocks. In addition, the data presented here indicate that estimates from the fossil record greatly underestimate divergence dates among Sorex taxa.
Resumo:
Automatic classification of makams from symbolic data is a rarely studied topic. In this paper, first a review of an n-gram based approach is presented using various representations of the symbolic data. While a high degree of precision can be obtained, confusion happens mainly for makams using (almost) the same scale and pitch hierarchy but differ in overall melodic progression, seyir. To further improve the system, first n-gram based classification is tested for various sections of the piece to take into account a feature of the seyir that melodic progression starts in a certain region of the scale. In a second test, a hierarchical classification structure is designed which uses n-grams and seyir features in different levels to further improve the system.
Resumo:
Predicting which species will occur together in the future, and where, remains one of the greatest challenges in ecology, and requires a sound understanding of how the abiotic and biotic environments interact with dispersal processes and history across scales. Biotic interactions and their dynamics influence species' relationships to climate, and this also has important implications for predicting future distributions of species. It is already well accepted that biotic interactions shape species' spatial distributions at local spatial extents, but the role of these interactions beyond local extents (e.g. 10 km(2) to global extents) are usually dismissed as unimportant. In this review we consolidate evidence for how biotic interactions shape species distributions beyond local extents and review methods for integrating biotic interactions into species distribution modelling tools. Drawing upon evidence from contemporary and palaeoecological studies of individual species ranges, functional groups, and species richness patterns, we show that biotic interactions have clearly left their mark on species distributions and realised assemblages of species across all spatial extents. We demonstrate this with examples from within and across trophic groups. A range of species distribution modelling tools is available to quantify species environmental relationships and predict species occurrence, such as: (i) integrating pairwise dependencies, (ii) using integrative predictors, and (iii) hybridising species distribution models (SDMs) with dynamic models. These methods have typically only been applied to interacting pairs of species at a single time, require a priori ecological knowledge about which species interact, and due to data paucity must assume that biotic interactions are constant in space and time. To better inform the future development of these models across spatial scales, we call for accelerated collection of spatially and temporally explicit species data. Ideally, these data should be sampled to reflect variation in the underlying environment across large spatial extents, and at fine spatial resolution. Simplified ecosystems where there are relatively few interacting species and sometimes a wealth of existing ecosystem monitoring data (e.g. arctic, alpine or island habitats) offer settings where the development of modelling tools that account for biotic interactions may be less difficult than elsewhere.
Resumo:
The distribution of parvalbumin (PV), calretinin (CR), and calbindin (CB) immunoreactive neurons was studied with the help of an image analysis system (Vidas/Zeiss) in the primary visual area 17 and associative area 18 (Brodmann) of Alzheimer and control brains. In neither of these areas was there a significant difference between Alzheimer and control groups in the mean number of PV, CR, or CB immunoreactive neuronal profiles, counted in a cortical column going from pia to white matter. Significant differences in the mean densities (numbers per square millimeter of cortex) of PV, CR, and CB immunoreactive neuronal profiles were not observed either between groups or areas, but only between superficial, middle, and deep layers within areas 17 and 18. The optical density of the immunoreactive neuropil was also similar in Alzheimer and controls, correlating with the numerical density of immunoreactive profiles in superficial, middle, and deep layers. The frequency distribution of neuronal areas indicated significant differences between PV, CR, and CB immunoreactive neuronal profiles in both areas 17 and 18, with more large PV than CR and CB positive profiles. There were also significantly more small and less large PV and CR immunoreactive neuronal profiles in Alzheimer than in controls. Our data show that, although the brain pathology is moderate to severe, there is no prominent decrease of PV, CR and CB positive neurons in the visual cortex of Alzheimer brains, but only selective changes in neuronal perikarya.
Resumo:
Diagnosis Related Groups (DRG) are frequently used to standardize the comparison of consumption variables, such as length of stay (LOS). In order to be reliable, this comparison must control for the presence of outliers, i.e. values far removed from the pattern set by the majority of the data. Indeed, outliers can distort the usual statistical summaries, such as means and variances. A common practice is to trim LOS values according to various empirical rules, but there is little theoretical support for choosing between alternative procedures. This pilot study explores the possibility of describing LOS distributions with parametric models which provide the necessary framework for the use of robust methods.
Resumo:
AimTo identify the bioclimatic niche of the endangered Andean cat (Leopardus jacobita), one of the rarest and least known felids in the world, by developing a species distribution model.LocationSouth America, High Andes and Patagonian steppe. Peru, Bolivia, Chile, Argentina.MethodsWe used 108 Andean cat records to build the models, and 27 to test them, applying the Maxent algorithm to sets of uncorrelated bioclimatic variables from global databases, including elevation. We based our biogeographical interpretations on the examination of the predicted geographic range, the modelled response curves and latitudinal variations in climatic variables associated with the locality data.ResultsSimple bioclimatic models for Andean cats were highly predictive with only 3-4 explanatory variables. The climatic niche of the species was defined by extreme diurnal variations in temperature, cold minimum and moderate maximum temperatures, and aridity, characteristic not only of the Andean highlands but also of the Patagonian steppe. Argentina had the highest representation of suitable climates, and Chile the lowest. The most favourable conditions were centrally located and spanned across international boundaries. Discontinuities in suitable climatic conditions coincided with three biogeographical barriers associated with climatic or topographic transitions.Main conclusionsSimple bioclimatic models can produce useful predictions of suitable climatic conditions for rare species, including major biogeographical constraints. In our study case, these constraints are also known to affect the distribution of other Andean species and the genetic structure of Andean cat populations. We recommend surveys of areas with suitable climates and no Andean cat records, including the corridor connecting two core populations. The inclusion of landscape variables at finer scales, crucially the distribution of Andean cat prey, would contribute to refine our predictions for conservation applications.
Resumo:
To ensure efficient energy supply to the high demanding brain, nutrients are transported into brain cells via specific glucose (GLUT) and monocarboxylate transporters (MCT). Mitochondrial dysfunction and altered glucose metabolism are thought to play an important role in the progression of neurodegenerative diseases, including multiple sclerosis (MS). Here, we investigated the cellular localization of key GLUT and MCT proteins in human brain tissue of non-neurological controls and MS patients. We show that in control brain tissue GLUT and MCT proteins were abundantly expressed in a variety of central nervous system cells, particularly in microglia and endothelial cells. In active MS lesions, GLUTs and MCTs were highly expressed in infiltrating leukocytes and reactive astrocytes. Astrocytes manifest increased MCT1 staining and maintain GLUT expression in inactive lesions, whereas demyelinated axons exhibit significantly reduced GLUT3 and MCT2 immunoreactivity in inactive lesions. Finally, we demonstrated that the co-transcription factor peroxisome proliferator-activated receptor gamma co-activator 1-alpha (PGC-1α), an important protein involved in energy metabolism, is highly expressed in reactive astrocytes in active MS lesions. Overexpression of PGC-1α in astrocyte-like cells resulted in increased production of several GLUT and MCT proteins. In conclusion, we provide for the first time a comprehensive overview of key nutrient transporters in white matter brain samples. Moreover, our data demonstrate an altered expression of these nutrient transporters in MS brain tissue, including a marked reduction of axonal GLUT3 and MCT2 expression in chronic lesions, which may impede efficient nutrient supply to the hypoxic demyelinated axons thereby contributing to the ongoing neurodegeneration in MS. GLIA 2014;62:1125-1141.
Resumo:
A report produced by the Department of Natural Resources on the historical pattern the rivers take.
Resumo:
Part of Iowa's Water Ambient monitoring Program, produced by the Iowa Department of Natural Resources.
Resumo:
In this paper we propose a subsampling estimator for the distribution ofstatistics diverging at either known rates when the underlying timeseries in strictly stationary abd strong mixing. Based on our results weprovide a detailed discussion how to estimate extreme order statisticswith dependent data and present two applications to assessing financialmarket risk. Our method performs well in estimating Value at Risk andprovides a superior alternative to Hill's estimator in operationalizingSafety First portofolio selection.
Resumo:
Do the contests with the largest prizes attract the most able contestants? Towhat extent do contestants avoid competition? In this paper, we show, theoreticallyand empirically, that the distribution of abilities plays a crucial role in determiningcontest choice. Sorting exists only when the proportion of high-ability contestantsis sufficiently small. As this proportion increases, contestants shy away from competitionand sorting decreases, such that, reverse sorting becomes a possibility. Wetest our theoretical predictions using a large panel data set containing contest choiceover three decades. We use exogenous variation in the participation of highly-ablecompetitors to provide empirical evidence for the relationship among prizes, competition,and sorting.