866 resultados para Compositional data analysis-roots in geosciences
Resumo:
Anthropogenic aerosols play a crucial role in our environment, climate, and health. Assessment of spatial and temporal variation in anthropogenic aerosols is essential to determine their impact. Aerosols are of natural and anthropogenic origin and together constitute a composite aerosol system. Information about either component needs elimination of the other from the composite aerosol system. In the present work we estimated the anthropogenic aerosol fraction (AF) over the Indian region following two different approaches and inter-compared the estimates. We espouse multi-satellite data analysis and model simulations (using the CHIMERE Chemical transport model) to derive natural aerosol distribution, which was subsequently used to estimate AF over the Indian subcontinent. These two approaches are significantly different from each other. Natural aerosol satellite-derived information was extracted in terms of optical depth while model simulations yielded mass concentration. Anthropogenic aerosol fraction distribution was studied over two periods in 2008: premonsoon (March-May) and winter (November-February) in regard to the known distinct seasonality in aerosol loading and type over the Indian region. Although both techniques have derived the same property, considerable differences were noted in temporal and spatial distribution. Satellite retrieval of AF showed maximum values during the pre-monsoon and summer months while lowest values were observed in winter. On the other hand, model simulations showed the highest concentration of AF in winter and the lowest during pre-monsoon and summer months. Both techniques provided an annual average AF of comparable magnitude (similar to 0.43 +/- 0.06 from the satellite and similar to 0.48 +/- 0.19 from the model). For winter months the model-estimated AF was similar to 0.62 +/- 0.09, significantly higher than that (0.39 +/- 0.05) estimated from the satellite, while during pre-monsoon months satellite-estimated AF was similar to 0.46 +/- 0.06 and the model simulation estimation similar to 0.53 +/- 0.14. Preliminary results from this work indicate that model-simulated results are nearer to the actual variation as compared to satellite estimation in view of general seasonal variation in aerosol concentrations.
Resumo:
This document provides a simple introduction to research methods and analysis tools for biologists or environmental scientists, with particular emphasis on fish biology in devleoping countries.
Resumo:
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis.
It is divided into two parts. The first begins with an exposition of the general techniques of latent variable modeling. A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models. This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima. REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined. Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it. This model is applied in the second part of the thesis.
The second part brings the technology of part I to bear on two important problems in experimental neuroscience. The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording. The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution. The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions. A latent variable model is proposed. Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data.
Resumo:
This thesis is an investigation into the nature of data analysis and computer software systems which support this activity.
The first chapter develops the notion of data analysis as an experimental science which has two major components: data-gathering and theory-building. The basic role of language in determining the meaningfulness of theory is stressed, and the informativeness of a language and data base pair is studied. The static and dynamic aspects of data analysis are then considered from this conceptual vantage point. The second chapter surveys the available types of computer systems which may be useful for data analysis. Particular attention is paid to the questions raised in the first chapter about the language restrictions imposed by the computer system and its dynamic properties.
The third chapter discusses the REL data analysis system, which was designed to satisfy the needs of the data analyzer in an operational relational data system. The major limitation on the use of such systems is the amount of access to data stored on a relatively slow secondary memory. This problem of the paging of data is investigated and two classes of data structure representations are found, each of which has desirable paging characteristics for certain types of queries. One representation is used by most of the generalized data base management systems in existence today, but the other is clearly preferred in the data analysis environment, as conceptualized in Chapter I.
This data representation has strong implications for a fundamental process of data analysis -- the quantification of variables. Since quantification is one of the few means of summarizing and abstracting, data analysis systems are under strong pressure to facilitate the process. Two implementations of quantification are studied: one analagous to the form of the lower predicate calculus and another more closely attuned to the data representation. A comparison of these indicates that the use of the "label class" method results in orders of magnitude improvement over the lower predicate calculus technique.
Resumo:
Large numbers of fishing vessels operating from ports in Latin America participate in surface longline fisheries in the eastern Pacific Ocean (EPO), and several species of sea turtles inhabit the grounds where these fleets operate. The endangered status of several sea turtle species, and the success of circle hooks (‘treatment’ hooks) in reducing turtle hookings in other ocean areas, as compared to J-hooks and Japanese-style tuna hooks (‘control’ hooks), prompted the initiation of a hook exchange program on the west coast of Latin America, the Eastern Pacific Regional Sea Turtle Program (EPRSTP)1. One of the goals of the EPRSTP is to determine if circle hooks would be effective at reducing turtle bycatch in artisanal fisheries of the EPO without significantly reducing the catch of marketable fish species. Participating fishers were provided with circle hooks at no cost and asked to replace the J/Japanese-style tuna hooks on their longlines with circle hooks in an alternating manner. Data collected by the EPRSTP show differences in longline gear and operational characteristics within and among countries. These aspects of the data, in addition to difficulties encountered with implementation of the alternating-hook design, pose challenges for analysis of these data.
Resumo:
Gasoline Homogeneous Charge Compression Ignition (HCCI) combustion has been studied widely in the past decade. However, in HCCI engines using negative valve overlap (NVO), there is still uncertainty as to whether the effect of pilot injection during NVO on the start of combustion is primarily due to heat release of the pilot fuel during NVO or whether it is due to pilot fuel reformation. This paper presents data taken on a 4-cylinder gasoline direct injection, spark ignition/HCCI engine with a dual cam system, capable of recompressing residual gas. Engine in-cylinder samples are extracted at various points during the engine cycle through a high-speed sampling system and directly analysed with a gas chromatograph and flame ionisation detector. Engine parameter sweeps are performed for different pilot injection timings and quantities at a medium load point. Results show that for lean engine running conditions, earlier pilot injection timing leads to partial oxidation of the injected pilot fuel during NVO, while the fraction of light hydrocarbons remains constant for all parameter variations investigated. The same applies for a variation in pilot fuel amount. Thus there is evidence that in lean conditions, pilot injection-related NVO effects are dominated by heat release rather than fuel reformation. © 2009 SAE International.
Resumo:
Background: Insects constitute the vast majority of known species with their importance including biodiversity, agricultural, and human health concerns. It is likely that the successful adaptation of the Insecta clade depends on specific components in its
Resumo:
This workshop followed on from two previous workshops held in Colombo, Sri Lanka, 2012 and Kochi, India in 2013. The 14 microsattellite markers had previously been developed for Indian Mackerel (Rastrelliger kanagurta) were used on 31 tissue collections from all eight countries were genotyped in India.
Resumo:
There is increasing evidence that many of the mitochondrial DNA (mtDNA) databases published in the fields of forensic science and molecular anthropology are flawed. An a posteriori phylogenetic analysis of the sequences could help to eliminate most of the errors and thus greatly improve data quality. However, previously published caveats and recommendations along these lines were not yet picked up by all researchers. Here we call for stringent quality control of mtDNA data by haplogroup-directed database comparisons. We take some problematic databases of East Asian mtDNAs, published in the Journal of Forensic Sciences and Forensic Science International, as examples to demonstrate the process of pinpointing obvious errors. Our results show that data sets are not only notoriously plagued by base shifts and artificial recombination but also by lab-specific phantom mutations, especially in the second hypervariable region (HVR-II). (C) 2003 Elsevier Ireland Ltd. All rights reserved.