991 resultados para Transformed data
Resumo:
Dissertação apresentada à Escola Superior de Tecnologia do Instituto Politécnico de Castelo Branco para cumprimento dos requisitos necessários à obtenção do grau de Mestre em Desenvolvimento de Software e Sistemas Interactivos, realizada sob a orientação científica da categoria profissional do orientador Doutor Eurico Ribeiro Lopes, do Instituto Politécnico de Castelo Branco.
Resumo:
The continuous plankton recorder (CPR) survey is an upper layer plankton monitoring program that has regularly collected samples, at monthly intervals, in the North Atlantic and adjacent seas since 1946. Water from approximately 6 m depth enters the CPR through a small aperture at the front of the sampler and travels down a tunnel where it passes through a silk filtering mesh of 270 µm before exiting at the back of the CPR. The plankton filtered on the silk is analyzed in sections corresponding to 10 nautical miles (approx. 3 m**3 of seawater filtered) and the plankton microscopically identified (Richardson et al., 2006 and reference therein). In the present study we used the CPR data to investigate the current basin scale distribution of C. finmarchicus (C5-C6), C. helgolandicus (C5-C6), C. hyperboreus (C5-C6), Pseudocalanus spp. (C6), Oithona spp. (C1-C6), total Euphausiida, total Thecosomata and the presence/absence of Cnidaria and the Phytoplankton Colour Index (PCI). The PCI, which is a visual assessment of the greenness of the silk, is used as an indicator of the distribution of total phytoplankton biomass across the Atlantic basin (Batten et al., 2003). Monthly data collected between 2000 and 2009 were gridded using the inverse-distance interpolation method, in which the interpolated values were the nodes of a 2 degree by 2 degree grid. The resulting twelve monthly matrices were then averaged within the year and in the case of the zooplankton the data were log-transformed (i.e. log10 (x+1).
Resumo:
Data were collected during various groundfish surveys carried out by IFREMER from October to December between 1997 and 2011, on the eastern continental shelf of the Bay of Biscay and in the Celtic Sea (EVHOE series). The sampling design was stratified according to latitude and depth. A 36/47 GOV trawl was used with a 20 mm mesh codend liner. Haul duration was 30 minutes at a towing speed of 4 knots. Fishing was restricted to daylight hours. Catch weights and catch numbers were recorded for all species and body size measured. The weights and numbers per haul were transformed into abundances per km**2 by considering the swept area of a standard haul (0.069 km**2).
Resumo:
We address the question of how to obtain effective fusion of identification information such that it is robust to the quality of this information. As well as technical issues data fusion is encumbered with a collection of (potentially confusing) practical considerations. These considerations are described during the early chapters in which a framework for data fusion is developed. Following this process of diversification it becomes clear that the original question is not well posed and requires more precise specification. We use the framework to focus on some of the technical issues relevant to the question being addressed. We show that fusion of hard decisions through use of an adaptive version of the maximum a posteriori decision rule yields acceptable performance. Better performance is possible using probability level fusion as long as the probabilities are accurate. Of particular interest is the prevalence of overconfidence and the effect it has on fused performance. The production of accurate probabilities from poor quality data forms the latter part of the thesis. Two approaches are taken. Firstly the probabilities may be moderated at source (either analytically or numerically). Secondly, the probabilities may be transformed at the fusion centre. In each case an improvement in fused performance is demonstrated. We therefore conclude that in order to obtain robust fusion care should be taken to model the probabilities accurately; either at the source or centrally.
Resumo:
In the future, competitors will have more and more opportunities to buy the same information; therefore the companies’ competitiveness will not primarily depend on how much information they possess, but rather on how they can “translate” it to their own language. This study aims to examine those factors that have the most significant impact on the degree to which market studies are utilised by companies. Most of the work in this area has studied the use of information in strategic decisions a priori. This paper — while reflecting on the findings of research on organisational theories of information processing — aims to bridge this gap. It proposes and tests a new conceptual framework that examines the use of managerial market research information in decision-making and knowledge creation within one single model. Collected survey data, including all the top-income business enterprises in Hungary indicate that market research findings are efficiently incorporated into the marketing information system only if the marketing manager has trust in the researcher, and believes that the market study is of high quality. Decision-makers are more likely to learn from market studies facilitating the resolution of some specific problem than descriptive studies of a more general nature.
How the World Learned to Stop Worrying and Love Failure: Big Data, Resilience and Emergent Causality
Resumo:
In modernity, failure was the discourse of critique, today, it is increasingly the discourse of power: failure has changed its allegiances. Over the last two decades, failure has been enfolded into discourses of power, facilitating the development of new policy approaches. Foremost among governing approaches that seek to include and to govern through failure is that of resilience. This article seeks to reflect upon how the understanding of failure has become transformed in this process, particularly linking this transformation to the radical appreciation of contingency and of the limits to instrumental cause-and-effect approaches to rule. Whereas modernity was shaped by a contestation over failure as an epistemological boundary, under conditions of contingency and complexity there appears to be a new consensus on failure as an ontological necessity. This problematic ‘ontological turn’ is illustrated using examples of changing approaches to risks, especially anthropogenic understandings of environmental threats, formerly seen as ‘natural’.
Resumo:
A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.
Resumo:
This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.
Resumo:
Composite plants consisting of a wild-type shoot and a transgenic root are frequently used for functional genomics in legume research. Although transformation of roots using Agrobacterium rhizogenes leads to morphologically normal roots, the question arises as to whether such roots interact with arbuscular mycorrhizal (AM) fungi in the same way as wild-type roots. To address this question, roots transformed with a vector containing the fluorescence marker DsRed were used to analyse AM in terms of mycorrhization rate, morphology of fungal and plant subcellular structures, as well as transcript and secondary metabolite accumulations. Mycorrhization rate, appearance, and developmental stages of arbuscules were identical in both types of roots. Using Mt16kOLI1Plus microarrays, transcript profiling of mycorrhizal roots showed that 222 and 73 genes exhibited at least a 2-fold induction and less than half of the expression, respectively, most of them described as AM regulated in the same direction in wild-type roots. To verify this, typical AM marker genes were analysed by quantitative reverse transcription-PCR and revealed equal transcript accumulation in transgenic and wild-type roots. Regarding secondary metabolites, several isoflavonoids and apocarotenoids, all known to accumulate in mycorrhizal wild-type roots, have been found to be up-regulated in mycorrhizal in comparison with non-mycorrhizal transgenic roots. This set of data revealed a substantial similarity in mycorrhization of transgenic and wild-type roots of Medicago truncatula, validating the use of composite plants for studying AM-related effects.
Resumo:
Statistical approaches to study extreme events require, by definition, long time series of data. In many scientific disciplines, these series are often subject to variations at different temporal scales that affect the frequency and intensity of their extremes. Therefore, the assumption of stationarity is violated and alternative methods to conventional stationary extreme value analysis (EVA) must be adopted. Using the example of environmental variables subject to climate change, in this study we introduce the transformed-stationary (TS) methodology for non-stationary EVA. This approach consists of (i) transforming a non-stationary time series into a stationary one, to which the stationary EVA theory can be applied, and (ii) reverse transforming the result into a non-stationary extreme value distribution. As a transformation, we propose and discuss a simple time-varying normalization of the signal and show that it enables a comprehensive formulation of non-stationary generalized extreme value (GEV) and generalized Pareto distribution (GPD) models with a constant shape parameter. A validation of the methodology is carried out on time series of significant wave height, residual water level, and river discharge, which show varying degrees of long-term and seasonal variability. The results from the proposed approach are comparable with the results from (a) a stationary EVA on quasi-stationary slices of non-stationary series and (b) the established method for non-stationary EVA. However, the proposed technique comes with advantages in both cases. For example, in contrast to (a), the proposed technique uses the whole time horizon of the series for the estimation of the extremes, allowing for a more accurate estimation of large return levels. Furthermore, with respect to (b), it decouples the detection of non-stationary patterns from the fitting of the extreme value distribution. As a result, the steps of the analysis are simplified and intermediate diagnostics are possible. In particular, the transformation can be carried out by means of simple statistical techniques such as low-pass filters based on the running mean and the standard deviation, and the fitting procedure is a stationary one with a few degrees of freedom and is easy to implement and control. An open-source MAT-LAB toolbox has been developed to cover this methodology, which is available at https://github.com/menta78/tsEva/(Mentaschi et al., 2016).