904 resultados para Data analysis system
Resumo:
Federal Highway Administration, Washington, D.C.
Resumo:
Federal Highway Administration, Washington, D.C.
Resumo:
Mode of access: Internet.
Resumo:
Exploratory analysis of data seeks to find common patterns to gain insights into the structure and distribution of the data. In geochemistry it is a valuable means to gain insights into the complicated processes making up a petroleum system. Typically linear visualisation methods like principal components analysis, linked plots, or brushing are used. These methods can not directly be employed when dealing with missing data and they struggle to capture global non-linear structures in the data, however they can do so locally. This thesis discusses a complementary approach based on a non-linear probabilistic model. The generative topographic mapping (GTM) enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate more structure than a two dimensional principal components plot. The model can deal with uncertainty, missing data and allows for the exploration of the non-linear structure in the data. In this thesis a novel approach to initialise the GTM with arbitrary projections is developed. This makes it possible to combine GTM with algorithms like Isomap and fit complex non-linear structure like the Swiss-roll. Another novel extension is the incorporation of prior knowledge about the structure of the covariance matrix. This extension greatly enhances the modelling capabilities of the algorithm resulting in better fit to the data and better imputation capabilities for missing data. Additionally an extensive benchmark study of the missing data imputation capabilities of GTM is performed. Further a novel approach, based on missing data, will be introduced to benchmark the fit of probabilistic visualisation algorithms on unlabelled data. Finally the work is complemented by evaluating the algorithms on real-life datasets from geochemical projects.
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
Image processing offers unparalleled potential for traffic monitoring and control. For many years engineers have attempted to perfect the art of automatic data abstraction from sequences of video images. This paper outlines a research project undertaken at Napier University by the authors in the field of image processing for automatic traffic analysis. A software based system implementing TRIP algorithms to count cars and measure vehicle speed has been developed by members of the Transport Engineering Research Unit (TERU) at the University. The TRIP algorithm has been ported and evaluated on an IBM PC platform with a view to hardware implementation of the pre-processing routines required for vehicle detection. Results show that a software based traffic counting system is realisable for single window processing. Due to the high volume of data required to be processed for full frames or multiple lanes, system operations in real time are limited. Therefore specific hardware is required to be designed. The paper outlines a hardware design for implementation of inter-frame and background differencing, background updating and shadow removal techniques. Preliminary results showing the processing time and counting accuracy for the routines implemented in software are presented and a real time hardware pre-processing architecture is described.
Resumo:
New morpho-bathymetric and tectono-stratigraphic data on Naples and Salerno Gulfs, derived from bathymetric and seismic data analysis and integrated geologic interpretation are here presented. The CUBE(Combined Uncertainty Bathymetric Estimator) method has been applied to complex morphologies, such as the Capri continental slope and the related geological structures occurring in the Salerno Gulf.The bathymetric data analysis has been carried out for marine geological maps of the whole Campania continental margin at scales ranging from 1:25.000 to 1:10.000, including focused examples in Naples and Salerno Gulfs, Naples harbour, Capri and Ischia Islands and Salerno Valley. Seismic data analysis has allowed for the correlation of main morpho-structural lineaments recognized at a regional scale through multichannel profiles with morphological features cropping out at the sea bottom, evident from bathymetry.Main fault systems in the area have been represented on a tectonic sketch map, including the master fault located northwards to the Salerno Valley half graben. Some normal faults parallel to the master fault have been interpreted from the slope map derived from bathymetric data. A complex system of antithetic faults bound two morpho-structural highs located 20km to the south of the Capri Island. Some hints of compressional reactivation of normal faults in an extensional setting involving the whole Campania continental margin have been shown from seismic interpretation.
Resumo:
Observing system experiments (OSEs) are carried out over a 1-year period to quantify the impact of Argo observations on the Mercator Ocean 0.25° global ocean analysis and forecasting system. The reference simulation assimilates sea surface temperature (SST), SSALTO/DUACS (Segment Sol multi-missions dALTimetrie, d'orbitographie et de localisation précise/Data unification and Altimeter combination system) altimeter data and Argo and other in situ observations from the Coriolis data center. Two other simulations are carried out where all Argo and half of the Argo data are withheld. Assimilating Argo observations has a significant impact on analyzed and forecast temperature and salinity fields at different depths. Without Argo data assimilation, large errors occur in analyzed fields as estimated from the differences when compared with in situ observations. For example, in the 0–300 m layer RMS (root mean square) differences between analyzed fields and observations reach 0.25 psu and 1.25 °C in the western boundary currents and 0.1 psu and 0.75 °C in the open ocean. The impact of the Argo data in reducing observation–model forecast differences is also significant from the surface down to a depth of 2000 m. Differences between in situ observations and forecast fields are thus reduced by 20 % in the upper layers and by up to 40 % at a depth of 2000 m when Argo data are assimilated. At depth, the most impacted regions in the global ocean are the Mediterranean outflow, the Gulf Stream region and the Labrador Sea. A significant degradation can be observed when only half of the data are assimilated. Therefore, Argo observations matter to constrain the model solution, even for an eddy-permitting model configuration. The impact of the Argo floats' data assimilation on other model variables is briefly assessed: the improvement of the fit to Argo profiles do not lead globally to unphysical corrections on the sea surface temperature and sea surface height. The main conclusion is that the performance of the Mercator Ocean 0.25° global data assimilation system is heavily dependent on the availability of Argo data.
Resumo:
The epoc® blood analysis system (Epocal Inc., Ottawa, Ontario, Canada) is a newly developed in vitro diagnostic hand-held analyzer for testing whole blood samples at point-of-care, which provides blood gas, electrolytes, ionized calcium, glucose, lactate, and hematocrit/calculated hemoglobin rapidly. The analytical performance of the epoc® system was evaluated in a tertiary hospital, see related research article “Analytical evaluation of the epoc® point-of-care blood analysis system in cardiopulmonary bypass patients” [1]. Data presented are the linearity analysis for 9 parameters and the comparison study in 40 cardiopulmonary bypass patients on 3 epoc® meters, Instrumentation Laboratory GEM4000, Abbott iSTAT, Nova CCX, and Roche Accu-Chek Inform II and Performa glucose meters.
Resumo:
The aim of this novel experimental study is to investigate the behaviour of a 2m x 2m model of a masonry groin vault, which is built by the assembly of blocks made of a 3D-printed plastic skin filled with mortar. The choice of the groin vault is due to the large presence of this vulnerable roofing system in the historical heritage. Experimental tests on the shaking table are carried out to explore the vault response on two support boundary conditions, involving four lateral confinement modes. The data processing of markers displacement has allowed to examine the collapse mechanisms of the vault, based on the arches deformed shapes. There then follows a numerical evaluation, to provide the orders of magnitude of the displacements associated to the previous mechanisms. Given that these displacements are related to the arches shortening and elongation, the last objective is the definition of a critical elongation between two diagonal bricks and consequently of a diagonal portion. This study aims to continue the previous work and to take another step forward in the research of ground motion effects on masonry structures.
Resumo:
Hadrontherapy employs high-energy beams of charged particles (protons and heavier ions) to treat deep-seated tumours: these particles have a favourable depth-dose distribution in tissue characterized by a low dose in the entrance channel and a sharp maximum (Bragg peak) near the end of their path. In these treatments nuclear interactions have to be considered: beam particles can fragment in the human body releasing a non-zero dose beyond the Bragg peak while fragments of human body nuclei can modify the dose released in healthy tissues. These effects are still in question given the lack of interesting cross sections data. Also space radioprotection can profit by fragmentation cross section measurements: the interest in long-term manned space missions beyond Low Earth Orbit is growing in these years but it has to cope with major health risks due to space radiation. To this end, risk models are under study: however, huge gaps in fragmentation cross sections data are currently present preventing an accurate benchmark of deterministic and Monte Carlo codes. To fill these gaps in data, the FOOT (FragmentatiOn Of Target) experiment was proposed. It is composed by two independent and complementary setups, an Emulsion Cloud Chamber and an electronic setup composed by several subdetectors providing redundant measurements of kinematic properties of fragments produced in nuclear interactions between a beam and a target. FOOT aims to measure double differential cross sections both in angle and kinetic energy which is the most complete information to address existing questions. In this Ph.D. thesis, the development of the Trigger and Data Acquisition system for the FOOT electronic setup and a first analysis of 400 MeV/u 16O beam on Carbon target data acquired in July 2021 at GSI (Darmstadt, Germany) are presented. When possible, a comparison with other available measurements is also reported.
Resumo:
Background: Dermatomyositis (DM) and polymyositis (PM) are rare systemic autoimmune rheumatic diseases with high fatality rates. There have been few population-based mortality studies of dermatomyositis and polymyositis in the world, and none have been conducted in Brazil. The objective of the present study was to employ multiple-cause of-death methodology in the analysis of trends in mortality related to dermatomyositis and polymyositis in the state of Sao Paulo, Brazil, between 1985 and 2007. Methods: We analyzed mortality data from the Sao Paulo State Data Analysis System, selecting all death certificates on which DM or PM was listed as a cause of death. The variables sex, age and underlying, associated or total mentions of causes of death were studied using mortality rates, proportions and historical trends. Statistical analysis were performed by chi-square and H Kruskal-Wallis tests, variance analysis and linear regression. A p value less than 0.05 was regarded as significant. Results: Over a 23-year period, there were 318 DM-related deaths and 316 PM-related deaths. Overall, DM/PM was designated as an underlying cause in 55.2% and as an associated cause in 44.8%; among 634 total deaths females accounted for 71.5%. During the study period, age-and gender-adjusted DM mortality rates did not change significantly, although PM as an underlying cause and total mentions of PM trended lower (p < 0.05). The mean ages at death were 47.76 +/- 20.81 years for DM and 54.24 +/- 17.94 years for PM (p = 0.0003). For DM/PM, respectively, as underlying causes, the principal associated causes of death were as follows: pneumonia (in 43.8%/33.5%); respiratory failure (in 34.4%/32.3%); interstitial pulmonary diseases and other pulmonary conditions (in 28.9%/17.6%); and septicemia (in 22.8%/15.9%). For DM/PM, respectively, as associated causes, the following were the principal underlying causes of death: respiratory disorders (in 28.3%/26.0%); circulatory disorders (in 17.4%/20.5%); neoplasms (in 16.7%/13.7%); infectious and parasitic diseases (in 11.6%/9.6%); and gastrointestinal disorders (in 8.0%/4.8%). Of the 318 DM-related deaths, 36 involved neoplasms, compared with 20 of the 316 PM-related deaths (p = 0.03). Conclusions: Our study using multiple cause of deaths found that DM/PM were identified as the underlying cause of death in only 55.2% of the deaths, indicating that both diseases were underestimated in the primary mortality statistics. We observed a predominance of deaths in women and in older individuals, as well as a trend toward stability in the mortality rates. We have confirmed that the risk of death is greater when either disease is accompanied by neoplasm, albeit to lesser degree in individuals with PM. The investigation of the underlying and associated causes of death related to DM/PM broaden the knowledge of the natural history of both diseases and could help integrate mortality data for use in the evaluation of control measures for DM/PM.
Resumo:
In this work a downscaled multicommuted flow injection analysis setup for photometric determination is described. The setup consists of a flow system module and a LED based photometer, with a total internal volume of about 170 mu L The system was tested by developing an analytical procedure for the photometric determination of iodate in table salt using N,N-diethyl-henylenediamine (DPD) as the chromogenic reagent. Accuracy was accessed by applying the paired r-test between results obtained using the proposed procedure and a reference method, and no significant difference at the 95% confidence level was observed. Other profitable features, such as a low reagent consumption of 7.3 mu g DPD per determination: a linear response ranging from 0.1 up to 3.0 m IO(3)(-), a relative standard deviation of 0.9% (n = 11) for samples containing 0.5 m IO(3)(-), a detection limit of 17 mu g L(-1) IO(3)(-), a sampling throughput of 117 determination per hour, and a waste generation 600 mu L per determination, were also achieved. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
This paper aims to investigate the influence of some dissolved air flotation (DAF) process variables (specifically: the hydraulic detention time in the contact zone and the supplied dissolved air concentration) and the pH values, as pretreatment chemical variables, on the micro-bubble size distribution (BSD) in a DAF contact zone. This work was carried out in a pilot plant where bubbles were measured by an appropriate non-intrusive image acquisition system. The results show that the obtained diameter ranges were in agreement with values reported in the literature (10-100mm), quite independently of the investigated conditions. The linear average diameter varied from 20 to 30mm, or equivalently, the Sauter (d(3,2)) diameter varied from 40 to 50mm. In all investigated conditions, D(50) was between 75% and 95%. The BSD might present different profile (with a bimodal curve trend), however, when analyzing the volumetric frequency distribution (in some cases with the appearance of peaks in diameters ranging from 90-100mm). Regarding volumetric frequency analysis, all the investigated parameters can modify the BSD in DAF contact zone after the release point, thus potentially causing changes in DAF kinetics. This finding prompts further research in order to verify the effect of these BSD changes on solid particle removal efficiency by DAF.
Resumo:
The performance of three analytical methods for multiple-frequency bioelectrical impedance analysis (MFBIA) data was assessed. The methods were the established method of Cole and Cole, the newly proposed method of Siconolfi and co-workers and a modification of this procedure. Method performance was assessed from the adequacy of the curve fitting techniques, as judged by the correlation coefficient and standard error of the estimate, and the accuracy of the different methods in determining the theoretical values of impedance parameters describing a set of model electrical circuits. The experimental data were well fitted by all curve-fitting procedures (r = 0.9 with SEE 0.3 to 3.5% or better for most circuit-procedure combinations). Cole-Cole modelling provided the most accurate estimates of circuit impedance values, generally within 1-2% of the theoretical values, followed by the Siconolfi procedure using a sixth-order polynomial regression (1-6% variation). None of the methods, however, accurately estimated circuit parameters when the measured impedances were low (<20 Omega) reflecting the electronic limits of the impedance meter used. These data suggest that Cole-Cole modelling remains the preferred method for the analysis of MFBIA data.