56 resultados para data reduction by factor analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The differential phase (ΦDP) measured by polarimetric radars is recognized to be a very good indicator of the path integrated by rain. Moreover, if a linear relationship is assumed between the specific differential phase (KDP) and the specific attenuation (AH) and specific differential attenuation (ADP), then attenuation can easily be corrected. The coefficients of proportionality, γH and γDP, are, however, known to be dependent in rain upon drop temperature, drop shapes, drop size distribution, and the presence of large drops causing Mie scattering. In this paper, the authors extensively apply a physically based method, often referred to as the “Smyth and Illingworth constraint,” which uses the constraint that the value of the differential reflectivity ZDR on the far side of the storm should be low to retrieve the γDP coefficient. More than 30 convective episodes observed by the French operational C-band polarimetric Trappes radar during two summers (2005 and 2006) are used to document the variability of γDP with respect to the intrinsic three-dimensional characteristics of the attenuating cells. The Smyth and Illingworth constraint could be applied to only 20% of all attenuated rays of the 2-yr dataset so it cannot be considered the unique solution for attenuation correction in an operational setting but is useful for characterizing the properties of the strongly attenuating cells. The range of variation of γDP is shown to be extremely large, with minimal, maximal, and mean values being, respectively, equal to 0.01, 0.11, and 0.025 dB °−1. Coefficient γDP appears to be almost linearly correlated with the horizontal reflectivity (ZH), differential reflectivity (ZDR), and specific differential phase (KDP) and correlation coefficient (ρHV) of the attenuating cells. The temperature effect is negligible with respect to that of the microphysical properties of the attenuating cells. Unusually large values of γDP, above 0.06 dB °−1, often referred to as “hot spots,” are reported for 15%—a nonnegligible figure—of the rays presenting a significant total differential phase shift (ΔϕDP > 30°). The corresponding strongly attenuating cells are shown to have extremely high ZDR (above 4 dB) and ZH (above 55 dBZ), very low ρHV (below 0.94), and high KDP (above 4° km−1). Analysis of 4 yr of observed raindrop spectra does not reproduce such low values of ρHV, suggesting that (wet) ice is likely to be present in the precipitation medium and responsible for the attenuation and high phase shifts. Furthermore, if melting ice is responsible for the high phase shifts, this suggests that KDP may not be uniquely related to rainfall rate but can result from the presence of wet ice. This hypothesis is supported by the analysis of the vertical profiles of horizontal reflectivity and the values of conventional probability of hail indexes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The time-course of metabolic events following response to a model hepatotoxin ethionine (800 mg/kg) was investigated over a 7 day period in rats using high-resolution (1)H NMR spectroscopic analysis of urine and multivariate statistics. Complementary information was obtained by multivariate analysis of (1)H MAS NMR spectra of intact liver and by conventional histopathology and clinical chemistry of blood plasma. (1)H MAS NMR spectra of liver showed toxin-induced lipidosis 24 h postdose consistent with the steatosis observed by histopathology, while hypertaurinuria was suggestive of liver injury. Early biochemical changes in urine included elevation of guanidinoacetate, suggesting impaired methylation reactions. Urinary increases in 5-oxoproline and glycine suggested disruption of the gamma-glutamyl cycle. Signs of ATP depletion together with impairment of the energy metabolism were given from the decreased levels in tricarboxylic acid cycle intermediates, the appearance of ketone bodies in urine, the depletion of hepatic glucose and glycogen, and also hypoglycemia. The observed increase in nicotinuric acid in urine could be an indication of an increase in NAD catabolism, a possible consequence of ATP depletion. Effects on the gut microbiota were suggested by the observed urinary reductions in the microbial metabolites 3-/4-hydroxyphenyl propionic acid, dimethylamine, and tryptamine. At later stages of toxicity, there was evidence of kidney damage, as indicated by the tubular damage observed by histopathology, supported by increased urinary excretion of lactic acid, amino acids, and glucose. These studies have given new insights into mechanisms of ethionine-induced toxicity and show the value of multisystem level data integration in the understanding of experimental models of toxicity or disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We examined the relationship between blood antioxidant enzyme activities, indices of inflammatory status and a number of lifestyle factors in the Caerphilly prospective cohort study of ischaemic heart disease. The study began in 1979 and is based on a representative male population sample. Initially 2512 men were seen in phase I, and followed-up every 5 years in phases II and III; they have recently been seen in phase IV. Data on social class, smoking habit, alcohol consumption were obtained by questionnaire, and body mass index was measured. Antioxidant enzyme activities and indices of inflammatory status were estimated by standard techniques. Significant associations were observed for: age with α-1-antichymotrypsin (p<0.0001) and with caeruloplasmin, both protein and oxidase (p<0.0001); smoking habit with α-1-antichymotrypsin (p<0.0001), with caeruloplasmin, both protein and oxidase (p<0.0001) and with glutathione peroxidose (GPX) (p<0.0001); social class with α-1-antichymotrypsin (p<0.0001), with caeruloplasmin both protein (p<0.001) and oxidase (p<0.01) and with GPX (p<0.0001); body mass index with α-1-antichymotrypsin (p<0.0001) and with caeruloplasmin protein (p<0.001). There was no significant association between alcohol consumption and any of the blood enzymes measured. Factor analysis produced a three-factor model (explaining 65.9% of the variation in the data set) which appeared to indicate close inter-relationships among antioxidants.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Boreal winter wind storm situations over Central Europe are investigated by means of an objective cluster analysis. Surface data from the NCEP-Reanalysis and ECHAM4/OPYC3-climate change GHG simulation (IS92a) are considered. To achieve an optimum separation of clusters of extreme storm conditions, 55 clusters of weather patterns are differentiated. To reduce the computational effort, a PCA is initially performed, leading to a data reduction of about 98 %. The clustering itself was computed on 3-day periods constructed with the first six PCs using "k-means" clustering algorithm. The applied method enables an evaluation of the time evolution of the synoptic developments. The climate change signal is constructed by a projection of the GCM simulation on the EOFs attained from the NCEP-Reanalysis. Consequently, the same clusters are obtained and frequency distributions can be compared. For Central Europe, four primary storm clusters are identified. These clusters feature almost 72 % of the historical extreme storms events and add only to 5 % of the total relative frequency. Moreover, they show a statistically significant signature in the associated wind fields over Europe. An increased frequency of Central European storm clusters is detected with enhanced GHG conditions, associated with an enhancement of the pressure gradient over Central Europe. Consequently, more intense wind events over Central Europe are expected. The presented algorithm will be highly valuable for the analysis of huge data amounts as is required for e.g. multi-model ensemble analysis, particularly because of the enormous data reduction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Proneural genes such as Ascl1 are known to promote cell cycle exit and neuronal differentiation when expressed in neural progenitor cells. The mechanisms by which proneural genes activate neurogenesis--and, in particular, the genes that they regulate--however, are mostly unknown. We performed a genome-wide characterization of the transcriptional targets of Ascl1 in the embryonic brain and in neural stem cell cultures by location analysis and expression profiling of embryos overexpressing or mutant for Ascl1. The wide range of molecular and cellular functions represented among these targets suggests that Ascl1 directly controls the specification of neural progenitors as well as the later steps of neuronal differentiation and neurite outgrowth. Surprisingly, Ascl1 also regulates the expression of a large number of genes involved in cell cycle progression, including canonical cell cycle regulators and oncogenic transcription factors. Mutational analysis in the embryonic brain and manipulation of Ascl1 activity in neural stem cell cultures revealed that Ascl1 is indeed required for normal proliferation of neural progenitors. This study identified a novel and unexpected activity of the proneural gene Ascl1, and revealed a direct molecular link between the phase of expansion of neural progenitors and the subsequent phases of cell cycle exit and neuronal differentiation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prior literature showed that Felder and Silverman learning styles model (FSLSM) was widely adopted to cater to individual styles of learners whether in traditional or Technology Enhanced Learning (TEL). In order to infer this model, the Index of Learning Styles (ILS) instrument was proposed. This research aims to analyse the soundness of this instrument in an Arabic sample. Data were integrated from different courses and years. A total of 259 engineering students participated voluntarily in the study. The reliability was analysed by applying internal construct reliability, inter-scale correlation, and total item correlation. The construct validity was also considered by running factor analysis. The overall results indicated that the reliability and validity of perception and input dimensions were moderately supported, whereas processing and understanding dimensions showed low internal-construct consistency and their items were weakly loaded in the associated constructs. Generally, the instrument needs further effort to improve its soundness. However, considering the consistency of the produced results of engineering students irrespective of cross-cultural differences, it can be adopted to diagnose learning styles.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ERA-40 is a re-analysis of meteorological observations from September 1957 to August 2002 produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) in collaboration with many institutions. The observing system changed considerably over this re-analysis period, with assimilable data provided by a succession of satellite-borne instruments from the 1970s onwards, supplemented by increasing numbers of observations from aircraft, ocean-buoys and other surface platforms, but with a declining number of radiosonde ascents since the late 1980s. The observations used in ERA-40 were accumulated from many sources. The first part of this paper describes the data acquisition and the principal changes in data type and coverage over the period. It also describes the data assimilation system used for ERA-40. This benefited from many of the changes introduced into operational forecasting since the mid-1990s, when the systems used for the 15-year ECMWF re-analysis (ERA-15) and the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) re-analysis were implemented. Several of the improvements are discussed. General aspects of the production of the analyses are also summarized. A number of results indicative of the overall performance of the data assimilation system, and implicitly of the observing system, are presented and discussed. The comparison of background (short-range) forecasts and analyses with observations, the consistency of the global mass budget, the magnitude of differences between analysis and background fields and the accuracy of medium-range forecasts run from the ERA-40 analyses are illustrated. Several results demonstrate the marked improvement that was made to the observing system for the southern hemisphere in the 1970s, particularly towards the end of the decade. In contrast, the synoptic quality of the analysis for the northern hemisphere is sufficient to provide forecasts that remain skilful well into the medium range for all years. Two particular problems are also examined: excessive precipitation over tropical oceans and a too strong Brewer-Dobson circulation, both of which are pronounced in later years. Several other aspects of the quality of the re-analyses revealed by monitoring and validation studies are summarized. Expectations that the second-generation ERA-40 re-analysis would provide products that are better than those from the firstgeneration ERA-15 and NCEP/NCAR re-analyses are found to have been met in most cases. © Royal Meteorological Society, 2005. The contributions of N. A. Rayner and R. W. Saunders are Crown copyright.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Perchlorate-reducing bacteria fractionate chlorine stable isotopes giving a powerful approach to monitor the extent of microbial consumption of perchlorate in contaminated sites undergoing remediation or natural perchlorate containing sites. This study reports the full experimental data and methodology used to re-evaluate the chlorine isotope fractionation of perchlorate reduction in duplicate culture experiments of Azospira suillum strain PS at 37 degrees C (Delta Cl-37(Cr)--ClO4-) previously reported, without a supporting data set by Coleman et al. [Coleman, M.L., Ader, M., Chaudhuri, S., Coates,J.D., 2003. Microbial Isotopic Fractionation of Perchlorate Chlorine. Appl. Environ. Microbiol. 69, 4997-5000] in a reconnaissance study, with the goal of increasing the accuracy and precision of the isotopic fractionation determination. The method fully described here for the first time, allows the determination of a higher precision Delta Cl-37(Cl)--ClO4- value, either from accumulated chloride content and isotopic composition or from the residual perchlorate content and isotopic composition. The result sets agree perfectly, within error, giving average Delta Cl-37(Cl)--ClO4- = -14.94 +/- 0.15%omicron. Complementary use of chloride and perchlorate data allowed the identification and rejection of poor quality data by applying mass and isotopic balance checks. This precise Delta Cl-37(Cl)--ClO4-, value can serve as a reference point for comparison with future in situ or microcosm studies but we also note its similarity to the theoretical equilibrium isotopic fractionation between a hypothetical chlorine species of redox state +6 and perchlorate at 37 degrees C and suggest that the first electron transfer during perchlorate reduction may occur at isotopic equilibrium between art enzyme-bound chlorine and perchlorate. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Severe acute respiratory syndrome (SARS) coronavirus infection and growth are dependent on initiating signaling and enzyme actions upon viral entry into the host cell. Proteins packaged during virus assembly may subsequently form the first line of attack and host manipulation upon infection. A complete characterization of virion components is therefore important to understanding the dynamics of early stages of infection. Mass spectrometry and kinase profiling techniques identified nearly 200 incorporated host and viral proteins. We used published interaction data to identify hubs of connectivity with potential significance for virion formation. Surprisingly, the hub with the most potential connections was not the viral M protein but the nonstructurall protein 3 (nsp3), which is one of the novel virion components identified by mass spectrometry. Based on new experimental data and a bioinformatics analysis across the Coronaviridae, we propose a higher-resolution functional domain architecture for nsp3 that determines the interaction capacity of this protein. Using recombinant protein domains expressed in Escherichia coli, we identified two additional RNA-binding domains of nsp3. One of these domains is located within the previously described SARS-unique domain, and there is a nucleic acid chaperone-like domain located immediately downstream of the papain-like proteinase domain. We also identified a novel cysteine-coordinated metal ion-binding domain. Analyses of interdomain interactions and provisional functional annotation of the remaining, so-far-uncharacterized domains are presented. Overall, the ensemble of data surveyed here paint a more complete picture of nsp3 as a conserved component of the viral protein processing machinery, which is intimately associated with viral RNA in its role as a virion component.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The reduction of water-insoluble indigo by the recently isolated moderate thermophile, Clostridium isatidis, has been studied with the aim of developing a sustainable technology for industrial indigo reduction. The ability to reduce indigo was not shared with C. aurantibutyricum, C. celatum and C. papyrosolvens, but C. papyrosolvens could reduce indigo carmine (5,5-indigosulfonic acid), a soluble indigo derivative. The supernatant from cultures of C. isatidis, but not from cultures of the other bacteria tested, decreased indigo particle size to one-tenth diameter. Addition of madder powder, anthraquinone-2,6-disulfonic acid, and humic acid all stimulated indigo reduction by C. isatidis. Redox potentials of cultures of C. isatidis were about 100 mV more negative than those of C. aurantibutyricum, C. celatum and C. papyrosolvens, and reached –600 mV versus the SCE in the presence of indigo, but potentials were not consistently affected by the addition of the quinone compounds, which probably act by modifying the surface of the bacteria or indigo particles. It is concluded that C. isatidis can reduce indigo because (1) it produces an extracellular factor that decreases indigo particle size, and (2) it generates a sufficiently reducing potential.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is generally thought that catalysts produced by incipient wetness impregnation (IW) are very poor for low temperature CO oxidation, and that it is necessary to use methods such as deposition-precipitation (DP) to make high activity materials. The former is true, indeed such IW catalysts are poor, and we present reactor data, XPS and TEM analysis which show that this is due to the very negative effect of the chloride anion involved in the preparation, which results in poisoning and excessive sintering of the Au particles. With the DP method, the chloride is largely removed during the preparation and so poisoning and sintering are avoided. However, we show here that, contrary to previous considerations, high activity catalysts can indeed be prepared by the incipient wetness method, if care is taken to remove the chloride ion during the process. This is achieved by using the double impregnation method (DIM). In this a double impregnation of chloroauric acid and a base are made to precipitate out gold hydroxide within the pores of the catalyst, followed by limited washing. This results in a much more active catalyst, which is active for CO oxidation at ambient temperature. The results for DIM and DP are compared, and it is proposed that the DIM method may represent an environmentally and economically more favorable route to high activity gold catalyst production. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose – The main aim of this paper is to present the results of a study examining managers' attitudes towards the deployment and use of information and communications technology (ICT) in their organisations. The study comes at a time when ICT is being recognised as a major enabler of innovation and new business models, which have the potential to have major impact on western economies and jobs. Design/methodology/approach – A questionnaire was specially designed to collect data relating to three research questions. The questionnaire also included a number of open-ended questions. A total of 181 managers from a wide range of industries across a number of countries participated in the electronic survey. The quantitative responses to the survey were analysed using SPSS. Exploratory factor analysis using Varimax rotation was used and ANOVA to compare responses by different groups. Findings – The survey showed that many of the respondents appeared equipped to work “any place, any time”. However, it also highlighted the challenges managers face in working in a connected operation. Also, the data suggested that many managers were less than confident about their companies' policies and practices in relation to information management. Originality/value – A next step from this exploratory research could be the development of a model exploring the impact of ICT on management and organisational performance in terms of personal characteristics of the manager, the role performed, the context and the ICT provision. Also, further research could focus on examining in more detail differences between management levels.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A wireless sensor network (WSN) is a group of sensors linked by wireless medium to perform distributed sensing tasks. WSNs have attracted a wide interest from academia and industry alike due to their diversity of applications, including home automation, smart environment, and emergency services, in various buildings. The primary goal of a WSN is to collect data sensed by sensors. These data are characteristic of being heavily noisy, exhibiting temporal and spatial correlation. In order to extract useful information from such data, as this paper will demonstrate, people need to utilise various techniques to analyse the data. Data mining is a process in which a wide spectrum of data analysis methods is used. It is applied in the paper to analyse data collected from WSNs monitoring an indoor environment in a building. A case study is given to demonstrate how data mining can be used to optimise the use of the office space in a building.