887 resultados para server log data analysis, vector space models, matrix methods of data analysis, tensor space modelling of web users, clustering, association rule mining, user profile, group profile, object profiling, recommendation
Resumo:
In this paper, reduced level of rock at Bangalore, India is arrived from the 652 boreholes data in the area covering 220 sq.km. In the context of prediction of reduced level of rock in the subsurface of Bangalore and to study the spatial variability of the rock depth, ordinary kriging and Support Vector Machine (SVM) models have been developed. In ordinary kriging, the knowledge of the semivariogram of the reduced level of rock from 652 points in Bangalore is used to predict the reduced level of rock at any point in the subsurface of Bangalore, where field measurements are not available. A cross validation (Q1 and Q2) analysis is also done for the developed ordinary kriging model. The SVM is a novel type of learning machine based on statistical learning theory, uses regression technique by introducing e-insensitive loss function has been used to predict the reduced level of rock from a large set of data. A comparison between ordinary kriging and SVM model demonstrates that the SVM is superior to ordinary kriging in predicting rock depth.
Resumo:
In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.
Resumo:
The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society
Resumo:
Vintage-based vector autoregressive models of a single macroeconomic variable are shown to be a useful vehicle for obtaining forecasts of different maturities of future and past observations, including estimates of post-revision values. The forecasting performance of models which include information on annual revisions is superior to that of models which only include the first two data releases. However, the empirical results indicate that a model which reflects the seasonal nature of data releases more closely does not offer much improvement over an unrestricted vintage-based model which includes three rounds of annual revisions.
Resumo:
We present a framework for fitting multiple random walks to animal movement paths consisting of ordered sets of step lengths and turning angles. Each step and turn is assigned to one of a number of random walks, each characteristic of a different behavioral state. Behavioral state assignments may be inferred purely from movement data or may include the habitat type in which the animals are located. Switching between different behavioral states may be modeled explicitly using a state transition matrix estimated directly from data, or switching probabilities may take into account the proximity of animals to landscape features. Model fitting is undertaken within a Bayesian framework using the WinBUGS software. These methods allow for identification of different movement states using several properties of observed paths and lead naturally to the formulation of movement models. Analysis of relocation data from elk released in east-central Ontario, Canada, suggests a biphasic movement behavior: elk are either in an "encamped" state in which step lengths are small and turning angles are high, or in an "exploratory" state, in which daily step lengths are several kilometers and turning angles are small. Animals encamp in open habitat (agricultural fields and opened forest), but the exploratory state is not associated with any particular habitat type.
Resumo:
This article explores two matrix methods to induce the ``shades of meaning" (SoM) of a word. A matrix representation of a word is computed from a corpus of traces based on the given word. Non-negative Matrix Factorisation (NMF) and Singular Value Decomposition (SVD) compute a set of vectors corresponding to a potential shade of meaning. The two methods were evaluated based on loss of conditional entropy with respect to two sets of manually tagged data. One set reflects concepts generally appearing in text, and the second set comprises words used for investigations into word sense disambiguation. Results show that for NMF consistently outperforms SVD for inducing both SoM of general concepts as well as word senses. The problem of inducing the shades of meaning of a word is more subtle than that of word sense induction and hence relevant to thematic analysis of opinion where nuances of opinion can arise.
Resumo:
Mandelstam�s argument that PCAC follows from assigning Lorentz quantum numberM=1 to the massless pion is examined in the context of multiparticle dual resonance model. We construct a factorisable dual model for pions which is formulated operatorially on the harmonic oscillator Fock space along the lines of Neveu-Schwarz model. The model has bothm ? andm ? as arbitrary parameters unconstrained by the duality requirement. Adler self-consistency condition is satisfied if and only if the conditionm?2?m?2=1/2 is imposed, in which case the model reduces to the chiral dual pion model of Neveu and Thorn, and Schwarz. The Lorentz quantum number of the pion in the dual model is shown to beM=0.
Resumo:
The Dependency Structure Matrix (DSM) has proved to be a useful tool for system structure elicitation and analysis. However, as with any modelling approach, the insights gained from analysis are limited by the quality and correctness of input information. This paper explores how the quality of data in a DSM can be enhanced by elicitation methods which include comparison of information acquired from different perspectives and levels of abstraction. The approach is based on comparison of dependencies according to their structural importance. It is illustrated through two case studies: creation of a DSM showing the spatial connections between elements in a product, and a DSM capturing information flows in an organisation. We conclude that considering structural criteria can lead to improved data quality in DSM models, although further research is required to fully explore the benefits and limitations of our proposed approach.
Resumo:
For two multinormal populations with equal covariance matrices the likelihood ratio discriminant function, an alternative allocation rule to the sample linear discriminant function when n1 ≠ n2 ,is studied analytically. With the assumption of a known covariance matrix its distribution is derived and the expectation of its actual and apparent error rates evaluated and compared with those of the sample linear discriminant function. This comparison indicates that the likelihood ratio allocation rule is robust to unequal sample sizes. The quadratic discriminant function is studied, its distribution reviewed and evaluation of its probabilities of misclassification discussed. For known covariance matrices the distribution of the sample quadratic discriminant function is derived. When the known covariance matrices are proportional exact expressions for the expectation of its actual and apparent error rates are obtained and evaluated. The effectiveness of the sample linear discriminant function for this case is also considered. Estimation of true log-odds for two multinormal populations with equal or unequal covariance matrices is studied. The estimative, Bayesian predictive and a kernel method are compared by evaluating their biases and mean square errors. Some algebraic expressions for these quantities are derived. With equal covariance matrices the predictive method is preferable. Where it derives this superiority is investigated by considering its performance for various levels of fixed true log-odds. It is also shown that the predictive method is sensitive to n1 ≠ n2. For unequal but proportional covariance matrices the unbiased estimative method is preferred. Product Normal kernel density estimates are used to give a kernel estimator of true log-odds. The effect of correlation in the variables with product kernels is considered. With equal covariance matrices the kernel and parametric estimators are compared by simulation. For moderately correlated variables and large dimension sizes the product kernel method is a good estimator of true log-odds.
Resumo:
We examined how marine plankton interaction networks, as inferred by multivariate autoregressive (MAR) analysis of time-series, differ based on data collected at a fixed sampling location (L4 station in the Western English Channel) and four similar time-series prepared by averaging Continuous Plankton Recorder (CPR) datapoints in the region surrounding the fixed station. None of the plankton community structures suggested by the MAR models generated from the CPR datasets were well correlated with the MAR model for L4, but of the four CPR models, the one most closely resembling the L4 model was that for the CPR region nearest to L4. We infer that observation error and spatial variation in plankton community dynamics influenced the model performance for the CPR datasets. A modified MAR framework in which observation error and spatial variation are explicitly incorporated could allow the analysis to better handle the diverse time-series data collected in marine environments.
Resumo:
Context. Electron-impact excitation collision strengths are required for the analysis and interpretation of stellar observations.
Aims. This calculation aims to provide effective collision strengths for the Mg V ion for a larger number of transitions and for a greater temperature range than previously available, using collision strength data that include contributions from resonances.
Methods. A 19-state Breit-Pauli R-matrix calculation was performed. The target states are represented by configuration interaction wavefunctions and consist of the 19 lowest LS states, having configurations 2s22p4, 2s2p5, 2p6, 2s22p33s, and 2s22p33p. These target states give rise to 37 fine-structure levels and 666 possible transitions. The effective collision strengths were calculated by averaging the electron collision strengths over a Maxwellian distribution of electron velocities.
Results. The non-zero effective collision strengths for transitions between the fine-structure levels are given for electron temperatures in the range = 3.0 - 7.0. Data for transitions among the 5 fine-structure levels arising from the 2s22p4 ground state configurations, seen in the UV range, are discussed in the paper, along with transitions in the EUV range – transitions from the ground state 3P levels to 2s2p5?3P levels. The 2s22p4?1D–2s2p5?1P transition is also noted. Data for the remaining transitions are available at the CDS.
Resumo:
OBJECTIVE: This work investigates the delivery accuracy of different Varian linear accelerator models using log-file derived MLC RMS values.
METHODS: Seven centres independently created a plan on the same virtual phantom using their own planning system and the log files were analysed following delivery of the plan in each centre to assess MLC positioning accuracy. A single standard plan was also delivered by seven centres to remove variations in complexity and the log files were analysed for Varian TrueBeams and Clinacs (2300IX or 2100CD models).
RESULTS: Varian TrueBeam accelerators had better MLC positioning accuracy (<1.0mm) than the 2300IX (<2.5mm) following delivery of the plans created by each centre and also the standard plan. In one case log files provided evidence that reduced delivery accuracy was not associated with the linear accelerator model but was due to planning issues.
CONCLUSIONS: Log files are useful in identifying differences between linear accelerator models, and isolate errors during end-to-end testing in VMAT audits. Log file analysis can rapidly eliminate the machine delivery from the process and divert attention with confidence to other aspects. Advances in Knowledge: Log file evaluation was shown to be an effective method to rapidly verify satisfactory treatment delivery when a dosimetric evaluation fails during end-to-end dosimetry audits. MLC RMS values for Varian TrueBeams were shown to be much smaller than Varian Clinacs for VMAT deliveries.
Resumo:
Vitis vinifera L., the most widely cultivated fruit crop in the world, was the starting point for the development of this PhD thesis. This subject was exploited following on two actual trends: i) the development of rapid, simple, and high sensitive methodologies with minimal sample handling; and ii) the valuation of natural products as a source of compounds with potential health benefits. The target group of compounds under study were the volatile terpenoids (mono and sesquiterpenoids) and C13 norisoprenoids, since they may present biological impact, either from the sensorial point of view, as regards to the wine aroma, or by the beneficial properties for the human health. Two novel methodologies for quantification of C13 norisoprenoids in wines were developed. The first methodology, a rapid method, was based on the headspace solid-phase microextraction combined with gas chromatography-quadrupole mass spectrometry operating at selected ion monitoring mode (HS-SPME/GC-qMS-SIM), using GC conditions that allowed obtaining a C13 norisoprenoid volatile signature. It does not require any pre-treatment of the sample, and the C13 norisoprenoid composition of the wine was evaluated based on the chromatographic profile and specific m/z fragments, without complete chromatographic separation of its components. The second methodology, used as reference method, was based on the HS-SPME/GC-qMS-SIM, allowing the GC conditions for an adequate chromatographic resolution of wine components. For quantification purposes, external calibration curves were constructed with β-ionone, with regression coefficient (r2) of 0.9968 (RSD 12.51 %) and 0.9940 (RSD of 1.08 %) for the rapid method and for the reference method, respectively. Low detection limits (1.57 and 1.10 μg L-1) were observed. These methodologies were applied to seventeen white and red table wines. Two vitispirane isomers (158-1529 L-1) and 1,1,6-trimethyl-1,2-dihydronaphthalene (TDN) (6.42-39.45 μg L-1) were quantified. The data obtained for vitispirane isomers and TDN using the two methods were highly correlated (r2 of 0.9756 and 0.9630, respectively). A rapid methodology for the establishment of the varietal volatile profile of Vitis vinifera L. cv. 'Fernão-Pires' (FP) white wines by headspace solid-phase microextraction combined with comprehensive two-dimensional gas chromatography with time-of-flight mass spectrometry (HS-SPME/GCxGC-TOFMS) was developed. Monovarietal wines from different harvests, Appellations, and producers were analysed. The study was focused on the volatiles that seem to be significant to the varietal character, such as mono and sesquiterpenic compounds, and C13 norisoprenoids. Two-dimensional chromatographic spaces containing the varietal compounds using the m/z fragments 93, 121, 161, 175 and 204 were established as follows: 1tR = 255-575 s, 2tR = 0,424-1,840 s, for monoterpenoids, 1tR = 555-685 s, 2tR = 0,528-0,856 s, for C13 norisoprenoids, and 1tR = 695-950 s, 2tR = 0,520-0,960 s, for sesquiterpenic compounds. For the three chemical groups under study, from a total of 170 compounds, 45 were determined in all wines, allowing defining the "varietal volatile profile" of FP wine. Among these compounds, 15 were detected for the first time in FP wines. This study proposes a HS-SPME/GCxGC-TOFMS based methodology combined with classification-reference sample to be used for rapid assessment of varietal volatile profile of wines. This approach is very useful to eliminate the majority of the non-terpenic and non-C13 norisoprenic compounds, allowing the definition of a two-dimensional chromatographic space containing these compounds, simplifying the data compared to the original data, and reducing the time of analysis. The presence of sesquiterpenic compounds in Vitis vinifera L. related products, to which are assigned several biological properties, prompted us to investigate the antioxidant, antiproliferative and hepatoprotective activities of some sesquiterpenic compounds. Firstly, the antiradical capacity of trans,trans-farnesol, cis-nerolidol, α-humulene and guaiazulene was evaluated using chemical (DPPH• and hydroxyl radicals) and biological (Caco-2 cells) models. Guaiazulene (IC50= 0.73 mM) was the sesquiterpene with higher scavenger capacity against DPPH•, while trans,trans-farnesol (IC50= 1.81 mM) and cis-nerolidol (IC50= 1.48 mM) were more active towards hydroxyl radicals. All compounds, with the exception of α-humulene, at non-cytotoxic levels (≤ 1 mM), were able to protect Caco-2 cells from oxidative stress induced by tert-butyl hydroperoxide. The activity of the compounds under study was also evaluated as antiproliferative agents. Guaiazulene and cis-nerolidol were able to more effectively arrest the cell cycle in the S-phase than trans,trans-farnesol and α-humulene, being the last almost inactive. The relative hepatoprotection effect of fifteen sesquiterpenic compounds, presenting different chemical structures and commonly found in plants and plant-derived foods and beverages, was assessed. Endogenous lipid peroxidation and induced lipid peroxidation with tert-butyl hydroperoxide were evaluated in liver homogenates from Wistar rats. With the exception of α-humulene, all the sesquiterpenic compounds under study (1 mM) were effective in reducing the malonaldehyde levels in both endogenous and induced lipid peroxidation up to 35% and 70%, respectively. The developed 3D-QSAR models, relating the hepatoprotection activity with molecular properties, showed good fit (R2LOO > 0.819) with good prediction power (Q2 > 0.950 and SDEP < 2%) for both models. A network of effects associated with structural and chemical features of sesquiterpenic compounds such as shape, branching, symmetry, and presence of electronegative fragments, can modulate the hepatoprotective activity observed for these compounds. In conclusion, this study allowed the development of rapid and in-depth methods for the assessment of varietal volatile compounds that might have a positive impact on sensorial and health attributes related to Vitis vinifera L. These approaches can be extended to the analysis of other related food matrices, including grapes and musts, among others. In addition, the results of in vitro assays open a perspective for the promising use of the sesquiterpenic compounds, with similar chemical structures such as those studied in the present work, as antioxidants, hepatoprotective and antiproliferative agents, which meets the current challenges related to diseases of modern civilization.