524 resultados para Stochastic Context-Free Grammars
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
Hierarchical structure with nested nonlocal dependencies is a key feature of human language and can be identified theoretically in most pieces of tonal music. However, previous studies have argued against the perception of such structures in music. Here, we show processing of nonlocal dependencies in music. We presented chorales by J. S. Bach and modified versions inwhich the hierarchical structure was rendered irregular whereas the local structure was kept intact. Brain electric responses differed between regular and irregular hierarchical structures, in both musicians and nonmusicians. This finding indicates that, when listening to music, humans apply cognitive processes that are capable of dealing with longdistance dependencies resulting from hierarchically organized syntactic structures. Our results reveal that a brain mechanism fundamental for syntactic processing is engaged during the perception of music, indicating that processing of hierarchical structure with nested nonlocal dependencies is not just a key component of human language, but a multidomain capacity of human cognition.
Resumo:
Hierarchical structure with nested nonlocal dependencies is a key feature of human language and can be identified theoretically in most pieces of tonal music. However, previous studies have argued against the perception of such structures in music. Here, we show processing of nonlocal dependencies in music. We presented chorales by J. S. Bach and modified versions inwhich the hierarchical structure was rendered irregular whereas the local structure was kept intact. Brain electric responses differed between regular and irregular hierarchical structures, in both musicians and nonmusicians. This finding indicates that, when listening to music, humans apply cognitive processes that are capable of dealing with longdistance dependencies resulting from hierarchically organized syntactic structures. Our results reveal that a brain mechanism fundamental for syntactic processing is engaged during the perception of music, indicating that processing of hierarchical structure with nested nonlocal dependencies is not just a key component of human language, but a multidomain capacity of human cognition.
Resumo:
This dissertation has two almost unrelated themes: privileged words and Sturmian words. Privileged words are a new class of words introduced recently. A word is privileged if it is a complete first return to a shorter privileged word, the shortest privileged words being letters and the empty word. Here we give and prove almost all results on privileged words known to date. On the other hand, the study of Sturmian words is a well-established topic in combinatorics on words. In this dissertation, we focus on questions concerning repetitions in Sturmian words, reproving old results and giving new ones, and on establishing completely new research directions. The study of privileged words presented in this dissertation aims to derive their basic properties and to answer basic questions regarding them. We explore a connection between privileged words and palindromes and seek out answers to questions on context-freeness, computability, and enumeration. It turns out that the language of privileged words is not context-free, but privileged words are recognizable by a linear-time algorithm. A lower bound on the number of binary privileged words of given length is proven. The main interest, however, lies in the privileged complexity functions of the Thue-Morse word and Sturmian words. We derive recurrences for computing the privileged complexity function of the Thue-Morse word, and we prove that Sturmian words are characterized by their privileged complexity function. As a slightly separate topic, we give an overview of a certain method of automated theorem-proving and show how it can be applied to study privileged factors of automatic words. The second part of this dissertation is devoted to Sturmian words. We extensively exploit the interpretation of Sturmian words as irrational rotation words. The essential tools are continued fractions and elementary, but powerful, results of Diophantine approximation theory. With these tools at our disposal, we reprove old results on powers occurring in Sturmian words with emphasis on the fractional index of a Sturmian word. Further, we consider abelian powers and abelian repetitions and characterize the maximum exponents of abelian powers with given period occurring in a Sturmian word in terms of the continued fraction expansion of its slope. We define the notion of abelian critical exponent for Sturmian words and explore its connection to the Lagrange spectrum of irrational numbers. The results obtained are often specialized for the Fibonacci word; for instance, we show that the minimum abelian period of a factor of the Fibonacci word is a Fibonacci number. In addition, we propose a completely new research topic: the square root map. We prove that the square root map preserves the language of any Sturmian word. Moreover, we construct a family of non-Sturmian optimal squareful words whose language the square root map also preserves.This construction yields examples of aperiodic infinite words whose square roots are periodic.
Resumo:
Olive (Olea europaea L.), one of the main crops in the Mediterranean basin, is mainly propagated by cuttings, a classical propagation method that relies on the ability of the cuttings to form adventitious roots. While some cultivars are easily propagated by this technique, some of the most interesting olive cultivars are considered difficult-to-root which poses a challenge for their preservation and commercialization. Therefore, increasing the current knowledge on adventitious root formation is extremely important for species like olive. This research focuses on evaluating the role of free auxins and oxidative enzymes on adventitious root formation of two olive cultivars with different rooting ability - ‘Galega vulgar’ (difficult-to-root) and ‘Cobrançosa’ (easy-to-root). In this context, free auxin levels and enzyme activities were determined in in vitro-cultured ‘Galega vulgar’ microshoots and in semi-hardwood cuttings of cvs. ‘Galega vulgar’ and ‘Cobrançosa’. To attain this goal, an analytical method for the quantification of free indole-3-acetic acid (IAA) and indole-3-butyric acid (IBA) was developed, which is based on dispersive liquid-liquid microextraction followed by microwave derivatization (DLLME-MAD) and gas chromatography-mass spectrometry (GC/MS) analysis. The developed method was validated in terms of linearity, recovery, limit of detection (LOD) and limit of quantification (LOQ) and proved to be useful in the analysis of two very different types of plant tissues. The results from auxin quantification in olive samples point at a relationship between free auxin levels and rooting ability of both microshoots and semihardwood cuttings. A defective IBA-IAA conversion, resulting in a peak of free IAA during initiation phase, seems to be associated with low rooting ability. Likewise, differences in the activity of oxidative enzymes also appear to be related with rooting ability. Higher polyphenol oxidases (PPO) activity is likely related with an easyto- root behavior, while the opposite is true for peroxidases (POX) (including IAA oxidase (IAAox)) activity. A possible hypothesis for adventitious root formation in olive microcuttings is presented herein for the first time. Free auxins, oxidative enzymes, alternative oxidase (AOX) and reactive oxygen species (ROS) are some of the factors that may be involved in this highly complex physiological process. Interestingly, while temporal changes in auxin levels were similar between microshoots and semihardwood cuttings, the conclusions obtained from enzyme activity results in microshoots didn’t translate to semi-hardwood tissues, showing the emerging need for adaptation of classical agronomical research studies to modern techniques; Resumo: Procurando compreender o papel das auxinas e enzimas oxidativas na formação de raízes adventícias em cultivares de oliveira (Olea europaea L.) A oliveira (Olea europaea L.) é uma das principais culturas da bacia Mediterrânica e é propagada maioritariamente por estacaria, um processo altamente dependente da capacidade das estacas para formar raízes adventícias. Enquanto algumas cultivares são fáceis de propagar desta forma, algumas das cultivares de oliveira mais interessantes são consideradas difíceis de enraizar, o que dificulta a sua preservação e comercialização e torna extremamente importante aprofundar o conhecimento sobre o enraizamento adventício desta espécie. Este trabalho foca-se na avaliação do papel das auxinas livres e das enzimas oxidativas na formação de raízes adventícias em duas cultivares de oliveira com diferente capacidade de enraizamento - ‘Galega vulgar’ (difícil de enraizar) e ‘Cobrançosa’ (fácil de enraizar). Neste contexto, determinaram-se os níveis de auxinas livres e as actividades de enzimas oxidativas em microestacas de ‘Galega vulgar’ cultivadas in vitro bem como em estacas semi-lenhosas das cvs. ‘Galega vulgar’ e ‘Cobrançosa’. Para tal foi necessário desenvolver uma metodologia analítica para a quantificação de ácido indol-3-acético (IAA) e ácido indol-3-butírico (IBA), baseada em microextracção dispersiva líquido-líquido (DLLME) seguida de derivatização em microondas (MAD) e análise por cromatografia gasosa acoplada a espectrometria de massa (GC/MS). O método desenvolvido foi validado em termos de linearidade, recuperação, limite de detecção (LOD) e limite de quantificação (LOQ), e mostrou-se eficaz na análise de dois tipos de tecidos vegetais bastante diferentes. Os resultados da análise de auxinas em amostras de oliveira apontam para uma possível relação entre os níveis de auxinas livres e a capacidade de enraizamento, tanto em microestacas como em estacas semi-lenhosas. Uma conversão IBA-IAA deficiente, que resulta num pico de IAA durante a fase de iniciação, parece estar associada à baixa capacidade de enraizamento. Por outro lado, a capacidade de enraizamento também parece estar relacionada com diferenças na actividade de enzimas oxidativas. Comportamentos fáceis de enraizar estão associados a actividade mais elevada das polifenoloxidases (PPO), enquanto o oposto é verdade para a actividade das peroxidases (POX) (incluindo a IAA oxidase (IAAox)). Neste trabalho propõe-se pela primeira vez uma possível explicação para o enraizamento adventício em microestacas de oliveira. Auxinas livres, enzimas oxidativas, oxidase alternativa (AOX) e espécies reactivas de oxigénio (ROS) são alguns dos factores envolvidos neste processo fisiológico altamente complexo. Curiosamente, enquanto as alterações temporais nos níveis de auxinas foram semelhantes entre microestacas e estacas semi-lenhosas, o mesmo não se observou relativamente à actividade enzimática, o que mostra a necessidade de adaptação dos estudos agronómicos tradicionais às técnicas correntes.
Resumo:
Biologists are increasingly conscious of the critical role that noise plays in cellular functions such as genetic regulation, often in connection with fluctuations in small numbers of key regulatory molecules. This has inspired the development of models that capture this fundamentally discrete and stochastic nature of cellular biology - most notably the Gillespie stochastic simulation algorithm (SSA). The SSA simulates a temporally homogeneous, discrete-state, continuous-time Markov process, and of course the corresponding probabilities and numbers of each molecular species must all remain positive. While accurately serving this purpose, the SSA can be computationally inefficient due to very small time stepping so faster approximations such as the Poisson and Binomial τ-leap methods have been suggested. This work places these leap methods in the context of numerical methods for the solution of stochastic differential equations (SDEs) driven by Poisson noise. This allows analogues of Euler-Maruyuma, Milstein and even higher order methods to be developed through the Itô-Taylor expansions as well as similar derivative-free Runge-Kutta approaches. Numerical results demonstrate that these novel methods compare favourably with existing techniques for simulating biochemical reactions by more accurately capturing crucial properties such as the mean and variance than existing methods.
A derivative-free explicit method with order 1.0 for solving stochastic delay differential equations
Resumo:
The paper presents a geometry-free approach to assess the variation of covariance matrices of undifferenced triple frequency GNSS measurements and its impact on positioning solutions. Four independent geometryfree/ ionosphere-free (GFIF) models formed from original triple-frequency code and phase signals allow for effective computation of variance-covariance matrices using real data. Variance Component Estimation (VCE) algorithms are implemented to obtain the covariance matrices for three pseudorange and three carrier-phase signals epoch-by-epoch. Covariance results from the triple frequency Beidou System (BDS) and GPS data sets demonstrate that the estimated standard deviation varies in consistence with the amplitude of actual GFIF error time series. The single point positioning (SPP) results from BDS ionosphere-free measurements at four MGEX stations demonstrate an improvement of up to about 50% in Up direction relative to the results based on a mean square statistics. Additionally, a more extensive SPP analysis at 95 global MGEX stations based on GPS ionosphere-free measurements shows an average improvement of about 10% relative to the traditional results. This finding provides a preliminary confirmation that adequate consideration of the variation of covariance leads to the improvement of GNSS state solutions.
Resumo:
The free vibrational characteristics of a beam-column, which is having randomly varying Young's modulus and mass density and subjected to randomly distributed axial loading is analysed. The material property fluctuations and axial loadings are considered to constitute independent one-dimensional, uni-variate, homogeneous real, spatially distributed stochastic fields. Hamilton's principle is used to formulate the problem using stochastic FEM. Vibration frequencies and mode shapes are analysed for their statistical descriptions. A numerical example is shown.
Resumo:
The free vibration of strings with randomly varying mass and stiffness is considered. The joint probability density functions of the eigenvalues and eigenfunctions are characterized in terms of the solution of a pair of stochastic non-linear initial value problems. Analytical solutions of these equations based on the method of stochastic averaging are obtained. The effects of the mean and autocorrelation of the mass process are included in the analysis. Numerical results for the marginal probability density functions of eigenvalues and eigenfunctions are obtained and are found to compare well with Monte Carlo simulation results. The random eigenvalues, when normalized with respect to their corresponding deterministic values, are observed to tend to become first order stochastically stationary with respect to the mode count.
Resumo:
This research investigated the unconfined flow through dams. The hydraulic conductivity was modeled as spatially random field following lognormal distribution. Results showed that the seepage flow produced from the stochastic solution was smaller than its deterministic value. In addition, the free surface was observed to exit at a point lower than that obtained from the deterministic solution. When the hydraulic conductivity was strongly correlated in the horizontal direction than the vertical direction, the flow through the dam has markedly increased. It is suggested that it may not be necessary to construct a core in dams made from soils that exhibit high degree of variability.
Resumo:
In 2004, Lost debuted on ABC and quickly became a cultural phenomenon. Its postmodem take on the classic Robinson Crusoe desert island scenario gestures to a variety of different issues circulating within the post-9II1 cultural consciousness, such as terrorism, leadership, anxieties involving air travel, torture, and globalization. Lost's complex interwoven flashback and flash-forward narrative structure encourages spectators to creatively hypothesize solutions to the central mysteries of the narrative, while also thematically addressing archetypal questions of freedom of choice versus fate. Through an examination of the narrative structure, the significance of technological shifts in television, and fan cultures in Lost, this thesis discusses the tenuous notion of consumer agency within the current cultural context. Furthermore, I also explore these issues in relation to the wider historical post-9/II context.
Resumo:
Latent variable models in finance originate both from asset pricing theory and time series analysis. These two strands of literature appeal to two different concepts of latent structures, which are both useful to reduce the dimension of a statistical model specified for a multivariate time series of asset prices. In the CAPM or APT beta pricing models, the dimension reduction is cross-sectional in nature, while in time-series state-space models, dimension is reduced longitudinally by assuming conditional independence between consecutive returns, given a small number of state variables. In this paper, we use the concept of Stochastic Discount Factor (SDF) or pricing kernel as a unifying principle to integrate these two concepts of latent variables. Beta pricing relations amount to characterize the factors as a basis of a vectorial space for the SDF. The coefficients of the SDF with respect to the factors are specified as deterministic functions of some state variables which summarize their dynamics. In beta pricing models, it is often said that only the factorial risk is compensated since the remaining idiosyncratic risk is diversifiable. Implicitly, this argument can be interpreted as a conditional cross-sectional factor structure, that is, a conditional independence between contemporaneous returns of a large number of assets, given a small number of factors, like in standard Factor Analysis. We provide this unifying analysis in the context of conditional equilibrium beta pricing as well as asset pricing with stochastic volatility, stochastic interest rates and other state variables. We address the general issue of econometric specifications of dynamic asset pricing models, which cover the modern literature on conditionally heteroskedastic factor models as well as equilibrium-based asset pricing models with an intertemporal specification of preferences and market fundamentals. We interpret various instantaneous causality relationships between state variables and market fundamentals as leverage effects and discuss their central role relative to the validity of standard CAPM-like stock pricing and preference-free option pricing.