10 resultados para Compressed text search
em Repositório Científico do Instituto Politécnico de Lisboa - Portugal
Resumo:
Lossless compression algorithms of the Lempel-Ziv (LZ) family are widely used nowadays. Regarding time and memory requirements, LZ encoding is much more demanding than decoding. In order to speed up the encoding process, efficient data structures, like suffix trees, have been used. In this paper, we explore the use of suffix arrays to hold the dictionary of the LZ encoder, and propose an algorithm to search over it. We show that the resulting encoder attains roughly the same compression ratios as those based on suffix trees. However, the amount of memory required by the suffix array is fixed, and much lower than the variable amount of memory used by encoders based on suffix trees (which depends on the text to encode). We conclude that suffix arrays, when compared to suffix trees in terms of the trade-off among time, memory, and compression ratio, may be preferable in scenarios (e.g., embedded systems) where memory is at a premium and high speed is not critical.
Resumo:
We study the implications of the searches based on H -> tau(+)tau-by the ATLAS and CMS collaborations on the parameter space of the two-Higgs-doublet model (2HDM). In the 2HDM, the scalars can decay into a tau pair with a branching ratio larger than the SM one, leading to constraints on the 2HDM parameter space. We show that in model II, values of tan beta > 1.8 are definitively excluded if the pseudoscalar is in the mass range 110 GeV < m(A) < 145 GeV. We have also discussed the implications for the 2HDM of the recent dimuon search by the ATLAS collaboration for a CP-odd scalar in the mass range 4-12 GeV.
Resumo:
In the present study we focus on the interaction between the acquisition of new words and text organisation. In the acquisition of new words we emphasise the acquisition of paradigmatic relations such as hyponymy, meronymy and semantic sets. We work with a group of girls attending a private school for adolescents in serious difficulties. The subjects are from disadvantaged families. Their writing skills were very poor. When asked to describe a garden, they write a short text of a single paragraph, the lexical items were generic, there were no adjectives, and all of them use mainly existential verbs. The intervention plan assumed that subjects must to be exposed to new words, working out its meaning. In presence of referents subjects were taught new words making explicit the intended relation of the new term to a term already known. In the classroom subjects were asked to write all the words they knew drawing the relationships among them. They talk about the words specifying the relation making explicit pragmatic directions like is a kind of, is a part of or are all x. After that subjects were exposed to the task of choosing perspective. The work presented in this paper accounts for significant differences in the text of the subjects before and after the intervention. While working new words subjects were organising their lexicon and learning to present a whole entity in perspective.
Resumo:
Nos dias de hoje, ferramentas como o Facebook, o Twitter e o YouTube fazem parte do quotidiano. Desde o recente virar do século até ao presente, a sociedade transformou-se. Usamos cada vez mais a Internet. Nela pesquisamos informação e partilhamos conteúdos, sejam eles textos, fotos ou vídeos. As novas ferramentas de comunicação online trouxeram uma maior interatividade entre aquele que emite uma mensagem e aquele que a recebe. Nesta investigação procura-se analisar quais e como é que as novas ferramentas de comunicação online são utilizadas pelas organizações culturais, nomeadamente, pelas companhias de teatro de Lisboa e Vale do Tejo, entre 2000 e 2013. Ao longo do enquadramento teórico são abordadas questões como a comunicação das organizações, a comunicação online das mesmas, a utilização das novas ferramentas online por parte de companhias de teatro e o que são considerados sites, media sociais e redes sociais. Entre várias referências, serão citados Grunig e Hunt (1984) que apresentam o modelo de comunicação de dois sentidos simétricos, assim como Phillips e Young (2009) que abordam as diferentes ferramentas de comunicação online. São ainda apresentados estudos relativos à utilização destas ferramentas por parte das organizações artísticas, elaborados pela MTM London (2009) e pelo Australia Council for the Arts (2011). A presente investigação tem por base a observação e acompanhamento das ferramentas de comunicação online utilizadas pelas companhias de teatro, inquéritos aos produtores dessas companhias e entrevistas a alguns dos seus diretores. Com este trabalho pretende-se verificar que ferramentas estão a ser utilizadas pelas companhias, com que regularidade, quem nas companhias gere essas ferramentas, quais as vantagens percecionadas, entre outros aspetos.
Resumo:
Projeto para obtenção do grau de Mestre em Engenharia Informática e de Computadores
Resumo:
The article reports density measurements of dipropyl (DPA), dibutyl (DBA) and bis(2-ethylhexyl) (DEHA) adipates, using a vibrating U-tube densimeter, model DMA HP, from Anton Paar GmbH. The measurements were performed in the temperature range (293 to 373) K and at pressures up to about 68 MPa, except for DPA for which the upper limits were 363 K and 65 MPa, respectively. The density data for each liquid was correlated with the temperature and pressure using a modified Tait equation. The expanded uncertainty of the present density results is estimated as 0.2% at a 95% confidence level. No literature density data at pressures higher than 0.1 MPa could be found. DEHA literature data at atmospheric pressure agree with the correlation of the present measurements, in the corresponding temperature range, within +/- 0.11%. The isothermal compressibility and the isobaric thermal expansion were calculated by differentiation of the modified Tait correlation equation. These two parameters were also calculated for dimethyl adipate (DMA), from density data reported in a previous work. The uncertainties of isothermal compressibility and the isobaric thermal expansion are estimated to be less than +/- 1.7% and +/- 1.1%, respectively, at a 95% confidence level. Literature data of isothermal compressibility and isobaric thermal expansivity for DMA have an agreement within +/- 1% and +/- 2.4%, respectively, with results calculated in this work. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
Locating and identifying points as global minimizers is, in general, a hard and time-consuming task. Difficulties increase in the impossibility of using the derivatives of the functions defining the problem. In this work, we propose a new class of methods suited for global derivative-free constrained optimization. Using direct search of directional type, the algorithm alternates between a search step, where potentially good regions are located, and a poll step where the previously located promising regions are explored. This exploitation is made through the launching of several instances of directional direct searches, one in each of the regions of interest. Differently from a simple multistart strategy, direct searches will merge when sufficiently close. The goal is to end with as many direct searches as the number of local minimizers, which would easily allow locating the global extreme value. We describe the algorithmic structure considered, present the corresponding convergence analysis and report numerical results, showing that the proposed method is competitive with currently commonly used global derivative-free optimization solvers.
Resumo:
In order to correctly assess the biaxial fatigue material properties one must experimentally test different load conditions and stress levels. With the rise of new in-plane biaxial fatigue testing machines, using smaller and more efficient electrical motors, instead of the conventional hydraulic machines, it is necessary to reduce the specimen size and to ensure that the specimen geometry is appropriate for the load capacity installed. At the present time there are no standard specimen's geometries and the indications on literature how to design an efficient test specimen are insufficient. The main goal of this paper is to present the methodology on how to obtain an optimal cruciform specimen geometry, with thickness reduction in the gauge area, appropriate for fatigue crack initiation, as a function of the base material sheet thickness used to build the specimen. The geometry is optimized for maximum stress using several parameters, ensuring that in the gauge area the stress distributions on the loading directions are uniform and maximum with two limit phase shift loading conditions (delta = 0 degrees and (delta = 180 degrees). Therefore the fatigue damage will always initiate on the center of the specimen, avoiding failure outside this region. Using the Renard Series of preferred numbers for the base material sheet thickness as a reference, the reaming geometry parameters are optimized using a derivative-free methodology, called direct multi search (DMS) method. The final optimal geometry as a function of the base material sheet thickness is proposed, as a guide line for cruciform specimens design, and as a possible contribution for a future standard on in-plane biaxial fatigue tests
Resumo:
The paper reports viscosity measurements of compressed liquid dipropyl (DPA) and dibutyl (DBA) adipates obtained with two vibrating wire sensors developed in our group. The vibrating wire instruments were operated in the forced oscillation, or steady-state mode. The viscosity measurements of DPA were carried out in a range of pressures up to 18. MPa and temperatures from (303 to 333). K, and DBA up to 65. MPa and temperature from (303 to 373). K, covering a total range of viscosities from (1.3 to 8.3). mPa. s. The required density data of the liquid samples were obtained in our laboratory using an Anton Paar vibrating tube densimeter and were reported in a previous paper. The viscosity results were correlated with density, using a modified hard-spheres scheme. The root mean square deviation of the data from the correlation is less than (0.21 and 0.32)% and the maximum absolute relative deviations are within (0.43 and 0.81)%, for DPA and DBA respectively. No data for the viscosity of both adipates could be found in the literature. Independent viscosity measurements were also performed, at atmospheric pressure, using an Ubbelohde capillary in order to compare with the vibrating wire results. The expanded uncertainty of these results is estimated as ±1.5% at a 95% confidence level. The two data sets agree within the uncertainty of both methods. © 2015 Published by Elsevier B.V.
Resumo:
Arguably, the most difficult task in text classification is to choose an appropriate set of features that allows machine learning algorithms to provide accurate classification. Most state-of-the-art techniques for this task involve careful feature engineering and a pre-processing stage, which may be too expensive in the emerging context of massive collections of electronic texts. In this paper, we propose efficient methods for text classification based on information-theoretic dissimilarity measures, which are used to define dissimilarity-based representations. These methods dispense with any feature design or engineering, by mapping texts into a feature space using universal dissimilarity measures; in this space, classical classifiers (e.g. nearest neighbor or support vector machines) can then be used. The reported experimental evaluation of the proposed methods, on sentiment polarity analysis and authorship attribution problems, reveals that it approximates, sometimes even outperforms previous state-of-the-art techniques, despite being much simpler, in the sense that they do not require any text pre-processing or feature engineering.