920 resultados para Data representation
Resumo:
We present an account of semantic representation that focuses on distinct types of information from which word meanings can be learned. In particular, we argue that there are at least two major types of information from which we learn word meanings. The first is what we call experiential information. This is data derived both from our sensory-motor interactions with the outside world, as well as from our experience of own inner states, particularly our emotions. The second type of information is language-based. In particular, it is derived from the general linguistic context in which words appear. The paper spells out this proposal, summarizes research supporting this view and presents new predictions emerging from this framework.
Resumo:
This paper considers an extension to the skew-normal model through the inclusion of an additional parameter which can lead to both uni- and bi-modal distributions. The paper presents various basic properties of this family of distributions and provides a stochastic representation which is useful for obtaining theoretical properties and to simulate from the distribution. Moreover, the singularity of the Fisher information matrix is investigated and maximum likelihood estimation for a random sample with no covariates is considered. The main motivation is thus to avoid using mixtures in fitting bimodal data as these are well known to be complicated to deal with, particularly because of identifiability problems. Data-based illustrations show that such model can be useful. Copyright (C) 2009 John Wiley & Sons, Ltd.
Resumo:
Tendo como motivação o desenvolvimento de uma representação gráfica de redes com grande número de vértices, útil para aplicações de filtro colaborativo, este trabalho propõe a utilização de superfícies de coesão sobre uma base temática multidimensionalmente escalonada. Para isso, utiliza uma combinação de escalonamento multidimensional clássico e análise de procrustes, em algoritmo iterativo que encaminha soluções parciais, depois combinadas numa solução global. Aplicado a um exemplo de transações de empréstimo de livros pela Biblioteca Karl A. Boedecker, o algoritmo proposto produz saídas interpretáveis e coerentes tematicamente, e apresenta um stress menor que a solução por escalonamento clássico.
Resumo:
Using the Pricing Equation in a panel-data framework, we construct a novel consistent estimator of the stochastic discount factor (SDF) which relies on the fact that its logarithm is the "common feature" in every asset return of the economy. Our estimator is a simple function of asset returns and does not depend on any parametric function representing preferences. The techniques discussed in this paper were applied to two relevant issues in macroeconomics and finance: the first asks what type of parametric preference-representation could be validated by asset-return data, and the second asks whether or not our SDF estimator can price returns in an out-of-sample forecasting exercise. In formal testing, we cannot reject standard preference specifications used in the macro/finance literature. Estimates of the relative risk-aversion coefficient are between 1 and 2, and statistically equal to unity. We also show that our SDF proxy can price reasonably well the returns of stocks with a higher capitalization level, whereas it shows some difficulty in pricing stocks with a lower level of capitalization.
Resumo:
There are four different hypotheses analyzed in the literature that explain deunionization, namely: the decrease in the demand for union representation by the workers; the impaet of globalization over unionization rates; teehnieal ehange and ehanges in the legal and politieal systems against unions. This paper aims to test alI ofthem. We estimate a logistie regression using panel data proeedure with 35 industries from 1973 to 1999 and eonclude that the four hypotheses ean not be rejeeted by the data. We also use a varianee analysis deeomposition to study the impaet of these variables over the drop in unionization rates. In the model with no demographic variables the results show that these economic (tested) variables can account from 10% to 12% of the drop in unionization. However, when we include demographic variables these tested variables can account from 10% to 35% in the total variation of unionization rates. In this case the four hypotheses tested can explain up to 50% ofthe total drop in unionization rates explained by the model.
Resumo:
SOUZA, Anderson A. S. ; SANTANA, André M. ; BRITTO, Ricardo S. ; GONÇALVES, Luiz Marcos G. ; MEDEIROS, Adelardo A. D. Representation of Odometry Errors on Occupancy Grids. In: INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, 5., 2008, Funchal, Portugal. Proceedings... Funchal, Portugal: ICINCO, 2008.
Resumo:
Husserl left many unpublished drafts explaining (or trying to) his views on spatial representation and geometry, such as, particularly, those collected in the second part of Studien zur Arithmetik und Geometrie (Hua XXI), but no completely articulate work on the subject. In this paper, I put forward an interpretation of what those views might have been. Husserl, I claim, distinguished among different conceptions of space, the space of perception (constituted from sensorial data by intentionally motivated psychic functions), that of physical geometry (or idealized perceptual space), the space of the mathematical science of physical nature (in which science, not only raw perception has a word) and the abstract spaces of mathematics (free creations of the mathematical mind), each of them with its peculiar geometrical structure. Perceptual space is proto-Euclidean and the space of physical geometry Euclidean, but mathematical physics, Husserl allowed, may find it convenient to represent physical space with a non-Euclidean structure. Mathematical spaces, on their turn, can be endowed, he thinks, with any geometry mathematicians may find interesting. Many other related questions are addressed here, in particular those concerning the a priori or a posteriori character of the many geometric features of perceptual space (bearing in mind that there are at least two different notions of a priori in Husserl, which we may call the conceptual and the transcendental a priori). I conclude with an overview of Weyl's ideas on the matter, since his philosophical conceptions are often traceable back to his former master, Husserl.
Resumo:
This paper presents a proposal for the semantic treatment of ambiguous homographic forms in Brazilian Portuguese, and to offer linguistic strategies for its computational implementation in Systems of Natural Language Processing (SNLP). Pustejovsky's Generative Lexicon was used as a theoretical model. From this model, the Qualia Structure - QS (and the Formal, Telic, Agentive and Constitutive roles) was selected as one of the linguistic and semantic expedients for the achievement of disambiguation of homonym forms. So that analyzed and treated data could be manipulated, we elaborated a Lexical Knowledge Base (LKB) where lexical items are correlated and interconnected by different kinds of semantic relations in the QS and ontological information.
Resumo:
Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define approximate to23,500 genes, of which only approximate to1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes reveals that <1% do not have corresponding ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body. More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants (although the one-pass nature of the data necessitates careful validation) and many alternatively spliced transcripts. Although widely exploited by the scientific community, vindicating our totally open source policy, the EST data generated still provide extensive information that remains to be systematically explored, and that may further facilitate progress toward both the understanding and treatment of human cancers.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Interactive visual representations complement traditional statistical and machine learning techniques for data analysis, allowing users to play a more active role in a knowledge discovery process and making the whole process more understandable. Though visual representations are applicable to several stages of the knowledge discovery process, a common use of visualization is in the initial stages to explore and organize a sometimes unknown and complex data set. In this context, the integrated and coordinated - that is, user actions should be capable of affecting multiple visualizations when desired - use of multiple graphical representations allows data to be observed from several perspectives and offers richer information than isolated representations. In this paper we propose an underlying model for an extensible and adaptable environment that allows independently developed visualization components to be gradually integrated into a user configured knowledge discovery application. Because a major requirement when using multiple visual techniques is the ability to link amongst them, so that user actions executed on a representation propagate to others if desired, the model also allows runtime configuration of coordinated user actions over different visual representations. We illustrate how this environment is being used to assist data exploration and organization in a climate classification problem.
Resumo:
The rise in boiling point of grapefruit juice was experimentally measured at soluble solids concentrations in the range of 9.3-60.6 °Brix and pressures between °6.0 × 103 and 9.0 × 104 Pa. Different approaches to represent experimental data, including the Dhring's rule, the Antoine equation and empirical models proposed in the literature were tested. In the range of 9.3-29.0 °Brix, the rise in boiling point was nearly independent of pressure, varying only with juice concentration. Considerable deviations of this behavior began to occur at concentrations higher than 29.0 °Brix. Experimental data could be best predicted by adjusting an empirical model, which consisted of a single equation that takes into account the dependence of rise in boiling point on pressure and concentration. © SAGE Publications 2007.
Resumo:
In this paper is reported the use of the chromatographic profiles of volatiles to determine disease markers in plants - in this case, leaves of Eucalyptus globulus contaminated by the necrotroph fungus Teratosphaeria nubilosa. The volatile fraction was isolated by headspace solid phase microextraction (HS-SPME) and analyzed by comprehensive two-dimensional gas chromatography-fast quadrupole mass spectrometry (GC. ×. GC-qMS). For the correlation between the metabolic profile described by the chromatograms and the presence of the infection, unfolded-partial least squares discriminant analysis (U-PLS-DA) with orthogonal signal correction (OSC) were employed. The proposed method was checked to be independent of factors such as the age of the harvested plants. The manipulation of the mathematical model obtained also resulted in graphic representations similar to real chromatograms, which allowed the tentative identification of more than 40 compounds potentially useful as disease biomarkers for this plant/pathogen pair. The proposed methodology can be considered as highly reliable, since the diagnosis is based on the whole chromatographic profile rather than in the detection of a single analyte. © 2013 Elsevier B.V..
Resumo:
In the present investigation we mapped the primary visual area of the South American diurnal rodent, Dasyprocta aguti, by standardized electrophysiological mapping techniques. In particular, we performed a series of mapping experiments of the visual streak in the primary visual cortex. We found that the representation of the visual streak in V1 is greatly expanded, the nasal 10 degrees of the visual streak representation occupies ten times more cortical area than equivalent areas in the central or temporal representation. Comparison of these data with those on the density of ganglion cells in the retina at corresponding locations in the visual field reveal a significant mismatch between these two variables. The nasal representation is greatly expanded along the horizontal meridian in V1 as compared to the central and temporal regions whereas the density of ganglion cells decreases with progression along the visual streak from central region towards the nasal or temporal visual field. A review of the available data reveals that all lateral-eyed mammals exhibit a similar mismatch between the retinal and cortical representation of the visual field, and this mismatches is greater in those species with well defined visual streaks such as rabbit and agouti.