48 resultados para Text processing
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analyses of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique Least Square Projections ( LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in mD. In order to perform the projection, a small number of distance calculations are necessary, and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in 2D. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing high-quality methods, particularly where it was mostly tested, that is, for mapping text sets.
Resumo:
An implementation of a computational tool to generate new summaries from new source texts is presented, by means of the connectionist approach (artificial neural networks). Among other contributions that this work intends to bring to natural language processing research, the use of a more biologically plausible connectionist architecture and training for automatic summarization is emphasized. The choice relies on the expectation that it may bring an increase in computational efficiency when compared to the sa-called biologically implausible algorithms.
Resumo:
Due to the imprecise nature of biological experiments, biological data is often characterized by the presence of redundant and noisy data. This may be due to errors that occurred during data collection, such as contaminations in laboratorial samples. It is the case of gene expression data, where the equipments and tools currently used frequently produce noisy biological data. Machine Learning algorithms have been successfully used in gene expression data analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from the training data set can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques for noise detection in gene expression data classification problems. This evaluation analyzes the effectiveness of the techniques investigated in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data.
Resumo:
This study addressed the use of conventional and vegetable origin polyurethane foams to extract C. I. Acid Orange 61 dye. The quantitative determination of the residual dye was carried out with an UV/Vis absorption spectrophotometer. The extraction of the dye was found to depend on various factors such as pH of the solution, foam cell structure, contact time and dye and foam interactions. After 45 days, better results were obtained for conventional foam when compared to vegetable foam. Despite presenting a lower percentage of extraction, vegetable foam is advantageous as it is considered a polymer with biodegradable characteristics.
Resumo:
This paper describes a new food classification which assigns foodstuffs according to the extent and purpose of the industrial processing applied to them. Three main groups are defined: unprocessed or minimally processed foods (group 1), processed culinary and food industry ingredients (group 2), and ultra-processed food products (group 3). The use of this classification is illustrated by applying it to data collected in the Brazilian Household Budget Survey which was conducted in 2002/2003 through a probabilistic sample of 48,470 Brazilian households. The average daily food availability was 1,792 kcal/person being 42.5% from group 1 (mostly rice and beans and meat and milk), 37.5% from group 2 (mostly vegetable oils, sugar, and flours), and 20% from group 3 (mostly breads, biscuits, sweets, soft drinks, and sausages). The share of group 3 foods increased with income, and represented almost one third of all calories in higher income households. The impact of the replacement of group 1 foods and group 2 ingredients by group 3 products on the overall quality of the diet, eating patterns and health is discussed.
Resumo:
Orthodox teaching and practice on nutrition and health almost always focuses on nutrients, or else on foods and drinks. Thus, diets that are high in folate and in green leafy vegetables are recommended, whereas diets high in saturated fat and in full-fat milk and other dairy products are not recommended. Food guides such as the US Food Guide Pyramid are designed to encourage consumption of healthier foods, by which is usually meant those higher in vitamins, minerals and other nutrients seen as desirable.What is generally overlooked in such approaches, which currently dominate official and other authoritative information and education programmes, and also food and nutrition public health policies, is food processing. It is now generally acknowledged that the current pandemic of obesity and related chronic diseases has as one of its important causes increased consumption of convenience including pre-prepared foods(1,2). However, the issue of food processing is largely ignored or minimised in education and information about food, nutrition and health, and also in public health policies.A short commentary cannot be comprehensive, and a general proposal such as that made here is bound to have some problems and exceptions. Also, the social, cultural, economic and environmental consequences of food processing are not discussed here. Readers comments and queries are invited
Resumo:
This paper presents a rational approach to the design of a catamaran's hydrofoil applied within a modern context of multidisciplinary optimization. The approach used includes the use of response surfaces represented by neural networks and a distributed programming environment that increases the optimization speed. A rational approach to the problem simplifies the complex optimization model; when combined with the distributed dynamic training used for the response surfaces, this model increases the efficiency of the process. The results achieved using this approach have justified this publication.
Resumo:
Previously, we isolated two strains of spontaneous oxidative (SpOx2 and SpOx3) stress mutants of Lactococcus lactis subsp cremoris. Herein, we compared these mutants to a parental wild- type strain (J60011) and a commercial starter in experimental fermented milk production. Total solid contents of milk and fermentation temperature both affected the acidification profile of the spontaneous oxidative stress- resistant L. lactis mutants during fermented milk production. Fermentation times to pH 4.7 ranged from 6.40 h (J60011) to 9.36 h (SpOx2); V(max) values were inversely proportional to fermentation time. Bacterial counts increased to above 8.50 log(10) cfu/mL. The counts of viable SpOx3 mutants were higher than those of the parental wild strain in all treatments. All fermented milk products showed post-fermentation acidification after 24 h of storage at 4 degrees C; they remained stable after one week of storage.
Resumo:
Background: Various neuroimaging studies, both structural and functional, have provided support for the proposal that a distributed brain network is likely to be the neural basis of intelligence. The theory of Distributed Intelligent Processing Systems (DIPS), first developed in the field of Artificial Intelligence, was proposed to adequately model distributed neural intelligent processing. In addition, the neural efficiency hypothesis suggests that individuals with higher intelligence display more focused cortical activation during cognitive performance, resulting in lower total brain activation when compared with individuals who have lower intelligence. This may be understood as a property of the DIPS. Methodology and Principal Findings: In our study, a new EEG brain mapping technique, based on the neural efficiency hypothesis and the notion of the brain as a Distributed Intelligence Processing System, was used to investigate the correlations between IQ evaluated with WAIS (Whechsler Adult Intelligence Scale) and WISC (Wechsler Intelligence Scale for Children), and the brain activity associated with visual and verbal processing, in order to test the validity of a distributed neural basis for intelligence. Conclusion: The present results support these claims and the neural efficiency hypothesis.
Resumo:
Genetic variation provides a basis upon which populations can be genetically improved. Management of animal genetic resources in order to minimize loss of genetic diversity both within and across breeds has recently received attention at different levels, e. g., breed, national and international levels. A major need for sustainable improvement and conservation programs is accurate estimates of population parameters, such as rate of inbreeding and effective population size. A software system (POPREP) is presented that automatically generates a typeset report. Key parameters for population management, such as age structure, generation interval, variance in family size, rate of inbreeding, and effective population size form the core part of this report. The report includes a default text that describes definition, computation and meaning of the various parameters. The report is summarized in two pdf files, named Population Structure and Pedigree Analysis Reports. In addition, results (e. g., individual inbreeding coefficients, rate of inbreeding and effective population size) are stored in comma-separate-values files that are available for further processing. Pedigree data from eight livestock breeds from different species and countries were used to describe the potential of POPREP and to highlight areas for further research.
Resumo:
Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.
Resumo:
Given the polarity dependent effects of transcranial direct current stimulation (tDCS) in facilitating or inhibiting neuronal processing, and tDCS effects on pitch perception, we tested the effects of tDCS on temporal aspects of auditory processing. We aimed to change baseline activity of the auditory cortex using tDCS as to modulate temporal aspects of auditory processing in healthy subjects without hearing impairment. Eleven subjects received 2mA bilateral anodal, cathodal and sham tDCS over auditory cortex in a randomized and counterbalanced order. Subjects were evaluated by the Random Gap Detection Test (RGDT), a test measuring temporal processing abilities in the auditory domain, before and during the stimulation. Statistical analysis revealed a significant interaction effect of time vs. tDCS condition for 4000 Hz and for clicks. Post-hoc tests showed significant differences according to stimulation polarity on RGDT performance: anodal improved 22.5% and cathodal decreased 54.5% subjects' performance, as compared to baseline. For clicks, anodal also increased performance in 29.4% when compared to baseline. tDCS presented polarity-dependent effects on the activity of the auditory cortex, which results in a positive or negative impact in a temporal resolution task performance. These results encourage further studies exploring tDCS in central auditory processing disorders.
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
This article discusses issues related to the organization and reception of information in the context of services and public information systems driven by technology. It stems from the assumption that in a ""technologized"" society, the distance between users and information is almost always of cognitive and socio-cultural nature, a product of our effort to design communication. In this context, we favor the approach of the information sign, seeking to answer how a documentary message turns into information, i.e. a structure recognized as socially useful. Observing the structural, cognitive and communicative aspects of the documentary message, based on Documentary Linguistics, Terminology, as well as on Textual Linguistics, the policy of knowledge management and innovation of the Government of the State of Sao Paulo is analyzed, which authorizes the use of Web 2.0, also questioning to what extent this initiative represents innovation in the environment of libraries.