11 resultados para databases and data mining

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a method based on both preprocessing and data mining with the objective of identify harmonic current sources in residential consumers. In addition, this methodology can also be applied to identify linear and nonlinear loads. It should be emphasized that the entire database was obtained through laboratory essays, i.e., real data were acquired from residential loads. Thus, the residential system created in laboratory was fed by a configurable power source and in its output were placed the loads and the power quality analyzers (all measurements were stored in a microcomputer). So, the data were submitted to pre-processing, which was based on attribute selection techniques in order to minimize the complexity in identifying the loads. A newer database was generated maintaining only the attributes selected, thus, Artificial Neural Networks were trained to realized the identification of loads. In order to validate the methodology proposed, the loads were fed both under ideal conditions (without harmonics), but also by harmonic voltages within limits pre-established. These limits are in accordance with IEEE Std. 519-1992 and PRODIST (procedures to delivery energy employed by Brazilian`s utilities). The results obtained seek to validate the methodology proposed and furnish a method that can serve as alternative to conventional methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Melanoma is a highly aggressive and therapy resistant tumor for which the identification of specific markers and therapeutic targets is highly desirable. We describe here the development and use of a bioinformatic pipeline tool, made publicly available under the name of EST2TSE, for the in silico detection of candidate genes with tissue-specific expression. Using this tool we mined the human EST (Expressed Sequence Tag) database for sequences derived exclusively from melanoma. We found 29 UniGene clusters of multiple ESTs with the potential to predict novel genes with melanoma-specific expression. Using a diverse panel of human tissues and cell lines, we validated the expression of a subset of three previously uncharacterized genes (clusters Hs.295012, Hs.518391, and Hs.559350) to be highly restricted to melanoma/melanocytes and named them RMEL1, 2 and 3, respectively. Expression analysis in nevi, primary melanomas, and metastatic melanomas revealed RMEL1 as a novel melanocytic lineage-specific gene up-regulated during melanoma development. RMEL2 expression was restricted to melanoma tissues and glioblastoma. RMEL3 showed strong up-regulation in nevi and was lost in metastatic tumors. Interestingly, we found correlations of RMEL2 and RMEL3 expression with improved patient outcome, suggesting tumor and/or metastasis suppressor functions for these genes. The three genes are composed of multiple exons and map to 2q12.2, 1q25.3, and 5q11.2, respectively. They are well conserved throughout primates, but not other genomes, and were predicted as having no coding potential, although primate-conserved and human-specific short ORFs could be found. Hairpin RNA secondary structures were also predicted. Concluding, this work offers new melanoma-specific genes for future validation as prognostic markers or as targets for the development of therapeutic strategies to treat melanoma.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. Methods/Principal Findings: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of ""what if'' situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. Conclusion/Significance: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In geophysics and seismology, raw data need to be processed to generate useful information that can be turned into knowledge by researchers. The number of sensors that are acquiring raw data is increasing rapidly. Without good data management systems, more time can be spent in querying and preparing datasets for analyses than in acquiring raw data. Also, a lot of good quality data acquired at great effort can be lost forever if they are not correctly stored. Local and international cooperation will probably be reduced, and a lot of data will never become scientific knowledge. For this reason, the Seismological Laboratory of the Institute of Astronomy, Geophysics and Atmospheric Sciences at the University of Sao Paulo (IAG-USP) has concentrated fully on its data management system. This report describes the efforts of the IAG-USP to set up a seismology data management system to facilitate local and international cooperation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

P>The determination of normal parameters is an important procedure in the evaluation of the stomatognathic system. We used the surface electromyography standardization protocol described by Ferrario et al. (J Oral Rehabil. 2000;27:33-40, 2006;33:341) to determine reference values of the electromyographic standardized indices for the assessment of muscular symmetry (left and right side, percentage overlapping coefficient, POC), potential lateral displacing components (unbalanced contractile activities of contralateral masseter and temporalis muscles, TC), relative activity (most prevalent pair of masticatory muscles, ATTIV) and total activity (integrated areas of the electromyographic potentials over time, IMPACT) in healthy Brazilian young adults, and the relevant data reproducibility. Electromyography of the right and left masseter and temporalis muscles was performed during maximum teeth clenching in 20 healthy subjects (10 women and 10 men, mean age 23 years, s.d. 3), free from periodontal problems, temporomandibular disorders, oro-facial myofunctional disorder, and with full permanent dentition (28 teeth at least). Data reproducibility was computed for 75% of the sample. The values obtained were POC Temporal (88 center dot 11 +/- 1 center dot 45%), POC masseter (87 center dot 11 +/- 1 center dot 60%), TC (8 center dot 79 +/- 1 center dot 20%), ATTIV (-0 center dot 33 +/- 9 center dot 65%) and IMPACT (110 center dot 40 +/- 23 center dot 69 mu V/mu V center dot s %). There were no statistical differences between test and retest values (P > 0 center dot 05). The Technical Errors of Measurement (TEM) for 50% of subjects assessed during the same session were 1 center dot 5, 1 center dot 39, 1 center dot 06, 3 center dot 83 and 10 center dot 04. For 25% of the subjects assessed after a 6-month interval, the TEM were 0 center dot 80, 1 center dot 03, 0 center dot 73, 12 center dot 70 and 19 center dot 10. For all indices, there was good reproducibility. These electromyographic indices could be used in the assessment of patients with stomatognathic dysfunction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several aspects of photoperception and light signal transduction have been elucidated by studies with model plants. However, the information available for economically important crops, such as Fabaceae species, is scarce. In order to incorporate the existing genomic tools into a strategy to advance soybean research, we have investigated publicly available expressed sequence tag ( EST) sequence databases in order to identify Glycine max sequences related to genes involved in light-regulated developmental control in model plants. Approximately 38,000 sequences from open-access databases were investigated, and all bona fide and putative photoreceptor gene families were found in soybean sequence databases. We have identified G. max orthologs for several families of transcriptional regulators and cytoplasmic proteins mediating photoreceptor-induced responses, although some important Arabidopsis phytochrome-signaling components are absent. Moreover, soybean and Arabidopsis gene-family homologs appear to have undergone a distinct expansion process in some cases. We propose a working model of light perception, signal transduction and response-eliciting in G. max, based on the identified key components from Arabidopsis. These results demonstrate the power of comparative genomics between model systems and crop species to elucidate several aspects of plant physiology and metabolism.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objetivou-se com este trabalho utilizar regras de associação para identificar forças de mercado que regem a comercialização de touros com avaliação genética pelo programa Nelore Brasil. Essas regras permitem evidenciar padrões implícitos nas transações de grandes bases de dados, indicando causas e efeitos determinantes da oferta e comercialização de touros. Na análise foram considerados 19.736 registros de touros comercializados, 17 fazendas e 15 atributos referentes às diferenças esperadas nas progênies dos reprodutores, local e época da venda. Utilizou-se um sistema com interface gráfica usuário-dirigido que permite geração e seleção interativa de regras de associação. Análise de Pareto foi aplicada para as três medidas objetivas (suporte, confiança e lift) que acompanham cada uma das regras de associação, para validação das mesmas. Foram geradas 2.667 regras de associação, 164 consideradas úteis pelo usuário e 107 válidas para lift ≥ 1,0505. As fazendas participantes do programa Nelore Brasil apresentam especializações na oferta de touros, segundo características para habilidade materna, ganho de peso, fertilidade, precocidade sexual, longevidade, rendimento e terminação de carcaça. Os perfis genéticos dos touros são diferentes para as variedades padrão e mocho. Algumas regiões brasileiras são nichos de mercado para touros sem registro genealógico. A análise de evolução de mercado sugere que o mérito genético total, índice oficial do programa Nelore Brasil, tornou-se um importante índice para comercialização dos touros. Com o uso das regras de associação, foi possível descobrir forças do mercado e identificar combinações de atributos genéticos, geográficos e temporais que determinam a comercialização de touros no programa Nelore Brasil.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The productivity associated with commonly available disassembly methods today seldomly makes disassembly the preferred end-of-life solution for massive take back product streams. Systematic reuse of parts or components, or recycling of pure material fractions are often not achievable in an economically sustainable way. In this paper a case-based review of current disassembly practices is used to analyse the factors influencing disassembly feasibility. Data mining techniques were used to identify major factors influencing the profitability of disassembly operations. Case characteristics such as involvement of the product manufacturer in the end-of-life treatment and continuous ownership are some of the important dimensions. Economic models demonstrate that the efficiency of disassembly operations should be increased an order of magnitude to assure the competitiveness of ecologically preferred, disassembly oriented end-of-life scenarios for large waste of electric and electronic equipment (WEEE) streams. Technological means available to increase the productivity of the disassembly operations are summarized. Automated disassembly techniques can contribute to the robustness of the process, but do not allow to overcome the efficiency gap if not combined with appropriate product design measures. Innovative, reversible joints, collectively activated by external trigger signals, form a promising approach to low cost, mass disassembly in this context. A short overview of the state-of-the-art in the development of such self-disassembling joints is included. (c) 2008 CIRP.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multidimensional Visualization techniques are invaluable tools for analysis of structured and unstructured data with variable dimensionality. This paper introduces PEx-Image-Projection Explorer for Images-a tool aimed at supporting analysis of image collections. The tool supports a methodology that employs interactive visualizations to aid user-driven feature detection and classification tasks, thus offering improved analysis and exploration capabilities. The visual mappings employ similarity-based multidimensional projections and point placement to layout the data on a plane for visual exploration. In addition to its application to image databases, we also illustrate how the proposed approach can be successfully employed in simultaneous analysis of different data types, such as text and images, offering a common visual representation for data expressed in different modalities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.