9 resultados para Knowledge Discovery in Databases
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Knowledge has been used as a resource for intelligent and effective action planning in organizations. Interest in research on knowledge management processes has intensified in different areas. A systematic literature review was accomplished, based on the question: what are the contributions of Brazilian and international journal publications on knowledge management in health? The sample totaled 32 items that complied with the inclusion criteria. The results showed that 78% of journals that published on the theme are international, 77% of researchers work in higher education and 65% have a Ph.D. The texts gave rise to five thematic categories, mainly: development of knowledge management systems in health (37.5%), discussion of knowledge management application in health (28.1%) and nurses' function in knowledge management (18.7%).
Resumo:
The article aims to analyze the process of knowledge creation in Brazilian technology-based companies, using as a background the driving and restrictive factors found in this process. As the pillars of discussion, four main modes of knowledge conversion were used, according to the Japanese model: socialization, externalization, combination and internalization. The comparative case method through qualitative research was carried out in nine technology-based enterprises that had been incubated or have recently passed through the stage of incubation (so-called graduated companies) in the Technology Park of Sao Carlos, state of Sao Paulo, Brazil. Among the main results, the combination of knowledge was identified as more conscious and structured in graduated companies, in relation to incubated companies. In contrast, it was noted that incubated companies have an environment with greater opportunities for socialization, internalization and externalization of knowledge.
Resumo:
We compared the microbial community composition in soils from the Brazilian Amazon with two contrasting histories; anthrosols and their adjacent non-anthrosol soils of the same mineralogy. The anthrosols, also known as the Amazonian Dark Earths or terra preta, were managed by the indigenous pre-Colombian Indians between 500 and 8,700 years before present and are characterized by unusually high cation exchange capacity, phosphorus (P), and calcium (Ca) contents, and soil carbon pools that contain a high proportion of incompletely combusted biomass as biochar or black carbon (BC). We sampled paired anthrosol and unmodified soils from four locations in the Manaus, Brazil, region that differed in their current land use and soil type. Community DNA was extracted from sampled soils and characterized by use of denaturing gradient gel electrophoresis (DGGE) and terminal restriction fragment length polymorphism. DNA bands of interest from Bacteria and Archaea DGGE gels were cloned and sequenced. In cluster analyses of the DNA fingerprints, microbial communities from the anthrosols grouped together regardless of current land use or soil type and were distinct from those in their respective, paired adjacent soils. For the Archaea, the anthrosol communities diverged from the adjacent soils by over 90%. A greater overall richness was observed for Bacteria sequences as compared with those of the Archaea. Most of the sequences obtained were novel and matched those in databases at less than 98% similarity. Several sequences obtained only from the anthrosols grouped at 93% similarity with the Verrucomicrobia, a genus commonly found in rice paddies in the tropics. Sequences closely related to Proteobacteria and Cyanobacteria sp. were recovered only from adjacent soil samples. Sequences related to Pseudomonas, Acidobacteria, and Flexibacter sp. were recovered from both anthrosols and adjacent soils. The strong similarities among the microbial communities present in the anthrosols for both the Bacteria and Archaea suggests that the microbial community composition in these soils is controlled more strongly by their historical soil management than by soil type or current land use. The anthrosols had consistently higher concentrations of incompletely combusted organic black carbon material (BC), higher soil pH, and higher concentrations of P and Ca compared to their respective adjacent soils. Such characteristics may help to explain the longevity and distinctiveness of the anthrosols in the Amazonian landscape and guide us in recreating soils with sustained high fertility in otherwise nutrient-poor soils in modern times.
Resumo:
There are some variants of the widely used Fuzzy C-Means (FCM) algorithm that support clustering data distributed across different sites. Those methods have been studied under different names, like collaborative and parallel fuzzy clustering. In this study, we offer some augmentation of the two FCM-based clustering algorithms used to cluster distributed data by arriving at some constructive ways of determining essential parameters of the algorithms (including the number of clusters) and forming a set of systematically structured guidelines such as a selection of the specific algorithm depending on the nature of the data environment and the assumptions being made about the number of clusters. A thorough complexity analysis, including space, time, and communication aspects, is reported. A series of detailed numeric experiments is used to illustrate the main ideas discussed in the study.
Resumo:
We review recent visualization techniques aimed at supporting tasks that require the analysis of text documents, from approaches targeted at visually summarizing the relevant content of a single document to those aimed at assisting exploratory investigation of whole collections of documents.Techniques are organized considering their target input materialeither single texts or collections of textsand their focus, which may be at displaying content, emphasizing relevant relationships, highlighting the temporal evolution of a document or collection, or helping users to handle results from a query posed to a search engine.We describe the approaches adopted by distinct techniques and briefly review the strategies they employ to obtain meaningful text models, discuss how they extract the information required to produce representative visualizations, the tasks they intend to support and the interaction issues involved, and strengths and limitations. Finally, we show a summary of techniques, highlighting their goals and distinguishing characteristics. We also briefly discuss some open problems and research directions in the fields of visual text mining and text analytics.
Resumo:
Abstract Background Some organisms can survive extreme desiccation by entering a state of suspended animation known as anhydrobiosis. The free-living mycophagous nematode Aphelenchus avenae can be induced to enter anhydrobiosis by pre-exposure to moderate reductions in relative humidity (RH) prior to extreme desiccation. This preconditioning phase is thought to allow modification of the transcriptome by activation of genes required for desiccation tolerance. Results To identify such genes, a panel of expressed sequence tags (ESTs) enriched for sequences upregulated in A. avenae during preconditioning was created. A subset of 30 genes with significant matches in databases, together with a number of apparently novel sequences, were chosen for further study. Several of the recognisable genes are associated with water stress, encoding, for example, two new hydrophilic proteins related to the late embryogenesis abundant (LEA) protein family. Expression studies confirmed EST panel members to be upregulated by evaporative water loss, and the majority of genes was also induced by osmotic stress and cold, but rather fewer by heat. We attempted to use RNA interference (RNAi) to demonstrate the importance of this gene set for anhydrobiosis, but found A. avenae to be recalcitrant with the techniques used. Instead, therefore, we developed a cross-species RNAi procedure using A. avenae sequences in another anhydrobiotic nematode, Panagrolaimus superbus, which is amenable to gene silencing. Of 20 A. avenae ESTs screened, a significant reduction in survival of desiccation in treated P. superbus populations was observed with two sequences, one of which was novel, while the other encoded a glutathione peroxidase. To confirm a role for glutathione peroxidases in anhydrobiosis, RNAi with cognate sequences from P. superbus was performed and was also shown to reduce desiccation tolerance in this species. Conclusions This study has identified and characterised the expression profiles of members of the anhydrobiotic gene set in A. avenae. It also demonstrates the potential of RNAi for the analysis of anhydrobiosis and provides the first genetic data to underline the importance of effective antioxidant systems in metazoan desiccation tolerance.
Resumo:
Background: The integration of sequencing and gene interaction data and subsequent generation of pathways and networks contained in databases such as KEGG Pathway is essential for the comprehension of complex biological processes. We noticed the absence of a chart or pathway describing the well-studied preimplantation development stages; furthermore, not all genes involved in the process have entries in KEGG Orthology, important information for knowledge application with relation to other organisms. Results: In this work we sought to develop the regulatory pathway for the preimplantation development stage using text-mining tools such as Medline Ranker and PESCADOR to reveal biointeractions among the genes involved in this process. The genes present in the resulting pathway were also used as seeds for software developed by our group called SeedServer to create clusters of homologous genes. These homologues allowed the determination of the last common ancestor for each gene and revealed that the preimplantation development pathway consists of a conserved ancient core of genes with the addition of modern elements. Conclusions: The generation of regulatory pathways through text-mining tools allows the integration of data generated by several studies for a more complete visualization of complex biological processes. Using the genes in this pathway as “seeds” for the generation of clusters of homologues, the pathway can be visualized for other organisms. The clustering of homologous genes together with determination of the ancestry leads to a better understanding of the evolution of such process.
Resumo:
USP INFORMATION MANDATE – Resolution 6444 – Oct. 22th, 2012 Make public and accessible the knowledge generated by research developed at USP, encouraging the sharing, the use and generation of new content; •Preserve institutional memory by storing the full text of Intellectual Production (scientific, academic, artistic and technical); •Increase the impact of the knowledge generated in the university within the scientific community and the general public; •It is suggested to all members of the USP community to publish the results of their research, preferably, in open-access publication outlets and/or repositories and to include the permission to deposit their production in the BDPI system in their publication agreements. •Institutional Repository for Intellectual Production; •Official Source USP Statistical Yearbook.
Resumo:
The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.