12 resultados para Frequent itemsets mining
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Abstract Background Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that, the multi-relational proposed algorithm, unlike other algorithms of this approach, is efficient for use in large relational databases.
Resumo:
Serra da Canastra National Park (SCNP) is one of the most important protected areas in the Cerrado biome. Despite its importance to the conservation of rare and endangered species like Brazilian Merganser, two bills were approved in 2010 by Brazil's Chamber of Deputies aiming to reduce SCNP's official boundaries and to transform some of its parts into an Environmental Protection Area (EPA). We evaluated whether such changes would facilitate mining areas to be legally exploited within the park's area, and if those mining areas would represent a threat to Brazilian Merganser populations at SCNP. Results showed that 55% of the mining areas currently within the National Park will be located within the new EPA, and six hydrographic micro-basins inhabited by Brazilian Merganser could be affected by environmental impacts caused by mineral exploitation in those areas. For these reasons, we recommend the two bills be refused at the Federal Senate.
Resumo:
The use of cover crops affects the support capacity of soil and least limiting water range to crop growth. The objective of this study was to quantify preconsolidation pressure (sigma(p)), compression index (CI) and least limiting water range (LLWR) of a reclaimed coal mining soil under different cover crops, in Candiota, RS, Brazil. In the experiment, with randomized blocks design and four replicates, the following cover crops (treatments) were evaluated: Hemarthria altissima (Poir.) Stapf & C.E. Hubbard, treatment 1 (T1), Paspalum notatum Flugge, treatment 4 (T4), Cynodon dactilon (L) Pers., treatment 5 (T5), control Brachiaria brizantha (Hochst.) Stapf, treatment 7 (T7) and without cover crop treatment 8 (reference treatment, T8). Soil compression and least limiting water range were evaluated with undisturbed samples at a depth of 0.00-0.05 m. In order to evaluate parameters of soil compressibility, the soil samples were saturated with water and subjected to -10 kPa matric potential and then submitted to a uniaxial compression test under the following pressures: 25, 50, 100, 200, 400, 800 and 1600 kPa. Cover crops decreased the preconsolidation pressure of constructed soils after coal mining and the greatest soil reclamation was obtained with the H. altissima cover crop, where the lowest degree of soil compactness and soil load capacity were observed. Soils cultivated under H. altissima or B. brizantha presented the highest least limiting water range and these two cover crops generated similar soil critical bulk density obtained by least limiting water range and soil load support capacity. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Purpose: We sought to determine the mechanisms of downregulation of the airway transcription factor Foxa2 in lung cancer and the expression status of Foxa2 in non-small-cell lung cancer (NSCLC). Methods: A series of 25 lung cancer cell lines were evaluated for Foxa2 protein expression, FOXA2 mRNA levels, FOXA2 mutations, FOXA2 copy number changes and for evidence of FOXA2 promoter hypermethylation. In addition, 32 NSCLCs were sequenced for FOXA2 mutations and 173 primary NSCLC tumors evaluated for Foxa2 expression using an immunohistochemical assay. Results: Out of the 25 cell lines, 13 (52%) had undetectable FOXA2 mRNA. The expression of FOXA2 mRNA and Foxa2 protein were congruent in 19/22 cells (p = 0.001). FOXA2 mutations were not identified in primary NSCLCs and were infrequent in cell lines. Focal or broad chromosomal deletions involving FOXA2 were not present. The promoter region of FOXA2 had evidence of hypermethylation, with an inverse correlation between FOXA2 mRNA expression and presence of CpG dinucleotide methylation (p < 0.0001). In primary NSCLC tumor specimens, there was a high frequency of either absence (42/173, 24.2%) or no/low expression (96/173,55.4%) of Foxa2. In 130 patients with stage I NSCLC there was a trend towards decreased survival in tumors with no/low expression of Foxa2 (HR of 1.6, 95%CI 0.9-3.1; p = 0.122). Conclusions: Loss of expression of Foxa2 is frequent in lung cancer cell lines and NSCLCs. The main mechanism of downregulation of Foxa2 is epigenetic silencing through promoter hypermethylation. Further elucidation of the involvement of Foxa2 and other airway transcription factors in the pathogenesis of lung cancer may identify novel therapeutic targets. (C) 2012 Elsevier Ireland Ltd. All rights reserved.
Resumo:
The reproductive performance of cattle may be influenced by several factors, but mineral imbalances are crucial in terms of direct effects on reproduction. Several studies have shown that elements such as calcium, copper, iron, magnesium, selenium, and zinc are essential for reproduction and can prevent oxidative stress. However, toxic elements such as lead, nickel, and arsenic can have adverse effects on reproduction. In this paper, we applied a simple and fast method of multi-element analysis to bovine semen samples from Zebu and European classes used in reproduction programs and artificial insemination. Samples were analyzed by inductively coupled plasma spectrometry (ICP-MS) using aqueous medium calibration and the samples were diluted in a proportion of 1:50 in a solution containing 0.01% (vol/vol) Triton X-100 and 0.5% (vol/vol) nitric acid. Rhodium, iridium, and yttrium were used as the internal standards for ICP-MS analysis. To develop a reliable method of tracing the class of bovine semen, we used data mining techniques that make it possible to classify unknown samples after checking the differentiation of known-class samples. Based on the determination of 15 elements in 41 samples of bovine semen, 3 machine-learning tools for classification were applied to determine cattle class. Our results demonstrate the potential of support vector machine (SVM), multilayer perceptron (MLP), and random forest (RF) chemometric tools to identify cattle class. Moreover, the selection tools made it possible to reduce the number of chemical elements needed from 15 to just 8.
Resumo:
Mutations in the critical chromatin modifier ATRX and mutations in CIC and FUBP1, which are potent regulators of cell growth, have been discovered in specific subtypes of gliomas, the most common type of primary malignant brain tumors. However, the frequency of these mutations in many subtypes of gliomas, and their association with clinical features of the patients, is poorly understood. Here we analyzed these loci in 363 brain tumors. ATRX is frequently mutated in grade II-III astrocytomas (71%), oligoastrocytomas (68%), and secondary glioblastomas (57%), and ATRX mutations are associated with IDH1 mutations and with an alternative lengthening of telomeres phenotype. CIC and FUBP1 mutations occurred frequently in oligodendrogliomas (46% and 24%, respectively) but rarely in astrocytomas or oligoastrocytomas (<10%). This analysis allowed us to define two highly recurrent genetic signatures in gliomas: IDH1/ATRX (I-A) and IDH1/CIC/FUBP1 (I-CF). Patients with I-CF gliomas had a significantly longer median overall survival (96 months) than patients with I-A gliomas (51 months) and patients with gliomas that did not harbor either signature (13 months). The genetic signatures distinguished clinically distinct groups of oligoastrocytoma patients, which usually present a diagnostic challenge, and were associated with differences in clinical outcome even among individual tumor types. In addition to providing new clues about the genetic alterations underlying gliomas, the results have immediate clinical implications, providing a tripartite genetic signature that can serve as a useful adjunct to conventional glioma classification that may aid in prognosis, treatment selection, and therapeutic trial design.
Resumo:
This quantitative study aimed to identify the costs of the most frequent nursing activities in highly dependent hospitalized patients at a medical clinic. The non-probabilistic convenience sample corresponded to 607 observations regarding oral feeding activities (OF), blood pressure verification (BP)/heart rate (HR), body temperature checking (BTC), performance of intimate hygiene and management of feeding probe. The costs identified corresponded to R$2.40 (SD+/-2.64) for OF feeding; R$1.26 (SD+/-0.48) to verify the BP/HR; R$1.17 (SD+/-0.46) for BTC; R$15.59 (SD+/-8.62) to perform intimate hygiene and R$5.95 (SD+/-2.13) for management of feeding probe. This study will facilitate cost management, with a view to avoiding waste related to unnecessary resource consumption and establish a correlation between costs and care delivery results. Supported by Pro-Reitoria de Pesquisa, Universidade de Sao Paulo, Brazil.
Resumo:
Multi-element analysis of honey samples was carried out with the aim of developing a reliable method of tracing the origin of honey. Forty-two chemical elements were determined (Al, Cu, Pb, Zn, Mn, Cd, Tl, Co, Ni, Rb, Ba, Be, Bi, U, V, Fe, Pt, Pd, Te, Hf, Mo, Sn, Sb, P, La, Mg, I, Sm, Tb, Dy, Sd, Th, Pr, Nd, Tm, Yb, Lu, Gd, Ho, Er, Ce, Cr) by inductively coupled plasma mass spectrometry (ICP-MS). Then, three machine learning tools for classification and two for attribute selection were applied in order to prove that it is possible to use data mining tools to find the region where honey originated. Our results clearly demonstrate the potential of Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Random Forest (RF) chemometric tools for honey origin identification. Moreover, the selection tools allowed a reduction from 42 trace element concentrations to only 5. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Background: Digestive complications in enteral nutrition (EN) can negatively affect the nutrition clinical outcome of hospitalized patients. Diarrhea and constipation are intestinal motility disorders associated with pharmacotherapy, hydration, nutrition status, and age. The aim of this study was to analyze the frequency of these intestinal motility disorders in patients receiving EN and assess risk factors associated with diarrhea and constipation in hospitalized patients receiving exclusive EN therapy in a general hospital. Materials and Methods: The authors performed a sequential and observational study of 110 hospitalized adult patients fed exclusively by EN through a feeding tube. Patients were categorized according to the type of intestinal transit disorder as follows: group D (diarrhea, 3 or more watery evacuations in 24 hours), group C (constipation, less than 1 evacuation during 3 days), and group N (absence of diarrhea or constipation). All prescription drugs were recorded, and patients were analyzed according to the type and amount of medication received. The authors also investigated the presence of fiber in the enteral formula. Results: Patients classified in group C represented 70% of the study population; group D comprised 13%, and group N represented 17%. There was an association between group C and orotracheal intubation as the indication for EN (P<.001). Enteral formula without fiber was associated with constipation (logistic regression analysis: P<.001). Conclusion: Constipation is more frequent than diarrhea in patients fed exclusively by EN. Enteral diet with fiber may protect against medication-associated intestinal motility disorders. The addition of prokinetic drugs seems to be useful in preventing constipation. (Nutr Clin Pract. XXXX;xx:xx-xx)
Resumo:
Purpose: The pathophysiology of acute coronary syndromes (ACS) after noncardiac surgery is not established yet. Thrombosis over a vulnerable plaque or decreased oxygen supply secondary to anemia or hypotension may be involved. The purpose of this study was to investigate the pathophysiology of ACS complicating noncardiac surgery. Methods: Clinical and angiographic data were prospectively recorded into a database for 120 consecutive patients that had an ACS after noncardiac surgery (PACS), for 120 patients with spontaneous ACS (SACS), and 240 patients with stable coronary artery disease (CAD). Coronary lesions with obstructions greater than 50% were classified based on two criteria: Ambrose's classification and complex morphology. The presence of Ambrose's type II or complex lesions were compared between the three groups. Results: We analyzed 1470 lesions in 480 patients. In PACS group, 45% of patients had Ambrose's type II lesions vs. 56.7% in SACS group and 16.4% in stable CAD group (P < 0.001). Both PACS and SACS patients had more complex lesions than patients in stable CAD group (56.7% vs. 79.2% vs. 31.8%, respectively; P < 0.001). Overall, the independent predictors of plaque rupture were being in the group PACS (P < 0.001, OR 2.86; CI, 1.82-4.52 for complex lesions and P < 0.001, OR 3.43; CI, 2.1-5.6 for Ambrose's type II lesions) or SACS (P < 0.001, OR 8.71; CI, 5.15-14.73 for complex lesions and P < 0.001, OR 5.99; CI, 3.66-9.81 for Ambrose's type II lesions). Conclusions: Nearly 50% of patients with perioperative ACS have evidence of coronary plaque rupture, characterizing a type 1 myocardial infarction. (C) 2012 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Background: The integration of sequencing and gene interaction data and subsequent generation of pathways and networks contained in databases such as KEGG Pathway is essential for the comprehension of complex biological processes. We noticed the absence of a chart or pathway describing the well-studied preimplantation development stages; furthermore, not all genes involved in the process have entries in KEGG Orthology, important information for knowledge application with relation to other organisms. Results: In this work we sought to develop the regulatory pathway for the preimplantation development stage using text-mining tools such as Medline Ranker and PESCADOR to reveal biointeractions among the genes involved in this process. The genes present in the resulting pathway were also used as seeds for software developed by our group called SeedServer to create clusters of homologous genes. These homologues allowed the determination of the last common ancestor for each gene and revealed that the preimplantation development pathway consists of a conserved ancient core of genes with the addition of modern elements. Conclusions: The generation of regulatory pathways through text-mining tools allows the integration of data generated by several studies for a more complete visualization of complex biological processes. Using the genes in this pathway as “seeds” for the generation of clusters of homologues, the pathway can be visualized for other organisms. The clustering of homologous genes together with determination of the ancestry leads to a better understanding of the evolution of such process.
Resumo:
Given a large image set, in which very few images have labels, how to guess labels for the remaining majority? How to spot images that need brand new labels different from the predefined ones? How to summarize these data to route the user’s attention to what really matters? Here we answer all these questions. Specifically, we propose QuMinS, a fast, scalable solution to two problems: (i) Low-labor labeling (LLL) – given an image set, very few images have labels, find the most appropriate labels for the rest; and (ii) Mining and attention routing – in the same setting, find clusters, the top-'N IND.O' outlier images, and the 'N IND.R' images that best represent the data. Experiments on satellite images spanning up to 2.25 GB show that, contrasting to the state-of-the-art labeling techniques, QuMinS scales linearly on the data size, being up to 40 times faster than top competitors (GCap), still achieving better or equal accuracy, it spots images that potentially require unpredicted labels, and it works even with tiny initial label sets, i.e., nearly five examples. We also report a case study of our method’s practical usage to show that QuMinS is a viable tool for automatic coffee crop detection from remote sensing images.