868 resultados para frequent Pattern
Resumo:
Data mining, frequent pattern mining, database mining, mining algorithms in SQL
Resumo:
We present a method to enhance fault localization for software systems based on a frequent pattern mining algorithm. Our method is based on a large set of test cases for a given set of programs in which faults can be detected. The test executions are recorded as function call trees. Based on test oracles the tests can be classified into successful and failing tests. A frequent pattern mining algorithm is used to identify frequent subtrees in successful and failing test executions. This information is used to rank functions according to their likelihood of containing a fault. The ranking suggests an order in which to examine the functions during fault analysis. We validate our approach experimentally using a subset of Siemens benchmark programs.
Resumo:
Analisi e applicazione dei processi di data mining al flusso informativo di sistemi real-time. Implementazione e analisi di un algoritmo autoadattivo per la ricerca di frequent patterns su macchine automatiche.
Resumo:
A large epidemic of serogroup B meningococcal disease (MD), has been occurring in greater São Paulo, Brazil, since 1988.21 A Cuban-produced vaccine, based on outer-membrane-protein (OMP) from serogroup B: serotype 4: serosubtype P1.15 (B:4:P1.15) Neisseria meningitidis, was given to about 2.4 million children aged from 3 months to 6 years during 1989 and 1990. The administration of vaccine had little or no measurable effects on this outbreak. In order to detect clonal changes that could explain the continued increase in the incidence of disease after the vaccination, we serotyped isolates recovered between 1990 and 1996 from 834 patients with systemic disease. Strains B:4:P1.15, which was detected in the area as early as 1977, has been the most prevalent phenotype since 1988. These strains are still prevalent in the area and were responsible for about 68% of 834 serogroup B cases in the last 7 years. We analyzed 438 (52%) of these strains by restriction fragment length polymorphism (RFLPs) of rRNA genes (ribotyping). The most frequent pattern obtained was referred to as Rb1 (68%). We concluded that the same clone of B:4:P1.15-Rb1 strains was the most prevalent strain and responsible for the continued increase of incidence of serogroup B MD cases in greater São Paulo during the last 7 years in spite of the vaccination trial.
Resumo:
Purpose: In the last years, MRI appears as a complementary diagnostic method to US in the diagnosis of congenital lung lesions. Focal homogeneous pulmonary hyperintensity on T2-WI constitutes a frequent pattern observed. Our purpose is to determine if this finding is associated with a characteristic pulmonary lesion. Materials and methods: Between 01.01.00 and 31.12.07, a total of 50 prenatal MRI in fetuses with echographic diagnosis of thoracic pathology were performed in our institution, including 12 cases of suspected congenital pulmonary lesions. Prenatal images were correlated with post-natal diagnosis. Results: In 12 cases, fetal MRI detected congenital pulmonary lesions. In 8 patients, typical signs (cystic lesions, septations, anomalous vasculature) clearly suggested a specific pathology. In 4 cases, MRI showed a focal homogeneous increase of the signal intensity (SI) on T2-WI of the pathologic lung related to the normal one. The final diagnosis of these fetuses included 1 patient with congenital cystic adenomatoid malformation type III, 1 patient with segmental emphysema and 2 cases of bronchial atresia. In all 4 cases, a significant post-natal reduction of the lesion size related to prenatal MRI studies was observed. Conclusion: Our study suggests that a focal increment of the SI of the lung on T2-WI is a non specific sign of congenital lung disease, present in different pathologies. Therefore, a prospective diagnosis is not possible.
Resumo:
This paper analyses the role of prosody in parenthetical insertions, a type of structure that is extremely common in both speech and writing. The materials under study come from a corpus of spontaneous speech acts in Central Catalan (with varying degrees of spontaneity) from which a corpus of oral parenthetical insertions has been compiled. The prototypical prosodic features of a parenthetical insertion in Catalan are: prosodic autonomy, limited extension, production in between pauses or final pause, tendency towards acceleration, fall in intensity, lower pitch range and, finally, falling or rising melodic pattern. While the final fall is the most frequent pattern in spontaneous conversations with a high degree of confidence between interlocutors, a final rising structure is found in interviews in which the degree of confidence between participants is smaller, their roles are unequal, and the interviewed constructs a narrative discourse. We thus suggest that the pitch contour of parenthetical insertions is related to formality and discourse typology (in this case, narrative vs. dialogue). Bearing in mind the discursive functions performed by these insertions, we propose a typology which classifies them with regards to two main functions: completion of information, and modalisation.
Resumo:
In this paper, moving flock patterns are mined from spatio- temporal datasets by incorporating a clustering algorithm. A flock is defined as the set of data that move together for a certain continuous amount of time. Finding out moving flock patterns using clustering algorithms is a potential method to find out frequent patterns of movement in large trajectory datasets. In this approach, SPatial clusteRing algoRithm thrOugh sWarm intelligence (SPARROW) is the clustering algorithm used. The advantage of using SPARROW algorithm is that it can effectively discover clusters of widely varying sizes and shapes from large databases. Variations of the proposed method are addressed and also the experimental results show that the problem of scalability and duplicate pattern formation is addressed. This method also reduces the number of patterns produced
Resumo:
Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.
Resumo:
Traditional dictionary learning algorithms are used for finding a sparse representation on high dimensional data by transforming samples into a one-dimensional (1D) vector. This 1D model loses the inherent spatial structure property of data. An alternative solution is to employ Tensor Decomposition for dictionary learning on their original structural form —a tensor— by learning multiple dictionaries along each mode and the corresponding sparse representation in respect to the Kronecker product of these dictionaries. To learn tensor dictionaries along each mode, all the existing methods update each dictionary iteratively in an alternating manner. Because atoms from each mode dictionary jointly make contributions to the sparsity of tensor, existing works ignore atoms correlations between different mode dictionaries by treating each mode dictionary independently. In this paper, we propose a joint multiple dictionary learning method for tensor sparse coding, which explores atom correlations for sparse representation and updates multiple atoms from each mode dictionary simultaneously. In this algorithm, the Frequent-Pattern Tree (FP-tree) mining algorithm is employed to exploit frequent atom patterns in the sparse representation. Inspired by the idea of K-SVD, we develop a new dictionary update method that jointly updates elements in each pattern. Experimental results demonstrate our method outperforms other tensor based dictionary learning algorithms.
Resumo:
Trabalho Final do Curso de Mestrado Integrado em Medicina, Faculdade de Medicina, Universidade de Lisboa, 2014
Resumo:
Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent. With the popularity of data mining techniques in recent years, these events grad- ually become more and more useful. Despite the recent advances of the data mining techniques, the application to system event mining is still in a rudimentary stage. Most of works are still focusing on episodes mining or frequent pattern discovering. These methods are unable to provide a brief yet comprehensible summary to reveal the valuable information from the high level perspective. Moreover, these methods provide little actionable knowledge to help the system administrators to better man- age the systems. To better make use of the recorded events, more practical techniques are required. From the perspective of data mining, three correlated directions are considered to be helpful for system management: (1) Provide concise yet comprehensive summaries about the running status of the systems; (2) Make the systems more intelligence and autonomous; (3) Effectively detect the abnormal behaviors of the systems. Due to the richness of the event logs, all these directions can be solved in the data-driven manner. And in this way, the robustness of the systems can be enhanced and the goal of autonomous management can be approached. This dissertation mainly focuses on the foregoing directions that leverage tem- poral mining techniques to facilitate system management. More specifically, three concrete topics will be discussed, including event, resource demand prediction, and streaming anomaly detection. Besides the theoretic contributions, the experimental evaluation will also be presented to demonstrate the effectiveness and efficacy of the corresponding solutions.
Resumo:
Gas-liquid two-phase flow is very common in industrial applications, especially in the oil and gas, chemical, and nuclear industries. As operating conditions change such as the flow rates of the phases, the pipe diameter and physical properties of the fluids, different configurations called flow patterns take place. In the case of oil production, the most frequent pattern found is slug flow, in which continuous liquid plugs (liquid slugs) and gas-dominated regions (elongated bubbles) alternate. Offshore scenarios where the pipe lies onto the seabed with slight changes of direction are extremely common. With those scenarios and issues in mind, this work presents an experimental study of two-phase gas-liquid slug flows in a duct with a slight change of direction, represented by a horizontal section followed by a downward sloping pipe stretch. The experiments were carried out at NUEM (Núcleo de Escoamentos Multifásicos UTFPR). The flow initiated and developed under controlled conditions and their characteristic parameters were measured with resistive sensors installed at four pipe sections. Two high-speed cameras were also used. With the measured results, it was evaluated the influence of a slight direction change on the slug flow structures and on the transition between slug flow and stratified flow in the downward section.
Resumo:
Although cartilaginous tumors have low microvascular density, vessels are important for the provision of nutrition so that the tumor can grow and generate metastasis. The aim of this study was to assess the value of the vascular pattern classification as a prognostic tool in chondrosarcomas (CSs) and its relation with vascular endothelial growth factor (VEGF) expression. This was a retrospective study of 21 enchondromas and 57 conventional CSs. Clinical data and outcome were retrieved from medical files. CSs histologic grades (on a scale of 1 to 3) were determined according to the World Health Organization classification. The vascular pattern (on a scale of A to C) was assessed through CD34, according to Kalinski. CD105 and VEGF were also evaluated. Poor outcome was significantly associated with vascular pattern groups B and C. Higher vascular pattern were 6.5 times more frequent in moderate-grade and high-grade CSs than in grade 1 CS. On multivariate analysis, a clear correlation was found between VEGF overexpression and B/C vascular patterns. Only 18 (benign and malignant) tumors stained for CD105. The results point to the use of the vascular pattern classification as a prognostic tool in CSs and to differentiate low-grade from moderate-grade/high-grade CSs. Vascular pattern might be also used to complement histologic grade, VEGF immunostaining, and microvascular density, for indicating a patient's prognosis. Low-grade CSs develop under low neoangiogenesis, which conforms to the slow growth rate of these tumors.
Resumo:
Human respiratory syncytial virus (HRSV) is the major cause of lower respiratory tract infections in children under 5 years of age and the elderly, causing annual disease outbreaks during the fall and winter. Multiple lineages of the HRSVA and HRSVB serotypes co-circulate within a single outbreak and display a strongly temporal pattern of genetic variation, with a replacement of dominant genotypes occurring during consecutive years. In the present study we utilized phylogenetic methods to detect and map sites subject to adaptive evolution in the G protein of HRSVA and HRSVB. A total of 29 and 23 amino acid sites were found to be putatively positively selected in HRSVA and HRSVB, respectively. Several of these sites defined genotypes and lineages within genotypes in both groups, and correlated well with epitopes previously described in group A. Remarkably, 18 of these positively selected tended to revert in time to a previous codon state, producing a ""flipflop'' phylogenetic pattern. Such frequent evolutionary reversals in HRSV are indicative of a combination of frequent positive selection, reflecting the changing immune status of the human population, and a limited repertoire of functionally viable amino acids at specific amino acid sites.
Resumo:
One hundred forty-two women with polycystic ovary syndrome (PCOS) with an average body mass index (BMI) of 29.1 kg/m(2) and average age of 25.12 years were studied. By BMI, 30.2% were normal, 38.0% were overweight and 31.6% were obese. Thirty-one eumenorrheic women matched for BMI and age, with no evidence of hyperandrogenism, were recruited as controls. The incidence of dyslipidemia in the PCOS group was twice that of the Control group (76.1% versus 32.25%). The most frequent abnormalities were low high-density lipoprotein cholesterol (HDL-C; 57.6%) and high triglyceride (TG) (28.3%). HDL-C was significantly lower in all subgroups of women with PCOS when compared to the subgroups of normal women. No significant differences were seen in the total cholesterol (p = 0.307), low-density lipoprotein cholesterol (LDL-C; p = 0.283) and TGs (p = 0.113) levels among the subgroups. An independent effect on HDL-C was detected for glucose (p = 0.004) and fasting insulin (p = 0.01); on TG for age (p = 0.003) and homeostatic model assessment insulin resistance (p = 0.03) and on total cholesterol and LDL-C for age (p = 0.02 and p = 0.033, respectively). In conclusion, dyslipidemia is common in women with PCOS, mainly due to low HDL-C levels. BMI has a significant impact on this abnormality.