17 resultados para Clustering algorithm
em Scielo Saúde Pública - SP
Management zones using fuzzy clustering based on spatial-temporal variability of soil and corn yield
Resumo:
Clustering soil and crop data can be used as a basis for the definition of management zones because the data are grouped into clusters based on the similar interaction of these variables. Therefore, the objective of this study was to identify management zones using fuzzy c-means clustering analysis based on the spatial and temporal variability of soil attributes and corn yield. The study site (18 by 250-m in size) was located in Jaboticabal, São Paulo/Brazil. Corn yield was measured in one hundred 4.5 by 10-m cells along four parallel transects (25 observations per transect) over five growing seasons between 2001 and 2010. Soil chemical and physical attributes were measured. SAS procedure MIXED was used to identify which variable(s) most influenced the spatial variability of corn yield over the five study years. Basis saturation (BS) was the variable that better related to corn yield, thus, semivariograms models were fitted for BS and corn yield and then, data values were krigged. Management Zone Analyst software was used to carry out the fuzzy c-means clustering algorithm. The optimum number of management zones can change over time, as well as the degree of agreement between the BS and corn yield management zone maps. Thus, it is very important take into account the temporal variability of crop yield and soil attributes to delineate management zones accurately.
Resumo:
OBJECTIVE: To estimate the incidence rate of type 1 diabetes in the urban area of Santiago, Chile, from March 21, 1997 to March 20, 1998, and to assess the spatio-temporal clustering of cases during that period. METHODS: All sixty-one incident cases were located temporally (day of diagnosis) and spatially (place of residence) in the area of study. Knox's method was used to assess spatio-temporal clustering of incident cases. RESULTS: The overall incidence rate of type 1 diabetes was 4.11 cases per 100,000 children aged less than 15 years per year (95% confidence interval: 3.06--5.14). The incidence rate seems to have increased since the last estimate of the incidence calculated for the years 1986--1992 in the metropolitan region of Santiago. Different combinations of space-time intervals have been evaluated to assess spatio-temporal clustering. The smallest p-value was found for the combination of critical distances of 750 meters and 60 days (uncorrected p-value = 0.048). CONCLUSIONS: Although these are preliminary results regarding space-time clustering in Santiago, exploratory analysis of the data method would suggest a possible aggregation of incident cases in space-time coordinates.
Resumo:
Classical serological screening assays for Chagas' disease are time consuming and subjective. The objective of the present work is to evaluate the enzyme immuno-assay (ELISA) methodology and to propose an algorithm for blood banks to be applied to Chagas' disease. Seven thousand, nine hundred and ninety nine blood donor samples were screened by both reverse passive hemagglutination (RPHA) and indirect immunofluorescence assay (IFA). Samples reactive on RPHA and/or IFA were submitted to supplementary RPHA, IFA and complement fixation (CFA) tests. This strategy allowed us to create a panel of 60 samples to evaluate the ELISA methodology from 3 different manufacturers. The sensitivity of the screening by IFA and the 3 different ELISA's was 100%. The specificity was better on ELISA methodology. For Chagas disease, ELISA seems to be the best test for blood donor screening, because it showed high sensitivity and specificity, it is not subjective and can be automated. Therefore, it was possible to propose an algorithm to screen samples and confirm donor results at the blood bank.
Resumo:
ABSTRACTThe Amazon várzeas are an important component of the Amazon biome, but anthropic and climatic impacts have been leading to forest loss and interruption of essential ecosystem functions and services. The objectives of this study were to evaluate the capability of the Landsat-based Detection of Trends in Disturbance and Recovery (LandTrendr) algorithm to characterize changes in várzeaforest cover in the Lower Amazon, and to analyze the potential of spectral and temporal attributes to classify forest loss as either natural or anthropogenic. We used a time series of 37 Landsat TM and ETM+ images acquired between 1984 and 2009. We used the LandTrendr algorithm to detect forest cover change and the attributes of "start year", "magnitude", and "duration" of the changes, as well as "NDVI at the end of series". Detection was restricted to areas identified as having forest cover at the start and/or end of the time series. We used the Support Vector Machine (SVM) algorithm to classify the extracted attributes, differentiating between anthropogenic and natural forest loss. Detection reliability was consistently high for change events along the Amazon River channel, but variable for changes within the floodplain. Spectral-temporal trajectories faithfully represented the nature of changes in floodplain forest cover, corroborating field observations. We estimated anthropogenic forest losses to be larger (1.071 ha) than natural losses (884 ha), with a global classification accuracy of 94%. We conclude that the LandTrendr algorithm is a reliable tool for studies of forest dynamics throughout the floodplain.
Resumo:
OBJECTIVE - A population-based prospective study was analysed to: a) determine the prevalence of hypertension; b) investigate the clustering of other cardiovascular risk factors and c) verify whether older differed from younger adults in the pattern of clustering. METHODS - The data comprised a representative sample of the population of Bambuí, Brazil. Multiple logistic regression was used to investigate the independent association between hypertension and selected factors. RESULTS - A total of 820 younger adults (82.5%) and 1494 older adults (85.9%) participated in this study. The overall prevalence of hypertension was 24.8% (SE=1.4 %), being higher in women (26.9±1.5%) than in men (22.0± 1.7%) (p=0.033). Hypertension was positively and significantly associated with physical inactivity, overweight, hypercholesterolemia hyperglycemia and hypertriglyceridemia. The coexistence of hypertension with 4 or more of these risk factors occurred 6 times more than expected by chance, after adjusting for age and sex (OR=6.3; 95%CI: 3.4-11.9). The pattern of risk factor clustering in hypertensive individuals differed with age. CONCLUSION - Our results reinforce the need to increase detection and treatment of hypertension and to approach patients' global risk profiles.
Resumo:
Background:Vascular remodeling, the dynamic dimensional change in face of stress, can assume different directions as well as magnitudes in atherosclerotic disease. Classical measurements rely on reference to segments at a distance, risking inappropriate comparison between dislike vessel portions.Objective:to explore a new method for quantifying vessel remodeling, based on the comparison between a given target segment and its inferred normal dimensions.Methods:Geometric parameters and plaque composition were determined in 67 patients using three-vessel intravascular ultrasound with virtual histology (IVUS-VH). Coronary vessel remodeling at cross-section (n = 27.639) and lesion (n = 618) levels was assessed using classical metrics and a novel analytic algorithm based on the fractional vessel remodeling index (FVRI), which quantifies the total change in arterial wall dimensions related to the estimated normal dimension of the vessel. A prediction model was built to estimate the normal dimension of the vessel for calculation of FVRI.Results:According to the new algorithm, “Ectatic” remodeling pattern was least common, “Complete compensatory” remodeling was present in approximately half of the instances, and “Negative” and “Incomplete compensatory” remodeling types were detected in the remaining. Compared to a traditional diagnostic scheme, FVRI-based classification seemed to better discriminate plaque composition by IVUS-VH.Conclusion:Quantitative assessment of coronary remodeling using target segment dimensions offers a promising approach to evaluate the vessel response to plaque growth/regression.
Resumo:
The study of the Schistosoma mansoni genome, one of the etiologic agents of human schistosomiasis, is essential for a better understanding of the biology and development of this parasite. In order to get an overview of all S. mansoni catalogued gene sequences, we performed a clustering analysis of the parasite mRNA sequences available in public databases. This was made using softwares PHRAP and CAP3. The consensus sequences, generated after the alignment of cluster constituent sequences, allowed the identification by database homology searches of the most expressed genes in the worm. We analyzed these genes and looked for a correlation between their high expression and parasite metabolism and biology. We observed that the majority of these genes is related to the maintenance of basic cell functions, encoding genes whose products are related to the cytoskeleton, intracellular transport and energy metabolism. Evidences are presented here that genes for aerobic energy metabolism are expressed in all the developmental stages analyzed. Some of the most expressed genes could not be identified by homology searches and may have some specific functions in the parasite.
Genetic diversity between improved banana diploids using canonical variables and the Ward-MLM method
Resumo:
The objective of this work was to estimate the genetic diversity of improved banana diploids using data from quantitative analysis and from simple sequence repeats (SSR) marker, simultaneously. The experiment was carried out with 33 diploids, in an augmented block design with 30 regular treatments and three common ones. Eighteen agronomic characteristics and 20 SSR primers were used. The agronomic characteristics and the SSR were analyzed simultaneously by the Ward-MLM, cluster, and IML procedures. The Ward clustering method considered the combined matrix obtained by the Gower algorithm. The Ward-MLM procedure identified three ideal groups (G1, G2, and G3) based on pseudo-F and pseudo-t² statistics. The dendrogram showed relative similarity between the G1 genotypes, justified by genealogy. In G2, 'Calcutta 4' appears in 62% of the genealogies. Similar behavior was observed in G3, in which the 028003-01 diploid is the male parent of the 086079-10 and 042079-06 genotypes. The method with canonical variables had greater discriminatory power than Ward-MLM. Although reduced, the genetic variability available is sufficient to be used in the development of new hybrids.
Resumo:
The objective of this work was to propose a way of using the Tocher's method of clustering to obtain a matrix similar to the cophenetic one obtained for hierarchical methods, which would allow the calculation of a cophenetic correlation. To illustrate the obtention of the proposed cophenetic matrix, we used two dissimilarity matrices - one obtained with the generalized squared Mahalanobis distance and the other with the Euclidean distance - between 17 garlic cultivars, based on six morphological characters. Basically, the proposal for obtaining the cophenetic matrix was to use the average distances within and between clusters, after performing the clustering. A function in R language was proposed to compute the cophenetic matrix for Tocher's method. The empirical distribution of this correlation coefficient was briefly studied. For both dissimilarity measures, the values of cophenetic correlation obtained for the Tocher's method were higher than those obtained with the hierarchical methods (Ward's algorithm and average linkage - UPGMA). Comparisons between the clustering made with the agglomerative hierarchical methods and with the Tocher's method can be performed using a criterion in common: the correlation between matrices of original and cophenetic distances.
Resumo:
It is presented a software developed with Delphi programming language to compute the reservoir's annual regulated active storage, based on the sequent-peak algorithm. Mathematical models used for that purpose generally require extended hydrological series. Usually, the analysis of those series is performed with spreadsheets or graphical representations. Based on that, it was developed a software for calculation of reservoir active capacity. An example calculation is shown by 30-years (from 1977 to 2009) monthly mean flow historical data, from Corrente River, located at São Francisco River Basin, Brazil. As an additional tool, an interface was developed to manage water resources, helping to manipulate data and to point out information that it would be of interest to the user. Moreover, with that interface irrigation districts where water consumption is higher can be analyzed as a function of specific seasonal water demands situations. From a practical application, it is possible to conclude that the program provides the calculation originally proposed. It was designed to keep information organized and retrievable at any time, and to show simulation on seasonal water demands throughout the year, contributing with the elements of study concerning reservoir projects. This program, with its functionality, is an important tool for decision making in the water resources management.
Resumo:
Several equipments and methodologies have been developed to make available precision agriculture, especially considering the high cost of its implantation and sampling. An interesting possibility is to define management zones aim at dividing producing areas in smaller management zones that could be treated differently, serving as a source of recommendation and analysis. Thus, this trial used physical and chemical properties of soil and yield aiming at the generation of management zones in order to identify whether they can be used as recommendation and analysis. Management zones were generated by the Fuzzy C-Means algorithm and their evaluation was performed by calculating the reduction of variance and performing means tests. The division of the area into two management zones was considered appropriate for the present distinct averages of most soil properties and yield. The used methodology allowed the generation of management zones that can serve as source of recommendation and soil analysis; despite the relative efficiency has shown a reduced variance for all attributes in divisions in the three sub-regions, the ANOVA did not show significative differences among the management zones.
Resumo:
The determination of the intersection curve between Bézier Surfaces may be seen as the composition of two separated problems: determining initial points and tracing the intersection curve from these points. The Bézier Surface is represented by a parametric function (polynomial with two variables) that maps a point in the tridimensional space from the bidimensional parametric space. In this article, it is proposed an algorithm to determine the initial points of the intersection curve of Bézier Surfaces, based on the solution of polynomial systems with the Projected Polyhedral Method, followed by a method for tracing the intersection curves (Marching Method with differential equations). In order to allow the use of the Projected Polyhedral Method, the equations of the system must be represented in terms of the Bernstein basis, and towards this goal it is proposed a robust and reliable algorithm to exactly transform a multivariable polynomial in terms of power basis to a polynomial written in terms of Bernstein basis .
Resumo:
In this paper we present an algorithm for the numerical simulation of the cavitation in the hydrodynamic lubrication of journal bearings. Despite the fact that this physical process is usually modelled as a free boundary problem, we adopted the equivalent variational inequality formulation. We propose a two-level iterative algorithm, where the outer iteration is associated to the penalty method, used to transform the variational inequality into a variational equation, and the inner iteration is associated to the conjugate gradient method, used to solve the linear system generated by applying the finite element method to the variational equation. This inner part was implemented using the element by element strategy, which is easily parallelized. We analyse the behavior of two physical parameters and discuss some numerical results. Also, we analyse some results related to the performance of a parallel implementation of the algorithm.
Resumo:
Previous genetic association studies have overlooked the potential for biased results when analyzing different population structures in ethnically diverse populations. The purpose of the present study was to quantify this bias in two-locus association studies conducted on an admixtured urban population. We studied the genetic structure distribution of angiotensin-converting enzyme insertion/deletion (ACE I/D) and angiotensinogen methionine/threonine (M/T) polymorphisms in 382 subjects from three subgroups in a highly admixtured urban population. Group I included 150 white subjects; group II, 142 mulatto subjects, and group III, 90 black subjects. We conducted sample size simulation studies using these data in different genetic models of gene action and interaction and used genetic distance calculation algorithms to help determine the population structure for the studied loci. Our results showed a statistically different population structure distribution of both ACE I/D (P = 0.02, OR = 1.56, 95% CI = 1.05-2.33 for the D allele, white versus black subgroup) and angiotensinogen M/T polymorphism (P = 0.007, OR = 1.71, 95% CI = 1.14-2.58 for the T allele, white versus black subgroup). Different sample sizes are predicted to be determinant of the power to detect a given genotypic association with a particular phenotype when conducting two-locus association studies in admixtured populations. In addition, the postulated genetic model is also a major determinant of the power to detect any association in a given sample size. The present simulation study helped to demonstrate the complex interrelation among ethnicity, power of the association, and the postulated genetic model of action of a particular allele in the context of clustering studies. This information is essential for the correct planning and interpretation of future association studies conducted on this population.
Resumo:
Verbal fluency tests are used as a measure of executive functions and language, and can also be used to evaluate semantic memory. We analyzed the influence of education, gender and age on scores in a verbal fluency test using the animal category, and on number of categories, clustering and switching. We examined 257 healthy participants (152 females and 105 males) with a mean age of 49.42 years (SD = 15.75) and having a mean educational level of 5.58 (SD = 4.25) years. We asked them to name as many animals as they could. Analysis of variance was performed to determine the effect of demographic variables. No significant effect of gender was observed for any of the measures. However, age seemed to influence the number of category changes, as expected for a sensitive frontal measure, after being controlled for the effect of education. Educational level had a statistically significant effect on all measures, except for clustering. Subject performance (mean number of animals named) according to schooling was: illiterates, 12.1; 1 to 4 years, 12.3; 5 to 8 years, 14.0; 9 to 11 years, 16.7, and more than 11 years, 17.8. We observed a decrease in performance in these five educational groups over time (more items recalled during the first 15 s, followed by a progressive reduction until the fourth interval). We conclude that education had the greatest effect on the category fluency test in this Brazilian sample. Therefore, we must take care in evaluating performance in lower educational subjects.