822 resultados para distance-metrics
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Issues related to association mining have received attention, especially the ones aiming to discover and facilitate the search for interesting patterns. A promising approach, in this context, is the application of clustering in the pre-processing step. In this paper, eleven metrics are proposed to provide an assessment procedure in order to support the evaluation of this kind of approach. To propose the metrics, a subjective evaluation was done. The metrics are important since they provide criteria to: (a) analyze the methodologies, (b) identify their positive and negative aspects, (c) carry out comparisons among them and, therefore, (d) help the users to select the most suitable solution for their problems. Besides, the metrics do the users think about aspects related to the problems and provide a flexible way to solve them. Some experiments were done in order to present how the metrics can be used and their usefulness.
Resumo:
Research on image processing has shown that combining segmentation methods may lead to a solid approach to extract semantic information from different sort of images. Within this context, the Normalized Cut (NCut) is usually used as a final partitioning tool for graphs modeled in some chosen method. This work explores the Watershed Transform as a modeling tool, using different criteria of the hierarchical Watershed to convert an image into an adjacency graph. The Watershed is combined with an unsupervised distance learning step that redistributes the graph weights and redefines the Similarity matrix, before the final segmentation step using NCut. Adopting the Berkeley Segmentation Data Set and Benchmark as a background, our goal is to compare the results obtained for this method with previous work to validate its performance.
Resumo:
It is known that the short distance QCD contribution to the mass difference of pions is quadratic on the quark masses, and irrelevant with respect to the long distance part. It is also considered in the literature that its calculation contains infinities, which should be absorbed by the quark mass renormalization. Following a prescription by Craigie, Narison, and Riazuddin of a renormalization-group-improved perturbation theory to deal with the electromagnetic mass shift problem in QCD, we show that the short distance QCD contribution to the electroweak pion mass difference (with mu=md≠0) is finite and, of course, its value is negligible compared to other contributions.
Resumo:
Crops close to small water bodies may exhibit changes in yield if the water mass causes significant changes in the microclimate of areas near the reservoir shoreline. The scientific literature describes this effect as occurring gradually, with higher intensity in the sites near the shoreline and decreasing intensity with distance from the reservoir. Experiments with two soybean cultivars were conducted during four crop seasons to evaluate soybean yield in relation to distance from the Itaipu reservoir and determine the effect of air temperature and water availability on soybean crop yield. Fifteen experimental sites were distributed in three transects perpendicular to the Itaipu reservoir, covering an area at approximately 10 km from the shoreline. The yield gradient between the site closest to the reservoir and the sites farther away in each transect did not show a consistent trend, but varied as a function of distance, crop season, and cultivar. This finding indicates that the Itaipu reservoir does not affect the yield of soybean plants grown within approximately 10 km from the shoreline. In addition, the variation in yield among the experimental sites was not attributed to thermal conditions because the temperature was similar within transects. However, the crop water availability was responsible for higher differences in yield among the neighboring experimental sites related to water stress caused by spatial variability in rainfall, especially during the soybean reproductive period in January and February.
Resumo:
Anaerobic efforts are commonly required through repeated sprint during efforts in many sports, making the anaerobic pathway a target of training. Nevertheless, to identify improvements on such energetic way it is necessary to assess anaerobic capacity or power, which is usually complex. For this purpose, authors have postulated the use of short running performances to anaerobic ability assessment. Thus, the aim of this study was to find a relationship between running performances on anaerobic power, anaerobic capacity or repeated sprint ability. Methods Thirteen military performed maximal running of 50 (P50), 100 (P100) and 300 (P300) m on track, beyond of running-based anaerobic sprint test (RAST; RSA and anaerobic power test), maximal anaerobic running test (MART; RSA and anaerobic capacity test) and the W′ from critical power model (anaerobic capacity test). Results By RAST variables, peak and average power (absolute and relative) and maximum velocity were significantly correlated with P50 (r = −0.68, p = 0.03 and −0.76, p = 0.01; −0.83, p < 0.01 and −0.83, p < 0.01; and −0.78, p < 0.01), respectively. The maximum intensity of MART was negatively and significantly correlated with P100 (r = −0.59) and W′ was not statistically correlated with any of the performances. Conclusion MART and W′ were not correlated with short running performances, having a weak performance predicting probably due to its longer duration in relation to assessed performances. Observing RAST outcomes, we postulated that such a protocol can be used during daily training as short running performance predictor.
Resumo:
Background and aims South America and Oceania possess numerous floristic similarities, often confirmed by morphological and molecular data. The carnivorous Drosera meristocaulis (Droseraceae), endemic to the Neblina highlands of northern South America, was known to share morphological characters with the pygmy sundews of Drosera sect. Bryastrum, which are endemic to Australia and New Zealand. The inclusion of D. meristocaulis in a molecular phylogenetic analysis may clarify its systematic position and offer an opportunity to investigate character evolution in Droseraceae and phylogeographic patterns between South America and Oceania. Methods Drosera meristocaulis was included in a molecular phylogenetic analysis of Droseraceae, using nuclear internal transcribed spacer (ITS) and plastid rbcL and rps16 sequence data. Pollen of D. meristocaulis was studied using light microscopy and scanning electron microscopy techniques, and the karyotype was inferred from root tip meristem. Key Results The phylogenetic inferences (maximum parsimony, maximum likelihood and Bayesian approaches) substantiate with high statistical support the inclusion of sect. Meristocaulis and its single species, D. meristocaulis, within the Australian Drosera clade, sister to a group comprising species of sect. Bryastrum. A chromosome number of 2n = approx. 32–36 supports the phylogenetic position within the Australian clade. The undivided styles, conspicuous large setuous stipules, a cryptocotylar (hypogaeous) germination pattern and pollen tetrads with aperture of intermediate type 7–8 are key morphological traits shared between D. meristocaulis and pygmy sundews of sect. Bryastrum from Australia and New Zealand. Conclusions The multidisciplinary approach adopted in this study (using morphological, palynological, cytotaxonomic and molecular phylogenetic data) enabled us to elucidate the relationships of the thus far unplaced taxon D. meristocaulis. Long-distance dispersal between southwestern Oceania and northern South America is the most likely scenario to explain the phylogeographic pattern revealed.
Resumo:
This study investigated the availability and use of audiovisual and electronic resources by distance learning students at the National Open University of Nigeria (NOUN). A questionnaire was administered tothe distance learning students selected across the various departments of the NOUN. The findings revealed that even though NOUN made provision for audiovisual and electronic resources for students' use, a majority of the audiovisual and electronic resources are available through personal provision by the students.The study also revealed regular use of audiovisual and electronic resources by the distance learning students. Constraints on use include poor power supply, poor infrastructure, lack of adequate skill, and high cost of access.
Resumo:
We develop spatial statistical models for stream networks that can estimate relationships between a response variable and other covariates, make predictions at unsampled locations, and predict an average or total for a stream or a stream segment. There have been very few attempts to develop valid spatial covariance models that incorporate flow, stream distance, or both. The application of typical spatial autocovariance functions based on Euclidean distance, such as the spherical covariance model, are not valid when using stream distance. In this paper we develop a large class of valid models that incorporate flow and stream distance by using spatial moving averages. These methods integrate a moving average function, or kernel, against a white noise process. By running the moving average function upstream from a location, we develop models that use flow, and by construction they are valid models based on stream distance. We show that with proper weighting, many of the usual spatial models based on Euclidean distance have a counterpart for stream networks. Using sulfate concentrations from an example data set, the Maryland Biological Stream Survey (MBSS), we show that models using flow may be more appropriate than models that only use stream distance. For the MBSS data set, we use restricted maximum likelihood to fit a valid covariance matrix that uses flow and stream distance, and then we use this covariance matrix to estimate fixed effects and make kriging and block kriging predictions.
Resumo:
1. Distance sampling is a widely used technique for estimating the size or density of biological populations. Many distance sampling designs and most analyses use the software Distance. 2. We briefly review distance sampling and its assumptions, outline the history, structure and capabilities of Distance, and provide hints on its use. 3. Good survey design is a crucial prerequisite for obtaining reliable results. Distance has a survey design engine, with a built-in geographic information system, that allows properties of different proposed designs to be examined via simulation, and survey plans to be generated. 4. A first step in analysis of distance sampling data is modeling the probability of detection. Distance contains three increasingly sophisticated analysis engines for this: conventional distance sampling, which models detection probability as a function of distance from the transect and assumes all objects at zero distance are detected; multiple-covariate distance sampling, which allows covariates in addition to distance; and mark–recapture distance sampling, which relaxes the assumption of certain detection at zero distance. 5. All three engines allow estimation of density or abundance, stratified if required, with associated measures of precision calculated either analytically or via the bootstrap. 6. Advanced analysis topics covered include the use of multipliers to allow analysis of indirect surveys (such as dung or nest surveys), the density surface modeling analysis engine for spatial and habitat-modeling, and information about accessing the analysis engines directly from other software. 7. Synthesis and applications. Distance sampling is a key method for producing abundance and density estimates in challenging field conditions. The theory underlying the methods continues to expand to cope with realistic estimation situations. In step with theoretical developments, state-of- the-art software that implements these methods is described that makes the methods accessible to practicing ecologists.
Resumo:
We consider a fully model-based approach for the analysis of distance sampling data. Distance sampling has been widely used to estimate abundance (or density) of animals or plants in a spatially explicit study area. There is, however, no readily available method of making statistical inference on the relationships between abundance and environmental covariates. Spatial Poisson process likelihoods can be used to simultaneously estimate detection and intensity parameters by modeling distance sampling data as a thinned spatial point process. A model-based spatial approach to distance sampling data has three main benefits: it allows complex and opportunistic transect designs to be employed, it allows estimation of abundance in small subregions, and it provides a framework to assess the effects of habitat or experimental manipulation on density. We demonstrate the model-based methodology with a small simulation study and analysis of the Dubbo weed data set. In addition, a simple ad hoc method for handling overdispersion is also proposed. The simulation study showed that the model-based approach compared favorably to conventional distance sampling methods for abundance estimation. In addition, the overdispersion correction performed adequately when the number of transects was high. Analysis of the Dubbo data set indicated a transect effect on abundance via Akaike’s information criterion model selection. Further goodness-of-fit analysis, however, indicated some potential confounding of intensity with the detection function.
Resumo:
The aim of this study was to determine whether image artifacts caused by orthodontic metal accessories interfere with the accuracy of 3D CBCT model superimposition. A human dry skull was subjected three times to a CBCT scan: at first without orthodontic brackets (T1), then with stainless steel brackets bonded without (T2) and with orthodontic arch wires (T3) inserted into the brackets' slots. The registration of image surfaces and the superimposition of 3D models were performed. Within-subject surface distances between T1-T2, T1-T3 and T2-T3 were computed and calculated for comparison among the three data sets. The minimum and maximum Hausdorff Distance units (HDu) computed between the corresponding data points of the T1 and T2 CBCT 3D surface images were 0.000000 and 0.049280 HDu, respectively, and the mean distance was 0.002497 HDu. The minimum and maximum Hausdorff Distances between T1 and T3 were 0.000000 and 0.047440 HDu, respectively, with a mean distance of 0.002585 HDu. In the comparison between T2 and T3, the minimum, maximum and mean Hausdorff Distances were 0.000000, 0.025616 and 0.000347 HDu, respectively. In the current study, the image artifacts caused by metal orthodontic accessories did not compromise the accuracy of the 3D model superimposition. Color-coded maps of overlaid structures complemented the computed Hausdorff Distances and demonstrated a precise fusion between the data sets.
Resumo:
Methods from statistical physics, such as those involving complex networks, have been increasingly used in the quantitative analysis of linguistic phenomena. In this paper, we represented pieces of text with different levels of simplification in co-occurrence networks and found that topological regularity correlated negatively with textual complexity. Furthermore, in less complex texts the distance between concepts, represented as nodes, tended to decrease. The complex networks metrics were treated with multivariate pattern recognition techniques, which allowed us to distinguish between original texts and their simplified versions. For each original text, two simplified versions were generated manually with increasing number of simplification operations. As expected, distinction was easier for the strongly simplified versions, where the most relevant metrics were node strength, shortest paths and diversity. Also, the discrimination of complex texts was improved with higher hierarchical network metrics, thus pointing to the usefulness of considering wider contexts around the concepts. Though the accuracy rate in the distinction was not as high as in methods using deep linguistic knowledge, the complex network approach is still useful for a rapid screening of texts whenever assessing complexity is essential to guarantee accessibility to readers with limited reading ability. Copyright (c) EPLA, 2012
Resumo:
This work proposes a method for data clustering based on complex networks theory. A data set is represented as a network by considering different metrics to establish the connection between each pair of objects. The clusters are obtained by taking into account five community detection algorithms. The network-based clustering approach is applied in two real-world databases and two sets of artificially generated data. The obtained results suggest that the exponential of the Minkowski distance is the most suitable metric to quantify the similarities between pairs of objects. In addition, the community identification method based on the greedy optimization provides the best cluster solution. We compare the network-based clustering approach with some traditional clustering algorithms and verify that it provides the lowest classification error rate. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Content-based image retrieval is still a challenging issue due to the inherent complexity of images and choice of the most discriminant descriptors. Recent developments in the field have introduced multidimensional projections to burst accuracy in the retrieval process, but many issues such as introduction of pattern recognition tasks and deeper user intervention to assist the process of choosing the most discriminant features still remain unaddressed. In this paper, we present a novel framework to CBIR that combines pattern recognition tasks, class-specific metrics, and multidimensional projection to devise an effective and interactive image retrieval system. User interaction plays an essential role in the computation of the final multidimensional projection from which image retrieval will be attained. Results have shown that the proposed approach outperforms existing methods, turning out to be a very attractive alternative for managing image data sets.