913 resultados para Nearest Neighbor


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of missing entries for some attributes whose values are actually known. These artificially missing values are imputed and then compared with the original values. Although this evaluation is useful, it does not allow the influence of imputed values in the ultimate modelling task (e.g. in classification) to be inferred. We argue that imputation cannot be properly evaluated apart from the modelling task. Thus, alternative approaches are needed. This article elaborates on the influence of imputed values in classification. In particular, a practical procedure for estimating the inserted bias is described. As an additional contribution, we have used such a procedure to empirically illustrate the performance of three imputation methods (majority, naive Bayes and Bayesian networks) in three datasets. Three classifiers (decision tree, naive Bayes and nearest neighbours) have been used as modelling tools in our experiments. The achieved results illustrate a variety of situations that can take place in the data preparation practice.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this work the interaction of cyclopentene with a set of InP(001) surfaces is investigated by means of the density functional theory. We propose a simple approach for evaluating the surface strain and based on it we have found a linear relation between bond and strain energies and the adsorption energy. Our results also indicate that the higher the bond energy, the more disperse the charge distribution is around the adsorption site associated to the high occupied state, a key feature that characterizes the adsorption process. Different adsorption coverages are used to evaluate the proposed equation. Our results suggest that the proposed approach might be extended to other systems where the interaction of the semiconductor surface and the molecule is restricted to first neighbor sites. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We focus this work on the theoretical investigation of the block-copolymer poly [oxyoctyleneoxy-(2,6-dimethoxy-1,4phenylene-1,2-ethinylene-phenanthrene-2,4diyl) named as LaPPS19, recently proposed for optoelectronic applications. We used for that a variety of methods, from molecular mechanics to quantum semiempirical techniques (AMI, ZINDO/S-CIS). Our results show that as expected isolated LaPPS19 chains present relevant electron localization over the phenanthrene group. We found, however, that LaPPS19 could assemble in a pi-stacked form, leading to impressive interchain interaction; the stacking induces electronic delocalization between neighbor chains and introduces new states below the phenanthrene-related absorption; these results allowed us to associate the red-shift of the absorption edge, seen in the experimental results, to spontaneous pi-stack aggregation of the chains. (C) 2009 Wiley Periodicals, Inc. Int J Quantum Chem 110: 885-892, 2010

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Structured meaning-signal mappings, i.e., mappings that preserve neighborhood relationships by associating similar signals with similar meanings, are advantageous in an environment where signals are corrupted by noise and sub-optimal meaning inferences are rewarded as well. The evolution of these mappings, however, cannot be explained within a traditional language evolutionary game scenario in which individuals meet randomly because the evolutionary dynamics is trapped in local maxima that do not reflect the structure of the meaning and signal spaces. Here we use a simple game theoretical model to show analytically that when individuals adopting the same communication code meet more frequently than individuals using different codes-a result of the spatial organization of the population-then advantageous linguistic innovations can spread and take over the population. In addition, we report results of simulations in which an individual can communicate only with its K nearest neighbors and show that the probability that the lineage of a mutant that uses a more efficient communication code becomes fixed decreases exponentially with increasing K. These findings support the mother tongue hypothesis that human language evolved as a communication system used among kin, especially between mothers and offspring.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The relationship between the structure and function of biological networks constitutes a fundamental issue in systems biology. Particularly, the structure of protein-protein interaction networks is related to important biological functions. In this work, we investigated how such a resilience is determined by the large scale features of the respective networks. Four species are taken into account, namely yeast Saccharomyces cerevisiae, worm Caenorhabditis elegans, fly Drosophila melanogaster and Homo sapiens. We adopted two entropy-related measurements (degree entropy and dynamic entropy) in order to quantify the overall degree of robustness of these networks. We verified that while they exhibit similar structural variations under random node removal, they differ significantly when subjected to intentional attacks (hub removal). As a matter of fact, more complex species tended to exhibit more robust networks. More specifically, we quantified how six important measurements of the networks topology (namely clustering coefficient, average degree of neighbors, average shortest path length, diameter, assortativity coefficient, and slope of the power law degree distribution) correlated with the two entropy measurements. Our results revealed that the fraction of hubs and the average neighbor degree contribute significantly for the resilience of networks. In addition, the topological analysis of the removed hubs indicated that the presence of alternative paths between the proteins connected to hubs tend to reinforce resilience. The performed analysis helps to understand how resilience is underlain in networks and can be applied to the development of protein network models.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we propose a scheme for quasi-perfect state transfer in a network of dissipative harmonic oscillators. We consider ideal sender and receiver oscillators connected by a chain of nonideal transmitter oscillators coupled by nearest-neighbour resonances. From the algebraic properties of the dynamical quantities describing the evolution of the network state, we derive a criterion, fixing the coupling strengths between all the oscillators, apart from their natural frequencies, enabling perfect state transfer in the particular case of ideal transmitter oscillators. Our criterion provides an easily manipulated formula enabling perfect state transfer in the special case where the network nonidealities are disregarded. We also extend such a criterion to dissipative networks where the fidelity of the transferred state decreases due to the loss mechanisms. To circumvent almost completely the adverse effect of decoherence, we propose a protocol to achieve quasi-perfect state transfer in nonideal networks. By adjusting the common frequency of the sender and the receiver oscillators to be out of resonance with that of the transmitters, we demonstrate that the sender`s state tunnels to the receiver oscillator by virtually exciting the nonideal transmitter chain. This virtual process makes negligible the decay rate associated with the transmitter line at the expense of delaying the time interval for the state transfer process. Apart from our analytical results, numerical computations are presented to illustrate our protocol.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The possibility to compress analyte bands at the beginning of CE runs has many advantages. Analytes at low concentration can be analyzed with high signal-to-noise ratios by using the so-called sample stacking methods. Moreover, sample injections with very narrow initial band widths (small initial standard deviations) are sometimes useful, especially if high resolutions among the bands are required in the shortest run time. In the present work, a method of sample stacking is proposed and demonstrated. It is based on BGEs with high thermal sensitive pHs (high dpH/dT) and analytes with low dpK(a)/dT. High thermal sensitivity means that the working pK(a) of the BGE has a high dpK(a)/dT in modulus. For instance, Tris and Ethanolamine have dpH/dT = -0.028/degrees C and -0.029/degrees C, respectively, whereas carboxylic acids have low dpK(a)/dT values, i.e. in the -0.002/degrees C to+0.002/degrees C range. The action of cooling and heating sections along the capillary during the runs affects also the local viscosity, conductivity, and electric field strength. The effect of these variables on electrophoretic velocity and band compression is theoretically calculated using a simple model. Finally, this stacking method was demonstrated for amino acids derivatized with naphthalene-2,3-dicarboxaldehyde and fluorescamine using a temperature difference of 70 degrees C between two neighbor sections and Tris as separation buffer. In this case, the BGE has a high pH thermal coefficient whereas the carboxylic groups of the analytes have low pK(a) thermal coefficients. The application of these dynamic thermal gradients increased peak height by a factor of two (and decreased the standard deviations of peaks by a factor of two) of aspartic acid and glutamic acid derivatized with naphthalene-2,3-dicarboxaldehyde and serine derivatized with fluorescamine. The effect of thermal compression of bands was not observed when runs were accomplished using phosphate buffer at pH 7 (negative control). Phosphate has a low dpH/dT in this pH range, similar to the dK(a)/dT of analytes. It is shown that vertical bar dK(a)/dT-dpH/dT vertical bar >> 0 is one determinant factor to have significant stacking produced by dynamic thermal junctions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study analyses the effects of firm relocation on firm profits, using longitudinal data on Swedish limtied liability firms and employing a difference-in-differnce propensity score method in the empirical analysis. Using propensity score matching, the pre-relocalization differneces between relocating and non-relocating firms are balanced. In addition to that, a difference-in-difference estimator is employed in order to control for all time-invariant unobserved heterogeneity among firms. For matching, nearest neighbour matching, using the one-, two- and three nearest neighbours is employed. The balanacing results indicate that matching achieves a good balance, and that similar relocating and non-relocating firms are being compared. The estimated average treatment on the treatment effects indicate thats relocations has a significant effect on the profits of the relocating firms. In other words, firms taht relocate increase their profits significantly, in comparison to what the profits would be had the firms not relocated. This effect is estimated to vary between 3 to 11 percentage points, depending on the lenght of the analysed period after relocation. 

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The p-median model is used to locate P facilities to serve a geographically distributed population. Conventionally, it is assumed that the population always travels to the nearest facility. Drezner and Drezner (2006, 2007) provide three arguments on why this assumption might be incorrect, and they introduce the extended the gravity p-median model to relax the assumption. We favour the gravity p-median model, but we note that in an applied setting, Drezner and Drezner’s arguments are incomplete. In this communication, we point at the existence of a fourth compelling argument for the gravity p-median model.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An administrative border might hinder the optimal allocation of a given set of resources by restricting the flow of goods, services, and people. In this paper we address the question: Do administrative borders lead to poor accessibility to public service such as hospitals? In answering the question, we have examined the case of Sweden and its regional borders. We have used detailed data on the Swedish road network, its hospitals, and its geo-coded population. We have assessed the population’s spatial accessibility to Swedish hospitals by computing the inhabitants’ distance to the nearest hospital. We have also elaborated several scenarios ranging from strongly confining regional borders to no confinements of borders and recomputed the accessibility. Our findings imply that administrative borders are only marginally worsening the accessibility.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A customer is presumed to gravitate to a facility by the distance to it and the attractiveness of it. However regarding the location of the facility, the presumption is that the customer opts for the shortest route to the nearest facility.This paradox was recently solved by the introduction of the gravity p-median model. The model is yet to be implemented and tested empirically. We implemented the model in an empirical problem of locating locksmiths, vehicle inspections, and retail stores ofv ehicle spare-parts, and we compared the solutions with those of the p-median model. We found the gravity p-median model to be of limited use for the problem of locating facilities as it either gives solutions similar to the p-median model, or it gives unstable solutions due to a non-concave objective function.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The p-median model is used to locate P facilities to serve a geographically distributed population. Conventionally, it is assumed that the population patronize the nearest facility and that the distance between the resident and the facility may be measured by the Euclidean distance. Carling, Han, and Håkansson (2012) compared two network distances with the Euclidean in a rural region witha sparse, heterogeneous network and a non-symmetric distribution of thepopulation. For a coarse network and P small, they found, in contrast to the literature, the Euclidean distance to be problematic. In this paper we extend their work by use of a refined network and study systematically the case when P is of varying size (2-100 facilities). We find that the network distance give as gooda solution as the travel-time network. The Euclidean distance gives solutions some 2-7 per cent worse than the network distances, and the solutions deteriorate with increasing P. Our conclusions extend to intra-urban location problems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data mining can be used in healthcare industry to “mine” clinical data to discover hidden information for intelligent and affective decision making. Discovery of hidden patterns and relationships often goes intact, yet advanced data mining techniques can be helpful as remedy to this scenario. This thesis mainly deals with Intelligent Prediction of Chronic Renal Disease (IPCRD). Data covers blood, urine test, and external symptoms applied to predict chronic renal disease. Data from the database is initially transformed to Weka (3.6) and Chi-Square method is used for features section. After normalizing data, three classifiers were applied and efficiency of output is evaluated. Mainly, three classifiers are analyzed: Decision Tree, Naïve Bayes, K-Nearest Neighbour algorithm. Results show that each technique has its unique strength in realizing the objectives of the defined mining goals. Efficiency of Decision Tree and KNN was almost same but Naïve Bayes proved a comparative edge over others. Further sensitivity and specificity tests are used as statistical measures to examine the performance of a binary classification. Sensitivity (also called recall rate in some fields) measures the proportion of actual positives which are correctly identified while Specificity measures the proportion of negatives which are correctly identified. CRISP-DM methodology is applied to build the mining models. It consists of six major phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Regarding the location of a facility, the presumption in the widely used p-median model is that the customer opts for the shortest route to the nearest facility. However, this assumption is problematic on free markets since the customer is presumed to gravitate to a facility by the distance to and the attractiveness of it. The recently introduced gravity p-median model offers an extension to the p-median model that account for this. The model is therefore potentially interesting, although it has not yet been implemented and tested empirically. In this paper, we have implemented the model in an empirical problem of locating vehicle inspections, locksmiths, and retail stores of vehicle spare-parts for the purpose of investigating its superiority to the p-median model. We found, however, the gravity p-median model to be of limited use for the problem of locating facilities as it either gives solutions similar to the p-median model, or it gives unstable solutions due to a non-concave objective function.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the last decade, we have seen a massive increase in the construction of wind farms in northern Fennoscandia. Wind farms comprising hundreds of wind turbines are being built, with little knowledge of the possible cumulative adverse effects on the habitat use and migration of semi-domesticated free-ranging reindeer. We assessed how reindeer responded to wind farm construction in an already fragmented landscape, with specific reference to the effects on use of movement corridors and reindeer habitat selection. We used GPS-data from reindeer during calving and post-calving in the MalAyen reindeer herding community in Sweden. We analysed data from the pre-development years compared to the construction years of two relatively small wind farms. During construction of the wind farms, use of original migration routes and movement corridors within 2 km of development declined by 76 %. This decline in use corresponded to an increase in activity of the reindeer measured by increased step lengths within 0-5 km. The step length was highest nearest the development and declining with distance, as animals moved towards migration corridors and turned around or were observed in holding patterns while not crossing. During construction, reindeer avoided the wind farms at both regional and landscape scale of selection. The combined construction activities associated with even a few wind turbines combined with power lines and roads in or close to central movement corridors caused a reduction in the use of such corridors and grazing habitat and increased the fragmentation of the reindeer calving ranges.