971 resultados para K-Nearest Neighbors


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Structurally neighboring residues are categorized according to their separation in the primary sequence as proximal (1-4 positions apart) and otherwise distal, which in turn is divided into near (5-20 positions), far (21-50 positions), very far ( > 50 positions), and interchain (from different chains of the same structure). These categories describe the linear distance histogram (LDH) for three-dimensional neighboring residue types. Among the main results are the following: (i) nearest-neighbor hydrophobic residues tend to be increasingly distally separated in the linear sequence, thus most often connecting distinct secondary structure units. (ii) The LDHs of oppositely charged nearest-neighbors emphasize proximal positions with a subsidiary maximum for very far positions. (iii) Cysteine-cysteine structural interactions rarely involve proximal positions. (iv) The greatest numbers of interchain specific nearest-neighbors in protein structures are composed of oppositely charged residues. (v) The largest fraction of side-chain neighboring residues from beta-strands involves near positions, emphasizing associations between consecutive strands. (vi) Exposed residue pairs are predominantly located in proximal linear positions, while buried residue pairs principally correspond to far or very far distal positions. The results are principally invariant to protein sizes, amino acid usages, linear distance normalizations, and over- and underrepresentations among nearest-neighbor types. Interpretations and hypotheses concerning the LDHs, particularly those of hydrophobic and charged pairings, are discussed with respect to protein stability and functionality. The pronounced occurrence of oppositely charged interchain contacts is consistent with many observations on protein complexes where multichain stabilization is facilitated by electrostatic interactions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cultural inheritance can be considered as a mechanism of adaptation made possible by communication, which has reached its greatest development in humans and can allow long-term conservation or rapid change of culturally transmissible traits depending on circumstances and needs. Conservativeness/flexibility is largely modulated by mechanisms of sociocultural transmission. An analysis was carried out by testing the fit of three models to 47 cultural traits (classified in six groups) in 277 African societies. Model A (demic diffusion) is conservation over generations, as shown by correlations of cultural traits with language, used as a measure of historical connection. Model B (environmental adaptation) is measured by correlation to the natural environment. Model C (cultural diffusion) is the spread to neighbors by social contact in an epidemic-like fashion and was tested by measuring the tightness of geographic clustering of the traits. Most traits examined, in particular those affecting family structure and kinship, showed great conservation over generations, as shown by the fit of model A. They are most probably transmitted by family members. This is in agreement with the theoretical demonstration that cultural transmission in the family (vertical) is the most conservative one. Some traits show environmental effects, indicating the importance of adaptation to physical environment. Only a few of the 47 traits showed tight geographic clustering indicating that their spread to nearest neighbors follows model C, as is usually the case for transmission among unrelated people (called horizontal transmission).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Quais propriedades magnéticas são modificadas quando se agrupam átomos de Fe/Co para formar estruturas quasi-2D, se comparadas aos nanofios (quasi-1D) de FexCo1-x? E como estas propriedades reagem com a variação da proporção de Fe/Co nos aglomerados? A fim de responder a estas questões, trímeros de FexCo1-x depositados em Pt(111) são investigados utilizando o método de primeiros princípios Real Space-Linear Muffin-Tin Orbital-Atomic Sphere Approximation (RS-LMTO-ASA) no âmbito da Teoria do Funcional da Densidade (DFT). Diferentes configurações de trímeros triangulares são consideradas, variando-se as posições e a concentração dos átomos de Fe/Co. Neste trabalho, demonstra-se a ocorrência de uma tendência não-linear estritamente decrescente dos momentos orbitais médios como função da concentração de Fe, distinta do encontrado tanto para os nanofios de FexCo1-x (dependência linear) quanto para a monocamada correspondente (dependência não-linear). Os resultados obtidos mostram ainda que os momentos orbitais variam com o ambiente local e com a direção de magnetização, especialmente quando associados aos átomos de Co, em concordância com publicações anteriores. A mudança de dimensionalidade quasi-1D (nanofios) para quasi-2D (trímeros compactos) não afeta o comportamento dos momentos de spin, que permanecem descritos por uma função linear com respeito à proporção de Fe/Co. Ambos o formato e a concentração de Fe nos sistemas apresentam um papel importante nos valores de energia de anisotropia magnética. Em adição, observou-se que o subtrato de Pt opera ativamente na definição das propriedades magnéticas dos aglomerados. Embora todas as configurações lineares e compactas dos aglomerados de FexCo1-x sejam estáveis e exibam interações fortemente ferromagnéticas entre os primeiros vizinhos, nem todas revelaram o ordenamento colinear como estado fundamental, apresentando uma interação de Dzyaloshinskii-Moriya não-desprezível induzida pelo acoplamento spin-órbita. Estes casos específicos são: o trímero triangular de Co puro e o trímero linear (nanofio) de Fe puro, para o qual foi verificado o acoplamento do tipo Ruderman-Kittel-Kasuya-Yosida entre os átomos de Fe constituintes. Os resultados obtidos contribuem para o entendimento de quais mecanismos definem o magnetismo nos trímeros de FexCo1-x/Pt(111), e discutem as questões presentes atualmente na literatura no contexto destes sistemas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Os motores de indução desempenham um importante papel na indústria, fato este que destaca a importância do correto diagnóstico e classificação de falhas ainda em fase inicial de sua evolução, possibilitando aumento na produtividade e, principalmente, eliminando graves danos aos processos e às máquinas. Assim, a proposta desta tese consiste em apresentar um multiclassificador inteligente para o diagnóstico de motor sem defeitos, falhas de curto-circuito nos enrolamentos do estator, falhas de rotor e falhas de rolamentos em motores de indução trifásicos acionados por diferentes modelos de inversores de frequência por meio da análise das amplitudes dos sinais de corrente de estator no domínio do tempo. Para avaliar a precisão de classificação frente aos diversos níveis de severidade das falhas, foram comparados os desempenhos de quatro técnicas distintas de aprendizado de máquina; a saber: (i) Rede Fuzzy Artmap, (ii) Rede Perceptron Multicamadas, (iii) Máquina de Vetores de Suporte e (iv) k-Vizinhos-Próximos. Resultados experimentais obtidos a partir de 13.574 ensaios experimentais são apresentados para validar o estudo considerando uma ampla faixa de frequências de operação, bem como regimes de conjugado de carga em 5 motores diferentes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Estudamos transições de fases quânticas em gases bosônicos ultrafrios aprisionados em redes óticas. A física desses sistemas é capturada por um modelo do tipo Bose-Hubbard que, no caso de um sistema sem desordem, em que os átomos têm interação de curto alcance e o tunelamento é apenas entre sítios primeiros vizinhos, prevê a transição de fases quântica superfluido-isolante de Mott (SF-MI) quando a profundidade do potencial da rede ótica é variado. Num primeiro estudo, verificamos como o diagrama de fases dessa transição muda quando passamos de uma rede quadrada para uma hexagonal. Num segundo, investigamos como a desordem modifica essa transição. No estudo com rede hexagonal, apresentamos o diagrama de fases da transição SF-MI e uma estimativa para o ponto crítico do primeiro lobo de Mott. Esses resultados foram obtidos usando o algoritmo de Monte Carlo quântico denominado Worm. Comparamos nossos resultados com os obtidos a partir de uma aproximação de campo médio e com os de um sistema com uma rede ótica quadrada. Ao introduzir desordem no sistema, uma nova fase emerge no diagrama de fases do estado fundamental intermediando a fase superfluida e a isolante de Mott. Essa nova fase é conhecida como vidro de Bose (BG) e a transição de fases quântica SF-BG que ocorre nesse sistema gerou muitas controvérsias desde seus primeiros estudos iniciados no fim dos anos 80. Apesar dos avanços em direção ao entendimento completo desta transição, a caracterização básica das suas propriedades críticas ainda é debatida. O que motivou nosso estudo, foi a publicação de resultados experimentais e numéricos em sistemas tridimensionais [Yu et al. Nature 489, 379 (2012), Yu et al. PRB 86, 134421 (2012)] que violam a lei de escala $\\phi= u z$, em que $\\phi$ é o expoente da temperatura crítica, $z$ é o expoente crítico dinâmico e $ u$ é o expoente do comprimento de correlação. Abordamos essa controvérsia numericamente fazendo uma análise de escalonamento finito usando o algoritmo Worm nas suas versões quântica e clássica. Nossos resultados demonstram que trabalhos anteriores sobre a dependência da temperatura de transição superfluido-líquido normal com o potencial químico (ou campo magnético, em sistemas de spin), $T_c \\propto (\\mu-\\mu_c)^\\phi$, estavam equivocados na interpretação de um comportamento transiente na aproximação da região crítica genuína. Quando os parâmetros do modelo são modificados de maneira a ampliar a região crítica quântica, simulações com ambos os modelos clássico e quântico revelam que a lei de escala $\\phi= u z$ [com $\\phi=2.7(2)$, $z=3$ e $ u = 0.88(5)$] é válida. Também estimamos o expoente crítico do parâmetro de ordem, encontrando $\\beta=1.5(2)$.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Os motores de indução trifásicos são os principais elementos de conversão de energia elétrica em mecânica motriz aplicados em vários setores produtivos. Identificar um defeito no motor em operação pode fornecer, antes que ele falhe, maior segurança no processo de tomada de decisão sobre a manutenção da máquina, redução de custos e aumento de disponibilidade. Nesta tese são apresentas inicialmente uma revisão bibliográfica e a metodologia geral para a reprodução dos defeitos nos motores e a aplicação da técnica de discretização dos sinais de correntes e tensões no domínio do tempo. É também desenvolvido um estudo comparativo entre métodos de classificação de padrões para a identificação de defeitos nestas máquinas, tais como: Naive Bayes, k-Nearest Neighbor, Support Vector Machine (Sequential Minimal Optimization), Rede Neural Artificial (Perceptron Multicamadas), Repeated Incremental Pruning to Produce Error Reduction e C4.5 Decision Tree. Também aplicou-se o conceito de Sistemas Multiagentes (SMA) para suportar a utilização de múltiplos métodos concorrentes de forma distribuída para reconhecimento de padrões de defeitos em rolamentos defeituosos, quebras nas barras da gaiola de esquilo do rotor e curto-circuito entre as bobinas do enrolamento do estator de motores de indução trifásicos. Complementarmente, algumas estratégias para a definição da severidade dos defeitos supracitados em motores foram exploradas, fazendo inclusive uma averiguação da influência do desequilíbrio de tensão na alimentação da máquina para a determinação destas anomalias. Os dados experimentais foram adquiridos por meio de uma bancada experimental em laboratório com motores de potência de 1 e 2 cv acionados diretamente na rede elétrica, operando em várias condições de desequilíbrio das tensões e variações da carga mecânica aplicada ao eixo do motor.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a new feature representation method based on the construction of a Confidence Matrix (CM). This representation consists of posterior probability values provided by several weak classifiers, each one trained and used in different sets of features from the original sample. The CM allows the final classifier to abstract itself from discovering underlying groups of features. In this work the CM is applied to isolated character image recognition, for which several set of features can be extracted from each sample. Experimentation has shown that the use of CM permits a significant improvement in accuracy in most cases, while the others remain the same. The results were obtained after experimenting with four well-known corpora, using evolved meta-classifiers with the k-Nearest Neighbor rule as a weak classifier and by applying statistical significance tests.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Prototype Selection (PS) algorithms allow a faster Nearest Neighbor classification by keeping only the most profitable prototypes of the training set. In turn, these schemes typically lower the performance accuracy. In this work a new strategy for multi-label classifications tasks is proposed to solve this accuracy drop without the need of using all the training set. For that, given a new instance, the PS algorithm is used as a fast recommender system which retrieves the most likely classes. Then, the actual classification is performed only considering the prototypes from the initial training set belonging to the suggested classes. Results show that this strategy provides a large set of trade-off solutions which fills the gap between PS-based classification efficiency and conventional kNN accuracy. Furthermore, this scheme is not only able to, at best, reach the performance of conventional kNN with barely a third of distances computed, but it does also outperform the latter in noisy scenarios, proving to be a much more robust approach.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the current Information Age, data production and processing demands are ever increasing. This has motivated the appearance of large-scale distributed information. This phenomenon also applies to Pattern Recognition so that classic and common algorithms, such as the k-Nearest Neighbour, are unable to be used. To improve the efficiency of this classifier, Prototype Selection (PS) strategies can be used. Nevertheless, current PS algorithms were not designed to deal with distributed data, and their performance is therefore unknown under these conditions. This work is devoted to carrying out an experimental study on a simulated framework in which PS strategies can be compared under classical conditions as well as those expected in distributed scenarios. Our results report a general behaviour that is degraded as conditions approach to more realistic scenarios. However, our experiments also show that some methods are able to achieve a fairly similar performance to that of the non-distributed scenario. Thus, although there is a clear need for developing specific PS methodologies and algorithms for tackling these situations, those that reported a higher robustness against such conditions may be good candidates from which to start.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A visualization plot of a data set of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM, and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries) and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection) and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the data sets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the data sets used to evaluate clustering by activity, LTM again gives the best performance but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map and a Bernoulli noise model for modeling binary data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Neutron diffraction was used to measure the total structure factors for several rare-earth ion R3+ (La3+ or Ce3+) phosphate glasses with composition close to RAl0.35P3.24O10.12. By assuming isomorphic structures, difference function methods were employed to separate, essentially, those correlations involving R3+ from the remainder. A self-consistent model of the glass structure was thereby developed in which the Al correlations were taken into explicit account. The glass network was found to be made from interlinked PO4 tetrahedra having 2.2(1) terminal oxygen atoms, OT, at 1.51(1) Angstrom, and 1.8(1) bridging oxygen atoms, OB, at 1.60(1) Angstrom. Rare-earth cations bonded to an average of 7.5(2) OT nearest neighbors in a broad and asymmetric distribution. The Al3+ ion acted as a network modifier and formed OT-A1-OT linkages that helped strengthen the glass. The connectivity of the R-centered coordination polyhedra was quantified in terms of a parameter f(s) and used to develop a model for the dependence on composition of the A1-OT coordination number in R-A1-P-O glasses. By using recent 17 A1 nuclear-magnetic-resonance data, it was shown that this connectivity decreases monotonically with increasing Al content. The chemical durability of the glasses appeared to be at a maximum when the connectivity of the R-centered coordination polyhedra was at a minimum. The relation of f(s) to the glass transition temperature, Tg, was discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The relative distribution of rare-earth ions R3+ (Dy3+ or Ho3+) in the phosphate glass RAl0.30P3.05O9.62 was measured by employing the method of isomorphic substitution in neutron diffraction. It is found that 7.9(7) R-R nearest neighbors reside at 5.62(6) Angstrom in a network made from interlinked PO4 tetrahedra. Provided that the role of Al is explicitly considered, a self-consistent account of the local matrix atom correlations can be developed in which there are 1.68(9) bridging and 2.32(9) terminal oxygen atoms per phosphorus.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The method (algorithm BIDIMS) of multivariate objects display to bidimensional structure in which the sum of differences of objects properties and their nearest neighbors is minimal is being described. The basic regularities on the set of objects at this ordering become evident. Besides, such structures (tables) have high inductive opportunities: many latent properties of objects may be predicted on their coordinates in this table. Opportunities of a method are illustrated on an example of bidimentional ordering of chemical elements. The table received in result practically coincides with the periodic Mendeleev table.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Allergy is an overreaction by the immune system to a previously encountered, ordinarily harmless substance - typically proteins - resulting in skin rash, swelling of mucous membranes, sneezing or wheezing, or other abnormal conditions. The use of modified proteins is increasingly widespread: their presence in food, commercial products, such as washing powder, and medical therapeutics and diagnostics, makes predicting and identifying potential allergens a crucial societal issue. The prediction of allergens has been explored widely using bioinformatics, with many tools being developed in the last decade; many of these are freely available online. Here, we report a set of novel models for allergen prediction utilizing amino acid E-descriptors, auto- and cross-covariance transformation, and several machine learning methods for classification, including logistic regression (LR), decision tree (DT), naïve Bayes (NB), random forest (RF), multilayer perceptron (MLP) and k nearest neighbours (kNN). The best performing method was kNN with 85.3% accuracy at 5-fold cross-validation. The resulting model has been implemented in a revised version of the AllerTOP server (http://www.ddg-pharmfac.net/AllerTOP). © Springer-Verlag 2014.