991 resultados para maximum family sizes
Resumo:
Learning from Positive and Unlabelled examples (LPU) has emerged as an important problem in data mining and information retrieval applications. Existing techniques are not ideally suited for real world scenarios where the datasets are linearly inseparable, as they either build linear classifiers or the non-linear classifiers fail to achieve the desired performance. In this work, we propose to extend maximum margin clustering ideas and present an iterative procedure to design a non-linear classifier for LPU. In particular, we build a least squares support vector classifier, suitable for handling this problem due to symmetry of its loss function. Further, we present techniques for appropriately initializing the labels of unlabelled examples and for enforcing the ratio of positive to negative examples while obtaining these labels. Experiments on real-world datasets demonstrate that the non-linear classifier designed using the proposed approach gives significantly better generalization performance than the existing relevant approaches for LPU.
Resumo:
This article presents the details of estimation of fracture parameters for high strength concrete (HSC, HSC1) and ultra high strength concrete (UHSC). Brief details about characterization of ingredients of HSC, HSC1 and UHSC have been provided. Experiments have been carried out on beams made up of HSC, HSC1 and UHSC considering various sizes and notch depths. Fracture characteristics such as size independent fracture energy (G(f)), size of fracture process zone (C-f), fracture toughness (K-IC) and crack tip opening displacement (CTODc) have been estimated based on the experimental observations. From the studies, it is observed that (i) UHSC has high fracture energy and ductility inspite of having a very low value of C-f; (ii) relatively much more homogeneous than other concretes, because of absence of coarse aggregates and well-graded smaller size particles; (iii) the critical SIF (K-IC) values are increasing with increase of beam depth and decreasing with increase of notch depth. Generally, it can be noted that there is significant increase in fracture toughness and CTODc. They are about 7 times in HSC1 and about 10 times in UHSC compared to those in HSC; (iv) for notch-to-depth ratio 0.1, Bazant's size effect model slightly overestimates the maximum failure loads compared to experimental observations and Karihaloo's model slightly underestimates the maximum failure loads. For the notch-to-depth ratio ranging from 0.2 to 0.4 for the case of UHSC, it can be observed that, both the size effect models predict more or less similar maximum failure loads compared to corresponding experimental values.
Resumo:
We consider the problem of characterizing the minimum average delay, or equivalently the minimum average queue length, of message symbols randomly arriving to the transmitter queue of a point-to-point link which dynamically selects a (n, k) block code from a given collection. The system is modeled by a discrete time queue with an IID batch arrival process and batch service. We obtain a lower bound on the minimum average queue length, which is the optimal value for a linear program, using only the mean (λ) and variance (σ2) of the batch arrivals. For a finite collection of (n, k) codes the minimum achievable average queue length is shown to be Θ(1/ε) as ε ↓ 0 where ε is the difference between the maximum code rate and λ. We obtain a sufficient condition for code rate selection policies to achieve this optimal growth rate. A simple family of policies that use only one block code each as well as two other heuristic policies are shown to be weakly optimal in the sense of achieving the 1/ε growth rate. An appropriate selection from the family of policies that use only one block code each is also shown to achieve the optimal coefficient σ2/2 of the 1/ε growth rate. We compare the performance of the heuristic policies with the minimum achievable average queue length and the lower bound numerically. For a countable collection of (n, k) codes, the optimal average queue length is shown to be Ω(1/ε). We illustrate the selectivity among policies of the growth rate optimality criterion for both finite and countable collections of (n, k) block codes.
Resumo:
In this paper, we consider the setting of the pattern maximum likelihood (PML) problem studied by Orlitsky et al. We present a well-motivated heuristic algorithm for deciding the question of when the PML distribution of a given pattern is uniform. The algorithm is based on the concept of a ``uniform threshold''. This is a threshold at which the uniform distribution exhibits an interesting phase transition in the PML problem, going from being a local maximum to being a local minimum.
Resumo:
In addition to the chemical nature of the surface, the dimensions of the confining host exert a significant influence on confined protein structures; this results in immense biological implications, especially those concerning the enzymatic activities of the protein. This study probes the structure of hemoglobin (Hb), a model protein, confined inside silica tubes with pore diameters that vary by one order of magnitude (approximate to 20-200 nm). The effect of confinement on the protein structure is probed by comparison with the structure of the protein in solution. Small-angle neutron scattering (SANS), which provides information on protein tertiary and quaternary structures, is employed to study the influence of the tube pore diameter on the structure and configuration of the confined protein in detail. Confinement significantly influences the structural stability of Hb and the structure depends on the Si-tube pore diameter. The high radius of gyration (R-g) and polydispersity of Hb in the 20 nm diameter Si-tube indicates that Hb undergoes a significant amount of aggregation. However, for Si-tube diameters greater or equal to 100 nm, the R-g of Hb is found to be in very close proximity to that obtained from the protein data bank (PDB) reported structure (R-g of native Hb=23.8 angstrom). This strongly indicates that the protein has a preference for the more native-like non-aggregated state if confined inside tubes of diameter greater or equal to 100 nm. Further insight into the Hb structure is obtained from the distance distribution function, p(r), and ab initio models calculated from the SANS patterns. These also suggest that the Si-tube size is a key parameter for protein stability and structure.
Resumo:
Maximum entropy approach to classification is very well studied in applied statistics and machine learning and almost all the methods that exists in literature are discriminative in nature. In this paper, we introduce a maximum entropy classification method with feature selection for large dimensional data such as text datasets that is generative in nature. To tackle the curse of dimensionality of large data sets, we employ conditional independence assumption (Naive Bayes) and we perform feature selection simultaneously, by enforcing a `maximum discrimination' between estimated class conditional densities. For two class problems, in the proposed method, we use Jeffreys (J) divergence to discriminate the class conditional densities. To extend our method to the multi-class case, we propose a completely new approach by considering a multi-distribution divergence: we replace Jeffreys divergence by Jensen-Shannon (JS) divergence to discriminate conditional densities of multiple classes. In order to reduce computational complexity, we employ a modified Jensen-Shannon divergence (JS(GM)), based on AM-GM inequality. We show that the resulting divergence is a natural generalization of Jeffreys divergence to a multiple distributions case. As far as the theoretical justifications are concerned we show that when one intends to select the best features in a generative maximum entropy approach, maximum discrimination using J-divergence emerges naturally in binary classification. Performance and comparative study of the proposed algorithms have been demonstrated on large dimensional text and gene expression datasets that show our methods scale up very well with large dimensional datasets.
Resumo:
The phylogenetic structure of Asclepiadoideae (Apocynaceae) has been elucidated at the tribal and subtribal levels in the last two decades. However, to date, the systematic positions of seven Asian genera, Cosmostigma, Graphistemma, Holostemma, Pentasachme, Raphistemma, Seshagiria and Treutlera, have not been investigated. In this study, we examine the evolutionary relationships among these seven small enigmatic Asian genera and clarify their positions in Asclepiadoideae, using a combination of plastid sequences of rbcL, rps16, trnL and trnL- F regions. Cosmostigma and Treutlera are resolved as members of the non-Hoya clade of Marsdenieae with strong support (maximum parsimony bootstrap support value BSMP = 96, maximum likelihood bootstrap support value BSML = 98, Bayesian-inferred posterior probability PP = 1.0). Pentasachme is resolved as sister of Stapeliinae to Ceropegieae with moderate support (BSMP = 64, BSML = 66, PP = 0.94). Graphistemma, Holostemma, Raphistemma and Seshagiria are all nested in the Asclepiadeae-Cynanchinae clade (BSMP = 97, BSML = 100, PP = 1.0). The study confirms the generally accepted tribal and subtribal structure of the subfamily. One exception is Eustegia minuta, which is placed here as sister to all Asclepiadeae (BSMP = 58, BSML = 76, PP = 0.99) and not as sister to the Marsdenieae + Ceropegieae clade. The weak support and conflicting position indicate the need for a placement of Eustegia as an independent tribe. In Asclepiadeae, a sister group position of Cynanchinae to the Asclepiadinae + Tylophorinae clade is favoured (BSMP = 84, BSML = 88, PP = 1.0), whereas Schizostephanus is retrieved as unresolved. Oxystelma appears as an early-branching member of Asclepiadinae with weak support (BSMP = 52, BSML = 74, PP = 0.69). Calciphila and Solenostemma are also associated with Asclepiadinae with weak support (BSMP = 37, BSML = 45, PP = 0.79), but all alternative positions are essentially without support. The position of Indian Asclepiadoideae in the family phylogeny is discussed. (c) 2014 The Linnean Society of London, Botanical Journal of the Linnean Society, 2014, 174, 601-619.
Resumo:
Climate change impact assessment studies involve downscaling large-scale atmospheric predictor variables (LSAPVs) simulated by general circulation models (GCMs) to site-scale meteorological variables. This article presents a least-square support vector machine (LS-SVM)-based methodology for multi-site downscaling of maximum and minimum daily temperature series. The methodology involves (1) delineation of sites in the study area into clusters based on correlation structure of predictands, (2) downscaling LSAPVs to monthly time series of predictands at a representative site identified in each of the clusters, (3) translation of the downscaled information in each cluster from the representative site to that at other sites using LS-SVM inter-site regression relationships, and (4) disaggregation of the information at each site from monthly to daily time scale using k-nearest neighbour disaggregation methodology. Effectiveness of the methodology is demonstrated by application to data pertaining to four sites in the catchment of Beas river basin, India. Simulations of Canadian coupled global climate model (CGCM3.1/T63) for four IPCC SRES scenarios namely A1B, A2, B1 and COMMIT were downscaled to future projections of the predictands in the study area. Comparison of results with those based on recently proposed multivariate multiple linear regression (MMLR) based downscaling method and multi-site multivariate statistical downscaling (MMSD) method indicate that the proposed method is promising and it can be considered as a feasible choice in statistical downscaling studies. The performance of the method in downscaling daily minimum temperature was found to be better when compared with that in downscaling daily maximum temperature. Results indicate an increase in annual average maximum and minimum temperatures at all the sites for A1B, A2 and B1 scenarios. The projected increment is high for A2 scenario, and it is followed by that for A1B, B1 and COMMIT scenarios. Projections, in general, indicated an increase in mean monthly maximum and minimum temperatures during January to February and October to December.
Resumo:
Purpose: Weill-Marchesani syndrome (WMS) is a rare connective tissue disorder, characterized by short stature, micro-spherophakic lens, and stubby hands and feet (brachydactyly). WMS is caused by mutations in the FBN1, ADAMTS10, and LTBP2 genes. Mutations in the LTBP2 and ADAMTS17 genes cause a WMS-like syndrome, in which the affected individuals show major features of WMS but do not display brachydactyly and joint stiffness. The main purpose of our study was to determine the genetic cause of WMS in an Indian family. Methods: Whole exome sequencing (WES) was used to identify the genetic cause of WMS in the family. The cosegregation of the mutation was determined with Sanger sequencing. Reverse transcription (RT)-PCR analysis was used to assess the effect of a splice-site mutation on splicing of the ADAMTS17 transcript. Results: The WES analysis identified a homozygous novel splice-site mutation c.873+1G>T in a known WMS-like syndrome gene, ADAMTS17, in the family. RT-PCR analysis in the patient showed that exon 5 was skipped, which resulted in the deletion of 28 amino acids in the ADAMTS17 protein. Conclusions: The mutation in the WMS-like syndrome gene ADAMTS17 also causes WMS in an Indian family. The present study will be helpful in genetic diagnosis of this family and increases the number of mutations of this gene to six.
Resumo:
We address the issue of stability of recently proposed significantly super-Chandrasekhar white dwarfs. We present stable solutions of magnetostatic equilibrium models for super-Chandrasekhar white dwarfs pertaining to various magnetic field profiles. This has been obtained by self-consistently including the effects of the magnetic pressure gradient and total magnetic density in a general relativistic framework. We estimate that the maximum stable mass of magnetized white dwarfs could be more than 3 solar mass. This is very useful to explain peculiar, overluminous type Ia supernovae which do not conform to the traditional Chandrasekhar mass-limit.
Resumo:
Let Z(n) denote the ring of integers modulo n. A permutation of Z(n) is a sequence of n distinct elements of Z(n). Addition and subtraction of two permutations is defined element-wise. In this paper we consider two extremal problems on permutations of Z(n), namely, the maximum size of a collection of permutations such that the sum of any two distinct permutations in the collection is again a permutation, and the maximum size of a collection of permutations such that no sum of two distinct permutations in the collection is a permutation. Let the sizes be denoted by s (n) and t (n) respectively. The case when n is even is trivial in both the cases, with s (n) = 1 and t (n) = n!. For n odd, we prove (n phi(n))/2(k) <= s(n) <= n!.2(-)(n-1)/2/((n-1)/2)! and 2 (n-1)/2 . (n-1/2)! <= t (n) <= 2(k) . (n-1)!/phi(n), where k is the number of distinct prime divisors of n and phi is the Euler's totient function.
Resumo:
We investigate the parameterized complexity of the following edge coloring problem motivated by the problem of channel assignment in wireless networks. For an integer q >= 2 and a graph G, the goal is to find a coloring of the edges of G with the maximum number of colors such that every vertex of the graph sees at most q colors. This problem is NP-hard for q >= 2, and has been well-studied from the point of view of approximation. Our main focus is the case when q = 2, which is already theoretically intricate and practically relevant. We show fixed-parameter tractable algorithms for both the standard and the dual parameter, and for the latter problem, the result is based on a linear vertex kernel.
Resumo:
We demonstrate the first STM evaluation of the Young's modulus (E) of nanoparticles (NPs) of different sizes. The sample deformation induced by tip-sample interaction has been determined using current-distance (I-Z) spectroscopy. As a result of tip-sample interaction, and the induced surface deformations, the I-z curves deviates from pure exponential dependence. Normally, in order to analyze the deformation quantitatively, the tip radius must be known. We show, that this necessity is eliminated by measuring the deformation on a substrate with a known Young's modulus (Au(111)) and estimating the tip radius, and afterwards, using the same tip (with a known radius) to measure the (unknown) Young's modulus of another sample (nanoparticles of CdS). The Young's modulus values found for 3 NP's samples of average diameters of 3.7, 6 and 7.5 nm, were E similar to 73%, 78% and 88% of the bulk value, respectively. These results are in a good agreement with the theoretically predicted reduction of the Young's modulus due to the changes in hydrostatic stresses which resulted from surface tension in nanoparticles with different sizes. Our calculation using third order elastic constants gives a reduction of E which scales linearly with 1/r (r is the NP's radius). This demonstrates the applicability of scanning tunneling spectroscopy for local mechanical characterization of nanoobjects. The method does not include a direct measurement of the tip-sample force but is rather based on the study of the relative elastic response. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
Central to network tomography is the problem of identifiability, the ability to identify internal network characteristics uniquely from end-to-end measurements. This problem is often underconstrained even when internal network characteristics such as link delays are modeled as additive constants. While it is known that the network topology can play a role in determining the extent of identifiability, there is a lack in the fundamental understanding of being able to quantify it for a given network. In this paper, we consider the problem of identifying additive link metrics in an arbitrary undirected network using measurement nodes and establishing paths/cycles between them. For a given placement of measurement nodes, we define and derive the ``link rank'' of the network-the maximum number of linearly independent cycles/paths that may be established between the measurement nodes. We achieve this in linear time. The link rank helps quantify the exact extent of identifiability in a network. We also develop a quadratic time algorithm to compute a set of cycles/paths that achieves the maximum rank.