956 resultados para Conditional CAPM
Resumo:
The Generalized Distributive Law (GDL) is a message passing algorithm which can efficiently solve a certain class of computational problems, and includes as special cases the Viterbi's algorithm, the BCJR algorithm, the Fast-Fourier Transform, Turbo and LDPC decoding algorithms. In this paper GDL based maximum-likelihood (ML) decoding of Space-Time Block Codes (STBCs) is introduced and a sufficient condition for an STBC to admit low GDL decoding complexity is given. Fast-decoding and multigroup decoding are the two algorithms used in the literature to ML decode STBCs with low complexity. An algorithm which exploits the advantages of both these two is called Conditional ML (CML) decoding. It is shown in this paper that the GDL decoding complexity of any STBC is upper bounded by its CML decoding complexity, and that there exist codes for which the GDL complexity is strictly less than the CML complexity. Explicit examples of two such families of STBCs is given in this paper. Thus the CML is in general suboptimal in reducing the ML decoding complexity of a code, and one should design codes with low GDL complexity rather than low CML complexity.
Resumo:
The present work involves a computational study of soot formation and transport in case of a laminar acetylene diffusion flame perturbed by a co nvecting line vortex. The topology of the soot contours (as in an earlier experimental work [4]) have been investigated. More soot was produced when vortex was introduced from the air si de in comparison to a fuel side vortex. Also the soot topography was more diffused in case of the air side vortex. The computational model was found to be in good agreement with the ex perimental work [4]. The computational simulation enabled a study of the various parameters affecting soot transport. Temperatures were found to be higher in case of air side vortex as compared to a fuel side vortex. In case of the fuel side vortex, abundance of fuel in the vort ex core resulted in stoichiometrically rich combustion in the vortex core, and more discrete so ot topography. Overall soot production too was low. In case of the air side vortex abundan ce of air in the core resulted in higher temperatures and more soot yield. Statistical techniques like probability density fun ction, correlation coefficient and conditional probability function were introduced to explain the transient dependence of soot yield and transport on various parameters like temperature, a cetylene concentration.
Resumo:
The present work involves a computational study of soot formation and transport in case of a laminar acetylene diffusion flame perturbed by a convecting line vortex. The topology of the soot contours (as in an earlier experimental work [4]) have been investigated. More soot was produced when vortex was introduced from the air side in comparison to a fuel side vortex. Also the soot topography was more diffused in case of the air side vortex. The computational model was found to be in good agreement with the experimental work [4]. The computational simulation enabled a study of the various parameters affecting soot transport. Temperatures were found to be higher in case of air side vortex as compared to a fuel side vortex. In case of the fuel side vortex, abundance of fuel in the vort ex core resulted in stoichiometrically rich combustion in the vortex core, and more discrete soot topography. Overall soot production too was low. In case of the air side vortex abundance of air in the core resulted in higher temperatures and more soot yield. Statistical techniques like probability density function, correlation coefficient and conditional probability function were introduced to explain the transient dependence of soot yield and transport on various parameters like temperature, a cetylene concentration.
Resumo:
The problem of designing good space-time block codes (STBCs) with low maximum-likelihood (ML) decoding complexity has gathered much attention in the literature. All the known low ML decoding complexity techniques utilize the same approach of exploiting either the multigroup decodable or the fast-decodable (conditionally multigroup decodable) structure of a code. We refer to this well-known technique of decoding STBCs as conditional ML (CML) decoding. In this paper, we introduce a new framework to construct ML decoders for STBCs based on the generalized distributive law (GDL) and the factor-graph-based sum-product algorithm. We say that an STBC is fast GDL decodable if the order of GDL decoding complexity of the code, with respect to the constellation size, is strictly less than M-lambda, where lambda is the number of independent symbols in the STBC. We give sufficient conditions for an STBC to admit fast GDL decoding, and show that both multigroup and conditionally multigroup decodable codes are fast GDL decodable. For any STBC, whether fast GDL decodable or not, we show that the GDL decoding complexity is strictly less than the CML decoding complexity. For instance, for any STBC obtained from cyclic division algebras which is not multigroup or conditionally multigroup decodable, the GDL decoder provides about 12 times reduction in complexity compared to the CML decoder. Similarly, for the Golden code, which is conditionally multigroup decodable, the GDL decoder is only half as complex as the CML decoder.
Resumo:
Chebyshev-inequality-based convex relaxations of Chance-Constrained Programs (CCPs) are shown to be useful for learning classifiers on massive datasets. In particular, an algorithm that integrates efficient clustering procedures and CCP approaches for computing classifiers on large datasets is proposed. The key idea is to identify high density regions or clusters from individual class conditional densities and then use a CCP formulation to learn a classifier on the clusters. The CCP formulation ensures that most of the data points in a cluster are correctly classified by employing a Chebyshev-inequality-based convex relaxation. This relaxation is heavily dependent on the second-order statistics. However, this formulation and in general such relaxations that depend on the second-order moments are susceptible to moment estimation errors. One of the contributions of the paper is to propose several formulations that are robust to such errors. In particular a generic way of making such formulations robust to moment estimation errors is illustrated using two novel confidence sets. An important contribution is to show that when either of the confidence sets is employed, for the special case of a spherical normal distribution of clusters, the robust variant of the formulation can be posed as a second-order cone program. Empirical results show that the robust formulations achieve accuracies comparable to that with true moments, even when moment estimates are erroneous. Results also illustrate the benefits of employing the proposed methodology for robust classification of large-scale datasets.
Resumo:
Transaction processing is a key constituent of the IT workload of commercial enterprises (e.g., banks, insurance companies). Even today, in many large enterprises, transaction processing is done by legacy "batch" applications, which run offline and process accumulated transactions. Developers acknowledge the presence of multiple loosely coupled pieces of functionality within individual applications. Identifying such pieces of functionality (which we call "services") is desirable for the maintenance and evolution of these legacy applications. This is a hard problem, which enterprises grapple with, and one without satisfactory automated solutions. In this paper, we propose a novel static-analysis-based solution to the problem of identifying services within transaction-processing programs. We provide a formal characterization of services in terms of control-flow and data-flow properties, which is well-suited to the idioms commonly exhibited by business applications. Our technique combines program slicing with the detection of conditional code regions to identify services in accordance with our characterization. A preliminary evaluation, based on a manual analysis of three real business programs, indicates that our approach can be effective in identifying useful services from batch applications.
Resumo:
The problem of designing good Space-Time Block Codes (STBCs) with low maximum-likelihood (ML) decoding complexity has gathered much attention in the literature. All the known low ML decoding complexity techniques utilize the same approach of exploiting either the multigroup decodable or the fast-decodable (conditionally multigroup decodable) structure of a code. We refer to this well known technique of decoding STBCs as Conditional ML (CML) decoding. In [1], we introduced a framework to construct ML decoders for STBCs based on the Generalized Distributive Law (GDL) and the Factor-graph based Sum-Product Algorithm, and showed that for two specific families of STBCs, the Toepltiz codes and the Overlapped Alamouti Codes (OACs), the GDL based ML decoders have strictly less complexity than the CML decoders. In this paper, we introduce a `traceback' step to the GDL decoding algorithm of STBCs, which enables roughly 4 times reduction in the complexity of the GDL decoders proposed in [1]. Utilizing this complexity reduction from `traceback', we then show that for any STBC (not just the Toeplitz and Overlapped Alamouti Codes), the GDL decoding complexity is strictly less than the CML decoding complexity. For instance, for any STBC obtained from Cyclic Division Algebras that is not multigroup or conditionally multigroup decodable, the GDL decoder provides approximately 12 times reduction in complexity compared to the CML decoder. Similarly, for the Golden code, which is conditionally multigroup decodable, the GDL decoder is only about half as complex as the CML decoder.
Resumo:
Stochastic modelling is a useful way of simulating complex hard-rock aquifers as hydrological properties (permeability, porosity etc.) can be described using random variables with known statistics. However, very few studies have assessed the influence of topological uncertainty (i.e. the variability of thickness of conductive zones in the aquifer), probably because it is not easy to retrieve accurate statistics of the aquifer geometry, especially in hard rock context. In this paper, we assessed the potential of using geophysical surveys to describe the geometry of a hard rock-aquifer in a stochastic modelling framework. The study site was a small experimental watershed in South India, where the aquifer consisted of a clayey to loamy-sandy zone (regolith) underlain by a conductive fissured rock layer (protolith) and the unweathered gneiss (bedrock) at the bottom. The spatial variability of the thickness of the regolith and fissured layers was estimated by electrical resistivity tomography (ERT) profiles, which were performed along a few cross sections in the watershed. For stochastic analysis using Monte Carlo simulation, the generated random layer thickness was made conditional to the available data from the geophysics. In order to simulate steady state flow in the irregular domain with variable geometry, we used an isoparametric finite element method to discretize the flow equation over an unstructured grid with irregular hexahedral elements. The results indicated that the spatial variability of the layer thickness had a significant effect on reducing the simulated effective steady seepage flux and that using the conditional simulations reduced the uncertainty of the simulated seepage flux. As a conclusion, combining information on the aquifer geometry obtained from geophysical surveys with stochastic modelling is a promising methodology to improve the simulation of groundwater flow in complex hard-rock aquifers. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Structural Support Vector Machines (SSVMs) and Conditional Random Fields (CRFs) are popular discriminative methods used for classifying structured and complex objects like parse trees, image segments and part-of-speech tags. The datasets involved are very large dimensional, and the models designed using typical training algorithms for SSVMs and CRFs are non-sparse. This non-sparse nature of models results in slow inference. Thus, there is a need to devise new algorithms for sparse SSVM and CRF classifier design. Use of elastic net and L1-regularizer has already been explored for solving primal CRF and SSVM problems, respectively, to design sparse classifiers. In this work, we focus on dual elastic net regularized SSVM and CRF. By exploiting the weakly coupled structure of these convex programming problems, we propose a new sequential alternating proximal (SAP) algorithm to solve these dual problems. This algorithm works by sequentially visiting each training set example and solving a simple subproblem restricted to a small subset of variables associated with that example. Numerical experiments on various benchmark sequence labeling datasets demonstrate that the proposed algorithm scales well. Further, the classifiers designed are sparser than those designed by solving the respective primal problems and demonstrate comparable generalization performance. Thus, the proposed SAP algorithm is a useful alternative for sparse SSVM and CRF classifier design.
Resumo:
Overland rain retrieval using spaceborne microwave radiometer offers a myriad of complications as land presents itself as a radiometrically warm and highly variable background. Hence, land rainfall algorithms of the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) have traditionally incorporated empirical relations of microwave brightness temperature (Tb) with rain rate, rather than relying on physically based radiative transfer modeling of rainfall (as implemented in the TMI ocean algorithm). In this paper, sensitivity analysis is conducted using the Spearman rank correlation coefficient as benchmark, to estimate the best combination of TMI low-frequency channels that are highly sensitive to the near surface rainfall rate from the TRMM Precipitation Radar (PR). Results indicate that the TMI channel combinations not only contain information about rainfall wherein liquid water drops are the dominant hydrometeors but also aid in surface noise reduction over a predominantly vegetative land surface background. Furthermore, the variations of rainfall signature in these channel combinations are not understood properly due to their inherent uncertainties and highly nonlinear relationship with rainfall. Copula theory is a powerful tool to characterize the dependence between complex hydrological variables as well as aid in uncertainty modeling by ensemble generation. Hence, this paper proposes a regional model using Archimedean copulas, to study the dependence of TMI channel combinations with respect to precipitation, over the land regions of Mahanadi basin, India, using version 7 orbital data from the passive and active sensors on board TRMM, namely, TMI and PR. Studies conducted for different rainfall regimes over the study area show the suitability of Clayton and Gumbel copulas for modeling convective and stratiform rainfall types for the majority of the intraseasonal months. Furthermore, large ensembles of TMI Tb (from the most sensitive TMI channel combination) were generated conditional on various quantiles (25th, 50th, 75th, and 95th) of the convective and the stratiform rainfall. Comparatively greater ambiguity was observed to model extreme values of the convective rain type. Finally, the efficiency of the proposed model was tested by comparing the results with traditionally employed linear and quadratic models. Results reveal the superior performance of the proposed copula-based technique.
Resumo:
Maximum entropy approach to classification is very well studied in applied statistics and machine learning and almost all the methods that exists in literature are discriminative in nature. In this paper, we introduce a maximum entropy classification method with feature selection for large dimensional data such as text datasets that is generative in nature. To tackle the curse of dimensionality of large data sets, we employ conditional independence assumption (Naive Bayes) and we perform feature selection simultaneously, by enforcing a `maximum discrimination' between estimated class conditional densities. For two class problems, in the proposed method, we use Jeffreys (J) divergence to discriminate the class conditional densities. To extend our method to the multi-class case, we propose a completely new approach by considering a multi-distribution divergence: we replace Jeffreys divergence by Jensen-Shannon (JS) divergence to discriminate conditional densities of multiple classes. In order to reduce computational complexity, we employ a modified Jensen-Shannon divergence (JS(GM)), based on AM-GM inequality. We show that the resulting divergence is a natural generalization of Jeffreys divergence to a multiple distributions case. As far as the theoretical justifications are concerned we show that when one intends to select the best features in a generative maximum entropy approach, maximum discrimination using J-divergence emerges naturally in binary classification. Performance and comparative study of the proposed algorithms have been demonstrated on large dimensional text and gene expression datasets that show our methods scale up very well with large dimensional datasets.
Resumo:
The association of a factors with the RNA polymerase dictates the expression profile of a bacterial cell. Major changes to the transcription profile are achieved by the use of multiple sigma factors that confer distinct promoter selectivity to the holoenzyme. The cellular concentration of a sigma factor is regulated by diverse mechanisms involving transcription, translation and post-translational events. The number of sigma factors varies substantially across bacteria. The diversity in the interactions between sigma factors also vary-ranging from collaboration, competition or partial redundancy in some cellular or environmental contexts. These interactions can be rationalized by a mechanistic model referred to as the partitioning of a space model of bacterial transcription. The structural similarity between different sigma/anti-sigma complexes despite poor sequence conservation and cellular localization reveals an elegant route to incorporate diverse regulatory mechanisms within a structurally conserved scaffold. These features are described here with a focus on sigma/anti-sigma complexes from Mycobacterium tuberculosis. In particular, we discuss recent data on the conditional regulation of sigma/anti-sigma factor interactions. Specific stages of M. tuberculosis infection, such as the latent phase, as well as the remarkable adaptability of this pathogen to diverse environmental conditions can be rationalized by the synchronized action of different a factors.
Resumo:
Global conservation policy is increasingly debating the feasibility of reconciling wildlife conservation and human resource requirements in land uses outside protected areas (PAs). However, there are few quantitative assessments of whether or to what extent these `wildlife-friendly' land uses fulfill a fundamental function of PAs-to separate biodiversity from anthropogenic threats. We distinguish the role of wildlife-friendly land uses as being (a) subsidiary, whereby they augment PAs with secondary habitat, or (b) substitutive, wherein they provide comparable habitat to PAs. We tested our hypotheses by investigating the influence of land use and human presence on space-use intensity of the endangered Asian elephant (Elephas maximus) in a fragmented landscape comprising PAs and wildlife-friendly land uses. We applied multistate occupancy models to spatial data on elephant occurrence to estimate and model the overall probability of elephants using a site, and the conditional probability of high-intensity use given that elephants use a site. The probability of elephants using a site regardless of intensity did not vary between PAs and wildlife-friendly land uses. However, high-intensity use declined with distance to PM, and this effect was accentuated by an increase in village density. Therefore, while wildlife-friendly land uses did play a subsidiary conservation role, their potential to substitute for PAs was offset by a strong human presence. Our findings demonstrate the need to evaluate the role of wildlife-friendly land uses in landscape-scale conservation; for species that have conflicting resource requirements with people, PAs are likely to provide crucial refuge from growing anthropogenic threats. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Developments in the statistical extreme value theory, which allow non-stationary modeling of changes in the frequency and severity of extremes, are explored to analyze changes in return levels of droughts for the Colorado River. The transient future return levels (conditional quantiles) derived from regional drought projections using appropriate extreme value models, are compared with those from observed naturalized streamflows. The time of detection is computed as the time at which significant differences exist between the observed and future extreme drought levels, accounting for the uncertainties in their estimates. Projections from multiple climate model-scenario combinations are considered; no uniform pattern of changes in drought quantiles is observed across all the projections. While some projections indicate shifting to another stationary regime, for many projections which are found to be non-stationary, detection of change in tail quantiles of droughts occurs within the 21st century with no unanimity in the time of detection. Earlier detection is observed in droughts levels of higher probability of exceedance. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as Protein Blocks (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa.