1000 resultados para Linear coregionalization
Resumo:
Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
Resumo:
L-Alanylglycyl-L-alanine, C8H15N3O4, exists as zwitter-ion in the crystal with the N terminus protonated and the C terminus in an ionized form, Both the peptide units are in trans configurations and deviate significantly from planarity. Backbone torsion angles are psi(1)=172.7(2), omega(1)=-178.2(2), phi(2)=91.7(2), phi(2)=-151.9(2), omega(2)=-176.9(2), phi(3)=-71.3(2), phi(31)=-7.0(3) and psi(32) 172.4(2)degrees. The protonated NH3+ group forms three hydrogen bonds with atoms of symmetry-related molecules.
Resumo:
In this paper, expressions for convolution multiplication properties of DCT IV and DST IV are derived starting from equivalent DFT representations. Using these expressions methods for implementing linear filtering through block convolution in the DCT IV and DST IV domain are proposed. Techniques developed for DCT IV and DST IV are further extended to MDCT and MDST where the filter implementation is near exact for symmetric filters and approximate for non-symmetric filters. No additional overlapping is required for implementing the symmetric filtering in the MDCT domain and hence the proposed algorithm is computationally competitive with DFT based systems. Moreover, inherent 50% overlap between the adjacent frames used for MDCT/MDST domain reduces the blocking artifacts due to block processing or quantization. The techniques are computationally efficient for symmetric filters and provides a new alternative to DFT based convolution.
Resumo:
The distribution of black leaf nodes at each level of a linear quadtree is of significant interest in the context of estimation of time and space complexities of linear quadtree based algorithms. The maximum number of black nodes of a given level that can be fitted in a square grid of size 2n × 2n can readily be estimated from the ratio of areas. We show that the actual value of the maximum number of nodes of a level is much less than the maximum obtained from the ratio of the areas. This is due to the fact that the number of nodes possible at a level k, 0≤k≤n − 1, should consider the sum of areas occupied by the actual number of nodes present at levels k + 1, k + 2, …, n − 1.
Resumo:
Abstract-To detect errors in decision tables one needs to decide whether a given set of constraints is feasible or not. This paper describes an algorithm to do so when the constraints are linear in variables that take only integer values. Decision tables with such constraints occur frequently in business data processing and in nonnumeric applications. The aim of the algorithm is to exploit. the abundance of very simple constraints that occur in typical decision table contexts. Essentially, the algorithm is a backtrack procedure where the the solution space is pruned by using the set of simple constrains. After some simplications, the simple constraints are captured in an acyclic directed graph with weighted edges. Further, only those partial vectors are considered from extension which can be extended to assignments that will at least satisfy the simple constraints. This is how pruning of the solution space is achieved. For every partial assignment considered, the graph representation of the simple constraints provides a lower bound for each variable which is not yet assigned a value. These lower bounds play a vital role in the algorithm and they are obtained in an efficient manner by updating older lower bounds. Our present algorithm also incorporates an idea by which it can be checked whether or not an (m - 2)-ary vector can be extended to a solution vector of m components, thereby backtracking is reduced by one component.
Resumo:
We propose an iterative estimating equations procedure for analysis of longitudinal data. We show that, under very mild conditions, the probability that the procedure converges at an exponential rate tends to one as the sample size increases to infinity. Furthermore, we show that the limiting estimator is consistent and asymptotically efficient, as expected. The method applies to semiparametric regression models with unspecified covariances among the observations. In the special case of linear models, the procedure reduces to iterative reweighted least squares. Finite sample performance of the procedure is studied by simulations, and compared with other methods. A numerical example from a medical study is considered to illustrate the application of the method.
Resumo:
Statistical methods are often used to analyse commercial catch and effort data to provide standardised fishing effort and/or a relative index of fish abundance for input into stock assessment models. Achieving reliable results has proved difficult in Australia's Northern Prawn Fishery (NPF), due to a combination of such factors as the biological characteristics of the animals, some aspects of the fleet dynamics, and the changes in fishing technology. For this set of data, we compared four modelling approaches (linear models, mixed models, generalised estimating equations, and generalised linear models) with respect to the outcomes of the standardised fishing effort or the relative index of abundance. We also varied the number and form of vessel covariates in the models. Within a subset of data from this fishery, modelling correlation structures did not alter the conclusions from simpler statistical models. The random-effects models also yielded similar results. This is because the estimators are all consistent even if the correlation structure is mis-specified, and the data set is very large. However, the standard errors from different models differed, suggesting that different methods have different statistical efficiency. We suggest that there is value in modelling the variance function and the correlation structure, to make valid and efficient statistical inferences and gain insight into the data. We found that fishing power was separable from the indices of prawn abundance only when we offset the impact of vessel characteristics at assumed values from external sources. This may be due to the large degree of confounding within the data, and the extreme temporal changes in certain aspects of individual vessels, the fleet and the fleet dynamics.
Resumo:
The non-linear equations of motion of a rotating blade undergoing extensional and flapwise bending vibration are derived, including non-linearities up to O (ε3). The strain-displacement relationship derived is compared with expressions derived by earlier investigators and the errors and the approximations made in some of those are brought out. The equations of motion are solved under the inextensionality condition to obtain the influence of the amplitude on the fundamental flapwise natural frequency of the rotating blade. It is found that large finite amplitudes have a softening effect on the flapwise frequency and that this influence becomes stronger at higher speeds of rotation.
Resumo:
The paper presents two new algorithms for the direct parallel solution of systems of linear equations. The algorithms employ a novel recursive doubling technique to obtain solutions to an nth-order system in n steps with no more than 2n(n −1) processors. Comparing their performance with the Gaussian elimination algorithm (GE), we show that they are almost 100% faster than the latter. This speedup is achieved by dispensing with all the computation involved in the back-substitution phase of GE. It is also shown that the new algorithms exhibit error characteristics which are superior to GE. An n(n + 1) systolic array structure is proposed for the implementation of the new algorithms. We show that complete solutions can be obtained, through these single-phase solution methods, in 5n−log2n−4 computational steps, without the need for intermediate I/O operations.
Resumo:
Functional dependencies in relational databases are investigated. Eight binary relations, viz., (1) dependency relation, (2) equipotence relation, (3) dissidence relation, (4) completion relation, and dual relations of each of them are described. Any one of these eight relations can be used to represent the functional dependencies in a database. Results from linear graph theory are found helpful in obtaining these representations. The dependency relation directly gives the functional dependencies. The equipotence relation specifies the dependencies in terms of attribute sets which functionally determine each other. The dissidence relation specifies the dependencies in terms of saturated sets in a very indirect way. Completion relation represents the functional dependencies as a function, the range of which turns out to be a lattice. Depletion relation which is the dual of the completion relation can also represent functional dependencies and similarly can the duals of dependency, equipotence, and dissidence relations. The class of depleted sets, which is the dual of saturated sets, is defined and used in the study of depletion relations.
Resumo:
Analogue and digital techniques for linearization of non-linear input-output relationship of transducers are briefly reviewed. The condition required for linearizing a non-linear function y = f(x) using a non-linear analogue-to-digital converter, is explained. A simple technique to construct a non-linear digital-to-analogue converter, based on ' segments of equal digital interval ' is described. The technique was used to build an N-DAC which can be employed in a successive approximation or counter-ramp type ADC to linearize the non-linear transfer function of a thermistor-resistor combination. The possibility of achieving an order of magnitude higher accuracy in the measurement of temperature is shown.
Resumo:
Lateral or transaxial truncation of cone-beam data can occur either due to the field of view limitation of the scanning apparatus or iregion-of-interest tomography. In this paper, we Suggest two new methods to handle lateral truncation in helical scan CT. It is seen that reconstruction with laterally truncated projection data, assuming it to be complete, gives severe artifacts which even penetrates into the field of view. A row-by-row data completion approach using linear prediction is introduced for helical scan truncated data. An extension of this technique known as windowed linear prediction approach is introduced. Efficacy of the two techniques are shown using simulation with standard phantoms. A quantitative image quality measure of the resulting reconstructed images are used to evaluate the performance of the proposed methods against an extension of a standard existing technique.
Resumo:
This paper presents an approach, based on Lean production philosophy, for rationalising the processes involved in the production of specification documents for construction projects. Current construction literature erroneously depicts the process for the creation of construction specifications as a linear one. This traditional understanding of the specification process often culminates in process-wastes. On the contrary, the evidence suggests that though generalised, the activities involved in producing specification documents are nonlinear. Drawing on the outcome of participant observation, this paper presents an optimised approach for representing construction specifications. Consequently, the actors typically involved in producing specification documents are identified, the processes suitable for automation are highlighted and the central role of tacit knowledge is integrated into a conceptual template of construction specifications. By applying the transformation, flow, value (TFV) theory of Lean production the paper argues that value creation can be realised by eliminating the wastes associated with the traditional preparation of specification documents with a view to integrating specifications in digital models such as Building Information Models (BIM). Therefore, the paper presents an approach for rationalising the TFV theory as a method for optimising current approaches for generating construction specifications based on a revised specification writing model.
Resumo:
In this paper, we generalize the existing rate-one space frequency (SF) and space-time frequency (STF) code constructions. The objective of this exercise is to provide a systematic design of full-diversity STF codes with high coding gain. Under this generalization, STF codes are formulated as linear transformations of data. Conditions on these linear transforms are then derived so that the resulting STF codes achieve full diversity and high coding gain with a moderate decoding complexity. Many of these conditions involve channel parameters like delay profile (DP) and temporal correlation. When these quantities are not available at the transmitter, design of codes that exploit full diversity on channels with arbitrary DIP and temporal correlation is considered. Complete characterization of a class of such robust codes is provided and their bit error rate (BER) performance is evaluated. On the other hand, when channel DIP and temporal correlation are available at the transmitter, linear transforms are optimized to maximize the coding gain of full-diversity STF codes. BER performance of such optimized codes is shown to be better than those of existing codes.
Resumo:
An efficient geometrical design rule checker is proposed, based on operations on quadtrees, which represent VLSI mask layouts. The time complexity of the design rule checker is O(N), where N is the number of polygons in the mask. A pseudoPascal description is provided of all the important algorithms for geometrical design rule verification.