24 resultados para linear-regression

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Chemical composition of rainwater changes from sea to inland under the influence of several major factors - topographic location of area, its distance from sea, annual rainfall. A model is developed here to quantify the variation in precipitation chemistry under the influence of inland distance and rainfall amount. Various sites in India categorized as 'urban', 'suburban' and 'rural' have been considered for model development. pH, HCO3, NO3 and Mg do not change much from coast to inland while, SO4 and Ca change is subjected to local emissions. Cl and Na originate solely from sea salinity and are the chemistry parameters in the model. Non-linear multiple regressions performed for the various categories revealed that both rainfall amount and precipitation chemistry obeyed a power law reduction with distance from sea. Cl and Na decrease rapidly for the first 100 km distance from sea, then decrease marginally for the next 100 km, and later stabilize. Regression parameters estimated for different cases were found to be consistent (R-2 similar to 0.8). Variation in one of the parameters accounted for urbanization. Model was validated using data points from the southern peninsular region of the country. Estimates are found to be within 99.9% confidence interval. Finally, this relationship between the three parameters - rainfall amount, coastline distance, and concentration (in terms of Cl and Na) was validated with experiments conducted in a small experimental watershed in the south-west India. Chemistry estimated using the model was in good correlation with observed values with a relative error of similar to 5%. Monthly variation in the chemistry is predicted from a downscaling model and then compared with the observed data. Hence, the model developed for rain chemistry is useful in estimating the concentrations at different spatio-temporal scales and is especially applicable for south-west region of India. (C) 2008 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Processor architects have a challenging task of evaluating a large design space consisting of several interacting parameters and optimizations. In order to assist architects in making crucial design decisions, we build linear regression models that relate Processor performance to micro-architecture parameters, using simulation based experiments. We obtain good approximate models using an iterative process in which Akaike's information criteria is used to extract a good linear model from a small set of simulations, and limited further simulation is guided by the model using D-optimal experimental designs. The iterative process is repeated until desired error bounds are achieved. We used this procedure to establish the relationship of the CPI performance response to 26 key micro-architectural parameters using a detailed cycle-by-cycle superscalar processor simulator The resulting models provide a significance ordering on all micro-architectural parameters and their interactions, and explain the performance variations of micro-architectural techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple input multiple output (MIMO) systems with large number of antennas have been gaining wide attention as they enable very high throughputs. A major impediment is the complexity at the receiver needed to detect the transmitted data. To this end we propose a new receiver, called LRR (Linear Regression of MMSE Residual), which improves the MMSE receiver by learning a linear regression model for the error of the MMSE receiver. The LRR receiver uses pilot data to estimate the channel, and then uses locally generated training data (not transmitted over the channel), to find the linear regression parameters. The proposed receiver is suitable for applications where the channel remains constant for a long period (slow-fading channels) and performs quite well: at a bit error rate (BER) of 10(-3), the SNR gain over MMSE receiver is about 7 dB for a 16 x 16 system; for a 64 x 64 system the gain is about 8.5 dB. For large coherence time, the complexity order of the LRR receiver is the same as that of the MMSE receiver, and in simulations we find that it needs about 4 times as many floating point operations. We also show that further gain of about 4 dB is obtained by local search around the estimate given by the LRR receiver.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we present a novel algorithm for piecewise linear regression which can learn continuous as well as discontinuous piecewise linear functions. The main idea is to repeatedly partition the data and learn a linear model in each partition. The proposed algorithm is similar in spirit to k-means clustering algorithm. We show that our algorithm can also be viewed as a special case of an EM algorithm for maximum likelihood estimation under a reasonable probability model. We empirically demonstrate the effectiveness of our approach by comparing its performance with that of the state of art algorithms on various datasets. (C) 2014 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For the problem of speaker adaptation in speech recognition, the performance depends on the availability of adaptation data. In this paper, we have compared several existing speaker adaptation methods, viz. maximum likelihood linear regression (MLLR), eigenvoice (EV), eigenspace-based MLLR (EMLLR), segmental eigenvoice (SEV) and hierarchical eigenvoice (HEV) based methods. We also develop a new method by modifying the existing HEV method for achieving further performance improvement in a limited available data scenario. In the sense of availability of adaptation data, the new modified HEV (MHEV) method is shown to perform better than all the existing methods throughout the range of operation except the case of MLLR at the availability of more adaptation data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Temperature data collected over several years from rocket grenade and other experiments at Point Barrow (Alaska), Fort Churchill (Canada) and Wallops Island (Virginia) have been analysed to determine the effect of geomagnetic activity on the neutral temperature in the mesosphere and to study the latitudinal variation of this effect. An analysis carried out has revealed almost certainly significant correlations between the temperature and the geomagnetic indicies Kp and Ap at Fort Churchill and marginally significant correlations at Barrow and Wallops. This has also been substantiated by a linear regression analysis. The results indicate two types of interdependence between mesospheric temperature and geomagnetic field variations. The first type is the direct heating effect, during a geomagnetic disturbance, which has been observed in the present analysis with a time lag of 3–15 hr at the high latitudes and 36 hr at the middle latitudes. The magnitude of this heating effect has been found to decrease at the lower altitudes. The second type of interrelation which has been observed is temperature perturbations preceding geomagnetic field variations, both presumably caused by a disturbance in atmospheric circulation at these levels.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper proposes the use of empirical modeling techniques for building microarchitecture sensitive models for compiler optimizations. The models we build relate program performance to settings of compiler optimization flags, associated heuristics and key microarchitectural parameters. Unlike traditional analytical modeling methods, this relationship is learned entirely from data obtained by measuring performance at a small number of carefully selected compiler/microarchitecture configurations. We evaluate three different learning techniques in this context viz. linear regression, adaptive regression splines and radial basis function networks. We use the generated models to a) predict program performance at arbitrary compiler/microarchitecture configurations, b) quantify the significance of complex interactions between optimizations and the microarchitecture, and c) efficiently search for'optimal' settings of optimization flags and heuristics for any given microarchitectural configuration. Our evaluation using benchmarks from the SPEC CPU2000 suits suggests that accurate models (< 5% average error in prediction) can be generated using a reasonable number of simulations. We also find that using compiler settings prescribed by a model-based search can improve program performance by as much as 19% (with an average of 9.5%) over highly optimized binaries.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The absorption produced by the audience in concert halls is considered a random variable. Beranek's proposal [L. L. Beranek, Music, Acoustics and Architecture (Wiley, New York, 1962), p. 543] that audience absorption is proportional to the area they occupy and not to their number is subjected to a statistical hypothesis test. A two variable linear regression model of the absorption with audience area and residual area as regressor variables is postulated for concert halls without added absorptive materials. Since Beranek's contention amounts to the statement that audience absorption is independent of the seating density, the test of the hypothesis lies in categorizing halls by seating density and examining for significant differences among slopes of regression planes of the different categories. Such a test shows that Beranek's hypothesis can be accepted. It is also shown that the audience area is a better predictor of the absorption than the audience number. The absorption coefficients and their 95% confidence limits are given for the audience and residual areas. A critique of the regression model is presented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Predictions of two popular closed-form models for unsaturated hydraulic conductivity (K) are compared with in situ measurements made in a sandy loam field soil. Whereas the Van Genuchten model estimates were very close to field measured values, the Brooks-Corey model predictions were higher by about one order of magnitude in the wetter range. Estimation of parameters of the Van Genuchten soil moisture characteristic (SMC) equation, however, involves the use of non-linear regression techniques. The Brooks-Corey SMC equation has the advantage of being amenable to application of linear regression techniques for estimation of its parameters from retention data. A conversion technique, whereby known Brooks-Corey model parameters may be converted into Van Genuchten model parameters, is formulated. The proposed conversion algorithm may be used to obtain the parameters of the preferred Van Genuchten model from in situ retention data, without the use of non-linear regression techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we propose the first approximation for thickness of Quaternary sediment and late Quaternary early Tertiary topography for the part of lower reaches of Narmada valley in a systematic way using the shallow seismic method, that records both horizontal and vertical components of the microtremor (ambient noise) caused by natural processes. The measurements of microtremors were carried out at 31 sites spaced at a grid interval of 5 km s using Lennartz seismometer (5 s period) and City shark-II data acquisition system. The signals recorded were analysed for horizontal to the vertical (H/V) spectral ratio using GEOPSY software. For the present study, we concentrate on frequency range between 0.2 Hz and 10 Hz. The thickness of unconsolidated sediments at various sites is calculated based on non-linear regression equations proposed by Ibs-von Seht and Wohlenberg (1999) and Parolai et al. (2002). The estimated thickness is used to plot digital elevation model and cross profiles correlating with geomorphology and geology of the study area. (C) 2011 Elsevier Ltd. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Analysis of climate change impacts on streamflow by perturbing the climate inputs has been a concern for many authors in the past few years, but there are few analyses for the impacts on water quality. To examine the impact of change in climate variables on the water quality parameters, the water quality input variables have to be perturbed. The primary input variables that can be considered for such an analysis are streamflow and water temperature, which are affected by changes in precipitation and air temperature, respectively. Using hypothetical scenarios to represent both greenhouse warming and streamflow changes, the sensitivity of the water quality parameters has been evaluated under conditions of altered river flow and river temperature in this article. Historical data analysis of hydroclimatic variables is carried out, which includes flow duration exceedance percentage (e.g. Q90), single low- flow indices (e.g. 7Q10, 30Q10) and relationships between climatic variables and surface variables. For the study region of Tunga-Bhadra river in India, low flows are found to be decreasing and water temperatures are found to be increasing. As a result, there is a reduction in dissolved oxygen (DO) levels found in recent years. Water quality responses of six hypothetical climate change scenarios were simulated by the water quality model, QUAL2K. A simple linear regression relation between air and water temperature is used to generate the scenarios for river water temperature. The results suggest that all the hypothetical climate change scenarios would cause impairment in water quality. It was found that there is a significant decrease in DO levels due to the impact of climate change on temperature and flows, even when the discharges were at safe permissible levels set by pollution control agencies (PCAs). The necessity to improve the standards of PCA and develop adaptation policies for the dischargers to account for climate change is examined through a fuzzy waste load allocation model developed earlier. Copyright (C) 2011 John Wiley & Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Prediction of variable bit rate compressed video traffic is critical to dynamic allocation of resources in a network. In this paper, we propose a technique for preprocessing the dataset used for training a video traffic predictor. The technique involves identifying the noisy instances in the data using a fuzzy inference system. We focus on three prediction techniques, namely, linear regression, neural network and support vector regression and analyze their performance on H.264 video traces. Our experimental results reveal that data preprocessing greatly improves the performance of linear regression and neural network, but is not effective on support vector regression.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this study, an effort has been made to study heavy rainfall events during cyclonic storms over Indian Ocean. This estimate is based on microwave observations from tropical rainfall measuring mission (TRMM) Microwave Imager (TMI). Regional scattering index (SI) developed for Indian region based on measurements at 19-, 21- and 85-GHz brightness temperature and polarization corrected temperature (PCT) at 85 GHz have been utilized in this study. These PCT and SI are collocated against Precipitation Radar (PR) onboard TRMM to establish a relationship between rainfall rate, PCT and SI. The retrieval technique using both linear and nonlinear regressions has been developed utilizing SI, PCT and the combination of SI and PCT. The results have been compared with the observations from PR. It was found that a nonlinear algorithm using combination of SI and PCT is more accurate than linear algorithm or nonlinear algorithm using either SI or PCT. Statistical comparison with PR exhibits the correlation coefficients (CC) of 0.68, 0.66 and 0.70, and root mean square error (RMSE) of 1.78, 1.96 and 1.68 mm/h from the observations of SI, PCT and combination of SI and PCT respectively using linear regressions. When nonlinear regression is used, the CC of 0.73, 0.71, 0.79 and RMSE of 1.64, 1.95, 1.54 mm/h are observed from the observations of SI, PCT and combination of SI and PCT, respectively. The error statistics for high rain events (above 10 mm/h) shows the CC of 0.58, 0.59, 0.60 and RMSE of 5.07, 5.47, 5.03 mm/h from the observations of SI, PCT and combination of SI and PCT, respectively, using linear regression, and on the other hand, use of nonlinear regression yields the CC of 0.66, 0.64, 0.71 and RMSE of 4.68, 5.78 and 4.02 mm/h from the observations of SI, PCT and combined SI and PCT, respectively.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we estimate the trends and variability in Advanced Very High Resolution Radiometer (AVHRR)-derived terrestrial net primary productivity (NPP) over India for the period 1982-2006. We find an increasing trend of 3.9% per decade (r = 0.78, R-2 = 0.61) during the analysis period. A multivariate linear regression of NPP with temperature, precipitation, atmospheric CO2 concentration, soil water and surface solar radiation (r = 0.80, R-2 = 0.65) indicates that the increasing trend is partly driven by increasing atmospheric CO2 concentration and the consequent CO2 fertilization of the ecosystems. However, human interventions may have also played a key role in the NPP increase: non-forest NPP growth is largely driven by increases in irrigated area and fertilizer use, while forest NPP is influenced by plantation and forest conservation programs. A similar multivariate regression of interannual NPP anomalies with temperature, precipitation, soil water, solar radiation and CO2 anomalies suggests that the interannual variability in NPP is primarily driven by precipitation and temperature variability. Mean seasonal NPP is largest during post-monsoon and lowest during the pre-monsoon period, thereby indicating the importance of soil moisture for vegetation productivity.