160 resultados para Maithilisharan Gupta


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Lipid extraction is an integral part of biodiesel production, as it facilitates the release of fatty acids from algal cells. To utilise thraustochytrids as a potential source for lipid production. We evaluated the extraction efficiency of various solvents and solvent combinations for lipid extraction from Schizochytrium sp. S31 and Thraustochytrium sp. AMCQS5-5. The maximum lipid extraction yield was 22% using a chloroform:methanol ratio of 2:1. We compared various cell disruption methods to improve lipid extraction yields, including grinding with liquid nitrogen, bead vortexing, osmotic shock, water bath, sonication and shake mill. The highest lipid extraction yields were obtained using osmotic shock and 48.7% from Schizochytrium sp. S31 and 29.1% from Thraustochytrium sp. AMCQS5-5. Saturated and monounsaturated fatty acid contents were more than 60% in Schizochytrium sp. S31 which suggests their suitability for biodiesel production.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Medical interventions critically determine clinical outcomes. But prediction models either ignore interventions or dilute impact by building a single prediction rule by amalgamating interventions with other features. One rule across all interventions may not capture differential effects. Also, interventions change with time as innovations are made, requiring prediction models to evolve over time. To address these gaps, we propose a prediction framework that explicitly models interventions by extracting a set of latent intervention groups through a Hierarchical Dirichlet Process (HDP) mixture. Data are split in temporal windows and for each window, a separate distribution over the intervention groups is learnt. This ensures that the model evolves with changing interventions. The outcome is modeled as conditional, on both the latent grouping and the patients' condition, through a Bayesian logistic regression. Learning distributions for each time-window result in an over-complex model when interventions do not change in every time-window. We show that by replacing HDP with a dynamic HDP prior, a more compact set of distributions can be learnt. Experiments performed on two hospital datasets demonstrate the superiority of our framework over many existing clinical and traditional prediction frameworks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The support vector machine (SVM) is a popular method for classification, well known for finding the maximum-margin hyperplane. Combining SVM with l1-norm penalty further enables it to simultaneously perform feature selection and margin maximization within a single framework. However, l1-norm SVM shows instability in selecting features in presence of correlated features. We propose a new method to increase the stability of l1-norm SVM by encouraging similarities between feature weights based on feature correlations, which is captured via a feature covariance matrix. Our proposed method can capture both positive and negative correlations between features. We formulate the model as a convex optimization problem and propose a solution based on alternating minimization. Using both synthetic and real-world datasets, we show that our model achieves better stability and classification accuracy compared to several state-of-the-art regularized classification methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The discovery of contexts is important for context-aware applications in pervasive computing. This is a challenging problem because of the stream nature of data, the complexity and changing nature of contexts. We propose a Bayesian nonparametric model for the detection of co-location contexts from Bluetooth signals. By using an Indian buffet process as the prior distribution, the model can discover the number of contexts automatically. We introduce a novel fixed-lag particle filter that processes data incrementally. This sampling scheme is especially suitable for pervasive computing as the computational requirements remain constant in spite of growing data. We examine our model on a synthetic dataset and two real world datasets. To verify the discovered contexts, we compare them to the communities detected by the Louvain method, showing a strong correlation between the results of the two methods. Fixed-lag particle filter is compared with Gibbs sampling in terms of the normalized factorization error that shows a close performance between the two inference methods. As fixed-lag particle filter processes a small chunk of data when it comes and does not need to be restarted, its execution time is significantly shorter than that of Gibbs sampling.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Privacy-preserving data mining has become an active focus of the research community in the domains where data are sensitive and personal in nature. For example, highly sensitive digital repositories of medical or financial records offer enormous values for risk prediction and decision making. However, prediction models derived from such repositories should maintain strict privacy of individuals. We propose a novel random forest algorithm under the framework of differential privacy. Unlike previous works that strictly follow differential privacy and keep the complete data distribution approximately invariant to change in one data instance, we only keep the necessary statistics (e.g. variance of the estimate) invariant. This relaxation results in significantly higher utility. To realize our approach, we propose a novel differentially private decision tree induction algorithm and use them to create an ensemble of decision trees. We also propose feasible adversary models to infer about the attribute and class label of unknown data in presence of the knowledge of all other data. Under these adversary models, we derive bounds on the maximum number of trees that are allowed in the ensemble while maintaining privacy. We focus on binary classification problem and demonstrate our approach on four real-world datasets. Compared to the existing privacy preserving approaches we achieve significantly higher utility.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Feature selection is an important step in building predictive models for most real-world problems. One of the popular methods in feature selection is Lasso. However, it shows instability in selecting features when dealing with correlated features. In this work, we propose a new method that aims to increase the stability of Lasso by encouraging similarities between features based on their relatedness, which is captured via a feature covariance matrix. Besides modeling positive feature correlations, our method can also identify negative correlations between features. We propose a convex formulation for our model along with an alternating optimization algorithm that can learn the weights of the features as well as the relationship between them. Using both synthetic and real-world data, we show that the proposed method is more stable than Lasso and many state-of-the-art shrinkage and feature selection methods. Also, its predictive performance is comparable to other methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Prognosis, such as predicting mortality, is common in medicine. When confronted with small numbers of samples, as in rare medical conditions, the task is challenging. We propose a framework for classification with data with small numbers of samples. Conceptually, our solution is a hybrid of multi-task and transfer learning, employing data samples from source tasks as in transfer learning, but considering all tasks together as in multi-task learning. Each task is modelled jointly with other related tasks by directly augmenting the data from other tasks. The degree of augmentation depends on the task relatedness and is estimated directly from the data. We apply the model on three diverse real-world data sets (healthcare data, handwritten digit data and face data) and show that our method outperforms several state-of-the-art multi-task learning baselines. We extend the model for online multi-task learning where the model parameters are incrementally updated given new data or new tasks. The novelty of our method lies in offering a hybrid multi-task/transfer learning model to exploit sharing across tasks at the data-level and joint parameter learning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Emerging Electronic Medical Records (EMRs) have reformed the modern healthcare. These records have great potential to be used for building clinical prediction models. However, a problem in using them is their high dimensionality. Since a lot of information may not be relevant for prediction, the underlying complexity of the prediction models may not be high. A popular way to deal with this problem is to employ feature selection. Lasso and l1-norm based feature selection methods have shown promising results. But, in presence of correlated features, these methods select features that change considerably with small changes in data. This prevents clinicians to obtain a stable feature set, which is crucial for clinical decision making. Grouping correlated variables together can improve the stability of feature selection, however, such grouping is usually not known and needs to be estimated for optimal performance. Addressing this problem, we propose a new model that can simultaneously learn the grouping of correlated features and perform stable feature selection. We formulate the model as a constrained optimization problem and provide an efficient solution with guaranteed convergence. Our experiments with both synthetic and real-world datasets show that the proposed model is significantly more stable than Lasso and many existing state-of-the-art shrinkage and classification methods. We further show that in terms of prediction performance, the proposed method consistently outperforms Lasso and other baselines. Our model can be used for selecting stable risk factors for a variety of healthcare problems, so it can assist clinicians toward accurate decision making.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a hypothesis and its experimental validation that simultaneous improvement in the hardness and corrosion resistance of aluminium can be achieved by the combination of suitable processing route and alloying additions. More specifically, the corrosion resistance and hardness of Al- xCr (x= 0-10 wt.%) alloys as produced via high-energy ball milling were significantly higher than pure Al and AA7075-T651. The improved properties of the Al- xCr alloys were attributed to the Cr addition and high-energy ball milling, which caused nanocrystalline structure, extended solubility of Cr in Al, and uniformly distributed fine intermetallic phases in the Al-Cr matrix.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The applications of Omega-3 fatty acids for human health are rapidly expanding, which necessitates exploring alternative sources to fish. Many marine microorganisms across different kingdoms exhibit the ability to store a significant oil content, however are difficult to cultivate. Out of all marine microbes, thraustochytrids are considered a good source for the production of high value compounds such as polyunsaturated fatty acids (PUFAs). Optimization of culture conditions will be helpful in further enhancing cellular lipid content to suit fatty acid synthesis. This chapter describes some recent advances in the development of marine microbes for fatty acid production with a special emphasis upon thraustochytrids for biotechnological applications, focussing particularly on methods to enhanced docosahexaenoic acid (DHA) and eicosapentaenoic acid (EPA) production.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multi-task learning offers a way to benefit from synergy of multiple related prediction tasks via their joint modeling. Current multi-task techniques model related tasks jointly, assuming that the tasks share the same relationship across features uniformly. This assumption is seldom true as tasks may be related across some features but not others. Addressing this problem, we propose a new multi-task learning model that learns separate task relationships along different features. This added flexibility allows our model to have a finer and differential level of control in joint modeling of tasks along different features. We formulate the model as an optimization problem and provide an efficient, iterative solution. We illustrate the behavior of the proposed model using a synthetic dataset where we induce varied feature-dependent task relationships: positive relationship, negative relationship, no relationship. Using four real datasets, we evaluate the effectiveness of the proposed model for many multi-task regression and classification problems, and demonstrate its superiority over other state-of-the-art multi-task learning models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Prediction of patient outcomes is critical to plan resources in an hospital emergency department. We present a method to exploit longitudinal data from Electronic Medical Records (EMR), whilst exploiting multiple patient outcomes. We divide the EMR data into segments where each segment is a task, and all tasks are associated with multiple patient outcomes over a 3, 6 and 12 month period. We propose a model that learns a prediction function for each task-label pair, interacting through two subspaces: the first subspace is used to impose sharing across all tasks for a given label. The second subspace captures the task-specific variations and is shared across all the labels for a given task. The proposed model is formulated as an iterative optimization problems and solved using a scalable and efficient Block co-ordinate descent (BCD) method. We apply the proposed model on two hospital cohorts - Cancer and Acute Myocardial Infarction (AMI) patients collected over a two year period from a large hospital emergency department. We show that the predictive performance of our proposed models is significantly better than those of several state-of-the-art multi-task and multi-label learning methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Learning from small number of examples is a challenging problem in machine learning. An effective way to improve the performance is through exploiting knowledge from other related tasks. Multi-task learning (MTL) is one such useful paradigm that aims to improve the performance through jointly modeling multiple related tasks. Although there exist numerous classification or regression models in machine learning literature, most of the MTL models are built around ridge or logistic regression. There exist some limited works, which propose multi-task extension of techniques such as support vector machine, Gaussian processes. However, all these MTL models are tied to specific classification or regression algorithms and there is no single MTL algorithm that can be used at a meta level for any given learning algorithm. Addressing this problem, we propose a generic, model-agnostic joint modeling framework that can take any classification or regression algorithm of a practitioner’s choice (standard or custom-built) and build its MTL variant. The key observation that drives our framework is that due to small number of examples, the estimates of task parameters are usually poor, and we show that this leads to an under-estimation of task relatedness between any two tasks with high probability. We derive an algorithm that brings the tasks closer to their true relatedness by improving the estimates of task parameters. This is achieved by appropriate sharing of data across tasks. We provide the detail theoretical underpinning of the algorithm. Through our experiments with both synthetic and real datasets, we demonstrate that the multi-task variants of several classifiers/regressors (logistic regression, support vector machine, K-nearest neighbor, Random Forest, ridge regression, support vector regression) convincingly outperform their single-task counterparts. We also show that the proposed model performs comparable or better than many state-of-the-art MTL and transfer learning baselines.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Monitoring daily physical activity plays an important role in disease prevention and intervention. This paper proposes an approach to monitor the body movement intensity levels from accelerometer data. We collect the data using the accelerometer in a realistic setting without any supervision. The ground-truth of activities is provided by the participants themselves using an experience sampling application running on their mobile phones. We compute a novel feature that has a strong correlation with the movement intensity. We use the hierarchical Dirichlet process (HDP) model to detect the activity levels from this feature. Consisting of Bayesian nonparametric priors over the parameters the model can infer the number of levels automatically. By demonstrating the approach on the publicly available USC-HAD dataset that includes ground-truth activity labels, we show a strong correlation between the discovered activity levels and the movement intensity of the activities. This correlation is further confirmed using our newly collected dataset. We further use the extracted patterns as features for clustering and classifying the activity sequences to improve performance.