157 resultados para Least-squares support vector machine


100.00% 100.00%



We consider the problem of binary classification where the classifier can, for a particular cost, choose not to classify an observation. Just as in the conventional classification problem, minimization of the sample average of the cost is a difficult optimization problem. As an alternative, we propose the optimization of a certain convex loss function φ, analogous to the hinge loss used in support vector machines (SVMs). Its convexity ensures that the sample average of this surrogate loss can be efficiently minimized. We study its statistical properties. We show that minimizing the expected surrogate loss—the φ-risk—also minimizes the risk. We also study the rate at which the φ-risk approaches its minimum value. We show that fast rates are possible when the conditional probability P(Y=1|X) is unlikely to be close to certain critical values.


100.00% 100.00%



One of the nice properties of kernel classifiers such as SVMs is that they often produce sparse solutions. However, the decision functions of these classifiers cannot always be used to estimate the conditional probability of the class label. We investigate the relationship between these two properties and show that these are intimately related: sparseness does not occur when the conditional probabilities can be unambiguously estimated. We consider a family of convex loss functions and derive sharp asymptotic results for the fraction of data that becomes support vectors. This enables us to characterize the exact trade-off between sparseness and the ability to estimate conditional probabilities for these loss functions.


100.00% 100.00%



Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the so-called kernel matrix, a symmetric and positive definite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input space -- classical model selection problems in machine learning. In this paper we show how the kernel matrix can be learned from data via semi-definite programming (SDP) techniques. When applied to a kernel matrix associated with both training and test data this gives a powerful transductive algorithm -- using the labelled part of the data one can learn an embedding also for the unlabelled part. The similarity between test points is inferred from training points and their labels. Importantly, these learning problems are convex, so we obtain a method for learning both the model class and the function without local minima. Furthermore, this approach leads directly to a convex method to learn the 2-norm soft margin parameter in support vector machines, solving another important open problem. Finally, the novel approach presented in the paper is supported by positive empirical results.


100.00% 100.00%



We have used microarray gene expression profiling and machine learning to predict the presence of BRAF mutations in a panel of 61 melanoma cell lines. The BRAF gene was found to be mutated in 42 samples (69%) and intragenic mutations of the NRAS gene were detected in seven samples (11%). No cell line carried mutations of both genes. Using support vector machines, we have built a classifier that differentiates between melanoma cell lines based on BRAF mutation status. As few as 83 genes are able to discriminate between BRAF mutant and BRAF wild-type samples with clear separation observed using hierarchical clustering. Multidimensional scaling was used to visualize the relationship between a BRAF mutation signature and that of a generalized mitogen-activated protein kinase (MAPK) activation (either BRAF or NRAS mutation) in the context of the discriminating gene list. We observed that samples carrying NRAS mutations lie somewhere between those with or without BRAF mutations. These observations suggest that there are gene-specific mutation signals in addition to a common MAPK activation that result from the pleiotropic effects of either BRAF or NRAS on other signaling pathways, leading to measurably different transcriptional changes.


100.00% 100.00%



This paper focuses on information sharing with key suppliers and seeks to explore the factors that might influence its extent and depth. We also investigate how information sharing affects a company’s performance with regards to resource usage, output, and flexibility. Drawing from transaction cost- and contingency theories, several factors, namely environmental uncertainty, demand uncertainty, dependency and, the product life cycle stage are proposed to explain the level of information shared with key suppliers. We develop a model where information sharing mediates the (contingent) factors and company performance. A mail survey was used to collect data from Finnish and Swedish companies. Partial Least Squares analysis was separately performed for each country (n=119, n=102). There was consistent evidence that environmental uncertainty, demand uncertainty and supplier/buyer dependency had explanatory power, whereas no significance was found for the product life cycle stage. The results also confirm previous studies by providing support for a positive relationship between information sharing and performance, where output performance was found to be the most strongly related


100.00% 100.00%



This paper focuses on information sharing with key suppliers and seeks to explore the factors that might influence its extent and depth. We also investigate how information sharing affects a company’s performance with regards to resource usage, output, and flexibility. Drawing from transaction cost- and contingency theories, several factors, namely environmental uncertainty, demand uncertainty, dependency and, the product life cycle stage are proposed to explain the level of information shared with key suppliers. We develop a model where information sharing mediates the (contingent) factors and company performance. A mail survey was used to collect data from Finnish and Swedish companies. Partial Least Squares analysis was separately performed for each country (n=119, n=102). There was consistent evidence that environmental uncertainty, demand uncertainty and supplier/buyer dependency had explanatory power, whereas no significance was found for the relationship between product life cycle stage and information sharing. The results also confirm previous studies by providing support for a positive relationship between information sharing and performance, where output performance was found to be the most strongly related.


100.00% 100.00%



Background Depression is a major public health problem worldwide and is currently ranked second to heart disease for years lost due to disability. For many decades, international research has found that depressive symptoms occur more frequently among low socioeconomic (SES) individuals than their more-advantaged peers. However, the reasons as to why those of low socioeconomic groups suffer more depressive symptoms are not well understood. Studies investigating the prevalence of depression and its association with SES emanate largely from developed countries, with little research among developing countries. In particular, there is a serious dearth of research on depression and no investigation of its association with SES in Vietnam. The aims of the research presented in this Thesis are to: estimate the prevalence of depressive symptoms among Vietnamese adults, examine the nature and extent of the association between SES and depression and to elucidate causal pathways linking SES to depressive symptoms Methods The research was conducted between September 2008 and November 2009 in Hue city in central Vietnam and used a combination of qualitative (in-depth interviews) and quantitative (survey) data collection methods. The qualitative study contributed to the development of the theoretical model and to the refinement of culturally-appropriate data collection instruments for the quantitative study. The main survey comprised a cross-sectional population–based survey with randomised cluster sampling. A sample of 1976 respondents aged between 25-55 years from ten randomly-selected residential zones (quarters) of Hue city completed the questionnaire (response rate 95.5%). Measures SES was classified using three indicators: education, occupation and income. The Center for Epidemiologic Studies-Depression (CES-D) scale was used to measure depressive symptoms (range0-51, mean=11.0, SD=8.5). Three cut-off points for the CES-D scores were applied: ‘at risk for clinical depression’ (16 or above), ‘depressive symptoms’ (above 21) and ‘depression’ (above 25). Six psychosocial indicators: life time trauma, chronic stress, recent life events, social support, self esteem, and mastery were hypothesized to mediate the association between SES and depressive symptoms. Analyses The prevalence of depressive symptoms were analysed using bivariate analyses. The multivariable analytic phase comprised of ordinary least squares regression, in accordance with Baron and Kenny’s three-step framework for mediation modeling. All analyses were adjusted for a range of confounders, including age, marital status, smoking, drinking and chronic diseases and the mediation models were stratified by gender. Results Among these Vietnamese adults, 24.3% were at or above the cut-off for being ‘at risk for clinical depression’, 11.9% were classified as having depressive symptoms and 6.8% were categorised as having depression. SES was inversely related to depressive symptoms: the least educated those with low occupational status or with the lowest incomes reported more depressive symptoms. Socioeconomicallydisadvantaged individuals were more likely to report experiencing stress (life time trauma, chronic stress or recent life events), perceived less social support and reported fewer personal resources (self esteem and mastery) than their moreadvantaged counterparts. These psychosocial resources were all significantly associated with depressive symptoms independent of SES. Each psychosocial factor showed a significant mediating effect on the association between SES and depressive symptoms. This was found for all measures of SES, and for males and females. In particular, personal resources (mastery, self esteem) and chronic stress accounted for a substantial proportion of the variation in depressive symptoms between socioeconomic groups. Social support and recent life events contributed modestly to socioeconomic differences in depressive symptoms, whereas lifetime trauma contributed the least to these inequalities. Conclusion This is the first known study in Vietnam or any developing country to systematically examine the extent to which psychosocial factors mediate the relationship between SES and depression. The study contributes new evidence regarding the burden of depression in Vietnam. The findings have practical relevance for advocacy, for mental health promotion and health-care services, and point to the need for programs that focus on building a sense of personal mastery and self esteem. More broadly, the work presented in this Thesis contributes to the international scientific literature on the social determinants of depression.


100.00% 100.00%



This item provides supplementary materials for the paper mentioned in the title, specifically a range of organisms used in the study. The full abstract for the main paper is as follows: Next Generation Sequencing (NGS) technologies have revolutionised molecular biology, allowing clinical sequencing to become a matter of routine. NGS data sets consist of short sequence reads obtained from the machine, given context and meaning through downstream assembly and annotation. For these techniques to operate successfully, the collected reads must be consistent with the assumed species or species group, and not corrupted in some way. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans,with some strains exhibiting antibiotic resistance. In this paper, we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from alternative pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.


100.00% 100.00%



Language has been of interest to numerous economists since the late 20th century, with the majority of the studies focusing on its effects on immigrants’ labour market outcomes; earnings in particular. However, language is an endogenous variable, which along with its susceptibility to measurement error causes biases in ordinary-least-squares estimates. The instrumental variables method overcomes the shortcomings of ordinary least squares in modelling endogenous explanatory variables. In this dissertation, age at arrival combined with country of origin form an instrument creating a difference-in-difference scenario, to address the issue of endogeneity and attenuation error in language proficiency. The first half of the study aims to investigate the extent to which English speaking ability of immigrants improves their labour market outcomes and social assimilation in Australia, with the use of the 2006 Census. The findings have provided evidence that support the earlier studies. As expected, immigrants in Australia with better language proficiency are able to earn higher income, attain higher level of education, have higher probability of completing tertiary studies, and have more hours of work per week. Language proficiency also improves social integration, leading to higher probability of marriage to a native and higher probability of obtaining citizenship. The second half of the study further investigates whether language proficiency has similar effects on a migrant’s physical and mental wellbeing, health care access and lifestyle choices, with the use of three National Health Surveys. However, only limited evidence has been found with respect to the hypothesised causal relationship between language and health for Australian immigrants.


100.00% 100.00%



In Australia and increasingly worldwide, methamphetamine is one of the most commonly seized drugs analysed by forensic chemists. The current well-established GC/MS methods used to identify and quantify methamphetamine are lengthy, expensive processes, but often rapid analysis is requested by undercover police leading to an interest in developing this new analytical technique. Ninety six illicit drug seizures containing methamphetamine (0.1% - 78.6%) were analysed using Fourier Transform Infrared Spectroscopy with an Attenuated Total Reflectance attachment and Chemometrics. Two Partial Least Squares models were developed, one using the principal Infrared Spectroscopy peaks of methamphetamine and the other a Hierarchical Partial Least Squares model. Both of these models were refined to choose the variables that were most closely associated with the methamphetamine % vector. Both of the models were excellent, with the principal peaks in the Partial Least Squares model having Root Mean Square Error of Prediction 3.8, R2 0.9779 and lower limit of quantification 7% methamphetamine. The Hierarchical Partial Least Squares model had lower limit of quantification 0.3% methamphetamine, Root Mean Square Error of Prediction 5.2 and R2 0.9637. Such models offer rapid and effective methods for screening illicit drug samples to determine the percentage of methamphetamine they contain.


100.00% 100.00%



Person re-identification is particularly challenging due to significant appearance changes across separate camera views. In order to re-identify people, a representative human signature should effectively handle differences in illumination, pose and camera parameters. While general appearance-based methods are modelled in Euclidean spaces, it has been argued that some applications in image and video analysis are better modelled via non-Euclidean manifold geometry. To this end, recent approaches represent images as covariance matrices, and interpret such matrices as points on Riemannian manifolds. As direct classification on such manifolds can be difficult, in this paper we propose to represent each manifold point as a vector of similarities to class representers, via a recently introduced form of Bregman matrix divergence known as the Stein divergence. This is followed by using a discriminative mapping of similarity vectors for final classification. The use of similarity vectors is in contrast to the traditional approach of embedding manifolds into tangent spaces, which can suffer from representing the manifold structure inaccurately. Comparative evaluations on benchmark ETHZ and iLIDS datasets for the person re-identification task show that the proposed approach obtains better performance than recent techniques such as Histogram Plus Epitome, Partial Least Squares, and Symmetry-Driven Accumulation of Local Features.


100.00% 100.00%



The thesis investigates “where were the auditors in asset securitizations”, a criticism of the audit profession before and after the onset of the global financial crisis (GFC). Asset securitizations increase audit complexity and audit risks, which are expected to increase audit fees. Using US bank holding company data from 2003 to 2009, this study examines the association between asset securitization risks and audit fees, and its changes during the global financial crisis. The main test is based on an ordinary least squares (OLS) model, which is adapted from the Fields et al. (2004) bank audit fee model. I employ a principal components analysis to address high correlations among asset securitization risks. Individual securitization risks are also separately tested. A suite of sensitivity tests indicate the results are robust. These include model alterations, sample variations, further controls in the tests, and correcting for the securitizer self-selection problem. A partial least squares (PLS) path modelling methodology is introduced as a separate test, which allows for high intercorrelations, self-selection correction, and sequential order hypotheses in one simultaneous model. The PLS results are consistent with the main results. The study finds significant and positive associations between securitization risks and audit fees. After the commencement of the global financial crisis in 2007, there was an increased focus on the role of audits on asset securitization risks resulting from bank failures; therefore I expect that auditors would become more sensitive to bank asset securitization risks after the commencement of the crisis. I find that auditors appear to focus on different aspects of asset securitization risks during the crisis and that auditors appear to charge a GFC premium for banks. Overall, the results support the view that auditors consider asset securitization risks and market changes, and adjust their audit effort and risk considerations accordingly.


100.00% 100.00%



Healthcare organizations in all OECD countries have continued to undergo change. These changes have been found to have a negative effect on work engagement of nursing staff. While the extent to which nursing staff dealt with these changes has been documented in the literature, little is known of how they utilized their personal resources to deal with the consequences of these changes. This study will address this gap by integrating the Job Demands-Resources theoretical perspective with Positive Psychology, in particular, psychological capital (PsyCap). PsyCap is operationalized as a source of personal resources. Data were collected from 401 nurses from Australia and analyses were undertaken using Partial Least Squares modelling and moderation analysis. Two types of changes on the nursing work were identified. There was an increase in changes to the work environment of nursing. These changes, included increasing administrative workload and the amount of work, resulted in more job demands and job resources. On the other hand, another type of changes relate to reduction to training and management support, which resulted in less job demands. Nurses with more job demands utilized more job resources to address these increasing demands. We found PsyCap to be a crucial source of personal resources that has a moderating effect on the negative effects of job demands and role stress. PsyCap and job resources were both critical in enhancing the work engagement of nurses, as they encountered changes to nursing work. These findings provided empirical support for a positive psychological perspective of understanding nursing engagement.


100.00% 100.00%



Purpose The purpose of this paper is to explore the role of marketing in today's enterprises and examines the antecedents of the marketing department's influence and its relationship with market orientation and firm performance. Design/methodology/approach Data were collected from the West (i.e. the USA and Europe) and the East (i.e. Asia). Partial least squares (PLS) was used to estimate structural models. Findings The findings support the idea that a strong and influential marketing department contributes positively to firm performance. This finding holds for Western and Asian, and for small/medium and large firms alike. Second, the marketing department's influence in a firm depends more on its responsibilities and resources, and less on internal contingency factors (i.e. a firm's competitive strategy or institutional attributes). Third, a marketing department's influence in the West affects firm performance both directly and indirectly (via market orientation). In contrast, this relationship is fully mediated among Eastern firms. Fourth, low-cost strategies enhance the influence of a firm's marketing department in the East, but not in the West. Research limitations/implications The paper assumes explicitly that a marketing department's influence is an antecedent of its market orientation. While the paper finds support for this link, the paper did not test for dual causality between the constructs. Originality/value Countering the frequent claim in anecdotal and journalistic work that the role of the marketing department diminishes, the findings show that across different geographic regions and firm sizes, strong marketing departments improve firm performance (especially in the marketing-savvy West), and that they should continue to play an important role in firms.


100.00% 100.00%



Previous research identifies various reasons companies invest in information technology (IT), often as a means to generate value. To add to the discussion of IT value generation, this study investigates investments in enterprise software systems that support business processes. Managers of more than 500 Swiss small and medium-sized enterprises (SMEs) responded to a survey regarding the levels of their IT investment in enterprise software systems and the perceived utility of those investments. The authors use logistic and ordinary least squares regression to examine whether IT investments in two business processes affect SMEs' performance and competitive advantage. Using cluster analysis, they also develop a firm typology with four distinct groups that differ in their investments in enterprise software systems. These findings offer key implications for both research and managerial practice.