947 resultados para stochastic search variable selection


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Experimental work and analysis was done to investigate engine startup robustness and emissions of a flex-fuel spark ignition (SI) direct injection (DI) engine. The vaporization and other characteristics of ethanol fuel blends present a challenge at engine startup. Strategies to reduce the enrichment requirements for the first engine startup cycle and emissions for the second and third fired cycle at 25°C ± 1°C engine and intake air temperature were investigated. Research work was conducted on a single cylinder SIDI engine with gasoline and E85 fuels, to study the effect on first fired cycle of engine startup. Piston configurations that included a compression ratio change (11 vs 15.5) and piston geometry change (flattop vs bowl) were tested, along with changes in intake cam timing (95,110,125) and fuel pressure (0.4 MPa vs 3 MPa). The goal was to replicate the engine speed, manifold pressure, fuel pressure and testing temperature from an engine startup trace for investigating the first fired cycle for the engine. Results showed bowl piston was able to enable lower equivalence ratio engine starts with gasoline fuel, while also showing lower IMEP at the same equivalence ratio compared to flat top piston. With E85, bowl piston showed reduced IMEP as compression ratio increased at the same equivalence ratio. A preference for constant intake valve timing across fuels seemed to indicate that flattop piston might be a good flex-fuel piston. Significant improvements were seen with higher CR bowl piston with high fuel pressure starts, but showed no improvement with low fuel pressures. Simulation work was conducted to analyze initial three cycles of engine startup in GT-POWER for the same set of hardware used in the experimentations. A steady state validated model was modified for startup conditions. The results of which allowed an understanding of the relative residual levels and IMEP at the test points in the cam phasing space. This allowed selecting additional test points that enable use of higher residual levels, eliminating those with smaller trapped mass incapable of producing required IMEP for proper engine turnover. The second phase of experimental testing results for 2nd and 3rd startup cycle revealed both E10 and E85 prefer the same SOI of 240°bTDC at second and third startup cycle for the flat top piston and high injection pressures. E85 fuel optimal cam timing for startup showed that it tolerates more residuals compared to E10 fuel. Higher internal residuals drives down the Ø requirement for both fuels up to their combustion stability limit, this is thought to be direct benefit to vaporization due to increased cycle start temperature. Benefits are shown for an advance IMOP and retarded EMOP strategy at engine startup. Overall the amount of residuals preferred by an engine for E10 fuel at startup is thought to be constant across engine speed, thus could enable easier selection of optimized cam positions across the startup speeds.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wind energy has been one of the most growing sectors of the nation’s renewable energy portfolio for the past decade, and the same tendency is being projected for the upcoming years given the aggressive governmental policies for the reduction of fossil fuel dependency. Great technological expectation and outstanding commercial penetration has shown the so called Horizontal Axis Wind Turbines (HAWT) technologies. Given its great acceptance, size evolution of wind turbines over time has increased exponentially. However, safety and economical concerns have emerged as a result of the newly design tendencies for massive scale wind turbine structures presenting high slenderness ratios and complex shapes, typically located in remote areas (e.g. offshore wind farms). In this regard, safety operation requires not only having first-hand information regarding actual structural dynamic conditions under aerodynamic action, but also a deep understanding of the environmental factors in which these multibody rotating structures operate. Given the cyclo-stochastic patterns of the wind loading exerting pressure on a HAWT, a probabilistic framework is appropriate to characterize the risk of failure in terms of resistance and serviceability conditions, at any given time. Furthermore, sources of uncertainty such as material imperfections, buffeting and flutter, aeroelastic damping, gyroscopic effects, turbulence, among others, have pleaded for the use of a more sophisticated mathematical framework that could properly handle all these sources of indetermination. The attainable modeling complexity that arises as a result of these characterizations demands a data-driven experimental validation methodology to calibrate and corroborate the model. For this aim, System Identification (SI) techniques offer a spectrum of well-established numerical methods appropriated for stationary, deterministic, and data-driven numerical schemes, capable of predicting actual dynamic states (eigenrealizations) of traditional time-invariant dynamic systems. As a consequence, it is proposed a modified data-driven SI metric based on the so called Subspace Realization Theory, now adapted for stochastic non-stationary and timevarying systems, as is the case of HAWT’s complex aerodynamics. Simultaneously, this investigation explores the characterization of the turbine loading and response envelopes for critical failure modes of the structural components the wind turbine is made of. In the long run, both aerodynamic framework (theoretical model) and system identification (experimental model) will be merged in a numerical engine formulated as a search algorithm for model updating, also known as Adaptive Simulated Annealing (ASA) process. This iterative engine is based on a set of function minimizations computed by a metric called Modal Assurance Criterion (MAC). In summary, the Thesis is composed of four major parts: (1) development of an analytical aerodynamic framework that predicts interacted wind-structure stochastic loads on wind turbine components; (2) development of a novel tapered-swept-corved Spinning Finite Element (SFE) that includes dampedgyroscopic effects and axial-flexural-torsional coupling; (3) a novel data-driven structural health monitoring (SHM) algorithm via stochastic subspace identification methods; and (4) a numerical search (optimization) engine based on ASA and MAC capable of updating the SFE aerodynamic model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Invasive species often evolve rapidly in response to the novel biotic and abiotic conditions in their introduced range. Such adaptive evolutionary changes might play an important role in the success of some invasive species. Here, we investigated whether introduced European populations of the South African ragwort Senecio inaequidens (Asteraceae) have genetically diverged from native populations. We carried out a greenhouse experiment where 12 South African and 11 European populations were for several months grown at two levels of nutrient availability, as well as in the presence or absence of a generalist insect herbivore. We found that, in contrast to a current hypothesis, plants from introduced populations had a significantly lower reproductive output, but higher allocation to root biomass, and they were more tolerant to insect herbivory. Moreover, introduced populations were less genetically variable, but displayed greater plasticity in response to fertilization. Finally, introduced populations were phenotypically most similar to a subset of native populations from mountainous regions in southern Africa. Taking into account the species' likely history of introduction, our data support the idea that the invasion success of Senecio inaequidens in Central Europe is based on selective introduction of specific preadapted and plastic genotypes rather than on adaptive evolution in the introduced range.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Osteoarthritis is the most common form of joint disease and the leading cause of pain and physical disability in the elderly. Transcutaneous electrical nerve stimulation (TENS), interferential current stimulation and pulsed electrostimulation are used widely to control both acute and chronic pain arising from several conditions, but some policy makers regard efficacy evidence as insufficient. OBJECTIVES: To compare transcutaneous electrostimulation with sham or no specific intervention in terms of effects on pain and withdrawals due to adverse events in patients with knee osteoarthritis. SEARCH STRATEGY: We updated the search in CENTRAL, MEDLINE, EMBASE, CINAHL and PEDro up to 5 August 2008, checked conference proceedings and reference lists, and contacted authors. SELECTION CRITERIA: Randomised or quasi-randomised controlled trials that compared transcutaneously applied electrostimulation with a sham intervention or no intervention in patients with osteoarthritis of the knee. DATA COLLECTION AND ANALYSIS: We extracted data using standardised forms and contacted investigators to obtain missing outcome information. Main outcomes were pain and withdrawals or dropouts due to adverse events. We calculated standardised mean differences (SMDs) for pain and relative risks for safety outcomes and used inverse-variance random-effects meta-analysis. The analysis of pain was based on predicted estimates from meta-regression using the standard error as explanatory variable. MAIN RESULTS: In this update we identified 14 additional trials resulting in the inclusion of 18 small trials in 813 patients. Eleven trials used TENS, four interferential current stimulation, one both TENS and interferential current stimulation, and two pulsed electrostimulation. The methodological quality and the quality of reporting was poor and a high degree of heterogeneity among the trials (I(2) = 80%) was revealed. The funnel plot for pain was asymmetrical (P < 0.001). The predicted SMD of pain intensity in trials as large as the largest trial was -0.07 (95% CI -0.46 to 0.32), corresponding to a difference in pain scores between electrostimulation and control of 0.2 cm on a 10 cm visual analogue scale. There was little evidence that SMDs differed on the type of electrostimulation (P = 0.94). The relative risk of being withdrawn or dropping out due to adverse events was 0.97 (95% CI 0.2 to 6.0). AUTHORS' CONCLUSIONS: In this update, we could not confirm that transcutaneous electrostimulation is effective for pain relief. The current systematic review is inconclusive, hampered by the inclusion of only small trials of questionable quality. Appropriately designed trials of adequate power are warranted.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The first section of this chapter starts with the Buffon problem, which is one of the oldest in stochastic geometry, and then continues with the definition of measures on the space of lines. The second section defines random closed sets and related measurability issues, explains how to characterize distributions of random closed sets by means of capacity functionals and introduces the concept of a selection. Based on this concept, the third section starts with the definition of the expectation and proves its convexifying effect that is related to the Lyapunov theorem for ranges of vector-valued measures. Finally, the strong law of large numbers for Minkowski sums of random sets is proved and the corresponding limit theorem is formulated. The chapter is concluded by a discussion of the union-scheme for random closed sets and a characterization of the corresponding stable laws.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

During school-to-work transition, adolescents develop values and prioritize what is im-portant in their life. Values are concepts or beliefs about desirable states or behaviors that guide the selection or evaluation of behavior and events, and are ordered by their relative importance (Schwartz & Bilsky, 1987). Stressing the important role of values, career re-search has intensively studied the effect of values on educational decisions and early career development (e.g. Eccles, 2005; Hirschi, 2010; Rimann, Udris, & Weiss, 2000). Few re-searchers, however, have investigated so far how values develop in the early career phase and how value trajectories are influenced by individual characteristics. Values can be oriented towards specific life domains, such as work or family. Work values include intrinsic and extrinsic aspects of work (e.g., self-development, cooperation with others, income) (George & Jones, 1997). Family values include the importance of partner-ship, the creation of an own family and having children (Mayer, Kuramschew, & Trommsdroff, 2009). Research indicates that work values change considerably during early career development (Johnson, 2001; Lindsay & Knox, 1984). Individual differences in work values and value trajectories are found e.g., in relation to gender (Duffy & Sedlacek, 2007), parental background (Loughlin & Barling, 2001), personality (Lowry et al., 2012), educa-tion (Battle, 2003), and the anticipated timing of school-to-work transition (Porfeli, 2007). In contrast to work values, research on family value trajectories is rare and knowledge about the development during the school-to-work transition and early career development is lack-ing. This paper aims at filling this research gap. Focusing on family values and intrinsic work values and we expect a) family and work val-ues to change between ages 16 and 25, and b) that initial levels of family and work values as well as value change to be predicted by gender, reading literacy, ambition, and expected du-ration of education. Method. Using data from 2620 young adults (59.5% females), who participated in the Swiss longitudinal study TREE, latent growth modeling was employed to estimate the initial level and growth rate per year for work and family values. Analyses are based on TREE-waves 1 (year 2001, first year after compulsory school) to 8 (year 2010). Variables in the models included family values and intrinsic work values, gender, reading literacy, ambition and ex-pected duration of education. Language region was included as control variable. Results. Family values did not change significantly over the first four years after leaving compulsory school (mean slope = -.03, p =.36). They increased, however, significantly five years after compulsory school (mean slope = .13, p >.001). Intercept (.23, p < .001), first slope (.02, p < .001), and second slope (.01, p < .001) showed significant variance. Initial levels were higher for men and those with higher ambitions. Increases were found to be steeper for males as well as for participants with lower educational duration expectations and reading skills. Intrinsic work values increased over the first four years (mean slope =.03, p <.05) and showed a tendency to decrease in the years five to ten (mean slope = -.01, p < .10). Intercept (.21, p < .001), first slope (.01, p < .001), and second slope (.01, p < .001) showed signifi-cant variance, meaning that there are individual differences in initial levels and growth rates. Initial levels were higher for females, and those with higher ambitions, expecting longer educational pathways, and having lower reading skills. Growth rates were lower for the first phase and steeper for the second phase for males compared to females. Discussion. In general, results showed different patterns of work and family value trajecto-ries, and different individual factors related to initial levels and development after compul-sory school. Developments seem to fit to major life and career roles: in the first years after compulsory school young adults may be engaged to become established in one's job; later on, raising a family becomes more important. That we found significant gender differences in work and family trajectories may reflect attempts to overcome traditional roles, as over-all, women increase in work values and men increase in family values, resulting in an over-all trend to converge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we examined whether defenders of victims of school bullying befriended similar peers, and whether the similarity is due to selection or influence processes or both. We examined whether these processes result in different degrees of similarity between peers depending on teachers’ self-efficacy and the school climate. We analyzed longitudinal data of 478 Swiss school students employing actor-based stochastic models. Our analyses showed that similarity in defending behavior among friends was due to selection rather than influence. The extent to which adolescents selected peers showing similar defending behavior was related to contextual factors. In fact, lower self-efficacy of teachers and positive school climate were associated with increased selection effects in terms of defending behavior.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

AIM Information regarding the selection procedure for selective dorsal rhizotomy (SDR) in children with spastic cerebral palsy (CP) is scarce. Therefore, the aim of this study was to summarize the selection criteria for SDR in children with spastic CP. METHOD A systematic review was carried out using the following databases: MEDLINE, CINAHL, EMBASE, PEDro, and the Cochrane Library. Additional studies were identified in the reference lists. Search terms included 'selective dorsal rhizotomy', 'functional posterior rhizotomy', 'selective posterior rhizotomy', and 'cerebral palsy'. Studies were selected if they studied mainly children (<18y of age) with spastic CP, if they had an intervention of SDR, if they had a detailed description of the selection criteria, and if they were in English. The levels of evidence, conduct of studies, and selection criteria for SDR were scored. RESULTS Fifty-two studies were included. Selection criteria were reported in 16 International Classification of Functioning, Disability and Health model domains including 'body structure and function' (details concerning spasticity [94%], other movement abnormalities [62%], and strength [54%]), 'activity' (gross motor function [27%]), and 'personal and environmental factors' (age [44%], diagnosis [50%], motivation [31%], previous surgery [21%], and follow-up therapy [31%]). Most selection criteria were not based on standardized measurements. INTERPRETATION Selection criteria for SDR vary considerably. Future studies should describe clearly the selection procedure. International meetings of experts should develop more uniform consensus guidelines, which could form the basis for selecting candidates for SDR.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A search is performed for WH production with a light Higgs boson decaying to hidden-sector particles resulting in clusters of collimated electrons, known as electron-jets. The search is performed with 2.04 fb(-1) of data collected in 2011 with the ATLAS detector at the Large Hadron Collider in proton-proton collisions at root s = 7 TeV. One event satisfying the signal selection criteria is observed, which is consistent with the expected background rate. Limits on the product of the WH production cross section and the branching ratio of a Higgs boson decaying to prompt electron-jets are calculated as a function of a Higgs boson mass in the range from 100 to 140 GeV.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE To systematically analyze the regenerative effect of the available biomaterials either alone or in various combinations for the treatment of periodontal intrabony defects as evaluated in preclinical histologic studies. DATA SOURCES A protocol covered all aspects of the systematic review methodology. A literature search was performed in Medline, including hand searching. Combinations of searching terms and several criteria were applied for study identification, selection, and inclusion. The preliminary outcome variable was periodontal regeneration after reconstructive surgery obtained with the various regenerative materials, as demonstrated through histologic/ histomorphometric analysis. New periodontal ligament, new cementum, and new bone formation as a linear measurement in mm or as a percentage of the instrumented root length were recorded. Data were extracted based on the general characteristics, study characteristics, methodologic characteristics, and conclusions. Study selection was limited to preclinical studies involving histologic analysis, evaluating the use of potential regenerative materials (ie, barrier membranes, grafting materials, or growth factors/proteins) for the treatment of periodontal intrabony defects. Any type of biomaterial alone or in various combinations was considered. All studies reporting histologic outcome measures with a healing period of at least 6 weeks were included. A meta-analysis was not possible due to the heterogeneity of the data. CONCLUSION Flap surgery in conjunction with most of the evaluated biomaterials used either alone or in various combinations has been shown to promote periodontal regeneration to a greater extent than control therapy (flap surgery without biomaterials). Among the used biomaterials, autografts revealed the most favorable outcomes, whereas the use of most biologic factors showed inferior results compared to flap surgery.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A search is conducted for non-resonant new phenomena in dielectron and dimuon final states, originating from either contact interactions or large extra spatial dimensions. The LHC 2012 proton–proton collision dataset recorded by the ATLAS detector is used, corresponding to 20 fb−1 at √ s = 8 TeV. The dilepton invariant mass spectrum is a discriminating variable in both searches, with the contact interaction search additionally utilizing the dilepton forward-backward asymmetry. No significant deviations from the Standard Model expectation are observed. Lower limits are set on the ℓℓqq contact interaction scale ʌ between 15.4 TeVand 26.3 TeV, at the 95%credibility level. For large extra spatial dimensions, lower limits are set on the string scale MS between 3.2 TeV to 5.0 TeV.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bargaining is the building block of many economic interactions, ranging from bilateral to multilateral encounters and from situations in which the actors are individuals to negotiations between firms or countries. In all these settings, economists have been intrigued for a long time by the fact that some projects, trades or agreements are not realized even though they are mutually beneficial. On the one hand, this has been explained by incomplete information. A firm may not be willing to offer a wage that is acceptable to a qualified worker, because it knows that there are also unqualified workers and cannot distinguish between the two types. This phenomenon is known as adverse selection. On the other hand, it has been argued that even with complete information, the presence of externalities may impede efficient outcomes. To see this, consider the example of climate change. If a subset of countries agrees to curb emissions, non-participant regions benefit from the signatories’ efforts without incurring costs. These free riding opportunities give rise to incentives to strategically improve ones bargaining power that work against the formation of a global agreement. This thesis is concerned with extending our understanding of both factors, adverse selection and externalities. The findings are based on empirical evidence from original laboratory experiments as well as game theoretic modeling. On a very general note, it is demonstrated that the institutions through which agents interact matter to a large extent. Insights are provided about which institutions we should expect to perform better than others, at least in terms of aggregate welfare. Chapters 1 and 2 focus on the problem of adverse selection. Effective operation of markets and other institutions often depends on good information transmission properties. In terms of the example introduced above, a firm is only willing to offer high wages if it receives enough positive signals about the worker’s quality during the application and wage bargaining process. In Chapter 1, it will be shown that repeated interaction coupled with time costs facilitates information transmission. By making the wage bargaining process costly for the worker, the firm is able to obtain more accurate information about the worker’s type. The cost could be pure time cost from delaying agreement or cost of effort arising from a multi-step interviewing process. In Chapter 2, I abstract from time cost and show that communication can play a similar role. The simple fact that a worker states to be of high quality may be informative. In Chapter 3, the focus is on a different source of inefficiency. Agents strive for bargaining power and thus may be motivated by incentives that are at odds with the socially efficient outcome. I have already mentioned the example of climate change. Other examples are coalitions within committees that are formed to secure voting power to block outcomes or groups that commit to different technological standards although a single standard would be optimal (e.g. the format war between HD and BlueRay). It will be shown that such inefficiencies are directly linked to the presence of externalities and a certain degree of irreversibility in actions. I now discuss the three articles in more detail. In Chapter 1, Olivier Bochet and I study a simple bilateral bargaining institution that eliminates trade failures arising from incomplete information. In this setting, a buyer makes offers to a seller in order to acquire a good. Whenever an offer is rejected by the seller, the buyer may submit a further offer. Bargaining is costly, because both parties suffer a (small) time cost after any rejection. The difficulties arise, because the good can be of low or high quality and the quality of the good is only known to the seller. Indeed, without the possibility to make repeated offers, it is too risky for the buyer to offer prices that allow for trade of high quality goods. When allowing for repeated offers, however, at equilibrium both types of goods trade with probability one. We provide an experimental test of these predictions. Buyers gather information about sellers using specific price offers and rates of trade are high, much as the model’s qualitative predictions. We also observe a persistent over-delay before trade occurs, and this mitigates efficiency substantially. Possible channels for over-delay are identified in the form of two behavioral assumptions missing from the standard model, loss aversion (buyers) and haggling (sellers), which reconcile the data with the theoretical predictions. Chapter 2 also studies adverse selection, but interaction between buyers and sellers now takes place within a market rather than isolated pairs. Remarkably, in a market it suffices to let agents communicate in a very simple manner to mitigate trade failures. The key insight is that better informed agents (sellers) are willing to truthfully reveal their private information, because by doing so they are able to reduce search frictions and attract more buyers. Behavior observed in the experimental sessions closely follows the theoretical predictions. As a consequence, costless and non-binding communication (cheap talk) significantly raises rates of trade and welfare. Previous experiments have documented that cheap talk alleviates inefficiencies due to asymmetric information. These findings are explained by pro-social preferences and lie aversion. I use appropriate control treatments to show that such consideration play only a minor role in our market. Instead, the experiment highlights the ability to organize markets as a new channel through which communication can facilitate trade in the presence of private information. In Chapter 3, I theoretically explore coalition formation via multilateral bargaining under complete information. The environment studied is extremely rich in the sense that the model allows for all kinds of externalities. This is achieved by using so-called partition functions, which pin down a coalitional worth for each possible coalition in each possible coalition structure. It is found that although binding agreements can be written, efficiency is not guaranteed, because the negotiation process is inherently non-cooperative. The prospects of cooperation are shown to crucially depend on i) the degree to which players can renegotiate and gradually build up agreements and ii) the absence of a certain type of externalities that can loosely be described as incentives to free ride. Moreover, the willingness to concede bargaining power is identified as a novel reason for gradualism. Another key contribution of the study is that it identifies a strong connection between the Core, one of the most important concepts in cooperative game theory, and the set of environments for which efficiency is attained even without renegotiation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a parallel surrogate-based global optimization method for computationally expensive objective functions that is more effective for larger numbers of processors. To reach this goal, we integrated concepts from multi-objective optimization and tabu search into, single objective, surrogate optimization. Our proposed derivative-free algorithm, called SOP, uses non-dominated sorting of points for which the expensive function has been previously evaluated. The two objectives are the expensive function value of the point and the minimum distance of the point to previously evaluated points. Based on the results of non-dominated sorting, P points from the sorted fronts are selected as centers from which many candidate points are generated by random perturbations. Based on surrogate approximation, the best candidate point is subsequently selected for expensive evaluation for each of the P centers, with simultaneous computation on P processors. Centers that previously did not generate good solutions are tabu with a given tenure. We show almost sure convergence of this algorithm under some conditions. The performance of SOP is compared with two RBF based methods. The test results show that SOP is an efficient method that can reduce time required to find a good near optimal solution. In a number of cases the efficiency of SOP is so good that SOP with 8 processors found an accurate answer in less wall-clock time than the other algorithms did with 32 processors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^