994 resultados para sample subset optimization (SSO)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data in many biological problems are often compounded by imbalanced class distribution. That is, the positive examples may largely outnumbered by the negative examples. Many classification algorithms such as support vector machine (SVM) are sensitive to data with imbalanced class distribution, and result in a suboptimal classification. It is desirable to compensate the imbalance effect in model training for more accurate classification. In this study, we propose a sample subset optimization technique for classifying biological data with moderate and extremely high imbalanced class distributions. By using this optimization technique with an ensemble of SVMs, we build multiple roughly balanced SVM base classifiers, each trained on an optimized sample subset. The experimental results demonstrate that the ensemble of SVMs created by our sample subset optimization technique can achieve higher area under the ROC curve (AUC) value than popular sampling approaches such as random over-/under-sampling; SMOTE sampling, and those in widely used ensemble approaches such as bagging and boosting.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Multi-drug resistance and severe/ complicated cases are the emerging phenotypes of vivax malaria, which may deteriorate current anti-malarial control measures. The emergence of these phenotypes could be associated with either of the two Plasmodium vivax lineages. The two lineages had been categorized as Old World and New World, based on geographical sub-division and genetic and phenotypical markers. This study revisited the lineage hypothesis of P. vivax by typing the distribution of lineages among global isolates and evaluated their genetic relatedness using a panel of new mini-satellite markers. Methods: 18S SSU rRNA S-type gene was amplified from 420 Plasmodium vivax field isolates collected from different geographical regions of India, Thailand and Colombia as well as four strains each of P. vivax originating from Nicaragua, Panama, Thailand (Pak Chang), and Vietnam (ONG). A mini-satellite marker panel was then developed to understand the population genetic parameters and tested on a sample subset of both lineages. Results: 18S SSU rRNA S-type gene typing revealed the distribution of both lineages (Old World and New World) in all geographical regions. However, distribution of Plasmodium vivax lineages was highly variable in every geographical region. The lack of geographical sub-division between lineages suggests that both lineages are globally distributed. Ten mini-satellites were scanned from the P. vivax genome sequence; these tandem repeats were located in eight of the chromosomes. Mini-satellites revealed substantial allelic diversity (7-21, AE = 14.6 +/- 2.0) and heterozygosity (He = 0.697-0.924, AE = 0.857 +/- 0.033) per locus. Mini-satellite comparison between the two lineages revealed high but similar pattern of genetic diversity, allele frequency, and high degree of allele sharing. A Neighbour-Joining phylogenetic tree derived from genetic distance data obtained from ten mini-satellites also placed both lineages together in every cluster. Conclusions: The global lineage distribution, lack of genetic distance, similar pattern of genetic diversity, and allele sharing strongly suggested that both lineages are a single species and thus new emerging phenotypes associated with vivax malaria could not be clearly classified as belonging to a particular lineage on basis of their geographical origin.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two hydraulic piston cores containing the total Quaternary suite were analyzed quantitatively in their planktonic foraminiferal contents. For the Early Pleistocene, the Caribbean standard zonation (BOLLI & PREMOLI-SILVA) can be adopted and refined by the introduction of an additional subzone at its base (the Globorotalia triangula subzone). Local substages are proposed for the Late Pleistocene because index fossils are missing. The use of the transfer-function technique resulted in paleotemperature and paleosalinity curves with a time resolution of cycles of about 4-68,000 years duration. The Early Pleistocene paleoenvironment is characterized by low oscillations of the surface water temperatures, followed by a distinct cooling trend during the Globorotalia viola subzone, a period of smoothed cycles during the Globorotalia hessi subzone and distinctly developed cycles during the late Pleistocene since the oxygen isotope termination III. Grainsize distribution and several dissolution indices gave evidence for current activities on the top of the Walvis Ridge, where the amount of fine grained components in the sediment is reduced in comparison with that of the flanks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Despite the frequency with which fevers occur in children ages 1–3 years, lack of knowledge and understanding about the implications of fever and methods of fever management often results in anxiety among caretakers, sometimes prompting them to seek help at nearby emergency departments. Caretakers often look to health care professionals for advice and guidance over the telephone. The purpose of this study was to investigate caretakers' knowledge of the implications of fever, methods of fever management, perceptions of pediatric telephone triage and advice services regarding fever, and the effectiveness of after hour telephone triage directed toward improving the caretakers' ability to manage their child's fever at home. Pre-triage questionnaires were completed by 72 caretakers over the telephone before the triage encounter. Twenty-two of those same caretakers whose children were triaged using the fever guideline completed and returned the mailed post-triage questionnaire. Descriptive statistics were used to analyze responses for the larger pre-intervention group and describe comparisons for the pre and post-triage responses in the smaller sample subset (n = 22). ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The growing interest for sequencing with higher throughput in the last decade has led to the development of new sequencing applications. This thesis concentrates on optimizing DNA library preparation for Illumina Genome Analyzer II sequencer. The library preparation steps that were optimized include fragmentation, PCR purification and quantification. DNA fragmentation was performed with focused sonication in different concentrations and durations. Two column based PCR purification method, gel matrix method and magnetic bead based method were compared. Quantitative PCR and gel electrophoresis in a chip were compared for DNA quantification. The magnetic bead purification was found to be the most efficient and flexible purification method. The fragmentation protocol was changed to produce longer fragments to be compatible with longer sequencing reads. Quantitative PCR correlates better with the cluster number and should thus be considered to be the default quantification method for sequencing. As a result of this study more data have been acquired from sequencing with lower costs and troubleshooting has become easier as qualification steps have been added to the protocol. New sequencing instruments and applications will create a demand for further optimizations in future.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This program of research examines the experience of chronic pain in a community sample. While, it is clear that like patient samples, chronic pain in non-patient samples is also associated with psychological distress and physical disability, the experience of pain across the total spectrum of pain conditions (including acute and episodic pain conditions) and during the early course of chronic pain is less clear. Information about these aspects of the pain experience is important because effective early intervention for chronic pain relies on identification of people who are likely to progress to chronicity post-injury. A conceptual model of the transition from acute to chronic pain was proposed by Gatchel (1991a). In brief, Gatchel’s model describes three stages that individuals who have a serious pain experience move through, each with worsening psychological dysfunction and physical disability. The aims of this program of research were to describe the experience of pain in a community sample in order to obtain pain-specific data on the problem of pain in Queensland, and to explore the usefulness of Gatchel’s Model in a non-clinical sample. Additionally, five risk factors and six protective factors were proposed as possible extensions to Gatchel’s Model. To address these aims, a prospective longitudinal mixed-method research design was used. Quantitative data was collected in Phase 1 via a comprehensive postal questionnaire. Phase 2 consisted of a follow-up questionnaire 3 months post-baseline. Phase 3 consisted of semi-structured interviews with a subset of the original sample 12 months post follow-up, which used qualitative data to provide a further in-depth examination of the experience and process of chronic pain from respondents’ point of view. The results indicate chronic pain is associated with high levels of anxiety and depressive symptoms. However, the levels of disability reported by this Queensland sample were generally lower than those reported by clinical samples and consistent with disability data reported in a New South Wales population-based study. With regard to the second aim of this program of research, while some elements of the pain experience of this sample were consistent with that described by Gatchel’s Model, overall the model was not a good fit with the experience of this non-clinical sample. The findings indicate that passive coping strategies (minimising activity), catastrophising, self efficacy, optimism, social support, active strategies (use of distraction) and the belief that emotions affect pain may be important to consider in understanding the processes that underlie the transition to and continuation of chronic pain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Postconcussion symptoms are relatively common in the acute recovery period following mild traumatic brain injury (MTBI). However, for a small subset of patients, self reported postconcussion symptoms continue long after injury. Many factors have been proposed to account for the presence of persistent postconcussion symptoms. The influence of personality traits has been proposed as one explanation. The purpose of this study was to examine the relation between postconcussion-like symptom reporting and personality traits in a sample of 96 healthy participants. Participants completed the British Columbia Postconcussion Symptom Inventory (BC-PSI) and the Millon Clinical Multiaxial Inventory III (MCMI-III). There was a strong positive relation between the majority of MCMI-III scales and postconcussion-like symptom reporting. Approximately half of the sample met the International Classification of Diseases-10 Criterion C symptoms for Postconcussional Syndrome (PCS). Compared with those participants who did not meet this criterion, the PCS group had significant elevations on the negativistic, depression, major depression, dysthymia, anxiety, dependent, sadistic, somatic, and borderline scales of the MCMI-III. These findings support the hypothesis that personality traits can play a contributing role in self reported postconcussion-like symptoms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVES This study examined the associations between physical activity and other health behaviors in a representative sample of US adolescents. METHODS In the 1990 Youth Risk Behavior Survey, 11631 high school students provided information on physical activity; diet; substance use; and other negative health behaviors. Logistic regression analyses examined associations between physical activity and other health behaviors in a subset of 2652 high-active and 1641 low-active students. RESULTS Low activity was associated with cigarette smoking, marijuana use, lower fruit and vegetable consumption, greater television watching, failure to wear a seat belt, and low perception of academic performance. For consumption of fruit, television watching, and alcohol consumption, significant interactions were found with race/ethnicity or sex, suggesting that sociocultural factors may affect the relationships between physical activity and some health behaviors. CONCLUSIONS Low physical activity was associated with several other negative health behaviors in teenagers. Future studies should examine whether interventions for increasing physical activity in youth can be effective in reducing negative health behaviors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In transport networks, Origin-Destination matrices (ODM) are classically estimated from road traffic counts whereas recent technologies grant also access to sample car trajectories. One example is the deployment in cities of Bluetooth scanners that measure the trajectories of Bluetooth equipped cars. Exploiting such sample trajectory information, the classical ODM estimation problem is here extended into a link-dependent ODM (LODM) one. This much larger size estimation problem is formulated here in a variational form as an inverse problem. We develop a convex optimization resolution algorithm that incorporates network constraints. We study the result of the proposed algorithm on simulated network traffic.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modern database systems incorporate a query optimizer to identify the most efficient "query execution plan" for executing the declarative SQL queries submitted by users. A dynamic-programming-based approach is used to exhaustively enumerate the combinatorially large search space of plan alternatives and, using a cost model, to identify the optimal choice. While dynamic programming (DP) works very well for moderately complex queries with up to around a dozen base relations, it usually fails to scale beyond this stage due to its inherent exponential space and time complexity. Therefore, DP becomes practically infeasible for complex queries with a large number of base relations, such as those found in current decision-support and enterprise management applications. To address the above problem, a variety of approaches have been proposed in the literature. Some completely jettison the DP approach and resort to alternative techniques such as randomized algorithms, whereas others have retained DP by using heuristics to prune the search space to computationally manageable levels. In the latter class, a well-known strategy is "iterative dynamic programming" (IDP) wherein DP is employed bottom-up until it hits its feasibility limit, and then iteratively restarted with a significantly reduced subset of the execution plans currently under consideration. The experimental evaluation of IDP indicated that by appropriate choice of algorithmic parameters, it was possible to almost always obtain "good" (within a factor of twice of the optimal) plans, and in the few remaining cases, mostly "acceptable" (within an order of magnitude of the optimal) plans, and rarely, a "bad" plan. While IDP is certainly an innovative and powerful approach, we have found that there are a variety of common query frameworks wherein it can fail to consistently produce good plans, let alone the optimal choice. This is especially so when star or clique components are present, increasing the complexity of th- e join graphs. Worse, this shortcoming is exacerbated when the number of relations participating in the query is scaled upwards.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There are a number of large networks which occur in many problems dealing with the flow of power, communication signals, water, gas, transportable goods, etc. Both design and planning of these networks involve optimization problems. The first part of this paper introduces the common characteristics of a nonlinear network (the network may be linear, the objective function may be non linear, or both may be nonlinear). The second part develops a mathematical model trying to put together some important constraints based on the abstraction for a general network. The third part deals with solution procedures; it converts the network to a matrix based system of equations, gives the characteristics of the matrix and suggests two solution procedures, one of them being a new one. The fourth part handles spatially distributed networks and evolves a number of decomposition techniques so that we can solve the problem with the help of a distributed computer system. Algorithms for parallel processors and spatially distributed systems have been described.There are a number of common features that pertain to networks. A network consists of a set of nodes and arcs. In addition at every node, there is a possibility of an input (like power, water, message, goods etc) or an output or none. Normally, the network equations describe the flows amoungst nodes through the arcs. These network equations couple variables associated with nodes. Invariably, variables pertaining to arcs are constants; the result required will be flows through the arcs. To solve the normal base problem, we are given input flows at nodes, output flows at nodes and certain physical constraints on other variables at nodes and we should find out the flows through the network (variables at nodes will be referred to as across variables).The optimization problem involves in selecting inputs at nodes so as to optimise an objective function; the objective may be a cost function based on the inputs to be minimised or a loss function or an efficiency function. The above mathematical model can be solved using Lagrange Multiplier technique since the equalities are strong compared to inequalities. The Lagrange multiplier technique divides the solution procedure into two stages per iteration. Stage one calculates the problem variables % and stage two the multipliers lambda. It is shown that the Jacobian matrix used in stage one (for solving a nonlinear system of necessary conditions) occurs in the stage two also.A second solution procedure has also been imbedded into the first one. This is called total residue approach. It changes the equality constraints so that we can get faster convergence of the iterations.Both solution procedures are found to coverge in 3 to 7 iterations for a sample network.The availability of distributed computer systems — both LAN and WAN — suggest the need for algorithms to solve the optimization problems. Two types of algorithms have been proposed — one based on the physics of the network and the other on the property of the Jacobian matrix. Three algorithms have been deviced, one of them for the local area case. These algorithms are called as regional distributed algorithm, hierarchical regional distributed algorithm (both using the physics properties of the network), and locally distributed algorithm (a multiprocessor based approach with a local area network configuration). The approach used was to define an algorithm that is faster and uses minimum communications. These algorithms are found to converge at the same rate as the non distributed (unitary) case.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Based on a method proposed by Reddy and Daum, the equations governing the steady inviscid nonreacting gasdynamic laser (GDL) flow in a supersonic nozzle are reduced to a universal form so that the solutions depend on a single parameter which combines all the other parameters of the problem. Solutions are obtained for a sample case of available data and compared with existing results to validate the present approach. Also, similar solutions for a sample case are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A robust aeroelastic optimization is performed to minimize helicopter vibration with uncertainties in the design variables. Polynomial response surfaces and space-¯lling experimental designs are used to generate the surrogate model of aeroelastic analysis code. Aeroelastic simulations are performed at the sample inputs generated by Latin hypercube sampling. The response values which does not satisfy the frequency constraints are eliminated from the data for model ¯tting. This step increased the accuracy of response surface models in the feasible design space. It is found that the response surface models are able to capture the robust optimal regions of design space. The optimal designs show a reduction of 10 percent in the objective function comprising six vibratory hub loads and 1.5 to 80 percent reduction for the individual vibratory forces and moments. This study demonstrates that the second-order response surface models with space ¯lling-designs can be a favorable choice for computationally intensive robust aeroelastic optimization.