886 resultados para Optimal test set


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The relationship between sleep apnoea–hypopnoea syndrome (SAHS) severity and the regularity of nocturnal oxygen saturation (SaO2) recordings was analysed. Three different methods were proposed to quantify regularity: approximate entropy (AEn), sample entropy (SEn) and kernel entropy (KEn). A total of 240 subjects suspected of suffering from SAHS took part in the study. They were randomly divided into a training set (96 subjects) and a test set (144 subjects) for the adjustment and assessment of the proposed methods, respectively. According to the measurements provided by AEn, SEn and KEn, higher irregularity of oximetry signals is associated with SAHS-positive patients. Receiver operating characteristic (ROC) and Pearson correlation analyses showed that KEn was the most reliable predictor of SAHS. It provided an area under the ROC curve of 0.91 in two-class classification of subjects as SAHS-negative or SAHS-positive. Moreover, KEn measurements from oximetry data exhibited a linear dependence on the apnoea–hypopnoea index, as shown by a correlation coefficient of 0.87. Therefore, these measurements could be used for the development of simplified diagnostic techniques in order to reduce the demand for polysomnographies. Furthermore, KEn represents a convincing alternative to AEn and SEn for the diagnostic analysis of noisy biomedical signals.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Allocating resources optimally is a nontrivial task, especially when multiple

self-interested agents with conflicting goals are involved. This dissertation

uses techniques from game theory to study two classes of such problems:

allocating resources to catch agents that attempt to evade them, and allocating

payments to agents in a team in order to stabilize it. Besides discussing what

allocations are optimal from various game-theoretic perspectives, we also study

how to efficiently compute them, and if no such algorithms are found, what

computational hardness results can be proved.

The first class of problems is inspired by real-world applications such as the

TOEFL iBT test, course final exams, driver's license tests, and airport security

patrols. We call them test games and security games. This dissertation first

studies test games separately, and then proposes a framework of Catcher-Evader

games (CE games) that generalizes both test games and security games. We show

that the optimal test strategy can be efficiently computed for scored test

games, but it is hard to compute for many binary test games. Optimal Stackelberg

strategies are hard to compute for CE games, but we give an empirically

efficient algorithm for computing their Nash equilibria. We also prove that the

Nash equilibria of a CE game are interchangeable.

The second class of problems involves how to split a reward that is collectively

obtained by a team. For example, how should a startup distribute its shares, and

what salary should an enterprise pay to its employees. Several stability-based

solution concepts in cooperative game theory, such as the core, the least core,

and the nucleolus, are well suited to this purpose when the goal is to avoid

coalitions of agents breaking off. We show that some of these solution concepts

can be justified as the most stable payments under noise. Moreover, by adjusting

the noise models (to be arguably more realistic), we obtain new solution

concepts including the partial nucleolus, the multiplicative least core, and the

multiplicative nucleolus. We then study the computational complexity of those

solution concepts under the constraint of superadditivity. Our result is based

on what we call Small-Issues-Large-Team games and it applies to popular

representation schemes such as MC-nets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

To provide biological insights into transcriptional regulation, a couple of groups have recently presented models relating the promoter DNA-bound transcription factors (TFs) to downstream gene’s mean transcript level or transcript production rates over time. However, transcript production is dynamic in response to changes of TF concentrations over time. Also, TFs are not the only factors binding to promoters; other DNA binding factors (DBFs) bind as well, especially nucleosomes, resulting in competition between DBFs for binding at same genomic location. Additionally, not only TFs, but also some other elements regulate transcription. Within core promoter, various regulatory elements influence RNAPII recruitment, PIC formation, RNAPII searching for TSS, and RNAPII initiating transcription. Moreover, it is proposed that downstream from TSS, nucleosomes resist RNAPII elongation.

Here, we provide a machine learning framework to predict transcript production rates from DNA sequences. We applied this framework in the S. cerevisiae yeast for two scenarios: a) to predict the dynamic transcript production rate during the cell cycle for native promoters; b) to predict the mean transcript production rate over time for synthetic promoters. As far as we know, our framework is the first successful attempt to have a model that can predict dynamic transcript production rates from DNA sequences only: with cell cycle data set, we got Pearson correlation coefficient Cp = 0.751 and coefficient of determination r2 = 0.564 on test set for predicting dynamic transcript production rate over time. Also, for DREAM6 Gene Promoter Expression Prediction challenge, our fitted model outperformed all participant teams, best of all teams, and a model combining best team’s k-mer based sequence features and another paper’s biologically mechanistic features, in terms of all scoring metrics.

Moreover, our framework shows its capability of identifying generalizable fea- tures by interpreting the highly predictive models, and thereby provide support for associated hypothesized mechanisms about transcriptional regulation. With the learned sparse linear models, we got results supporting the following biological insights: a) TFs govern the probability of RNAPII recruitment and initiation possibly through interactions with PIC components and transcription cofactors; b) the core promoter amplifies the transcript production probably by influencing PIC formation, RNAPII recruitment, DNA melting, RNAPII searching for and selecting TSS, releasing RNAPII from general transcription factors, and thereby initiation; c) there is strong transcriptional synergy between TFs and core promoter elements; d) the regulatory elements within core promoter region are more than TATA box and nucleosome free region, suggesting the existence of still unidentified TAF-dependent and cofactor-dependent core promoter elements in yeast S. cerevisiae; e) nucleosome occupancy is helpful for representing +1 and -1 nucleosomes’ regulatory roles on transcription.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this chapter four combinations of input features and the feedforward, cascade forward and recurrent architectures are compared for the task of forecast tourism time series. The input features of the ANNs consist in the combination of the previous 12 months, the index time modeled by two nodes used to the year and month and one input with the daily hours of sunshine (insolation duration). The index time features associated to the previous twelve values of the time series proved its relevance in this forecast task. The insolation variable can improved results with some architectures, namely the cascade forward architecture. Finally, the experimented ANN models/architectures produced a mean absolute percentage error between 4 and 6%, proving the ability of the ANN models based to forecast this time series. Besides, the feedforward architecture behaved better considering validation and test sets, with 4.2% percentage error in test set.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Faculdade Gama, Programa de Pós-Graduação em Engenharia Biomédica, 2016.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2016.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract : Recently, there is a great interest to study the flow characteristics of suspensions in different environmental and industrial applications, such as snow avalanches, debris flows, hydrotransport systems, and material casting processes. Regarding rheological aspects, the majority of these suspensions, such as fresh concrete, behave mostly as non-Newtonian fluids. Concrete is the most widely used construction material in the world. Due to the limitations that exist in terms of workability and formwork filling abilities of normal concrete, a new class of concrete that is able to flow under its own weight, especially through narrow gaps in the congested areas of the formwork was developed. Accordingly, self-consolidating concrete (SCC) is a novel construction material that is gaining market acceptance in various applications. Higher fluidity characteristics of SCC enable it to be used in a number of special applications, such as densely reinforced sections. However, higher flowability of SCC makes it more sensitive to segregation of coarse particles during flow (i.e., dynamic segregation) and thereafter at rest (i.e., static segregation). Dynamic segregation can increase when SCC flows over a long distance or in the presence of obstacles. Therefore, there is always a need to establish a trade-off between the flowability, passing ability, and stability properties of SCC suspensions. This should be taken into consideration to design the casting process and the mixture proportioning of SCC. This is called “workability design” of SCC. An efficient and non-expensive workability design approach consists of the prediction and optimization of the workability of the concrete mixtures for the selected construction processes, such as transportation, pumping, casting, compaction, and finishing. Indeed, the mixture proportioning of SCC should ensure the construction quality demands, such as demanded levels of flowability, passing ability, filling ability, and stability (dynamic and static). This is necessary to develop some theoretical tools to assess under what conditions the construction quality demands are satisfied. Accordingly, this thesis is dedicated to carry out analytical and numerical simulations to predict flow performance of SCC under different casting processes, such as pumping and tremie applications, or casting using buckets. The L-Box and T-Box set-ups can evaluate flow performance properties of SCC (e.g., flowability, passing ability, filling ability, shear-induced and gravitational dynamic segregation) in casting process of wall and beam elements. The specific objective of the study consists of relating numerical results of flow simulation of SCC in L-Box and T-Box test set-ups, reported in this thesis, to the flow performance properties of SCC during casting. Accordingly, the SCC is modeled as a heterogeneous material. Furthermore, an analytical model is proposed to predict flow performance of SCC in L-Box set-up using the Dam Break Theory. On the other hand, results of the numerical simulation of SCC casting in a reinforced beam are verified by experimental free surface profiles. The results of numerical simulations of SCC casting (modeled as a single homogeneous fluid), are used to determine the critical zones corresponding to the higher risks of segregation and blocking. The effects of rheological parameters, density, particle contents, distribution of reinforcing bars, and particle-bar interactions on flow performance of SCC are evaluated using CFD simulations of SCC flow in L-Box and T-box test set-ups (modeled as a heterogeneous material). Two new approaches are proposed to classify the SCC mixtures based on filling ability and performability properties, as a contribution of flowability, passing ability, and dynamic stability of SCC.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This paper considers the one-sample sign test for data obtained from general ranked set sampling when the number of observations for each rank are not necessarily the same, and proposes a weighted sign test because observations with different ranks are not identically distributed. The optimal weight for each observation is distribution free and only depends on its associated rank. It is shown analytically that (1) the weighted version always improves the Pitman efficiency for all distributions; and (2) the optimal design is to select the median from each ranked set.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An iterative based strategy is proposed for finding the optimal rating and location of fixed and switched capacitors in distribution networks. The substation Load Tap Changer tap is also set during this procedure. A Modified Discrete Particle Swarm Optimization is employed in the proposed strategy. The objective function is composed of the distribution line loss cost and the capacitors investment cost. The line loss is calculated using estimation of the load duration curve to multiple levels. The constraints are the bus voltage and the feeder current which should be maintained within their standard range. For validation of the proposed method, two case studies are tested. The first case study is the semi-urban 37-bus distribution system which is connected at bus 2 of the Roy Billinton Test System which is located in the secondary side of a 33/11 kV distribution substation. The second case is a 33 kV distribution network based on the modification of the 18-bus IEEE distribution system. The results are compared with prior publications to illustrate the accuracy of the proposed strategy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we consider the bi-criteria single machine scheduling problem of n jobs with a learning effect. The two objectives considered are the total completion time (TC) and total absolute differences in completion times (TADC). The objective is to find a sequence that performs well with respect to both the objectives: the total completion time and the total absolute differences in completion times. In an earlier study, a method of solving bi-criteria transportation problem is presented. In this paper, we use the methodology of solvin bi-criteria transportation problem, to our bi-criteria single machine scheduling problem with a learning effect, and obtain the set of optimal sequences,. Numerical examples are presented for illustrating the applicability and ease of understanding.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Matrix metalloproteinase (MMP) -8, collagenase-2, is a key mediator of irreversible tissue destruction in chronic periodontitis and detectable in gingival crevicular fluid (GCF). MMP-8 mostly originates from neutrophil leukocytes, the first line of defence cells which exist abundantly in GCF, especially in inflammation. MMP-8 is capable of degrading almost all extra-cellular matrix and basement membrane components and is especially efficient against type I collagen. Thus the expression of MMP-8 in GCF could be valuable in monitoring the activity of periodontitis and possibly offers a diagnostic means to predict progression of periodontitis. In this study the value of MMP-8 detection from GCF in monitoring of periodontal health and disease was evaluated with special reference to its ability to differentiate periodontal health and different disease states of the periodontium and to recognise the progression of periodontitis, i.e. active sites. For chair-side detection of MMP-8 from the GCF or peri-implant sulcus fluid (PISF) samples, a dip-stick test based on immunochromatography involving two monoclonal antibodies was developed. The immunoassay for the detection of MMP-8 from GCF was found to be more suitable for monitoring of periodontitis than detection of GCF elastase concentration or activity. Periodontally healthy subjects and individuals suffering of gingivitis or of periodontitis could be differentiated by means of GCF MMP-8 levels and dipstick testing when the positive threshold value of the MMP-8 chair-side test was set at 1000 µg/l. MMP-8 dipstick test results from periodontally healthy and from subjects with gingivitis were mainly negative while periodontitis patients sites with deep pockets ( 5 mm) and which were bleeding on probing were most often test positive. Periodontitis patients GCF MMP-8 levels decreased with hygiene phase periodontal treatment (scaling and root planing, SRP) and even reduced during the three month maintenance phase. A decrease in GCF MMP-8 levels could be monitored with the MMP-8 test. Agreement between the test stick and the quantitative assay was very good (κ = 0.81) and the test provided a baseline sensitivity of 0.83 and specificity of 0.96. During the 12-month longitudinal maintenance phase, periodontitis patients progressing sites (sites with an increase in attachment loss ≥ 2 mm during the maintenance phase) had elevated GCF MMP-8 levels compared with stable sites. General mean MMP-8 concentrations in smokers (S) sites were lower than in non-smokers (NS) sites but in progressing S and NS sites concentrations were at an equal level. Sites with exceptionally and repeatedly elevated MMP-8 concentrations during the maintenance phase were clustered in smoking patients with poor response to SRP (refractory patients). These sites especially were identified by the MMP-8 test. Subgingival plaque samples from periodontitis patients deep periodontal pockets were examined by polymerase chain reaction (PCR) to find out if periodontal lesions may serve as a niche for Chlamydia pneumoniae. Findings were compared with the clinical periodontal parameters and GCF MMP-8 levels to determine the correlation with periodontal status. Traces of C. pneumoniae were identified from one periodontitis patient s pooled subgingival plaque sample by means of PCR. After periodontal treatment (SRP) the sample was negative for C. pneumoniae. Clinical parameters or biomarkers (MMP-8) of the patient with the positive C. pneumoniae finding did not differ from other study patients. In this study it was concluded that MMP-8 concentrations in GCF of sites from periodontally healthy individuals, subjects with gingivitis or with periodontitis are at different levels. The cut-off value of the developed MMP-8 test is at an optimal level to differentiate between these conditions and can possibly be utilised in identification of individuals at the risk of the transition of gingivitis to periodontitis. In periodontitis patients, repeatedly elevated GCF MMP-8 concentrations may indicate sites at risk of progression of periodontitis as well as patients with poor response to conventional periodontal treatment (SRP). This can be monitored by MMP-8 testing. Despite the lower mean GCF MMP-8 concentrations in smokers, a fraction of smokers sites expressed very high MMP-8 concentrations together with enhanced periodontal activity and could be identified with MMP-8 specific chair-side test. Deep periodontal lesions may be niches for non-periodontopathogenic micro-organisms with systemic effects like C. pneumoniae and possibly play a role in the transmission from one subject to another.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We propose here a local exponential divergence plot which is capable of providing an alternative means of characterizing a complex time series. The suggested plot defines a time-dependent exponent and a ''plus'' exponent. Based on their changes with the embedding dimension and delay time, a criterion for estimating simultaneously the minimal acceptable embedding dimension, the proper delay time, and the largest Lyapunov exponent has been obtained. When redefining the time-dependent exponent LAMBDA(k) curves on a series of shells, we have found that whether a linear envelope to the LAMBDA(k) curves exists can serve as a direct dynamical method of distinguishing chaos from noise.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.

In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.

Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.

In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.