942 resultados para Classification error rate
Resumo:
The turn-on process of a multimode VCSEL is investigated from a statistical point of view. Special attention is paid to quantities such as time jitter and bit error rate. The single-mode performance of VCSEL¿s during current modulation is compared to that of edge-emitting lasers.
Resumo:
We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether, 16 model selection and weighing schemes, 58 benchmark data sets, and various statistical tests are employed. This paper's main contributions are threefold. First, it formally presents each scheme's definition, rationale, and time complexity and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers bias-variance analysis for each scheme's classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms which have an immediate and significant impact on real-world applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets.
Resumo:
The value of earmarks as an efficient means of personal identification is still subject to debate. It has been argued that the field is lacking a firm systematic and structured data basis to help practitioners to form their conclusions. Typically, there is a paucity of research guiding as to the selectivity of the features used in the comparison process between an earmark and reference earprints taken from an individual. This study proposes a system for the automatic comparison of earprints and earmarks, operating without any manual extraction of key-points or manual annotations. For each donor, a model is created using multiple reference prints, hence capturing the donor within source variability. For each comparison between a mark and a model, images are automatically aligned and a proximity score, based on a normalized 2D correlation coefficient, is calculated. Appropriate use of this score allows deriving a likelihood ratio that can be explored under known state of affairs (both in cases where it is known that the mark has been left by the donor that gave the model and conversely in cases when it is established that the mark originates from a different source). To assess the system performance, a first dataset containing 1229 donors elaborated during the FearID research project was used. Based on these data, for mark-to-print comparisons, the system performed with an equal error rate (EER) of 2.3% and about 88% of marks are found in the first 3 positions of a hitlist. When performing print-to-print transactions, results show an equal error rate of 0.5%. The system was then tested using real-case data obtained from police forces.
Resumo:
BACKGROUND: Therapy of chronic hepatitis C (CHC) with pegIFNα/ribavirin achieves a sustained virologic response (SVR) in ∼55%. Pre-activation of the endogenous interferon system in the liver is associated with non-response (NR). Recently, genome-wide association studies described associations of allelic variants near the IL28B (IFNλ3) gene with treatment response and with spontaneous clearance of the virus. We investigated if the IL28B genotype determines the constitutive expression of IFN stimulated genes (ISGs) in the liver of patients with CHC. METHODS: We genotyped 93 patients with CHC for 3 IL28B single nucleotide polymorphisms (SNPs, rs12979860, rs8099917, rs12980275), extracted RNA from their liver biopsies and quantified the expression of IL28B and of 8 previously identified classifier genes which discriminate between SVR and NR (IFI44L, RSAD2, ISG15, IFI22, LAMP3, OAS3, LGALS3BP and HTATIP2). Decision tree ensembles in the form of a random forest classifier were used to calculate the relative predictive power of these different variables in a multivariate analysis. RESULTS: The minor IL28B allele (bad risk for treatment response) was significantly associated with increased expression of ISGs, and, unexpectedly, with decreased expression of IL28B. Stratification of the patients into SVR and NR revealed that ISG expression was conditionally independent from the IL28B genotype, i.e. there was an increased expression of ISGs in NR compared to SVR irrespective of the IL28B genotype. The random forest feature score (RFFS) identified IFI27 (RFFS = 2.93), RSAD2 (1.88) and HTATIP2 (1.50) expression and the HCV genotype (1.62) as the strongest predictors of treatment response. ROC curves of the IL28B SNPs showed an AUC of 0.66 with an error rate (ERR) of 0.38. A classifier with the 3 best classifying genes showed an excellent test performance with an AUC of 0.94 and ERR of 0.15. The addition of IL28B genotype information did not improve the predictive power of the 3-gene classifier. CONCLUSIONS: IL28B genotype and hepatic ISG expression are conditionally independent predictors of treatment response in CHC. There is no direct link between altered IFNλ3 expression and pre-activation of the endogenous system in the liver. Hepatic ISG expression is by far the better predictor for treatment response than IL28B genotype.
Resumo:
This article describes the developmentof an Open Source shallow-transfer machine translation system from Czech to Polish in theApertium platform. It gives details ofthe methods and resources used in contructingthe system. Although the resulting system has quite a high error rate, it is still competitive with other systems.
Resumo:
In this paper, we investigate the average andoutage performance of spatial multiplexing multiple-input multiple-output (MIMO) systems with channel state information at both sides of the link. Such systems result, for example, from exploiting the channel eigenmodes in multiantenna systems. Dueto the complexity of obtaining the exact expression for the average bit error rate (BER) and the outage probability, we deriveapproximations in the high signal-to-noise ratio (SNR) regime assuming an uncorrelated Rayleigh flat-fading channel. Moreexactly, capitalizing on previous work by Wang and Giannakis, the average BER and outage probability versus SNR curves ofspatial multiplexing MIMO systems are characterized in terms of two key parameters: the array gain and the diversity gain. Finally, these results are applied to analyze the performance of a variety of linear MIMO transceiver designs available in the literature.
Resumo:
This paper presents a Bayesian approach to the design of transmit prefiltering matrices in closed-loop schemes robust to channel estimation errors. The algorithms are derived for a multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) system. Two different optimizationcriteria are analyzed: the minimization of the mean square error and the minimization of the bit error rate. In both cases, the transmitter design is based on the singular value decomposition (SVD) of the conditional mean of the channel response, given the channel estimate. The performance of the proposed algorithms is analyzed,and their relationship with existing algorithms is indicated. As withother previously proposed solutions, the minimum bit error rate algorithmconverges to the open-loop transmission scheme for very poor CSI estimates.
Resumo:
Localization, which is the ability of a mobile robot to estimate its position within its environment, is a key capability for autonomous operation of any mobile robot. This thesis presents a system for indoor coarse and global localization of a mobile robot based on visual information. The system is based on image matching and uses SIFT features as natural landmarks. Features extracted from training images arestored in a database for use in localization later. During localization an image of the scene is captured using the on-board camera of the robot, features are extracted from the image and the best match is searched from the database. Feature matching is done using the k-d tree algorithm. Experimental results showed that localization accuracy increases with the number of training features used in the training database, while, on the other hand, increasing number of features tended to have a negative impact on the computational time. For some parts of the environment the error rate was relatively high due to a strong correlation of features taken from those places across the environment.
Resumo:
Tulevaisuudessa siirrettävät laitteet, kuten matkapuhelimet ja kämmenmikrot, pystyvät muodostamaan verkkoyhteyden käyttäen erilaisia yhteysmenetelmiä eri tilanteissa. Yhteysmenetelmillä on toisistaan poikkeavat viestintäominaisuudet mm. latenssin, kaistanleveyden, virhemäärän yms. suhteen. Langattomille yhteysmenetelmille on myös ominaista tietoliikenneyhteyden ominaisuuksien voimakas muuttuminen ympäristön suhteen. Parhaan suorituskyvyn ja käytettävyyden saavuttamiseksi, on siirrettävän laitteen pystyttävä mukautumaan käytettyyn viestintämenetelmään ja viestintäympäristössä tapahtuviin muutoksiin. Olennainen osa tietoliikenteessä ovat protokollapinot, jotka mahdollistavat tietoliikenneyhteyden järjestelmien välillä tarjoten verkkopalveluita päätelaitteen käyttäjäsovelluksille. Jotta protokollapinot pystyisivät mukautumaan tietyn viestintäympäristön ominaisuuksiin, on protokollapinon käyttäytymistä pystyttävä muuttamaan ajonaikaisesti. Perinteisesti protokollapinot ovat kuitenkin rakennettu muuttumattomiksi niin, että mukautuminen tässä laajuudessa on erittäin vaikeaa toteuttaa, ellei jopa mahdotonta. Tämä diplomityö käsittelee mukautuvien protokollapinojen rakentamista käyttäen komponenttipohjaista ohjelmistokehystä joka mahdollistaa protokollapinojen ajonaikaisen muuttamisen. Toteuttamalla esimerkkijärjestelmän, ja mittaamalla sen suorituskykyä vaihtelevassa tietoliikenneympäristössä, osoitamme, että mukautuvat protokollapinot ovat mahdollisia rakentaa ja ne tarjoavat merkittäviä etuja erityisesti tulevaisuuden siirrettävissä laitteissa.
Resumo:
Several clinical studies have reported that EEG synchrony is affected by Alzheimer’s disease (AD). In this paper a frequency band analysis of AD EEG signals is presented, with the aim of improving the diagnosis of AD using EEG signals. In this paper, multiple synchrony measures are assessed through statistical tests (Mann–Whitney U test), including correlation, phase synchrony and Granger causality measures. Moreover, linear discriminant analysis (LDA) is conducted with those synchrony measures as features. For the data set at hand, the frequency range (5-6Hz) yields the best accuracy for diagnosing AD, which lies within the classical theta band (4-8Hz). The corresponding classification error is 4.88% for directed transfer function (DTF) Granger causality measure. Interestingly, results show that EEG of AD patients is more synchronous than in healthy subjects within the optimized range 5-6Hz, which is in sharp contrast with the loss of synchrony in AD EEG reported in many earlier studies. This new finding may provide new insights about the neurophysiology of AD. Additional testing on larger AD datasets is required to verify the effectiveness of the proposed approach.
Resumo:
Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.
Resumo:
Työssä tutkitaan tiedonsiirtoa eri modulaatioilla, bittinopeuksilla ja amplitudin voimakkuuksilla ja tuloksia tarkastellaan Bit Error Ration avulla. Signaaleja siirrettiiin myös koodattuna ja vertailtiin koodauksen etuja ja haittoja verrattuna koodaamattomaan tietoon. Datavirta kulkee AXMK-kaapelissa, joko tasasähkön mukana, tai maadoituskaapelissa. Tuloksissa havaittiin, että suurempi bittinopeus ei kasvattanut häviöiden määrää. Koodauksen käyttö toisaalta vähenti bittivirheiden määrää.
Resumo:
Longitudinal surveys are increasingly used to collect event history data on person-specific processes such as transitions between labour market states. Surveybased event history data pose a number of challenges for statistical analysis. These challenges include survey errors due to sampling, non-response, attrition and measurement. This study deals with non-response, attrition and measurement errors in event history data and the bias caused by them in event history analysis. The study also discusses some choices faced by a researcher using longitudinal survey data for event history analysis and demonstrates their effects. These choices include, whether a design-based or a model-based approach is taken, which subset of data to use and, if a design-based approach is taken, which weights to use. The study takes advantage of the possibility to use combined longitudinal survey register data. The Finnish subset of European Community Household Panel (FI ECHP) survey for waves 1–5 were linked at person-level with longitudinal register data. Unemployment spells were used as study variables of interest. Lastly, a simulation study was conducted in order to assess the statistical properties of the Inverse Probability of Censoring Weighting (IPCW) method in a survey data context. The study shows how combined longitudinal survey register data can be used to analyse and compare the non-response and attrition processes, test the missingness mechanism type and estimate the size of bias due to non-response and attrition. In our empirical analysis, initial non-response turned out to be a more important source of bias than attrition. Reported unemployment spells were subject to seam effects, omissions, and, to a lesser extent, overreporting. The use of proxy interviews tended to cause spell omissions. An often-ignored phenomenon classification error in reported spell outcomes, was also found in the data. Neither the Missing At Random (MAR) assumption about non-response and attrition mechanisms, nor the classical assumptions about measurement errors, turned out to be valid. Both measurement errors in spell durations and spell outcomes were found to cause bias in estimates from event history models. Low measurement accuracy affected the estimates of baseline hazard most. The design-based estimates based on data from respondents to all waves of interest and weighted by the last wave weights displayed the largest bias. Using all the available data, including the spells by attriters until the time of attrition, helped to reduce attrition bias. Lastly, the simulation study showed that the IPCW correction to design weights reduces bias due to dependent censoring in design-based Kaplan-Meier and Cox proportional hazard model estimators. The study discusses implications of the results for survey organisations collecting event history data, researchers using surveys for event history analysis, and researchers who develop methods to correct for non-sampling biases in event history data.
Resumo:
Imaging studies have shown reduced frontal lobe resources following total sleep deprivation (TSD). The anterior cingulate cortex (ACC) in the frontal region plays a role in performance monitoring and cognitive control; both error detection and response inhibition are impaired following sleep loss. Event-related potentials (ERPs) are an electrophysiological tool used to index the brain's response to stimuli and information processing. In the Flanker task, the error-related negativity (ERN) and error positivity (Pe) ERPs are elicited after erroneous button presses. In a Go/NoGo task, NoGo-N2 and NoGo-P3 ERPs are elicited during high conflict stimulus processing. Research investigating the impact of sleep loss on ERPs during performance monitoring is equivocal, possibly due to task differences, sample size differences and varying degrees of sleep loss. Based on the effects of sleep loss on frontal function and prior research, it was expected that the sleep deprivation group would have lower accuracy, slower reaction time and impaired remediation on performance monitoring tasks, along with attenuated and delayed stimulus- and response-locked ERPs. In the current study, 49 young adults (24 male) were screened to be healthy good sleepers and then randomly assigned to a sleep deprived (n = 24) or rested control (n = 25) group. Participants slept in the laboratory on a baseline night, followed by a second night of sleep or wake. Flanker and Go/NoGo tasks were administered in a battery at 1O:30am (i.e., 27 hours awake for the sleep deprivation group) to measure performance monitoring. On the Flanker task, the sleep deprivation group was significantly slower than controls (p's <.05), but groups did not differ on accuracy. No group differences were observed in post-error slowing, but a trend was observed for less remedial accuracy in the sleep deprived group compared to controls (p = .09), suggesting impairment in the ability to take remedial action following TSD. Delayed P300s were observed in the sleep deprived group on congruent and incongruent Flanker trials combined (p = .001). On the Go/NoGo task, the hit rate (i.e., Go accuracy) was significantly lower in the sleep deprived group compared to controls (p <.001), but no differences were found on false alarm rates (i.e., NoGo Accuracy). For the sleep deprived group, the Go-P3 was significantly smaller (p = .045) and there was a trend for a smaller NoGo-N2 compared to controls (p = .08). The ERN amplitude was reduced in the TSD group compared to controls in both the Flanker and Go/NoGo tasks. Error rate was significantly correlated with the amplitude of response-locked ERNs in control (r = -.55, p=.005) and sleep deprived groups (r = -.46, p = .021); error rate was also correlated with Pe amplitude in controls (r = .46, p=.022) and a trend was found in the sleep deprived participants (r = .39, p =. 052). An exploratory analysis showed significantly larger Pe mean amplitudes (p = .025) in the sleep deprived group compared to controls for participants who made more than 40+ errors on the Flanker task. Altered stimulus processing as indexed by delayed P3 latency during the Flanker task and smaller amplitude Go-P3s during the Go/NoGo task indicate impairment in stimulus evaluation and / or context updating during frontal lobe tasks. ERN and NoGoN2 reductions in the sleep deprived group confirm impairments in the monitoring system. These data add to a body of evidence showing that the frontal brain region is particularly vulnerable to sleep loss. Understanding the neural basis of these deficits in performance monitoring abilities is particularly important for our increasingly sleep deprived society and for safety and productivity in situations like driving and sustained operations.
Resumo:
Contexte. Les phénotypes ABO et Rh(D) des donneurs de sang ainsi que des patients transfusés sont analysés de façon routinière pour assurer une complète compatibilité. Ces analyses sont accomplies par agglutination suite à une réaction anticorps-antigènes. Cependant, pour des questions de coûts et de temps d’analyses faramineux, les dons de sang ne sont pas testés sur une base routinière pour les antigènes mineurs du sang. Cette lacune peut résulter à une allo-immunisation des patients receveurs contre un ou plusieurs antigènes mineurs et ainsi amener des sévères complications pour de futures transfusions. Plan d’étude et Méthodes. Pour ainsi aborder le problème, nous avons produit un panel génétique basé sur la technologie « GenomeLab _SNPstream» de Beckman Coulter, dans l’optique d’analyser simultanément 22 antigènes mineurs du sang. La source d’ADN provient des globules blancs des patients préalablement isolés sur papiers FTA. Résultats. Les résultats démontrent que le taux de discordance des génotypes, mesuré par la corrélation des résultats de génotypage venant des deux directions de l’ADN, ainsi que le taux d’échec de génotypage sont très bas (0,1%). Également, la corrélation entre les résultats de phénotypes prédit par génotypage et les phénotypes réels obtenus par sérologie des globules rouges et plaquettes sanguines, varient entre 97% et 100%. Les erreurs expérimentales ou encore de traitement des bases de données ainsi que de rares polymorphismes influençant la conformation des antigènes, pourraient expliquer les différences de résultats. Cependant, compte tenu du fait que les résultats de phénotypages obtenus par génotypes seront toujours co-vérifiés avant toute transfusion sanguine par les technologies standards approuvés par les instances gouvernementales, les taux de corrélation obtenus sont de loin supérieurs aux critères de succès attendus pour le projet. Conclusion. Le profilage génétique des antigènes mineurs du sang permettra de créer une banque informatique centralisée des phénotypes des donneurs, permettant ainsi aux banques de sang de rapidement retrouver les profiles compatibles entre les donneurs et les receveurs.