929 resultados para Statistical evaluation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical Machine Translation (SMT) is one of the potential applications in the field of Natural Language Processing. The translation process in SMT is carried out by acquiring translation rules automatically from the parallel corpora. However, for many language pairs (e.g. Malayalam- English), they are available only in very limited quantities. Therefore, for these language pairs a huge portion of phrases encountered at run-time will be unknown. This paper focuses on methods for handling such out-of-vocabulary (OOV) words in Malayalam that cannot be translated to English using conventional phrase-based statistical machine translation systems. The OOV words in the source sentence are pre-processed to obtain the root word and its suffix. Different inflected forms of the OOV root are generated and a match is looked up for the word variants in the phrase translation table of the translation model. A Vocabulary filter is used to choose the best among the translations of these word variants by finding the unigram count. A match for the OOV suffix is also looked up in the phrase entries and the target translations are filtered out. Structuring of the filtered phrases is done and SMT translation model is extended by adding OOV with its new phrase translations. By the results of the manual evaluation done it is observed that amount of OOV words in the input has been reduced considerably

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam sentence using statistical models. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set among the sentence pairs of the source and target language before subjecting them for training. This paper deals with certain techniques which can be adopted for improving the alignment model of SMT. Methods to incorporate the parts of speech information into the bilingual corpus has resulted in eliminating many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Presence of Malayalam words with predictable translations has also contributed in reducing the insignificant alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While channel coding is a standard method of improving a system’s energy efficiency in digital communications, its practice does not extend to high-speed links. Increasing demands in network speeds are placing a large burden on the energy efficiency of high-speed links and render the benefit of channel coding for these systems a timely subject. The low error rates of interest and the presence of residual intersymbol interference (ISI) caused by hardware constraints impede the analysis and simulation of coded high-speed links. Focusing on the residual ISI and combined noise as the dominant error mechanisms, this paper analyses error correlation through concepts of error region, channel signature, and correlation distance. This framework provides a deeper insight into joint error behaviours in high-speed links, extends the range of statistical simulation for coded high-speed links, and provides a case against the use of biased Monte Carlo methods in this setting

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam translation using statistical models like translation model, language model and a decoder. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set up among the sentence pairs of the source and target language before subjecting them for training. This paper is deals with the techniques which can be adopted for improving the alignment model of SMT. Incorporating the parts of speech information into the bilingual corpus has eliminated many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification of compositional changes in fumarolic gases of active and quiescent volcanoes is one of the most important targets in monitoring programs. From a general point of view, many systematic (often cyclic) and random processes control the chemistry of gas discharges, making difficult to produce a convincing mathematical-statistical modelling. Changes in the chemical composition of volcanic gases sampled at Vulcano Island (Aeolian Arc, Sicily, Italy) from eight different fumaroles located in the northern sector of the summit crater (La Fossa) have been analysed by considering their dependence from time in the period 2000-2007. Each intermediate chemical composition has been considered as potentially derived from the contribution of the two temporal extremes represented by the 2000 and 2007 samples, respectively, by using inverse modelling methodologies for compositional data. Data pertaining to fumaroles F5 and F27, located on the rim and in the inner part of La Fossa crater, respectively, have been used to achieve the proposed aim. The statistical approach has allowed us to highlight the presence of random and not random fluctuations, features useful to understand how the volcanic system works, opening new perspectives in sampling strategies and in the evaluation of the natural risk related to a quiescent volcano

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the B-ISDN there is a provision for four classes of services, all of them supported by a single transport network (the ATM network). Three of these services, the connected oriented (CO) ones, permit connection access control (CAC) but the fourth, the connectionless oriented (CLO) one, does not. Therefore, when CLO service and CO services have to share the same ATM link, a conflict may arise. This is because a bandwidth allocation to obtain maximum statistical gain can damage the contracted ATM quality of service (QOS); and vice versa, in order to guarantee the contracted QOS, the statistical gain have to be sacrificed. The paper presents a performance evaluation study of the influence of the CLO service on a CO service (a circuit emulation service or a variable bit-rate service) when sharing the same link

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the present work the toxic activity of extracts of Eupatorium microphyllum L.F. was evaluated on 4th instar larvae of the mosquito Aedes aegypti (Linneaus), under laboratory conditions. Aqueous extracts were utilized in concentrations of 500 mg L-1, 1,500 mg L-1 and 2,500 mg L-1 and acetone in concentrations of 10 mg L-1, 20 mg L-1, 30 mg L-1, 40 mg L-1and 50 mg L-1. The bioassays were carried out for triplicate each one with 20 larvae, exposed for 24 hours to 150 mL of solution. In all the bioassays were employed control groups. In the evaluation of the acetone extracts, a negative control was employed to avoid that the mortality of the larvae to occur on account of the solvent. The Aqueous extracts showed low moderate action in the mortality of larvae, less than 20%. On the contrary, the action of the acetone extracts was observed to 10 and 20 mg L-1with 15% of mortality, while to 30 and 40 mg L-1 were registered 22 to 38% of mortality. However, to 50 mg L-1 the mortality was of 95.4% with highly significant statistical results. The concentrations of the acetone extracts showed to be the most efficient for the control of the mosquitoes selected. Both types of extracts showed toxic effect in larvae of A. aegypti, nevertheless, greater effect in the acetone extracts was observed relating to the aqueous extracts of E. microphyllum, which constitutes a viable alternative in the search of new larvicides from composed natural.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Regular visual observations of persistent contrails over Reading, UK, have been used to evaluate radiosonde measurements of temperature and humidity defining cold ice-supersaturated atmospheric regions which are assumed to be a necessary condition for persistent condensation trails (contrails) to form. Results show a good correlation between observations and predictions using data from Larkhill, 63 km from Reading. A statistical analysis of this result and the forecasts using data from four additional UK radiosonde stations are presented. The horizontal extent of supersaturated layers could be inferred from this to be several hundred kilometres. The necessity of bias corrections to radiosonde humidity measurements is discussed and an analysis of measured ice-supersaturated atmospheric layers in the troposphere is presented. It is found that ice supersaturation is more likely to occur in winter than in summer, with frequencies of 17.3% and 9.4%, respectively, which is mostly due to the layers being thicker in winter than in summer. The most probable height for them to occur is about 10 km.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A total of 86 profiles from meat and egg strains of chickens (male and female) were used in this study. Different flexible growth functions were evaluated with regard to their ability to describe the relationship between live weight and age and were compared with the Gompertz and logistic equations, which have a fixed point of inflection. Six growth functions were used: Gompertz, logistic, Lopez, Richards, France, and von Bertalanffy. A comparative analysis was carried out based on model behavior and statistical performance. The results of this study confirmed the initial concern about the limitation of a fixed point of inflection, such as in the Gompertz equation. Therefore, consideration of flexible growth functions as an alternatives to the simpler equations (with a fixed point of inflection) for describing the relationship between live weight and age are recommended for the following reasons: they are easy to fit, they very often give a closer fit to data points because of their flexibility and therefore a smaller RSS value, than the simpler models, and they encompasses simpler models for the addition of an extra parameter, which is especially important when the behavior of a particular data set is not defined previously.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As the ideal method of assessing the nutritive value of a feedstuff, namely offering it to the appropriate class of animal and recording the production response obtained, is neither practical nor cost effective a range of feed evaluation techniques have been developed. Each of these balances some degree of compromise with the practical situation against data generation. However, due to the impact of animal-feed interactions over and above that of feed composition, the target animal remains the ultimate arbitrator of nutritional value. In this review current in vitro feed evaluation techniques are examined according to the degree of animal-feed interaction. Chemical analysis provides absolute values and therefore differs from the majority of in vitro methods that simply rank feeds. However, with no host animal involvement, estimates of nutritional value are inferred by statistical association. In addition given the costs involved, the practical value of many analyses conducted should be reviewed. The in sacco technique has made a substantial contribution to both understanding rumen microbial degradative processes and the rapid evaluation of feeds, especially in developing countries. However, the numerous shortfalls of the technique, common to many in vitro methods, the desire to eliminate the use of surgically modified animals for routine feed evaluation, paralleled with improvements in in vitro techniques, will see this technique increasingly replaced. The majority of in vitro systems use substrate disappearance to assess degradation, however, this provides no information regarding the quantity of derived end-products available to the host animal. As measurement of volatile fatty acids or microbial biomass production greatly increases analytical costs, fermentation gas release, a simple and non-destructive measurement, has been used as an alternative. However, as gas release alone is of little use, gas-based systems, where both degradation and fermentation gas release are measured simultaneously, are attracting considerable interest. Alternative microbial inocula are being considered, as is the potential of using multi-enzyme systems to examine degradation dynamics. It is concluded that while chemical analysis will continue to form an indispensable part of feed evaluation, enhanced use will be made of increasingly complex in vitro systems. It is vital, however, the function and limitations of each methodology are fully understood and that the temptation to over-interpret the data is avoided so as to draw the appropriate conclusions. With careful selection and correct application in vitro systems offer powerful research tools with which to evaluate feedstuffs. (C) 2003 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When Ian Wilson and Carlos Barahona of the Statistical Services Centre at the University of Reading were asked to review an evaluation of the effectiveness of an aid package in Malawi, they expected a simple enough task. But few things in the developing world are simple. Where aid for the poorest is concerned, is evidence collected and analysed with enough rigour to enable well-informed decisions to be made?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As a continuing effort to establish the structure-activity relationships (SARs) within the series of the angiotensin II antagonists (sartans), a pharmacophoric model was built by using novel TOPP 3D descriptors. Statistical values were satisfactory (PC4: r(2)=0.96, q(2) ((5) (random) (groups))=0.84; SDEP=0.26) and encouraged the synthesis and consequent biological evaluation of a series of new pyrrolidine derivatives. SAR together with a combined 3D quantitative SAR and high-throughput virtual screening showed that the newly synthesized 1-acyl-N-(biphenyl-4-ylmethyl)pyrrolidine-2-carboxamides may represent an interesting starting point for the design of new antihypertensive agents. In particular, biological tests performed on CHO-hAT(1) cells stably expressing the human AT(1) receptor showed that the length of the acyl chain is crucial for the receptor interaction and that the valeric chain is the optimal one.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Substantial resources are used for surveillance of bovine spongiform encephalopathy (BSE) despite an extremely low detection rate, especially in healthy slaughtered cattle. We have developed a method based on the geometric waiting time distribution to establish and update the statistical evidence for BSE-freedom for defined birth cohorts using continued surveillance data. The results suggest that currently (data included till September 2004) a birth cohort of Danish cattle born after March 1999 is free from BSE with probability (power) of 0.8746 or 0.8509, depending on the choice of a model for the diagnostic sensitivity. These results apply to an assumed design prevalence of 1 in 10,000 and account for prevalence heterogeneity. The age-dependent, diagnostic sensitivity for the detection of BSE has been identified as major determinant of the power. The incorporation of heterogeneity was deemed adequate on scientific grounds and led to improved power values. We propose our model as a decision tool for possible future modification of the BSE surveillance and discuss public health and international trade implications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Very large scale scheduling and planning tasks cannot be effectively addressed by fully automated schedule optimisation systems, since many key factors which govern 'fitness' in such cases are unformalisable. This raises the question of an interactive (or collaborative) approach, where fitness is assigned by the expert user. Though well-researched in the domains of interactively evolved art and music, this method is as yet rarely used in logistics. This paper concerns a difficulty shared by all interactive evolutionary systems (IESs), but especially those used for logistics or design problems. The difficulty is that objective evaluation of IESs is severely hampered by the need for expert humans in the loop. This makes it effectively impossible to, for example, determine with statistical confidence any ranking among a decent number of configurations for the parameters and strategy choices. We make headway into this difficulty with an Automated Tester (AT) for such systems. The AT replaces the human in experiments, and has parameters controlling its decision-making accuracy (modelling human error) and a built-in notion of a target solution which may typically be at odds with the solution which is optimal in terms of formalisable fitness. Using the AT, plausible evaluations of alternative designs for the IES can be done, allowing for (and examining the effects of) different levels of user error. We describe such an AT for evaluating an IES for very large scale planning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Peak picking is an early key step in MS data analysis. We compare three commonly used approaches to peak picking and discuss their merits by means of statistical analysis. Methods investigated encompass signal-to-noise ratio, continuous wavelet transform, and a correlation-based approach using a Gaussian template. Functionality of the three methods is illustrated and discussed in a practical context using a mass spectral data set created with MALDI-TOF technology. Sensitivity and specificity are investigated using a manually defined reference set of peaks. As an additional criterion, the robustness of the three methods is assessed by a perturbation analysis and illustrated using ROC curves.