993 resultados para word prediction
Resumo:
The work is based on the assumption that words with similar syntactic usage have similar meaning, which was proposed by Zellig S. Harris (1954,1968). We study his assumption from two aspects: Firstly, different meanings (word senses) of a word should manifest themselves in different usages (contexts), and secondly, similar usages (contexts) should lead to similar meanings (word senses). If we start with the different meanings of a word, we should be able to find distinct contexts for the meanings in text corpora. We separate the meanings by grouping and labeling contexts in an unsupervised or weakly supervised manner (Publication 1, 2 and 3). We are confronted with the question of how best to represent contexts in order to induce effective classifiers of contexts, because differences in context are the only means we have to separate word senses. If we start with words in similar contexts, we should be able to discover similarities in meaning. We can do this monolingually or multilingually. In the monolingual material, we find synonyms and other related words in an unsupervised way (Publication 4). In the multilingual material, we ?nd translations by supervised learning of transliterations (Publication 5). In both the monolingual and multilingual case, we first discover words with similar contexts, i.e., synonym or translation lists. In the monolingual case we also aim at finding structure in the lists by discovering groups of similar words, e.g., synonym sets. In this introduction to the publications of the thesis, we consider the larger background issues of how meaning arises, how it is quantized into word senses, and how it is modeled. We also consider how to define, collect and represent contexts. We discuss how to evaluate the trained context classi?ers and discovered word sense classifications, and ?nally we present the word sense discovery and disambiguation methods of the publications. This work supports Harris' hypothesis by implementing three new methods modeled on his hypothesis. The methods have practical consequences for creating thesauruses and translation dictionaries, e.g., for information retrieval and machine translation purposes. Keywords: Word senses, Context, Evaluation, Word sense disambiguation, Word sense discovery.
Resumo:
The Brix content of pineapple fruit can be non-invasively predicted from the second derivative of near infrared reflectance spectra. Correlations obtained using a NIRSystems 6500 spectrophotometer through multiple linear regression and modified partial least squares analyses using a post-dispersive configuration were comparable with that from a pre-dispersive configuration in terms of accuracy (e.g. coefficient of determination, R2, 0.73; standard error of cross validation, SECV, 1.01°Brix). The effective depth of sample assessed was slightly greater using the post-dispersive technique (about 20 mm for pineapple fruit), as expected in relation to the higher incident light intensity, relative to the pre-dispersive configuration. The effect of such environmental variables as temperature, humidity and external light, and instrumental variables such as the number of scans averaged to form a spectrum, were considered with respect to the accuracy and precision of the measurement of absorbance at 876 nm, as a key term in the calibration for Brix, and predicted Brix. The application of post-dispersive near infrared technology to in-line assessment of intact fruit in a packing shed environment is discussed.
Resumo:
Lateral or transaxial truncation of cone-beam data can occur either due to the field of view limitation of the scanning apparatus or iregion-of-interest tomography. In this paper, we Suggest two new methods to handle lateral truncation in helical scan CT. It is seen that reconstruction with laterally truncated projection data, assuming it to be complete, gives severe artifacts which even penetrates into the field of view. A row-by-row data completion approach using linear prediction is introduced for helical scan truncated data. An extension of this technique known as windowed linear prediction approach is introduced. Efficacy of the two techniques are shown using simulation with standard phantoms. A quantitative image quality measure of the resulting reconstructed images are used to evaluate the performance of the proposed methods against an extension of a standard existing technique.
Resumo:
Traps baited with synthetic aggregation pheromone and fermenting bread dough were used to monitor seasonal incidence and abundance of the ripening fruit pests, Carpophilus hemipterus (L.), C. mutilatus Erichson and C. davidsoni Dobson in stone fruit orchards in the Leeton district of southern New South Wales during five seasons (1991-96). Adult beetles were trapped from September-May, but abundance varied considerably between years with the amount of rainfall in December-January having a major influence on population size and damage potential during the canning peach harvest (late February-March). Below average rainfall in December-January was associated with mean trap catches of < 10 beetles/trap/week in low dose pheromone traps during the harvest period in 1991/92 and 1993/94 and no reported damage to ripening fruit. Rainfall in December-January 1992/93 was more than double the average and mean trap catches ranged from 8-27 beetles/week during the harvest period with substantial damage to the peach crop. December-January rainfall was also above average in 1994/95 and 1995/96 and means of 50-300 beetles/trap/week were recorded in high dose pheromone traps during harvest periods. Carpophilus spp. caused economic damage to peach crops in both seasons. These data indicate that it may be possible to predict the likelihood of Carpophilus beetle damage to ripening stone fruit in inland areas of southern Australia, by routine pheromone-based monitoring of beetle populations and summer temperatures and rainfall.
Resumo:
We study which factors in terms of trading environment and trader characteristics determine individual information acquisition in experimental asset markets. Traders with larger endowments, existing inconclusive information, lower risk aversion, and less experience in financial markets tend to acquire more information. Overall, we find that traders overacquire information, so that informed traders on average obtain negative profits net of information costs. Information acquisition and the associated losses do not diminish over time. This overacquisition phenomenon is inconsistent with predictions of rational expectations equilibrium, and we argue it resembles the overdissipation results from the contest literature. We find that more acquired information in the market leads to smaller differences between fundamental asset values and prices. Thus, the overacquisition phenomenon is a novel explanation for the high forecasting accuracy of prediction markets.
Resumo:
Aptitude-based student selection: A study concerning the admission processes of some technically oriented healthcare degree programmes in Finland (Orthotics and Prosthetics, Dental Technology and Optometry). The data studied consisted of conveniencesamples of preadmission information and the results of the admission processes of three technically oriented healthcare degree programmes (Orthotics and Prosthetics, Dental Technology and Optometry) in Finland during the years 1977-1986 and 2003. The number of the subjects tested and interviewed in the first samples was 191, 615 and 606, and in the second 67, 64 and 89, respectively. The questions of the six studies were: I. How were different kinds of preadmission data related to each other? II. Which were the major determinants of the admission decisions? III. Did the graduated students and those who dropped out differ from each other? IV. Was it possible to predict how well students would perform in the programmes? V. How was the student selection executed in the year 2003? VI. Should clinical vs. statistical prediction or both be used? (Some remarks are presented on Meehl's argument: "Always, we might as well face it, the shadow of the statistician hovers in the background; always the actuary will have the final word.") The main results of the study were as follows: Ability tests, dexterity tests and judgements of personality traits (communication skills, initiative, stress tolerance and motivation) provided unique, non-redundant information about the applicants. Available demographic variables did not bias the judgements of personality traits. In all three programme settings, four-factor solutions (personality, reasoning, gender-technical and age-vocational with factor scores) could be extracted by the Maximum Likelihood method with graphical Varimax rotation. The personality factor dominated the final aptitude judgements and very strongly affected the selection decisions. There were no clear differences between graduated students and those who had dropped out in regard to the four factors. In addition, the factor scores did not predict how well the students performed in the programmes. Meehl's argument on the uncertainty of clinical prediction was supported by the results, which on the other hand did not provide any relevant data for rules on statistical prediction. No clear arguments for or against the aptitude-based student selection was presented. However, the structure of the aptitude measures and their impact on the admission process are now better known. The concept of "personal aptitude" is not necessarily included in the values and preferences of those in charge of organizing the schooling. Thus, obviously the most well-founded and cost-effective way to execute student selection is to rely on e.g. the grade point averages of the matriculation examination and/or written entrance exams. This procedure, according to the present study, would result in a student group which has a quite different makeup (60%) from the group selected on the basis of aptitude tests. For the recruiting organizations, instead, "personal aptitude" may be a matter of great importance. The employers, of course, decide on personnel selection. The psychologists, if consulted, are responsible for the proper use of psychological measures.
Resumo:
Near infrared spectroscopy (NIRS) can be used for the on-line, non-invasive assessment of fruit for eating quality attributes such as total soluble solids (TSS). The robustness of multivariate calibration models, based on NIRS in a partial transmittance optical geometry, for the assessment of TSS of intact rockmelons (Cucumis melo) was assessed. The mesocarp TSS was highest around the fruit equator and increased towards the seed cavity. Inner mesocarp TSS levels decreased towards both the proximal and distal ends of the fruit, but more so towards the proximal end. The equatorial region of the fruit was chosen as representative of the fruit for near infrared assessment of TSS. The spectral window for model development was optimised at 695-1045 nm, and the data pre-treatment procedure was optimised to second-derivative absorbance without scatter correction. The 'global' modified partial least squares (MPLS) regression modelling procedure of WINISI (ver. 1.04) was found to be superior with respect to root mean squared error of prediction (RMSEP) and bias for model predictions of TSS across seasons, compared with the 'local' MPLS regression procedure. Updating of the model with samples selected randomly from the independent validation population demonstrated improvement in both RMSEP and bias with addition of approximately 15 samples.
Resumo:
This paper studies the problem of selecting users in an online social network for targeted advertising so as to maximize the adoption of a given product. In previous work, two families of models have been considered to address this problem: direct targeting and network-based targeting. The former approach targets users with the highest propensity to adopt the product, while the latter approach targets users with the highest influence potential – that is users whose adoption is most likely to be followed by subsequent adoptions by peers. This paper proposes a hybrid approach that combines a notion of propensity and a notion of influence into a single utility function. We show that targeting a fixed number of high-utility users results in more adoptions than targeting either highly influential users or users with high propensity.
Resumo:
Recent advances in neural language models have contributed new methods for learning distributed vector representations of words (also called word embeddings). Two such methods are the continuous bag-of-words model and the skipgram model. These methods have been shown to produce embeddings that capture higher order relationships between words that are highly effective in natural language processing tasks involving the use of word similarity and word analogy. Despite these promising results, there has been little analysis of the use of these word embeddings for retrieval. Motivated by these observations, in this paper, we set out to determine how these word embeddings can be used within a retrieval model and what the benefit might be. To this aim, we use neural word embeddings within the well known translation language model for information retrieval. This language model captures implicit semantic relations between the words in queries and those in relevant documents, thus producing more accurate estimations of document relevance. The word embeddings used to estimate neural language models produce translations that differ from previous translation language model approaches; differences that deliver improvements in retrieval effectiveness. The models are robust to choices made in building word embeddings and, even more so, our results show that embeddings do not even need to be produced from the same corpus being used for retrieval.
Resumo:
The paper presents, in three parts, a new approach to improve the detection and tracking performance of a track-while-scan radar. Part 1 presents a review of the current status of the subject. Part 2 details the new approach. It shows how a priori information provided by the tracker can be used to improve detection. It also presents a new multitarget tracking algorithm. In the present Part, analytical derivations are presented for assessing, a priori, the performance of the TWS radar system. True track initiation, false track initiation, true track continuation and false track deletion characteristics have been studied. It indicates how the various thresholds can be chosen by the designer to optimise performance. Simulation results are also presented.
Resumo:
BACKGROUND: The inability to consistently guarantee internal quality of horticulture produce is of major importance to the primary producer, marketers and ultimately the consumer. Currently, commercial avocado maturity estimation is based on the destructive assessment of percentage dry matter (%DM), and sometimes percentage oil, both of which are highly correlated with maturity. In this study the utility of Fourier transform (FT) near-infrared spectroscopy (NIRS) was investigated for the first time as a non-invasive technique for estimating %DM of whole intact 'Hass' avocado fruit. Partial least squares regression models were developed from the diffuse reflectance spectra to predict %DM, taking into account effects of intra-seasonal variation and orchard conditions. RESULTS: It was found that combining three harvests (early, mid and late) from a single farm in the major production district of central Queensland yielded a predictive model for %DM with a coefficient of determination for the validation set of 0.76 and a root mean square error of prediction of 1.53% for DM in the range 19.4-34.2%. CONCLUSION: The results of the study indicate the potential of FT-NIRS in diffuse reflectance mode to non-invasively predict %DM of whole 'Hass' avocado fruit. When the FT-NIRS system was assessed on whole avocados, the results compared favourably against data from other NIRS systems identified in the literature that have been used in research applications on avocados.
Resumo:
In this paper, a refined classic noise prediction method based on the VISSIM and FHWA noise prediction model is formulated to analyze the sound level contributed by traffic on the Nanjing Lukou airport connecting freeway before and after widening. The aim of this research is to (i) assess the traffic noise impact on the Nanjing University of Aeronautics and Astronautics (NUAA) campus before and after freeway widening, (ii) compare the prediction results with field data to test the accuracy of this method, (iii) analyze the relationship between traffic characteristics and sound level. The results indicate that the mean difference between model predictions and field measurements is acceptable. The traffic composition impact study indicates that buses (including mid-sized trucks) and heavy goods vehicles contribute a significant proportion of total noise power despite their low traffic volume. In addition, speed analysis offers an explanation for the minor differences in noise level across time periods. Future work will aim at reducing model error, by focusing on noise barrier analysis using the FEM/BEM method and modifying the vehicle noise emission equation by conducting field experimentation.
Resumo:
A quantitative expression has been obtained for the equivalent resistance of an internal short in rechargeable cells under constant voltage charging.