20 resultados para Large amounts
Resumo:
Relatively little research on dialect variation has been based on corpora of naturally occurring language. Instead, dialect variation has been studied based primarily on language elicited through questionnaires and interviews. Eliciting dialect data has several advantages, including allowing for dialectologists to select individual informants, control the communicative situation in which language is collected, elicit rare forms directly, and make high-quality audio recordings. Although far less common, a corpus-based approach to data collection also has several advantages, including allowing for dialectologists to collect large amounts of data from a large number of informants, observe dialect variation across a range of communicative situations, and analyze quantitative linguistic variation in large samples of natural language. Although both approaches allow for dialect variation to be observed, they provide different perspectives on language variation and change. The corpus- based approach to dialectology has therefore produced a number of new findings, many of which challenge traditional assumptions about the nature of dialect variation. Most important, this research has shown that dialect variation involves a wider range of linguistic variables and exists across a wider range of language varieties than has previously been assumed. The goal of this chapter is to introduce this emerging approach to dialectology. The first part of this chapter reviews the growing body of research that analyzes dialect variation in corpora, including research on variation across nations, regions, genders, ages, and classes, in both speech and writing, and from both a synchronic and diachronic perspective, with a focus on dialect variation in the English language. Although collections of language data elicited through interviews and questionnaires are now commonly referred to as corpora in sociolinguistics and dialectology (e.g. see Bauer 2002; Tagliamonte 2006; Kretzschmar et al. 2006; D'Arcy 2011), this review focuses on corpora of naturally occurring texts and discourse. The second part of this chapter presents the results of an analysis of variation in not contraction across region, gender, and time in a corpus of American English letters to the editor in order to exemplify a corpus-based approach to dialectology.
Resumo:
Agriculture accounts for ~70% of freshwater usage worldwide. Seawater desalination alone cannot meet the growing needs for irrigation and food production, particularly in hot, desert environments. Greenhouse cultivation of high-value crops uses just a fraction of freshwater per unit of food produced when compared with open field cultivation. However, desert greenhouse producers face three main challenges: freshwater supply, plant nutrient supply, and cooling of the greenhouse. The common practice of evaporative cooling for greenhouses consumes large amounts of fresh water. In Saudi Arabia, the most common greenhouse cooling schemes are fresh water-based evaporative cooling, often using fossil groundwater or energy-intensive desalinated water, and traditional refrigeration-based direct expansion cooling, largely powered by the burning of fossil fuels. The coastal deserts have ambient conditions that are seasonally too humid to support adequate evaporative cooling, necessitating additional energy consumption in the dehumidification process of refrigeration-based cooling. This project evaluates the use of a combined-system liquid desiccant dehumidifier and membrane distillation unit that can meet the dual needs of cooling and freshwater supply for a greenhouse in a hot and humid environment.
Resumo:
Adjuvants are substances that boost the protective immune response to vaccine antigens. The majority of known adjuvants have been identified through the use of empirical approaches. Our aim was to identify novel adjuvants with well-defined cellular and molecular mechanisms by combining a knowledge of immunoregulatory mechanisms with an in silico approach. CD4 + CD25 + FoxP3 + regulatory T cells (Tregs) inhibit the protective immune responses to vaccines by suppressing the activation of antigen presenting cells such as dendritic cells (DCs). In this chapter, we describe the identification and functional validation of small molecule antagonists to CCR4, a chemokine receptor expressed on Tregs. The CCR4 binds the chemokines CCL22 and CCL17 that are produced in large amounts by activated innate cells including DCs. In silico identified small molecule CCR4 antagonists inhibited the migration of Tregs both in vitro and in vivo and when combined with vaccine antigens, significantly enhanced protective immune responses in experimental models.
Resumo:
N-tuple recognition systems (RAMnets) are normally modeled using a small number of input lines to each RAM, because the address space grows exponentially with the number of inputs. It is impossible to implement an arbitrarily-large address space as physical memory. But given modest amounts of training data, correspondingly modest numbers of bits will be set in that memory. Hash arrays can therefore be used instead of a direct implementation of the required address space. This paper describes some exploratory experiments using the hash array technique to investigate the performance of RAMnets with very large numbers of input lines. An argument is presented which concludes that performance should peak at a relatively small n-tuple size, but the experiments carried out so far contradict this. Further experiments are needed to confirm this unexpected result.
Resumo:
One of the main challenges of classifying clinical data is determining how to handle missing features. Most research favours imputing of missing values or neglecting records that include missing data, both of which can degrade accuracy when missing values exceed a certain level. In this research we propose a methodology to handle data sets with a large percentage of missing values and with high variability in which particular data are missing. Feature selection is effected by picking variables sequentially in order of maximum correlation with the dependent variable and minimum correlation with variables already selected. Classification models are generated individually for each test case based on its particular feature set and the matching data values available in the training population. The method was applied to real patients' anonymous mental-health data where the task was to predict the suicide risk judgement clinicians would give for each patient's data, with eleven possible outcome classes: zero to ten, representing no risk to maximum risk. The results compare favourably with alternative methods and have the advantage of ensuring explanations of risk are based only on the data given, not imputed data. This is important for clinical decision support systems using human expertise for modelling and explaining predictions.