870 resultados para Classifier Generalization Ability
Resumo:
Speech and language ability is not a unitary concept; rather, it is made up of multiple abilities such as grammar, articulation and vocabulary. Young children from socio-economically deprived areas are more likely to experience language difficulties than those living in more affluent areas. However, less is known about individual differences in language difficulties amongst young children from socio-economically deprived backgrounds. The present research examined 172 four-year-old children from socio-economically deprived areas on standardised measures of core language, receptive vocabulary, articulation, information conveyed and grammar. Of the total sample, 26% had difficulty in at least one area of language. While most children with speech and language difficulty had generally low performance in all areas, around one in 10 displayed more uneven language abilities. For example, some children had generally good speech and language ability, but had specific difficulty with grammar. In such cases their difficulty is masked somewhat by good overall performance on language tests but they could still benefit from intervention in a specific area. The analysis also identified a number of typically achieving children who were identified as having borderline speech and language difficulty and should be closely monitored
Resumo:
Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.
Resumo:
Résumé : Face à l’accroissement de la résolution spatiale des capteurs optiques satellitaires, de nouvelles stratégies doivent être développées pour classifier les images de télédétection. En effet, l’abondance de détails dans ces images diminue fortement l’efficacité des classifications spectrales; de nombreuses méthodes de classification texturale, notamment les approches statistiques, ne sont plus adaptées. À l’inverse, les approches structurelles offrent une ouverture intéressante : ces approches orientées objet consistent à étudier la structure de l’image pour en interpréter le sens. Un algorithme de ce type est proposé dans la première partie de cette thèse. Reposant sur la détection et l’analyse de points-clés (KPC : KeyPoint-based Classification), il offre une solution efficace au problème de la classification d’images à très haute résolution spatiale. Les classifications effectuées sur les données montrent en particulier sa capacité à différencier des textures visuellement similaires. Par ailleurs, il a été montré dans la littérature que la fusion évidentielle, reposant sur la théorie de Dempster-Shafer, est tout à fait adaptée aux images de télédétection en raison de son aptitude à intégrer des concepts tels que l’ambiguïté et l’incertitude. Peu d’études ont en revanche été menées sur l’application de cette théorie à des données texturales complexes telles que celles issues de classifications structurelles. La seconde partie de cette thèse vise à combler ce manque, en s’intéressant à la fusion de classifications KPC multi-échelle par la théorie de Dempster-Shafer. Les tests menés montrent que cette approche multi-échelle permet d’améliorer la classification finale dans le cas où l’image initiale est de faible qualité. De plus, l’étude effectuée met en évidence le potentiel d’amélioration apporté par l’estimation de la fiabilité des classifications intermédiaires, et fournit des pistes pour mener ces estimations.
Resumo:
Este estudio empírico compara la capacidad de los modelos Vectores auto-regresivos (VAR) sin restricciones para predecir la estructura temporal de las tasas de interés en Colombia -- Se comparan modelos VAR simples con modelos VAR aumentados con factores macroeconómicos y financieros colombianos y estadounidenses -- Encontramos que la inclusión de la información de los precios del petróleo, el riesgo de crédito de Colombia y un indicador internacional de la aversión al riesgo mejora la capacidad de predicción fuera de la muestra de los modelos VAR sin restricciones para vencimientos de corto plazo con frecuencia mensual -- Para vencimientos de mediano y largo plazo los modelos sin variables macroeconómicas presentan mejores pronósticos sugiriendo que las curvas de rendimiento de mediano y largo plazo ya incluyen toda la información significativa para pronosticarlos -- Este hallazgo tiene implicaciones importantes para los administradores de portafolios, participantes del mercado y responsables de las políticas
Resumo:
Second language (L2) learning outcomes may depend on the structure of the input and learners’ cognitive abilities. This study tested whether less predictable input might facilitate learning and generalization of L2 morphology while evaluating contributions of statistical learning ability, nonverbal intelligence, phonological short-term memory, and verbal working memory. Over three sessions, 54 adults were exposed to a Russian case-marking paradigm with a balanced or skewed item distribution in the input. Whereas statistical learning ability and nonverbal intelligence predicted learning of trained items, only nonverbal intelligence also predicted generalization of case-marking inflections to new vocabulary. Neither measure of temporary storage capacity predicted learning. Balanced, less predictable input was associated with higher accuracy in generalization but only in the initial test session. These results suggest that individual differences in pattern extraction play a more sustained role in L2 acquisition than instructional manipulations that vary the predictability of lexical items in the input.
Resumo:
Volunteer organizations operate in a challenging environment and their management practices toward volunteers have become increasingly influenced by the private sector. This case study explores the impact of brand heritage on the experience of volunteering in such managed environments. We use data from the U.K. Scouts to show that brand heritage has a positive bearing on the level of engagement volunteers experience and on their reported attitude to the way(s) in which they are managed within the volunteer organization. We then use these findings to establish the salience of brand heritage to both long established and recently formed organizations, extending current volunteer management theory; consequently, we suggest volunteer managers utilize the power of brand heritage through unlocking its ability to retain engaged and satisfied volunteers.
Resumo:
Abstract Scheduling problems are generally NP-hard combinatorial problems, and a lot of research has been done to solve these problems heuristically. However, most of the previous approaches are problem-specific and research into the development of a general scheduling algorithm is still in its infancy. Mimicking the natural evolutionary process of the survival of the fittest, Genetic Algorithms (GAs) have attracted much attention in solving difficult scheduling problems in recent years. Some obstacles exist when using GAs: there is no canonical mechanism to deal with constraints, which are commonly met in most real-world scheduling problems, and small changes to a solution are difficult. To overcome both difficulties, indirect approaches have been presented (in [1] and [2]) for nurse scheduling and driver scheduling, where GAs are used by mapping the solution space, and separate decoding routines then build solutions to the original problem. In our previous indirect GAs, learning is implicit and is restricted to the efficient adjustment of weights for a set of rules that are used to construct schedules. The major limitation of those approaches is that they learn in a non-human way: like most existing construction algorithms, once the best weight combination is found, the rules used in the construction process are fixed at each iteration. However, normally a long sequence of moves is needed to construct a schedule and using fixed rules at each move is thus unreasonable and not coherent with human learning processes. When a human scheduler is working, he normally builds a schedule step by step following a set of rules. After much practice, the scheduler gradually masters the knowledge of which solution parts go well with others. He can identify good parts and is aware of the solution quality even if the scheduling process is not completed yet, thus having the ability to finish a schedule by using flexible, rather than fixed, rules. In this research we intend to design more human-like scheduling algorithms, by using ideas derived from Bayesian Optimization Algorithms (BOA) and Learning Classifier Systems (LCS) to implement explicit learning from past solutions. BOA can be applied to learn to identify good partial solutions and to complete them by building a Bayesian network of the joint distribution of solutions [3]. A Bayesian network is a directed acyclic graph with each node corresponding to one variable, and each variable corresponding to individual rule by which a schedule will be constructed step by step. The conditional probabilities are computed according to an initial set of promising solutions. Subsequently, each new instance for each node is generated by using the corresponding conditional probabilities, until values for all nodes have been generated. Another set of rule strings will be generated in this way, some of which will replace previous strings based on fitness selection. If stopping conditions are not met, the Bayesian network is updated again using the current set of good rule strings. The algorithm thereby tries to explicitly identify and mix promising building blocks. It should be noted that for most scheduling problems the structure of the network model is known and all the variables are fully observed. In this case, the goal of learning is to find the rule values that maximize the likelihood of the training data. Thus learning can amount to 'counting' in the case of multinomial distributions. In the LCS approach, each rule has its strength showing its current usefulness in the system, and this strength is constantly assessed [4]. To implement sophisticated learning based on previous solutions, an improved LCS-based algorithm is designed, which consists of the following three steps. The initialization step is to assign each rule at each stage a constant initial strength. Then rules are selected by using the Roulette Wheel strategy. The next step is to reinforce the strengths of the rules used in the previous solution, keeping the strength of unused rules unchanged. The selection step is to select fitter rules for the next generation. It is envisaged that the LCS part of the algorithm will be used as a hill climber to the BOA algorithm. This is exciting and ambitious research, which might provide the stepping-stone for a new class of scheduling algorithms. Data sets from nurse scheduling and mall problems will be used as test-beds. It is envisaged that once the concept has been proven successful, it will be implemented into general scheduling algorithms. It is also hoped that this research will give some preliminary answers about how to include human-like learning into scheduling algorithms and may therefore be of interest to researchers and practitioners in areas of scheduling and evolutionary computation. References 1. Aickelin, U. and Dowsland, K. (2003) 'Indirect Genetic Algorithm for a Nurse Scheduling Problem', Computer & Operational Research (in print). 2. Li, J. and Kwan, R.S.K. (2003), 'Fuzzy Genetic Algorithm for Driver Scheduling', European Journal of Operational Research 147(2): 334-344. 3. Pelikan, M., Goldberg, D. and Cantu-Paz, E. (1999) 'BOA: The Bayesian Optimization Algorithm', IlliGAL Report No 99003, University of Illinois. 4. Wilson, S. (1994) 'ZCS: A Zeroth-level Classifier System', Evolutionary Computation 2(1), pp 1-18.
Resumo:
Abstract. Two ideas taken from Bayesian optimization and classifier systems are presented for personnel scheduling based on choosing a suitable scheduling rule from a set for each person's assignment. Unlike our previous work of using genetic algorithms whose learning is implicit, the learning in both approaches is explicit, i.e. we are able to identify building blocks directly. To achieve this target, the Bayesian optimization algorithm builds a Bayesian network of the joint probability distribution of the rules used to construct solutions, while the adapted classifier system assigns each rule a strength value that is constantly updated according to its usefulness in the current situation. Computational results from 52 real data instances of nurse scheduling demonstrate the success of both approaches. It is also suggested that the learning mechanism in the proposed approaches might be suitable for other scheduling problems.
Resumo:
Modern software application testing, such as the testing of software driven by graphical user interfaces (GUIs) or leveraging event-driven architectures in general, requires paying careful attention to context. Model-based testing (MBT) approaches first acquire a model of an application, then use the model to construct test cases covering relevant contexts. A major shortcoming of state-of-the-art automated model-based testing is that many test cases proposed by the model are not actually executable. These \textit{infeasible} test cases threaten the integrity of the entire model-based suite, and any coverage of contexts the suite aims to provide. In this research, I develop and evaluate a novel approach for classifying the feasibility of test cases. I identify a set of pertinent features for the classifier, and develop novel methods for extracting these features from the outputs of MBT tools. I use a supervised logistic regression approach to obtain a model of test case feasibility from a randomly selected training suite of test cases. I evaluate this approach with a set of experiments. The outcomes of this investigation are as follows: I confirm that infeasibility is prevalent in MBT, even for test suites designed to cover a relatively small number of unique contexts. I confirm that the frequency of infeasibility varies widely across applications. I develop and train a binary classifier for feasibility with average overall error, false positive, and false negative rates under 5\%. I find that unique event IDs are key features of the feasibility classifier, while model-specific event types are not. I construct three types of features from the event IDs associated with test cases, and evaluate the relative effectiveness of each within the classifier. To support this study, I also develop a number of tools and infrastructure components for scalable execution of automated jobs, which use state-of-the-art container and continuous integration technologies to enable parallel test execution and the persistence of all experimental artifacts.
Resumo:
Currently, carotenoids are valuable bioactive molecules for several industries, such as chemical, pharmaceutical, food and cosmetics, due to their multiple benefits as natural colorants, antioxidants and vitamin precursors. Hence, the increasing interest on these high added-value products has led to the search of alternatives, more cost-effective and with better yields, towards their industrial production. Indeed, microbial metabolism offers a promising option for carotenoids production. Herein it is shown the potential of the dibenzothiophene desulfurizing bacterium Gordonia alkanivorans strain 1B as a high carotenoid-producer microorganism. The novel carotenoids, produced under different culture conditions, were extracted with DMSO and then further analyzed both through spectrophotometry and HPLC. When grown in glucose-sulfate-light, strain 1B was able of achieving 2015 g carotenoids per g DCW in shake-flask assays, with about 60% corresponding to lutein, canthaxanthin and astaxanthin. Further optimization studies open a new focus of research aiming to get a hyper pigment-producer strain that may be applied towards different industrial sectors.
Resumo:
In the present studies I investigated whether college students’ perceptions of effort source influenced their perceptions of the relation between levels of their own effort and ability in mathematics. In Study 1 (N = 210), I found using hypothetical vignettes that perceptions of task-elicited effort (i.e., effort that arises due to the subjective difficulty or ease of the task) led to perceptions of an inverse relation between one’s effort and ability, and perceptions of self-initiated effort (i.e., effort that arises due to one’s own motivation or lack of motivation) led to perceptions of a positive relation between one’s effort and ability, consistent with my hypotheses and prior research. In Study 2 (N = 160), participants completed an academic task and I used open-ended questions to manipulate their perceptions of effort source. I found that participants in the task-elicited condition endorsed no overall relation between effort and ability, and participants in the self-initiated condition endorsed an overall inverse relation, which is inconsistent with my hypotheses and prior research. Possible explanations for the findings, as well as broader theoretical and educational implications are discussed.
Resumo:
Prior research shows that both cognitive ability (Schmidt & Hunter, 1998) and personality measures (Poropat, 2009; Hough & Furnham, 2003) are valid predictors of job performance. The dynamic nature of the relationships between cognitive ability and personality measures with performance over time spent on the job is less understood and thus this paper explores their relationships. Although there is much research to suggest that the predictive relationship between cognitive ability and performance decreases over years of tenure (e.g., Hulin, Henry, & Noon, 1990), other research suggests that the relationship between cognitive ability and performance will increase over time (Kolz, McFarland, & Silverman, 1988). In regard to personality, this study provides a critical test of two competing theories. The first position holds that the validity of personality degrades over time. Support for this position comes from the “ubiquitous” nature of the simplex pattern in individual differences (Humphreys, 1985). It follows that personality validities should perform like cognitive ability in this respect, and thus decline over time. In contrast to this viewpoint, the alternative position contends that the predictive relationship between personality variables and performance increases over time, with the correlation becoming larger in magnitude and more positive in direction over years of tenure. The results of this study support the latter position; personality validities predicted long term performance outcomes.