905 resultados para Reinforcement Learning,resource-constrained devices,iOS devices,on-device machine learning
Resumo:
Artificial Intelligence has been applied to dynamic games for many years. The ultimate goal is creating responses in virtual entities that display human-like reasoning in the definition of their behaviors. However, virtual entities that can be mistaken for real persons are yet very far from being fully achieved. This paper presents an adaptive learning based methodology for the definition of players’ profiles, with the purpose of supporting decisions of virtual entities. The proposed methodology is based on reinforcement learning algorithms, which are responsible for choosing, along the time, with the gathering of experience, the most appropriate from a set of different learning approaches. These learning approaches have very distinct natures, from mathematical to artificial intelligence and data analysis methodologies, so that the methodology is prepared for very distinct situations. This way it is equipped with a variety of tools that individually can be useful for each encountered situation. The proposed methodology is tested firstly on two simpler computer versus human player games: the rock-paper-scissors game, and a penalty-shootout simulation. Finally, the methodology is applied to the definition of action profiles of electricity market players; players that compete in a dynamic game-wise environment, in which the main goal is the achievement of the highest possible profits in the market.
Resumo:
We explore the finish-to-start precedence relations of project activities used in scheduling problems. From these relations, we devise a method to identify groups of activities that could execute concurrently, i.e. activities in the same group can all execute in parallel. The method derives a new set of relations to describe the concurrency. Then, it is represented by an undirected graph and the maximal cliques problem identifies the groups. We provide a running example with a project from our previous studies in resource constrained project cost minimization together with an example application on the concurrency detection method: the evaluation of the resource stress.
Resumo:
One of the authors (S.M.) acknowledges Direction des Relations Extérieures of Ecole Polytechnique for financial support.
Resumo:
This paper reports the microstructural analysis of S-rich CuIn(S,Se)2 layers produced by electrodeposition of CuInSe2 precursors and annealing under sulfurizing conditions as a function of the temperature of sulfurization. The characterization of the layers by Raman scattering, scanning electron microscopy, Auger electron spectroscopy, and XRD techniques has allowed observation of the strong dependence of the crystalline quality of these layers on the sulfurization temperature: Higher sulfurization temperatures lead to films with improved crystallinity, larger average grain size, and lower density of structural defects. However, it also favors the formation of a thicker MoS2 interphase layer between the CuInS2 absorber layer and the Mo back contact. Decreasing the temperature of sulfurization leads to a significant decrease in the thickness of this intermediate layer and is also accompanied by significant changes in the composition of the interface region between the absorber and the MoS2 layer, which becomes Cu rich. The characterization of devices fabricated with these absorbers corroborates the significant impact of all these features on device parameters as the open circuit voltage and fill factor that determine the efficiency of the solar cells.
Resumo:
Research was undertaken to define an appropriate level of use of traffic control devices on rural secondary roads that carry very low traffic volumes. The goal of this research was to improve the safety and efficiency of travel on the rural secondary road system. This goal was to be accomplished by providing County Engineers with guidance concerning the cost-effective use of traffic control devices on very low volume rural roads. A further objective was to define the range of traffic volumes on the roads for which the recommendations would be appropriate. Little previous research has been directed toward roads that carry very low traffic volumes. Consequently, the factual input for this research was developed by conducting an inventory of the signs and markings actually in use on 2,069 miles of rural road in Iowa. Most of these roads carried 15 or fewer vehicles per day. Additional input was provided by a survey of the opinions of County Engineers and Supervisors in Iowa. Data from both the inventory and the opinion survey indicated a considerable lack of uniformity in the application of signs on very low volume rural roads. The number of warning signs installed varied from 0.24 per mile to 3.85 per mile in the 21 counties in which the inventory was carried out. The use of specific signs not only varied quite widely among counties but also indicated a lack of uniform application within counties. County officials generally favored varying the elaborateness of signing depending upon the type of surface and the volume of traffic on different roads. Less elaborate signing would be installed on an unpaved road than on a paved road. A concensus opinion was that roads carrying fewer than 25 vehicles per day should have fewer signs than roads carrying higher volumes. Although roads carrying 0 to 24 vehicles per day constituted over 24% of the total rural secondary system, they carried less than 3% of the total travel on that system. Virtually all of these roads are classified as area service roads and would thus be expected to carry only short trips primarily by local motorists. Consequently, it was concluded that the need for warning signs rarely can be demonstrated on unpaved rural roads with traffic volumes of fewer than 25 vehicles per day. It is recommended that each county designate a portion of its roads as an Area Service Level B system. All road segments with very low traffic volumes should be considered for inclusion in this system. Roads included in this system may receive a lesser level of maintenance and a reduced level of signing. The county is also afforded protection from liability arising from accidents occurring on roads designated as part of an Area Service Level B system. A uniform absence of warning signs on roads of this nature is not expected to have any discernible effect on the safety or quality of service on these very low volume roads. The resources conserved may be expended more effectively to upgrade maintenance and traffic control on roads carrying higher volumes where the beneficial effect on highway safety and service will be much more consequential.
Resumo:
This research consisted of five laboratory experiments designed to address the following two objectives in an integrated analysis: (1) To discriminate between the symbol Stop Ahead warning sign and a small set of other signs (which included the word-legend Stop Ahead sign); and (2) To analyze sign detection, recognizability, and processing characteristics by drivers. A set of 16 signs was used in each of three experiments. A tachistoscope was used to display each sign image to a respondent for a brief interval in a controlled viewing experiment. The first experiment was designed to test detection of a sign in the driver's visual field; the second experiment was designed to test the driver's ability to recognize a given sign in the visual field; and the third experiment was designed to test the speed and accuracy of a driver's response to each sign as a command to perform a driving action. A fourth experiment tested the meanings drivers associated with an eight-sign subset of the 16 signs used in the first three experiments. A fifth experiment required all persons to select which (if any) signs they considered to be appropriate for use on two scale model county road intersections. The conclusions are that word-legend Stop Ahead signs are more effective driver communication devices than symbol stop-ahead signs; that it is helpful to drivers to have a word plate supplementing the symbol sign if a symbol sign is used; and that the guidance in the Manual on Uniform Traffic Control Devices on the placement of advance warning signs should not supplant engineering judgment in providing proper sign communication at an intersection.
Resumo:
When individuals learn by trial-and-error, they perform randomly chosen actions and then reinforce those actions that led to a high payoff. However, individuals do not always have to physically perform an action in order to evaluate its consequences. Rather, they may be able to mentally simulate actions and their consequences without actually performing them. Such fictitious learners can select actions with high payoffs without making long chains of trial-and-error learning. Here, we analyze the evolution of an n-dimensional cultural trait (or artifact) by learning, in a payoff landscape with a single optimum. We derive the stochastic learning dynamics of the distance to the optimum in trait space when choice between alternative artifacts follows the standard logit choice rule. We show that for both trial-and-error and fictitious learners, the learning dynamics stabilize at an approximate distance of root n/(2 lambda(e)) away from the optimum, where lambda(e) is an effective learning performance parameter depending on the learning rule under scrutiny. Individual learners are thus unlikely to reach the optimum when traits are complex (n large), and so face a barrier to further improvement of the artifact. We show, however, that this barrier can be significantly reduced in a large population of learners performing payoff-biased social learning, in which case lambda(e) becomes proportional to population size. Overall, our results illustrate the effects of errors in learning, levels of cognition, and population size for the evolution of complex cultural traits. (C) 2013 Elsevier Inc. All rights reserved.
Resumo:
Efforts to improve safety and traffic flow through merge areas on high volume/high speed roadways have included early merge and late merge concepts and several studies of the effectiveness of these concepts, many using Intelligent Transportation Systems for implementation. The Iowa Department of Transportation (Iowa DOT) planned to employ a system of dynamic message signs (DMS) to enhance standard temporary traffic control for lane closures and traffic merges at two bridge construction projects in western Iowa (Adair County and Cass County counties) on I-80 during the 2008 construction season. To evaluate the DMS system’s effectiveness for impacting driver merging actions, the Iowa DOT contracted with Iowa State University’s Center for Transportation Research and Education to perform the evaluation and make recommendations for future use of this system based on the results. Data were collected over four weekends, beginning August 1–4 and ending October 16–20, 2008. Two weekends yielded sufficient data for evaluation, one of transition traffic flow and the other with a period of congestion. For both of these periods, a statistical review of collected data did not indicate a significant impact on driver merging actions when the DMS messaging was activated as compared to free flow conditions with no messaging. Collection of relevant project data proved to be problematic for several reasons. In addition to personnel safety issues associated with the placement and retrieval of counting devices on a high speed roadway, unsatisfactory equipment performance and insufficient congestion to activate the DMS messaging hampered efforts. A review of the data that was collected revealed different results taken by the tube counters compared to the older model plate counters. Although variations were not significant from a practical standpoint, a statistical evaluation showed that the data, including volumes, speeds, and classifications from the two sources were not comparable at a 95% level of confidence. Comparison of data from the Iowa DOT’s automated traffic recorders (ATRs) in the area also suggested variations in results from these data collection systems. Additional comparison studies were recommended.
Resumo:
As a result of climate change, streams are warming and their runoff has been decreasing in most temperate areas. These changes can affect consumers directly by increasing their metabolic rates and modifying their physiology and indirectly by changing the quality of the resources on which organisms depend. In this study, a common stream detritivore (Echinogammarus berilloni Catta) was reared at two temperatures (15 and 20°C) and fed Populus nigra L. leaves that had been conditioned either in an intermittent or permanent reach to evaluate the effects of resource quality and increased temperatures on detritivore performance, stoichiometry and nutrient cycling. The lower quality (i.e., lower protein, soluble carbohydrates and higher C:P and N:P ratios) of leaves conditioned in pools resulted in compensatory feeding and lower nutrient retention capacity by E. berilloni. This effect was especially marked for phosphorus, which was unexpected based on predictions of ecological stoichiometry. When individuals were fed pool-conditioned leaves at warmer temperatures, their growth rates were higher, but consumers exhibited less efficient assimilation and higher mortality. Furthermore, the shifts to lower C:P ratios and higher lipid concentrations in shredder body tissues suggest that structural molecules such as phospholipids are preserved over other energetic C-rich macromolecules such as carbohydrates. These effects on consumer physiology and metabolism were further translated into feces and excreta nutrient ratios. Overall, our results show that the effects of reduced leaf quality on detritivore nutrient retention were more severe at higher temperatures because the shredders were not able to offset their increased metabolism with increased consumption or more efficient digestion when fed pool-conditioned leaves. Consequently, the synergistic effects of impaired food quality and increased temperatures might not only affect the physiology and survival of detritivores but also extend to other trophic compartments through detritivore-mediated nutrient cycling.
Resumo:
IoT consists of essentially thousands of tiny sensor nodes interconnected to the internet, each one of which executes the programmed functions under memory and power limita- tions. The sensor nodes are distributed mainly for gathering data in various situations. IoT envisions the future technologies such as e-health, smart city, auto-mobiles automa- tion, construction sites automation, and smart home. Secure communication of data under memory and energy constraints is major challenge in IoT. Authentication is the first and important phase of secure communication. This study presents a protocol to authenticate resource constraint devices in physical proximity by solely using the shared wireless communication interfaces. This model of authentication only relies on the abundance of ambient radio signals to authenticate in less than a second. To evaluate the designed protocol, SkyMotes are emulated in a network environment simulated by Contiki/COOJA. Results presented during this study proves that this approach is immune against passive and active attacks. An adversary located as near as two meters can be identified in less than a second with minimal expense of energy. Since, only radio device is used as required hardware for the authentication, this technique is scalable and interoperable to heterogeneous nature of IoT.
Resumo:
The Feedback-Related Negativity (FRN) is thought to reflect the dopaminergic prediction error signal from the subcortical areas to the ACC (i.e., a bottom-up signal). Two studies were conducted in order to test a new model of FRN generation, which includes direct modulating influences of medial PFC (i.e., top-down signals) on the ACC at the time of the FRN. Study 1 examined the effects of one’s sense of control (top-down) and of informative cues (bottom-up) on the FRN measures. In Study 2, sense of control and instruction-based (top-down) and probability-based expectations (bottom-up) were manipulated to test the proposed model. The results suggest that any influences of medial PFC on the activity of the ACC that occur in the context of incentive tasks are not direct. The FRN was shown to be sensitive to salient stimulus characteristics. The results of this dissertation partially support the reinforcement learning theory, in that the FRN is a marker for prediction error signal from subcortical areas. However, the pattern of results outlined here suggests that prediction errors are based on salient stimulus characteristics and are not reward specific. A second goal of this dissertation was to examine whether ACC activity, measured through the FRN, is altered in individuals at-risk for problem-gambling behaviour (PG). Individuals in this group were more sensitive to the valence of the outcome in a gambling task compared to not at-risk individuals, suggesting that gambling contexts increase the sensitivity of the reward system to valence of the outcome in individuals at risk for PG. Furthermore, at-risk participants showed an increased sensitivity to reward characteristics and a decreased response to loss outcomes. This contrasts with those not at risk whose FRNs were sensitive to losses. As the results did not replicate previous research showing attenuated FRNs in pathological gamblers, it is likely that the size and time of the FRN does not change gradually with increasing risk of maladaptive behaviour. Instead, changes in ACC activity reflected by the FRN in general can be observed only after behaviour becomes clinically maladaptive or through comparison between different types of gain/loss outcomes.
Resumo:
L'objectif de cette thèse est de présenter différentes applications du programme de recherche de calcul conditionnel distribué. On espère que ces applications, ainsi que la théorie présentée ici, mènera à une solution générale du problème d'intelligence artificielle, en particulier en ce qui a trait à la nécessité d'efficience. La vision du calcul conditionnel distribué consiste à accélérer l'évaluation et l'entraînement de modèles profonds, ce qui est très différent de l'objectif usuel d'améliorer sa capacité de généralisation et d'optimisation. Le travail présenté ici a des liens étroits avec les modèles de type mélange d'experts. Dans le chapitre 2, nous présentons un nouvel algorithme d'apprentissage profond qui utilise une forme simple d'apprentissage par renforcement sur un modèle d'arbre de décisions à base de réseau de neurones. Nous démontrons la nécessité d'une contrainte d'équilibre pour maintenir la distribution d'exemples aux experts uniforme et empêcher les monopoles. Pour rendre le calcul efficient, l'entrainement et l'évaluation sont contraints à être éparse en utilisant un routeur échantillonnant des experts d'une distribution multinomiale étant donné un exemple. Dans le chapitre 3, nous présentons un nouveau modèle profond constitué d'une représentation éparse divisée en segments d'experts. Un modèle de langue à base de réseau de neurones est construit à partir des transformations éparses entre ces segments. L'opération éparse par bloc est implémentée pour utilisation sur des cartes graphiques. Sa vitesse est comparée à deux opérations denses du même calibre pour démontrer le gain réel de calcul qui peut être obtenu. Un modèle profond utilisant des opérations éparses contrôlées par un routeur distinct des experts est entraîné sur un ensemble de données d'un milliard de mots. Un nouvel algorithme de partitionnement de données est appliqué sur un ensemble de mots pour hiérarchiser la couche de sortie d'un modèle de langage, la rendant ainsi beaucoup plus efficiente. Le travail présenté dans cette thèse est au centre de la vision de calcul conditionnel distribué émis par Yoshua Bengio. Elle tente d'appliquer la recherche dans le domaine des mélanges d'experts aux modèles profonds pour améliorer leur vitesse ainsi que leur capacité d'optimisation. Nous croyons que la théorie et les expériences de cette thèse sont une étape importante sur la voie du calcul conditionnel distribué car elle cadre bien le problème, surtout en ce qui concerne la compétitivité des systèmes d'experts.
Resumo:
The main focus and concerns of this PhD thesis is the growth of III-V semiconductor nanostructures (Quantum dots (QDs) and quantum dashes) on silicon substrates using molecular beam epitaxy (MBE) technique. The investigation of influence of the major growth parameters on their basic properties (density, geometry, composition, size etc.) and the systematic characterization of their structural and optical properties are the core of the research work. The monolithic integration of III-V optoelectronic devices with silicon electronic circuits could bring enormous prospect for the existing semiconductor technology. Our challenging approach is to combine the superior passive optical properties of silicon with the superior optical emission properties of III-V material by reducing the amount of III-V materials to the very limit of the active region. Different heteroepitaxial integration approaches have been investigated to overcome the materials issues between III-V and Si. However, this include the self-assembled growth of InAs and InGaAs QDs in silicon and GaAx matrices directly on flat silicon substrate, sitecontrolled growth of (GaAs/In0,15Ga0,85As/GaAs) QDs on pre-patterned Si substrate and the direct growth of GaP on Si using migration enhanced epitaxy (MEE) and MBE growth modes. An efficient ex-situ-buffered HF (BHF) and in-situ surface cleaning sequence based on atomic hydrogen (AH) cleaning at 500 °C combined with thermal oxide desorption within a temperature range of 700-900 °C has been established. The removal of oxide desorption was confirmed by semicircular streaky reflection high energy electron diffraction (RHEED) patterns indicating a 2D smooth surface construction prior to the MBE growth. The evolution of size, density and shape of the QDs are ex-situ characterized by atomic-force microscopy (AFM) and transmission electron microscopy (TEM). The InAs QDs density is strongly increased from 108 to 1011 cm-2 at V/III ratios in the range of 15-35 (beam equivalent pressure values). InAs QD formations are not observed at temperatures of 500 °C and above. Growth experiments on (111) substrates show orientation dependent QD formation behaviour. A significant shape and size transition with elongated InAs quantum dots and dashes has been observed on (111) orientation and at higher Indium-growth rate of 0.3 ML/s. The 2D strain mapping derived from high-resolution TEM of InAs QDs embedded in silicon matrix confirmed semi-coherent and fully relaxed QDs embedded in defectfree silicon matrix. The strain relaxation is released by dislocation loops exclusively localized along the InAs/Si interfaces and partial dislocations with stacking faults inside the InAs clusters. The site controlled growth of GaAs/In0,15Ga0,85As/GaAs nanostructures has been demonstrated for the first time with 1 μm spacing and very low nominal deposition thicknesses, directly on pre-patterned Si without the use of SiO2 mask. Thin planar GaP layer was successfully grown through migration enhanced epitaxy (MEE) to initiate a planar GaP wetting layer at the polar/non-polar interface, which work as a virtual GaP substrate, for the GaP-MBE subsequently growth on the GaP-MEE layer with total thickness of 50 nm. The best root mean square (RMS) roughness value was as good as 1.3 nm. However, these results are highly encouraging for the realization of III-V optical devices on silicon for potential applications.
Resumo:
Babies are born with simple manipulation capabilities such as reflexes to perceived stimuli. Initial discoveries by babies are accidental until they become coordinated and curious enough to actively investigate their surroundings. This thesis explores the development of such primitive learning systems using an embodied light-weight hand with three fingers and a thumb. It is self-contained having four motors and 36 exteroceptor and proprioceptor sensors controlled by an on-palm microcontroller. Primitive manipulation is learned from sensory inputs using competitive learning, back-propagation algorithm and reinforcement learning strategies. This hand will be used for a humanoid being developed at the MIT Artificial Intelligence Laboratory.
Resumo:
Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(lambda) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP-based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(lambda) and Q-learning belong.