4 resultados para Machine to Machine
em DigitalCommons@University of Nebraska - Lincoln
Resumo:
Active machine learning algorithms are used when large numbers of unlabeled examples are available and getting labels for them is costly (e.g. requiring consulting a human expert). Many conventional active learning algorithms focus on refining the decision boundary, at the expense of exploring new regions that the current hypothesis misclassifies. We propose a new active learning algorithm that balances such exploration with refining of the decision boundary by dynamically adjusting the probability to explore at each step. Our experimental results demonstrate improved performance on data sets that require extensive exploration while remaining competitive on data sets that do not. Our algorithm also shows significant tolerance of noise.
Resumo:
One problem with using component-based software development approach is that once software modules are reused over generations of products, they form legacy structures that can be challenging to understand, making validating these systems difficult. Therefore, tools and methodologies that enable engineers to see interactions of these software modules will enhance their ability to make these software systems more dependable. To address this need, we propose SimSight, a framework to capture dynamic call graphs in Simics, a widely adopted commercial full-system simulator. Simics is a software system that simulates complete computer systems. Thus, it performs nearly identical tasks to a real system but at a much lower speed while providing greater execution observability. We have implemented SimSight to generate dynamic call graphs of statically and dynamically linked functions in x86/Linux environment. A case study illustrates how we can use SimSight to identify sources of software errors. We then evaluate its performance using 12 integer programs from SPEC CPU2006 benchmark suite.
Resumo:
In active learning, a machine learning algorithmis given an unlabeled set of examples U, and is allowed to request labels for a relatively small subset of U to use for training. The goal is then to judiciously choose which examples in U to have labeled in order to optimize some performance criterion, e.g. classification accuracy. We study how active learning affects AUC. We examine two existing algorithms from the literature and present our own active learning algorithms designed to maximize the AUC of the hypothesis. One of our algorithms was consistently the top performer, and Closest Sampling from the literature often came in second behind it. When good posterior probability estimates were available, our heuristics were by far the best.
Resumo:
"How large a sample is needed to survey the bird damage to corn in a county in Ohio or New Jersey or South Dakota?" Like those in the Bureau of Sport Fisheries and Wildlife and the U.S.D.A. who have been faced with a question of this sort we found only meager information on which to base an answer, whether the problem related to a county in Ohio or to one in New Jersey, or elsewhere. Many sampling methods and rates of sampling did yield reliable estimates but the judgment was often intuitive or based on the reasonableness of the resulting data. Later, when planning the next study or survey, little additional information was available on whether 40 samples of 5 ears each or 5 samples of 200 ears should be examined, i.e., examination of a large number of small samples or a small number of large samples. What information is needed to make a reliable decision? Those of us involved with the Agricultural Experiment Station regional project concerned with the problems of bird damage to crops, known as NE-49, thought we might supply an ans¬wer if we had a corn field in which all the damage was measured. If all the damage were known, we could then sample this field in various ways and see how the estimates from these samplings compared to the actual damage and pin-point the best and most accurate sampling procedure. Eventually the investigators in four states became involved in this work1 and instead of one field we were able to broaden the geographical base by examining all the corn ears in 2 half-acre sections of fields in each state, 8 sections in all. When the corn had matured well past the dough stage, damage on each corn ear was assessed, without removing the ear from the stalk, by visually estimating the percent of the kernel surface which had been destroyed and rating it in one of 5 damage categories. Measurements (by row-centimeters) of the rows of kernels pecked by birds also were made on selected ears representing all categories and all parts of each field section. These measurements provided conversion factors that, when fed into a computer, were applied to the more than 72,000 visually assessed ears. The machine now had in its memory and could supply on demand a map showing each ear, its location and the intensity of the damage.