900 resultados para Artificial intelligence -- Data processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The decreasing number of women who are graduating in the Science, Technology, Engineering and Mathematics (STEM) fields continues to be a major concern. Despite national support in the form of grants provided by National Science Foundation, National Center for Information and Technology and legislation passed such as the Deficit Reduction Act of 2005 that encourages women to enter the STEM fields, the number of women actually graduating in these fields is surprisingly low. This research study focuses on a robotics competition and its ability to engage female adolescents in STEM curricula. Data have been collected to help explain why young women are reticent to take technology or engineering type courses in high school and college. Factors that have been described include attitudes, parental support, social aspects, peer pressure, and lack of role models. Often these courses were thought to have masculine and “nerdy” overtones. The courses were usually majority male enrollments and appeared to be very competitive. With more female adolescents engaging in this type of competitive atmosphere, this study gathered information to discover what about the competition appealed to these young women. Focus groups were used to gather information from adolescent females who were participating in the First Lego League (FLL) and CEENBoT competitions. What enticed them to participate in a curriculum that data demonstrated many of their peers avoided? FLL and CEENBoT are robotics programs based on curricula that are taught in afterschool programs in non-formal environments. These programs culminate in a very large robotics competition. My research questions included: What are the factors that encouraged participants to participate in the robotics competition? What was the original enticement to the FLL and CEENBoT programs? What will make participants want to come back and what are the participants’ plans for the future? My research mirrored data of previous findings such as lack of role models, the need for parental support, social stigmatisms and peer pressure are still major factors that determine whether adolescent females seek out STEM activities. An interesting finding, which was an exception to previous findings, was these female adolescents enjoyed the challenge of the competition. The informal learning environments encouraged an atmosphere of social engagement and cooperative learning. Many volunteers that led the afterschool programs were women (role models) and a majority of parents showed support by accommodating an afterschool situation. The young women that were engaged in the competition noted it was a friendly competition, but they were all there to win. All who participated in the competition had a similar learning environment: competitive but cooperative. Further research is needed to determine if it is the learning environment that lures adolescent females to the program and entices them to continue in the STEM fields or if it is the competitive aspect of the culminating activity. Advisors: James King and Allen Steckelberg

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semi-supervised learning is one of the important topics in machine learning, concerning with pattern classification where only a small subset of data is labeled. In this paper, a new network-based (or graph-based) semi-supervised classification model is proposed. It employs a combined random-greedy walk of particles, with competition and cooperation mechanisms, to propagate class labels to the whole network. Due to the competition mechanism, the proposed model has a local label spreading fashion, i.e., each particle only visits a portion of nodes potentially belonging to it, while it is not allowed to visit those nodes definitely occupied by particles of other classes. In this way, a "divide-and-conquer" effect is naturally embedded in the model. As a result, the proposed model can achieve a good classification rate while exhibiting low computational complexity order in comparison to other network-based semi-supervised algorithms. Computer simulations carried out for synthetic and real-world data sets provide a numeric quantification of the performance of the method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Even though the digital processing of documents is increasingly widespread in industry, printed documents are still largely in use. In order to process electronically the contents of printed documents, information must be extracted from digital images of documents. When dealing with complex documents, in which the contents of different regions and fields can be highly heterogeneous with respect to layout, printing quality and the utilization of fonts and typing standards, the reconstruction of the contents of documents from digital images can be a difficult problem. In the present article we present an efficient solution for this problem, in which the semantic contents of fields in a complex document are extracted from a digital image.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are some variants of the widely used Fuzzy C-Means (FCM) algorithm that support clustering data distributed across different sites. Those methods have been studied under different names, like collaborative and parallel fuzzy clustering. In this study, we offer some augmentation of the two FCM-based clustering algorithms used to cluster distributed data by arriving at some constructive ways of determining essential parameters of the algorithms (including the number of clusters) and forming a set of systematically structured guidelines such as a selection of the specific algorithm depending on the nature of the data environment and the assumptions being made about the number of clusters. A thorough complexity analysis, including space, time, and communication aspects, is reported. A series of detailed numeric experiments is used to illustrate the main ideas discussed in the study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an optimum user-steered boundary tracking approach for image segmentation, which simulates the behavior of water flowing through a riverbed. The riverbed approach was devised using the image foresting transform with a never-exploited connectivity function. We analyze its properties in the derived image graphs and discuss its theoretical relation with other popular methods such as live wire and graph cuts. Several experiments show that riverbed can significantly reduce the number of user interactions (anchor points), as compared to live wire for objects with complex shapes. This paper also includes a discussion about how to combine different methods in order to take advantage of their complementary strengths.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We review recent visualization techniques aimed at supporting tasks that require the analysis of text documents, from approaches targeted at visually summarizing the relevant content of a single document to those aimed at assisting exploratory investigation of whole collections of documents.Techniques are organized considering their target input materialeither single texts or collections of textsand their focus, which may be at displaying content, emphasizing relevant relationships, highlighting the temporal evolution of a document or collection, or helping users to handle results from a query posed to a search engine.We describe the approaches adopted by distinct techniques and briefly review the strategies they employ to obtain meaningful text models, discuss how they extract the information required to produce representative visualizations, the tasks they intend to support and the interaction issues involved, and strengths and limitations. Finally, we show a summary of techniques, highlighting their goals and distinguishing characteristics. We also briefly discuss some open problems and research directions in the fields of visual text mining and text analytics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semi-supervised learning techniques have gained increasing attention in the machine learning community, as a result of two main factors: (1) the available data is exponentially increasing; (2) the task of data labeling is cumbersome and expensive, involving human experts in the process. In this paper, we propose a network-based semi-supervised learning method inspired by the modularity greedy algorithm, which was originally applied for unsupervised learning. Changes have been made in the process of modularity maximization in a way to adapt the model to propagate labels throughout the network. Furthermore, a network reduction technique is introduced, as well as an extensive analysis of its impact on the network. Computer simulations are performed for artificial and real-world databases, providing a numerical quantitative basis for the performance of the proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A deep theoretical analysis of the graph cut image segmentation framework presented in this paper simultaneously translates into important contributions in several directions. The most important practical contribution of this work is a full theoretical description, and implementation, of a novel powerful segmentation algorithm, GC(max). The output of GC(max) coincides with a version of a segmentation algorithm known as Iterative Relative Fuzzy Connectedness, IRFC. However, GC(max) is considerably faster than the classic IRFC algorithm, which we prove theoretically and show experimentally. Specifically, we prove that, in the worst case scenario, the GC(max) algorithm runs in linear time with respect to the variable M=|C|+|Z|, where |C| is the image scene size and |Z| is the size of the allowable range, Z, of the associated weight/affinity function. For most implementations, Z is identical to the set of allowable image intensity values, and its size can be treated as small with respect to |C|, meaning that O(M)=O(|C|). In such a situation, GC(max) runs in linear time with respect to the image size |C|. We show that the output of GC(max) constitutes a solution of a graph cut energy minimization problem, in which the energy is defined as the a"" (a) norm ayenF (P) ayen(a) of the map F (P) that associates, with every element e from the boundary of an object P, its weight w(e). This formulation brings IRFC algorithms to the realm of the graph cut energy minimizers, with energy functions ayenF (P) ayen (q) for qa[1,a]. Of these, the best known minimization problem is for the energy ayenF (P) ayen(1), which is solved by the classic min-cut/max-flow algorithm, referred to often as the Graph Cut algorithm. We notice that a minimization problem for ayenF (P) ayen (q) , qa[1,a), is identical to that for ayenF (P) ayen(1), when the original weight function w is replaced by w (q) . Thus, any algorithm GC(sum) solving the ayenF (P) ayen(1) minimization problem, solves also one for ayenF (P) ayen (q) with qa[1,a), so just two algorithms, GC(sum) and GC(max), are enough to solve all ayenF (P) ayen (q) -minimization problems. We also show that, for any fixed weight assignment, the solutions of the ayenF (P) ayen (q) -minimization problems converge to a solution of the ayenF (P) ayen(a)-minimization problem (ayenF (P) ayen(a)=lim (q -> a)ayenF (P) ayen (q) is not enough to deduce that). An experimental comparison of the performance of GC(max) and GC(sum) algorithms is included. This concentrates on comparing the actual (as opposed to provable worst scenario) algorithms' running time, as well as the influence of the choice of the seeds on the output.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and to allow the end-user to adjust the comparison process according to her requirements. Our framework consists of four main modules for (i) discovering the structural commonalities between sub-trees, (ii) identifying sub-tree semantic resemblances, (iii) computing tree-based edit operations costs, and (iv) computing tree edit distance. Experimental results demonstrate higher comparison accuracy with respect to alternative methods, while timing experiments reflect the impact of semantic similarity on overall system performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability to transmit and amplify weak signals is fundamental to signal processing of artificial devices in engineering. Using a multilayer feedforward network of coupled double-well oscillators as well as Fitzhugh-Nagumo oscillators, we here investigate the conditions under which a weak signal received by the first layer can be transmitted through the network with or without amplitude attenuation. We find that the coupling strength and the nodes' states of the first layer act as two-state switches, which determine whether the transmission is significantly enhanced or exponentially decreased. We hope this finding is useful for designing artificial signal amplifiers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Competitive learning is an important machine learning approach which is widely employed in artificial neural networks. In this paper, we present a rigorous definition of a new type of competitive learning scheme realized on large-scale networks. The model consists of several particles walking within the network and competing with each other to occupy as many nodes as possible, while attempting to reject intruder particles. The particle's walking rule is composed of a stochastic combination of random and preferential movements. The model has been applied to solve community detection and data clustering problems. Computer simulations reveal that the proposed technique presents high precision of community and cluster detections, as well as low computational complexity. Moreover, we have developed an efficient method for estimating the most likely number of clusters by using an evaluator index that monitors the information generated by the competition process itself. We hope this paper will provide an alternative way to the study of competitive learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semisupervised learning is a machine learning approach that is able to employ both labeled and unlabeled samples in the training process. In this paper, we propose a semisupervised data classification model based on a combined random-preferential walk of particles in a network (graph) constructed from the input dataset. The particles of the same class cooperate among themselves, while the particles of different classes compete with each other to propagate class labels to the whole network. A rigorous model definition is provided via a nonlinear stochastic dynamical system and a mathematical analysis of its behavior is carried out. A numerical validation presented in this paper confirms the theoretical predictions. An interesting feature brought by the competitive-cooperative mechanism is that the proposed model can achieve good classification rates while exhibiting low computational complexity order in comparison to other network-based semisupervised algorithms. Computer simulations conducted on synthetic and real-world datasets reveal the effectiveness of the model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the m-machine no-wait flow shop problem where the set-up time of a job is separated from its processing time. The performance measure considered is the total flowtime. A new hybrid metaheuristic Genetic Algorithm-Cluster Search is proposed to solve the scheduling problem. The performance of the proposed method is evaluated and the results are compared with the best method reported in the literature. Experimental tests show superiority of the new method for the test problems set, regarding the solution quality. (c) 2012 Elsevier Ltd. All rights reserved.