797 resultados para Educational data mining


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Presentations sponsored by the Patent and Trademark Depository Library Association (PTDLA) at the American Library Association Annual Conference, New Orleans, June 25, 2006 Speaker #1: Nan Myers Associate Professor; Government Documents, Patents and Trademarks Librarian Wichita State University, Wichita, KS Title: Intellectual Property Roundup: Copyright, Trademarks, Trade Secrets, and Patents Abstract: This presentation provides a capsule overview of the distinctive coverage of the four types of intellectual property – What they are, why they are important, how to get them, what they cost, how long they last. Emphasis will be on what questions patrons ask most, along with the answers! Includes coverage of the mission of Patent & Trademark Depository Libraries (PTDLs) and other sources of business information outside of libraries, such as Small Business Development Centers. Speaker #2: Jan Comfort Government Information Reference Librarian Clemson University, Clemson, SC Title: Patents as a Source of Competitive Intelligence Information Abstract: Large corporations often have R&D departments, or large numbers of staff whose jobs are to monitor the activities of their competitors. This presentation will review strategies that small business owners can employ to do their own competitive intelligence analysis. The focus will be on features of the patent database that is available free of charge on the USPTO website, as well as commercial databases available at many public and academic libraries across the country. Speaker #3: Virginia Baldwin Professor; Engineering Librarian University of Nebraska-Lincoln, Lincoln, NE Title: Mining Online Patent Data for Business Information Abstract: The United States Patent and Trademark Office (USPTO) website and websites of international databases contains information about granted patents and patent applications and the technologies they represent. Statistical information about patents, their technologies, geographical information, and patenting entities are compiled and available as reports on the USPTO website. Other valuable information from these websites can be obtained using data mining techniques. This presentation will provide the keys to opening these resources and obtaining valuable data. Speaker #4: Donna Hopkins Engineering Librarian Renssalaer Polytechnic Institute, Troy, NY Title: Searching the USPTO Trademark Database for Wordmarks and Logos Abstract: This presentation provides an overview of wordmark searching in www.uspto.gov, followed by a review of the techniques of searching for non-word US trademarks using codes from the Design Search Code Manual. These codes are used in an electronic search, either on the uspto website or on CASSIS DVDs. The search is sometimes supplemented by consulting the Official Gazette. A specific example of using a section of the codes for searching is included. Similar searches on the Madrid Express database of WIPO, using the Vienna Classification, will also be briefly described.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We review recent visualization techniques aimed at supporting tasks that require the analysis of text documents, from approaches targeted at visually summarizing the relevant content of a single document to those aimed at assisting exploratory investigation of whole collections of documents.Techniques are organized considering their target input materialeither single texts or collections of textsand their focus, which may be at displaying content, emphasizing relevant relationships, highlighting the temporal evolution of a document or collection, or helping users to handle results from a query posed to a search engine.We describe the approaches adopted by distinct techniques and briefly review the strategies they employ to obtain meaningful text models, discuss how they extract the information required to produce representative visualizations, the tasks they intend to support and the interaction issues involved, and strengths and limitations. Finally, we show a summary of techniques, highlighting their goals and distinguishing characteristics. We also briefly discuss some open problems and research directions in the fields of visual text mining and text analytics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we have quantified the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood. Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for eight authors indicated that these indices may be employed for author recognition. Indeed, as expected, authors of novels could be distinguished from those who wrote scientific texts. Our analysis demonstrated the suitability of the consistency indices, which can now be applied in other tasks, such as emotion recognition.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Dipteran a native Brazilian insect that has become a valuable model system for developmental biology research because it provides an interesting opportunity to study a different type of insect oogenesis. Sequences from a cDNA library that was constructed with poly A + RNA from the ovaries of larvae at different ages were analyzed. Molecular characterization confirmed interesting findings, such as the presence of . The gene encodes a conserved RNA-binding protein that is required during early development for the maintenance and division of the primordial germ cells of Diptera. plays an important role in specifying the posterior regions of insect embryos and is important for abdomen formation. In the present work, we showed the spatial and temporal expression profiles of this important gene, which is involved in oogenesis and early development. Data mining techniques were used to obtain the complete sequence of . Bioinformatic tools were used to determine the following: (1) the secondary structure of the 3'-untranslated region of the mRNA, (2) the encoded protein of the isolated gene, (3) the conserved zinc-finger domains of the Nanos protein, and (4) phylogenetic analyses. Furthermore, RNA in situ hybridization and immunolocalization were used to determine mRNA and protein expression in the tissues that were studied and to define as a germ cell molecular marker.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract Background Mycelium-to-yeast transition in the human host is essential for pathogenicity by the fungus Paracoccidioides brasiliensis and both cell types are therefore critical to the establishment of paracoccidioidomycosis (PCM), a systemic mycosis endemic to Latin America. The infected population is of about 10 million individuals, 2% of whom will eventually develop the disease. Previously, transcriptome analysis of mycelium and yeast cells resulted in the assembly of 6,022 sequence groups. Gene expression analysis, using both in silico EST subtraction and cDNA microarray, revealed genes that were differential to yeast or mycelium, and we discussed those involved in sugar metabolism. To advance our understanding of molecular mechanisms of dimorphic transition, we performed an extended analysis of gene expression profiles using the methods mentioned above. Results In this work, continuous data mining revealed 66 new differentially expressed sequences that were MIPS(Munich Information Center for Protein Sequences)-categorised according to the cellular process in which they are presumably involved. Two well represented classes were chosen for further analysis: (i) control of cell organisation – cell wall, membrane and cytoskeleton, whose representatives were hex (encoding for a hexagonal peroxisome protein), bgl (encoding for a 1,3-β-glucosidase) in mycelium cells; and ags (an α-1,3-glucan synthase), cda (a chitin deacetylase) and vrp (a verprolin) in yeast cells; (ii) ion metabolism and transport – two genes putatively implicated in ion transport were confirmed to be highly expressed in mycelium cells – isc and ktp, respectively an iron-sulphur cluster-like protein and a cation transporter; and a putative P-type cation pump (pct) in yeast. Also, several enzymes from the cysteine de novo biosynthesis pathway were shown to be up regulated in the yeast form, including ATP sulphurylase, APS kinase and also PAPS reductase. Conclusion Taken together, these data show that several genes involved in cell organisation and ion metabolism/transport are expressed differentially along dimorphic transition. Hyper expression in yeast of the enzymes of sulphur metabolism reinforced that this metabolic pathway could be important for this process. Understanding these changes by functional analysis of such genes may lead to a better understanding of the infective process, thus providing new targets and strategies to control PCM.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Máster Universitario en Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería (SIANI)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

[EN] Indoor position estimation has become an attractive research topic due to growing interest in location-aware services. Nevertheless, satisfying solutions have not been found with the considerations of both accuracy and system complexity. From the perspective of lightweight mobile devices, they are extremely important characteristics, because both the processor power and energy availability are limited. Hence, an indoor localization system with high computational complexity can cause complete battery drain within a few hours. In our research, we use a data mining technique named boosting to develop a localization system based on multiple weighted decision trees to predict the device location, since it has high accuracy and low computational complexity.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Trabajo Fin de Grado de la doble titulación de Grado en Ingeniería Informática y Grado en Administración y Dirección de Empresas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the last years, Intelligent Tutoring Systems have been a very successful way for improving learning experience. Many issues must be addressed until this technology can be defined mature. One of the main problems within the Intelligent Tutoring Systems is the process of contents authoring: knowledge acquisition and manipulation processes are difficult tasks because they require a specialised skills on computer programming and knowledge engineering. In this thesis we discuss a general framework for knowledge management in an Intelligent Tutoring System and propose a mechanism based on first order data mining to partially automate the process of knowledge acquisition that have to be used in the ITS during the tutoring process. Such a mechanism can be applied in Constraint Based Tutor and in the Pseudo-Cognitive Tutor. We design and implement a part of the proposed architecture, mainly the module of knowledge acquisition from examples based on first order data mining. We then show that the algorithm can be applied at least two different domains: first order algebra equation and some topics of C programming language. Finally we discuss the limitation of current approach and the possible improvements of the whole framework.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Precision horticulture and spatial analysis applied to orchards are a growing and evolving part of precision agriculture technology. The aim of this discipline is to reduce production costs by monitoring and analysing orchard-derived information to improve crop performance in an environmentally sound manner. Georeferencing and geostatistical analysis coupled to point-specific data mining allow to devise and implement management decisions tailored within the single orchard. Potential applications range from the opportunity to verify in real time along the season the effectiveness of cultural practices to achieve the production targets in terms of fruit size, number, yield and, in a near future, fruit quality traits. These data will impact not only the pre-harvest but their effect will extend to the post-harvest sector of the fruit chain. Chapter 1 provides an updated overview on precision horticulture , while in Chapter 2 a preliminary spatial statistic analysis of the variability in apple orchards is provided before and after manual thinning; an interpretation of this variability and how it can be managed to maximize orchard performance is offered. Then in Chapter 3 a stratification of spatial data into management classes to interpret and manage spatial variation on the orchard is undertaken. An inverse model approach is also applied to verify whether the crop production explains environmental variation. In Chapter 4 an integration of the techniques adopted before is presented. A new key for reading the information gathered within the field is offered. The overall goal of this Dissertation was to probe into the feasibility, the desirability and the effectiveness of a precision approach to fruit growing, following the lines of other areas of agriculture that already adopt this management tool. As existing applications of precision horticulture already had shown, crop specificity is an important factor to be accounted for. This work focused on apple because of its importance in the area where the work was carried out, and worldwide.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Oggi piu' che mai e' fondamentale essere in grado di estrarre informazioni rilevanti e conoscenza dal grande numero di dati che ci possono arrivare da svariati contesti, come database collegati a satelliti e sensori automatici, repository generati dagli utenti e data warehouse di grandi compagnie. Una delle sfide attuali riguarda lo sviluppo di tecniche di data mining per la gestione dell’incertezza. L’obiettivo di questa tesi e' di estendere le attuali tecniche di gestione dell’incertezza, in particolare riguardanti la classificazione tramite alberi decisionali, in maniera tale da poter gestire incertezza anche sull’attributo di classe.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Im ersten Teil 'Analyse der Grundlagen' der Dissertation 'Aspekte der Modellbildung: Konzepte und Anwendung in der Atmungsphysiologie' werden die Grundlagen zur Verfügung gestellt. Ausgehend von der Definition der modularer dynamischer Systeme im Kapitel 1 werden Grundbegriffe zu Modellen, Simulation und Modellentwicklung (Kapitel 2) dargelegt und schließlich folgt ein Kapitel über Netzmodelle. Im zweiten Teil wird 'der Prozess der Operationalisierung' untersucht. Im Kapitel 4 wird mit 'dem Koordinatensystem der Modellbildung' ein allgemeiner Lebenszyklus zur Modellbildung vorgestellt. Das Kapitel 5 zur 'Modellentwicklung' steht im Zentrum der Arbeit, wo eine generische Struktur für modulare Level-Raten-Modelle entwickelt wird. Das Kapitel endet mit einem Konzept zur Kalibrierung von Modellen, das auf Data Mining von Modelldaten basiert. Der Prozess der Operationalisierung endet mit der Validierung im sechsten Kapitel. 'Die Validierung am Beispiel der Atmungsphysiologie' im dritten Teil stellt die Anwendung der in beiden Teilen zuvor entwickelten Theorie dar. Zunächst wird das Projekt 'Evita-Weaning-System' vorgestellt, in dem die Arbeit entstanden ist. Ferner werden die notwendigen medizinischen Grundlagen der Atmungsphysiologie analysiert (Kapitel 7). Eine detaillierte Beschreibung des Modells der Atmungsphysiologie und der dabei entwickelten Algorithmen folgt im achten Kapitel. Die Arbeit schließt mit einem Kapitel zur Validierung des physiologischen Modells.