924 resultados para automated lexical analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: A major challenge for assessing students’ conceptual understanding of STEM subjects is the capacity of assessment tools to reliably and robustly evaluate student thinking and reasoning. Multiple-choice tests are typically used to assess student learning and are designed to include distractors that can indicate students’ incomplete understanding of a topic or concept based on which distractor the student selects. However, these tests fail to provide the critical information uncovering the how and why of students’ reasoning for their multiple-choice selections. Open-ended or structured response questions are one method for capturing higher level thinking, but are often costly in terms of time and attention to properly assess student responses. Purpose: The goal of this study is to evaluate methods for automatically assessing open-ended responses, e.g. students’ written explanations and reasoning for multiple-choice selections. Design/Method: We incorporated an open response component for an online signals and systems multiple-choice test to capture written explanations of students’ selections. The effectiveness of an automated approach for identifying and assessing student conceptual understanding was evaluated by comparing results of lexical analysis software packages (Leximancer and NVivo) to expert human analysis of student responses. In order to understand and delineate the process for effectively analysing text provided by students, the researchers evaluated strengths and weakness for both the human and automated approaches. Results: Human and automated analyses revealed both correct and incorrect associations for certain conceptual areas. For some questions, that were not anticipated or included in the distractor selections, showing how multiple-choice questions alone fail to capture the comprehensive picture of student understanding. The comparison of textual analysis methods revealed the capability of automated lexical analysis software to assist in the identification of concepts and their relationships for large textual data sets. We also identified several challenges to using automated analysis as well as the manual and computer-assisted analysis. Conclusions: This study highlighted the usefulness incorporating and analysing students’ reasoning or explanations in understanding how students think about certain conceptual ideas. The ultimate value of automating the evaluation of written explanations is that it can be applied more frequently and at various stages of instruction to formatively evaluate conceptual understanding and engage students in reflective

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Prescription medicine samples provided by pharmaceutical companies are predominantly newer and more expensive products. The range of samples provided to practices may not represent the drugs that the doctors desire to have available. Few studies have used a qualitative design to explore the reasons behind sample use. Objective The aim of this study was to explore the opinions of a variety of Australian key informants about prescription medicine samples, using a qualitative methodology. Methods Twenty-three organizations involved in quality use of medicines in Australia were identified, based on the authors' previous knowledge. Each organization was invited to nominate 1 or 2 representatives to participate in semistructured interviews utilizing seeding questions. Each interview was recorded and transcribed verbatim. Leximancer v2.25 text analysis software (Leximancer Pty Ltd., Jindalee, Queensland, Australia) was used for textual analysis. The top 10 concepts from each analysis group were interrogated back to the original transcript text to determine the main emergent opinions. Results A total of 18 key interviewees representing 16 organizations participated. Samples, patient, doctor, and medicines were the major concepts among general opinions about samples. The concept drug became more frequent and the concept companies appeared when marketing issues were discussed. The Australian Pharmaceutical Benefits Scheme and cost were more prevalent in discussions about alternative sample distribution models, indicating interviewees were cognizant of budgetary implications. Key interviewee opinions added richness to the single-word concepts extracted by Leximancer. Conclusions Participants recognized that prescription medicine samples have an influence on quality use of medicines and play a role in the marketing of medicines. They also believed that alternative distribution systems for samples could provide benefits. The cost of a noncommercial system for distributing samples or starter packs was a concern. These data will be used to design further research investigating alternative models for distribution of samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Engineers must have deep and accurate conceptual understanding of their field and Concept inventories (CIs) are one method of assessing conceptual understanding and providing formative feedback. Current CI tests use Multiple Choice Questions (MCQ) to identify misconceptions and have undergone reliability and validity testing to assess conceptual understanding. However, they do not readily provide the diagnostic information about students’ reasoning and therefore do not effectively point to specific actions that can be taken to improve student learning. We piloted the textual component of our diagnostic CI on electrical engineering students using items from the signals and systems CI. We then analysed the textual responses using automated lexical analysis software to test the effectiveness of these types of software and interviewed the students regarding their experience using the textual component. Results from the automated text analysis revealed that students held both incorrect and correct ideas for certain conceptual areas and provided indications of student misconceptions. User feedback also revealed that the inclusion of the textual component is helpful to students in assessing and reflecting on their own understanding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present the results of an exploratory study that examined the problem of automating content analysis of student online discussion transcripts. We looked at the problem of coding discussion transcripts for the levels of cognitive presence, one of the three main constructs in the Community of Inquiry (CoI) model of distance education. Using Coh-Metrix and LIWC features, together with a set of custom features developed to capture discussion context, we developed a random forest classification system that achieved 70.3% classification accuracy and 0.63 Cohen's kappa, which is significantly higher than values reported in the previous studies. Besides improvement in classification accuracy, the developed system is also less sensitive to overfitting as it uses only 205 classification features, which is around 100 times less features than in similar systems based on bag-of-words features. We also provide an overview of the classification features most indicative of the different phases of cognitive presence that gives an additional insights into the nature of cognitive presence learning cycle. Overall, our results show great potential of the proposed approach, with an added benefit of providing further characterization of the cognitive presence coding scheme.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantitative analysis of penetrative deformation in sedimentary rocks of fold and thrust belts has largely been carried out using clast based strain analysis techniques. These methods analyse the geometric deviations from an original state that populations of clasts, or strain markers, have undergone. The characterisation of these geometric changes, or strain, in the early stages of rock deformation is not entirely straight forward. This is in part due to the paucity of information on the original state of the strain markers, but also the uncertainty of the relative rheological properties of the strain markers and their matrix during deformation, as well as the interaction of two competing fabrics, such as bedding and cleavage. Furthermore one of the single largest setbacks for accurate strain analysis has been associated with the methods themselves, they are traditionally time consuming, labour intensive and results can vary between users. A suite of semi-automated techniques have been tested and found to work very well, but in low strain environments the problems discussed above persist. Additionally these techniques have been compared to Anisotropy of Magnetic Susceptibility (AMS) analyses, which is a particularly sensitive tool for the characterisation of low strain in sedimentary lithologies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nistor, N., Dascalu, M., Stavarache, L.L., Tarnai, C., & Trausan-Matu, S. (2015). Predicting Newcomer Integration in Online Knowledge Communities by Automated Dialog Analysis. In Y. Li, M. Chang, M. Kravcik, E. Popescu, R. Huang, Kinshuk & N.-S. Chen (Eds.), State-of-the-Art and Future Directions of Smart Learning (Vol. Lecture Notes in Educational Technology, pp. 13–17). Berlin, Germany: Springer-Verlag Singapur

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel methodology has been developed to quantify important saltwater intrusion parameters in a sandbox style experiment using image analysis. Existing methods found in the literature are based mainly on visual observations, which are subjective, labour intensive and limits the temporal and spatial resolutions that can be analysed. A robust error analysis was undertaken to determine the optimum methodology to convert image light intensity to concentration. Results showed that defining a relationship on a pixel-wise basis provided the most accurate image to concentration conversion and allowed quantification of the width of mixing zone between the saltwater and freshwater. A large image sample rate was used to investigate the transient dynamics of saltwater intrusion, which rendered analysis by visual observation unsuitable. This paper presents the methodologies developed to minimise human input and promote autonomy, provide high resolution image to concentration conversion and allow the quantification of intrusion parameters under transient conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The discovery and clinical application of molecular biomarkers in solid tumors, increasingly relies on nucleic acid extraction from FFPE tissue sections and subsequent molecular profiling. This in turn requires the pathological review of haematoxylin & eosin (H&E) stained slides, to ensure sample quality, tumor DNA sufficiency by visually estimating the percentage tumor nuclei and tumor annotation for manual macrodissection. In this study on NSCLC, we demonstrate considerable variation in tumor nuclei percentage between pathologists, potentially undermining the precision of NSCLC molecular evaluation and emphasising the need for quantitative tumor evaluation. We subsequently describe the development and validation of a system called TissueMark for automated tumor annotation and percentage tumor nuclei measurement in NSCLC using computerized image analysis. Evaluation of 245 NSCLC slides showed precise automated tumor annotation of cases using Tissuemark, strong concordance with manually drawn boundaries and identical EGFR mutational status, following manual macrodissection from the image analysis generated tumor boundaries. Automated analysis of cell counts for % tumor measurements by Tissuemark showed reduced variability and significant correlation (p < 0.001) with benchmark tumor cell counts. This study demonstrates a robust image analysis technology that can facilitate the automated quantitative analysis of tissue samples for molecular profiling in discovery and diagnostics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the applications of a novel methodology to quantify saltwater intrusion parameters in laboratory-scale experiments. The methodology uses an automated image analysis procedure, minimizing manual inputs and the subsequent systematic errors that can be introduced. This allowed the quantification of the width of the mixing zone which is difficult to measure in experimental methods that are based on visual observations. Glass beads of different grain sizes were tested for both steady-state and transient conditions. The transient results showed good correlation between experimental and numerical intrusion rates. The experimental intrusion rates revealed that the saltwater wedge reached a steady state condition sooner while receding than advancing. The hydrodynamics of the experimental mixing zone exhibited similar
traits; a greater increase in the width of the mixing zone was observed in the receding saltwater wedge, which indicates faster fluid velocities and higher dispersion. The angle of intrusion analysis revealed the formation of a volume of diluted saltwater at the toe position when the saltwater wedge is prompted to recede. In addition, results of different physical repeats of the experiment produced an average coefficient of variation less than 0.18 of the measured toe length and width of the mixing zone.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, different techniques for image analysis of high density microarrays have been investigated. Most of the existing image analysis techniques require prior knowledge of image specific parameters and direct user intervention for microarray image quantification. The objective of this research work was to develop of a fully automated image analysis method capable of accurately quantifying the intensity information from high density microarrays images. The method should be robust against noise and contaminations that commonly occur in different stages of microarray development.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

SAGA (System for Automated Geographic Analysis) es un SIG libre con capacidades para el manejo y análisis de información tanto vectorial como ráster, con un especial enfoque en esta última. Asimismo, es su enfoque analítico el que constituye su característica más destacable, siendo una herramienta de primer orden para la extracción de información a partir de todo tipo de capas de datos georeferenciados. (...)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have combined several key sample preparation steps for the use of a liquid matrix system to provide high analytical sensitivity in automated ultraviolet -- matrix-assisted laser desorption/ionisation -- mass spectrometry (UV-MALDI-MS). This new sample preparation protocol employs a matrix-mixture which is based on the glycerol matrix-mixture described by Sze et al. The low-femtomole sensitivity that is achievable with this new preparation protocol enables proteomic analysis of protein digests comparable to solid-state matrix systems. For automated data acquisition and analysis, the MALDI performance of this liquid matrix surpasses the conventional solid-state MALDI matrices. Besides the inherent general advantages of liquid samples for automated sample preparation and data acquisition the use of the presented liquid matrix significantly reduces the extent of unspecific ion signals in peptide mass fingerprints compared to typically used solid matrices, such as 2,5-dihydroxybenzoic acid (DHB) or alpha-cyano-hydroxycinnamic acid (CHCA). In particular, matrix and low-mass ion signals and ion signals resulting from cation adduct formation are dramatically reduced. Consequently, the confidence level of protein identification by peptide mass mapping of in-solution and in-gel digests is generally higher.