908 resultados para CLASSIFICATION AND REGRESSION TREE


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Research across several countries has shown that degree classification (i.e. the final grade awarded to students successfully completing university) is an important determinant of graduates’ first destination outcome. Graduates leaving university with higher degree classifications have better employment opportunities and a higher likelihood of continuing education relative to those with lower degree classifications. This article investigates whether one of the reasons for this result is that employers and higher education institutions use degree classification as a signalling device for the ability that recent graduates may possess. Given the large number of applicants and the amount of time and resources typically required to assess their skills, employers and higher education institutions may decide to rely on this measure when forming beliefs about recent graduates’ abilities. Using data on two cohorts of recent graduates from a UK university, results suggest that an Upper Second degree classification may have a signalling role.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Malware detection is a growing problem particularly on the Android mobile platform due to its increasing popularity and accessibility to numerous third party app markets. This has also been made worse by the increasingly sophisticated detection avoidance techniques employed by emerging malware families. This calls for more effective techniques for detection and classification of Android malware. Hence, in this paper we present an n-opcode analysis based approach that utilizes machine learning to classify and categorize Android malware. This approach enables automated feature discovery that eliminates the need for applying expert or domain knowledge to define the needed features. Our experiments on 2520 samples that were performed using up to 10-gram opcode features showed that an f-measure of 98% is achievable using this approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to optimize frontal detection in sea surface temperature fields at 4 km resolution, a combined statistical and expert-based approach is applied to test different spatial smoothing of the data prior to the detection process. Fronts are usually detected at 1 km resolution using the histogram-based, single image edge detection (SIED) algorithm developed by Cayula and Cornillon in 1992, with a standard preliminary smoothing using a median filter and a 3 × 3 pixel kernel. Here, detections are performed in three study regions (off Morocco, the Mozambique Channel, and north-western Australia) and across the Indian Ocean basin using the combination of multiple windows (CMW) method developed by Nieto, Demarcq and McClatchie in 2012 which improves on the original Cayula and Cornillon algorithm. Detections at 4 km and 1 km of resolution are compared. Fronts are divided in two intensity classes (“weak” and “strong”) according to their thermal gradient. A preliminary smoothing is applied prior to the detection using different convolutions: three type of filters (median, average and Gaussian) combined with four kernel sizes (3 × 3, 5 × 5, 7 × 7, and 9 × 9 pixels) and three detection window sizes (16 × 16, 24 × 24 and 32 × 32 pixels) to test the effect of these smoothing combinations on reducing the background noise of the data and therefore on improving the frontal detection. The performance of the combinations on 4 km data are evaluated using two criteria: detection efficiency and front length. We find that the optimal combination of preliminary smoothing parameters in enhancing detection efficiency and preserving front length includes a median filter, a 16 × 16 pixel window size, and a 5 × 5 pixel kernel for strong fronts and a 7 × 7 pixel kernel for weak fronts. Results show an improvement in detection performance (from largest to smallest window size) of 71% for strong fronts and 120% for weak fronts. Despite the small window used (16 × 16 pixels), the length of the fronts has been preserved relative to that found with 1 km data. This optimal preliminary smoothing and the CMW detection algorithm on 4 km sea surface temperature data are then used to describe the spatial distribution of the monthly frequencies of occurrence for both strong and weak fronts across the Indian Ocean basin. In general strong fronts are observed in coastal areas whereas weak fronts, with some seasonal exceptions, are mainly located in the open ocean. This study shows that adequate noise reduction done by a preliminary smoothing of the data considerably improves the frontal detection efficiency as well as the global quality of the results. Consequently, the use of 4 km data enables frontal detections similar to 1 km data (using a standard median 3 × 3 convolution) in terms of detectability, length and location. This method, using 4 km data is easily applicable to large regions or at the global scale with far less constraints of data manipulation and processing time relative to 1 km data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Endogenous and environmental variables are fundamental in explaining variations in fish condition. Based on more than 20 yr of fish weight and length data, relative condition indices were computed for anchovy and sardine caught in the Gulf of Lions. Classification and regression trees (CART) were used to identify endogenous factors affecting fish condition, and to group years of similar condition. Both species showed a similar annual cycle with condition being minimal in February and maximal in July. CART identified 3 groups of years where the fish populations generally showed poor, average and good condition and within which condition differed between age classes but not according to sex. In particular, during the period of poor condition (mostly recent years), sardines older than 1 yr appeared to be more strongly affected than younger individuals. Time-series were analyzed using generalized linear models (GLMs) to examine the effects of oceanographic abiotic (temperature, Western Mediterranean Oscillation [WeMO] and Rhone outflow) and biotic (chlorophyll a and 6 plankton classes) factors on fish condition. The selected models explained 48 and 35% of the variance of anchovy and sardine condition, respectively. Sardine condition was negatively related to temperature but positively related to the WeMO and mesozooplankton and diatom concentrations. A positive effect of mesozooplankton and Rhone runoff on anchovy condition was detected. The importance of increasing temperatures and reduced water mixing in the NW Mediterranean Sea, affecting planktonic productivity and thus fish condition by bottom-up control processes, was highlighted by these results. Changes in plankton quality, quantity and phenology could lead to insufficient or inadequate food supply for both species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Blastic plasmacytoid dendritic cell neoplasm (BPDCN) is a rare subtype of leukemia/lymphoma, whose diagnosis can be difficult to achieve due to its clinical and biological heterogeneity, as well as its overlapping features with other hematologic malignancies. In this study we investigated whether the association between the maturational stage of tumor cells and the clinico-biological and prognostic features of the disease, based on the analysis of 46 BPDCN cases classified into three maturation-associated subgroups on immunophenotypic grounds. Our results show that blasts from cases with an immature plasmacytoid dendritic cell (pDC) phenotype exhibit an uncommon CD56- phenotype, coexisting with CD34+ non-pDC tumor cells, typically in the absence of extramedullary (e.g. skin) disease at presentation. Conversely, patients with a more mature blast cell phenotype more frequently displayed skin/extramedullary involvement and spread into secondary lymphoid tissues. Despite the dismal outcome, acute lymphoblastic leukemia-type therapy (with central nervous system prophylaxis) and/or allogeneic stem cell transplantation appeared to be the only effective therapies. Overall, our findings indicate that the maturational profile of pDC blasts in BPDCN is highly heterogeneous and translates into a wide clinical spectrum -from acute leukemia to mature lymphoma-like behavior-, which may also lead to variable diagnosis and treatment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding the mode-locked response of excitable systems to periodic forcing has important applications in neuroscience. For example it is known that spatially extended place cells in the hippocampus are driven by the theta rhythm to generate a code conveying information about spatial location. Thus it is important to explore the role of neuronal dendrites in generating the response to periodic current injection. In this paper we pursue this using a compartmental model, with linear dynamics for each compartment, coupled to an active soma model that generates action potentials. By working with the piece-wise linear McKean model for the soma we show how the response of the whole neuron model (soma and dendrites) can be written in closed form. We exploit this to construct a stroboscopic map describing the response of the spatially extended model to periodic forcing. A linear stability analysis of this map, together with a careful treatment of the non-differentiability of the soma model, allows us to construct the Arnol'd tongue structure for 1:q states (one action potential for q cycles of forcing). Importantly we show how the presence of quasi-active membrane in the dendrites can influence the shape of tongues. Direct numerical simulations confirm our theory and further indicate that resonant dendritic membrane can enlarge the windows in parameter space for chaotic behavior. These simulations also show that the spatially extended neuron model responds differently to global as opposed to point forcing. In the former case spatio-temporal patterns of activity within an Arnol'd tongue are standing waves, whilst in the latter they are traveling waves.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mestrado em Ciências Actuariais

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Araucaria angustifolia é uma espécie nativa potencial para a silvicultura brasileira. No entanto, uma série de desafios e limitações técnicas ainda persistem, dificultando sua expansão silvicultural, dentre os quais se destaca a falta de tecnologias de clonagem de materiais genéticos superiores, bem como sua avaliação em condições de campo. Assim, objetivou-se avaliar a potencialidade da utilização de mudas de araucária oriundas de estaquia e de sementes para produção madeireira, por meio da avaliação da sobrevivência e crescimento a campo. Clones provenientes de matrizes masculinas e femininas, de diferentes tipos de estacas e mudas de sementes foram plantadas em espaçamento 3 x 3 m. O experimento foi conduzido em delineamento inteiramente casualizado, com três tratamentos e parcelas de uma planta (one tree plot). Clones do sexo feminino e de estacas contendo o ápice apresentaram maior crescimento em diâmetro à altura do peito (6,4 cm) e altura total (3,6 m) aos 74 meses após o plantio, seguidas das de sementes e demais clones, com resultados similares. Conclui-se que a estaquia é uma técnica potencial de produção de mudas de araucária para fins madeireiros e é favorecida pela utilização de estacas proveniente de matrizes femininas e com ápice.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In knowledge technology work, as expressed by the scope of this conference, there are a number of communities, each uncovering new methods, theories, and practices. The Library and Information Science (LIS) community is one such community. This community, through tradition and innovation, theories and practice, organizes knowledge and develops knowledge technologies formed by iterative research hewn to the values of equal access and discovery for all. The Information Modeling community is another contributor to knowledge technologies. It concerns itself with the construction of symbolic models that capture the meaning of information and organize it in ways that are computer-based, but human understandable. A recent paper that examines certain assumptions in information modeling builds a bridge between these two communities, offering a forum for a discussion on common aims from a common perspective. In a June 2000 article, Parsons and Wand separate classes from instances in information modeling in order to free instances from what they call the “tyranny” of classes. They attribute a number of problems in information modeling to inherent classification – or the disregard for the fact that instances can be conceptualized independent of any class assignment. By faceting instances from classes, Parsons and Wand strike a sonorous chord with classification theory as understood in LIS. In the practice community and in the publications of LIS, faceted classification has shifted the paradigm of knowledge organization theory in the twentieth century. Here, with the proposal of inherent classification and the resulting layered information modeling, a clear line joins both the LIS classification theory community and the information modeling community. Both communities have their eyes turned toward networked resource discovery, and with this conceptual conjunction a new paradigmatic conversation can take place. Parsons and Wand propose that the layered information model can facilitate schema integration, schema evolution, and interoperability. These three spheres in information modeling have their own connotation, but are not distant from the aims of classification research in LIS. In this new conceptual conjunction, established by Parsons and Ward, information modeling through the layered information model, can expand the horizons of classification theory beyond LIS, promoting a cross-fertilization of ideas on the interoperability of subject access tools like classification schemes, thesauri, taxonomies, and ontologies. This paper examines the common ground between the layered information model and faceted classification, establishing a vocabulary and outlining some common principles. It then turns to the issue of schema and the horizons of conventional classification and the differences between Information Modeling and Library and Information Science. Finally, a framework is proposed that deploys an interpretation of the layered information modeling approach in a knowledge technologies context. In order to design subject access systems that will integrate, evolve and interoperate in a networked environment, knowledge organization specialists must consider a semantic class independence like Parsons and Wand propose for information modeling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper outlines three information organization frameworks: library classification, social tagging, and boundary infrastructures. It then outlines functionality of these frameworks. The paper takes a neo-pragmatic approach. The paper finds that these frameworks are complementary, and by understanding the differences and similarities that obtain between them, researchers and developers can begin to craft a vocabulary of evaluation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deep learning methods are extremely promising machine learning tools to analyze neuroimaging data. However, their potential use in clinical settings is limited because of the existing challenges of applying these methods to neuroimaging data. In this study, first a data leakage type caused by slice-level data split that is introduced during training and validation of a 2D CNN is surveyed and a quantitative assessment of the model’s performance overestimation is presented. Second, an interpretable, leakage-fee deep learning software written in a python language with a wide range of options has been developed to conduct both classification and regression analysis. The software was applied to the study of mild cognitive impairment (MCI) in patients with small vessel disease (SVD) using multi-parametric MRI data where the cognitive performance of 58 patients measured by five neuropsychological tests is predicted using a multi-input CNN model taking brain image and demographic data. Each of the cognitive test scores was predicted using different MRI-derived features. As MCI due to SVD has been hypothesized to be the effect of white matter damage, DTI-derived features MD and FA produced the best prediction outcome of the TMT-A score which is consistent with the existing literature. In a second study, an interpretable deep learning system aimed at 1) classifying Alzheimer disease and healthy subjects 2) examining the neural correlates of the disease that causes a cognitive decline in AD patients using CNN visualization tools and 3) highlighting the potential of interpretability techniques to capture a biased deep learning model is developed. Structural magnetic resonance imaging (MRI) data of 200 subjects was used by the proposed CNN model which was trained using a transfer learning-based approach producing a balanced accuracy of 71.6%. Brain regions in the frontal and parietal lobe showing the cerebral cortex atrophy were highlighted by the visualization tools.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The dissertation addresses the still not solved challenges concerned with the source-based digital 3D reconstruction, visualisation and documentation in the domain of archaeology, art and architecture history. The emerging BIM methodology and the exchange data format IFC are changing the way of collaboration, visualisation and documentation in the planning, construction and facility management process. The introduction and development of the Semantic Web (Web 3.0), spreading the idea of structured, formalised and linked data, offers semantically enriched human- and machine-readable data. In contrast to civil engineering and cultural heritage, academic object-oriented disciplines, like archaeology, art and architecture history, are acting as outside spectators. Since the 1990s, it has been argued that a 3D model is not likely to be considered a scientific reconstruction unless it is grounded on accurate documentation and visualisation. However, these standards are still missing and the validation of the outcomes is not fulfilled. Meanwhile, the digital research data remain ephemeral and continue to fill the growing digital cemeteries. This study focuses, therefore, on the evaluation of the source-based digital 3D reconstructions and, especially, on uncertainty assessment in the case of hypothetical reconstructions of destroyed or never built artefacts according to scientific principles, making the models shareable and reusable by a potentially wide audience. The work initially focuses on terminology and on the definition of a workflow especially related to the classification and visualisation of uncertainty. The workflow is then applied to specific cases of 3D models uploaded to the DFG repository of the AI Mainz. In this way, the available methods of documenting, visualising and communicating uncertainty are analysed. In the end, this process will lead to a validation or a correction of the workflow and the initial assumptions, but also (dealing with different hypotheses) to a better definition of the levels of uncertainty.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spectral sensors are a wide class of devices that are extremely useful for detecting essential information of the environment and materials with high degree of selectivity. Recently, they have achieved high degrees of integration and low implementation cost to be suited for fast, small, and non-invasive monitoring systems. However, the useful information is hidden in spectra and it is difficult to decode. So, mathematical algorithms are needed to infer the value of the variables of interest from the acquired data. Between the different families of predictive modeling, Principal Component Analysis and the techniques stemmed from it can provide very good performances, as well as small computational and memory requirements. For these reasons, they allow the implementation of the prediction even in embedded and autonomous devices. In this thesis, I will present 4 practical applications of these algorithms to the prediction of different variables: moisture of soil, moisture of concrete, freshness of anchovies/sardines, and concentration of gasses. In all of these cases, the workflow will be the same. Initially, an acquisition campaign was performed to acquire both spectra and the variables of interest from samples. Then these data are used as input for the creation of the prediction models, to solve both classification and regression problems. From these models, an array of calibration coefficients is derived and used for the implementation of the prediction in an embedded system. The presented results will show that this workflow was successfully applied to very different scientific fields, obtaining autonomous and non-invasive devices able to predict the value of physical parameters of choice from new spectral acquisitions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Workflow activity was the following: Preliminary phase: Identification of 18 Formalin-fixed paraffin embedded (FFPE) samples (9 patients) («matched» 9 AK lesions and 9 SCC lesions). Working on biopsies samples we perform an extraction and RNA analysis with droplet Digital PCR (ddPCR) and we perform the data analysis. Second and final step phase: Evaluation of additional 39 subjects (36 men and 3 women). Results: We perform an evaluation and comparison of the following miRNA: miR-320 (a miRNA involved in apoptosis and cell proliferation control; miR-204, a miRNA involved in cell proliferation in and miRNA-16-5p, a miRNA involved in apoptosis).Conclusion: Our data suggest that there is no significant variation in the expression of the three tested microRNAs between adjacent AK lesions and squamous-cell carcinoma. However, a relevant trend has been observed Furthermore, by evaluating the miRNA expression trend between keratosis and carcinoma of the same patient, it is observed that there is no "uniform trend": for some samples the expression rises for the transition from AK to SCC and viceversa.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Esse estudo analisa dados do vestibular da Universidade Federal de Minas Gerais de 2004, mediante um modelo de regressão não paramétrico, o Classification and Regression Trees. Seu objetivo foi identificar os principais fatores de aprovação e, também, verificar se esses fatores eram os mesmos para os cursos diurnos e noturnos. A resposta a essas questões permitiria verificar se a expansão do turno noturno feita por essa universidade vinha promovendo maior inserção social. Observou-se que, em geral, a conclusão do ensino médio em escolas públicas federais ou particulares, o conhecimento de língua estrangeira e o pertencimento a um grupo socioeconômico alto são fatores fortemente associados à aprovação do candidato. Verificou-se, ainda, que nos cursos noturnos as variáveis socioeconômicas têm maior relevância, enquanto nos cursos diurnos a formação do candidato adquire maior peso. Finalmente, o fator socioeconômico médio tende a ser maior para os candidatos aprovados.