Biblioteca Digital

270 resultados para tree size classes

Pyktree : a K-tree implementation in Python

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present pyktree, an implementation of the K-tree algorithm in the Python programming language. The K-tree algorithm provides highly balanced search trees for vector quantization that scales up to very large data sets. Pyktree is highly modular and well suited for rapid-prototyping of novel distance measures and centroid representations. It is easy to install and provides a python package for library use as well as command line tools.

Sustainable assessment for large science classes : non-multiple choice, randomised assignments through a Learning Management System

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reports on the development of a tool that generates randomised, non-multiple choice assessment within the BlackBoard Learning Management System interface. An accepted weakness of multiple-choice assessment is that it cannot elicit learning outcomes from upper levels of Biggs’ SOLO taxonomy. However, written assessment items require extensive resources for marking, and are susceptible to copying as well as marking inconsistencies for large classes. This project developed an assessment tool which is valid, reliable and sustainable and that addresses the issues identified above. The tool provides each student with an assignment assessing the same learning outcomes, but containing different questions, with responses in the form of words or numbers. Practice questions are available, enabling students to obtain feedback on their approach before submitting their assignment. Thus, the tool incorporates automatic marking (essential for large classes), randomised tasks to each student (reducing copying), the capacity to give credit for working (feedback on the application of theory), and the capacity to target higher order learning outcomes by requiring students to derive their answers rather than choosing them. Results and feedback from students are presented, along with technical implementation details.

Multiple evolutionary rate classes in animal genome evolution

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The proportion of functional sequence in the human genome is currently a subject of debate. The most widely accepted figure is that approximately 5% is under purifying selection. In Drosophila, estimates are an order of magnitude higher, though this corresponds to a similar quantity of sequence. These estimates depend on the difference between the distribution of genomewide evolutionary rates and that observed in a subset of sequences presumed to be neutrally evolving. Motivated by the widening gap between these estimates and experimental evidence of genome function, especially in mammals, we developed a sensitive technique for evaluating such distributions and found that they are much more complex than previously apparent. We found strong evidence for at least nine well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least seven classes in an alignment of four mammals, including human. We also identified at least three rate classes in human ancestral repeats. By positing that the largest of these ancestral repeat classes is neutrally evolving, we estimate that the proportion of nonneutrally evolving sequence is 30% of human ancestral repeats and 45% of the aligned portion of the genome. However, we also question whether any of the classes represent neutrally evolving sequences and argue that a plausible alternative is that they reflect variable structure-function constraints operating throughout the genomes of complex organisms.

Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation

Relevância:

20.00% 20.00%

Publicador:

Synthesis And Characterisation of Titania Nanotubes: Effect of Phase And Crystallite Size on Nanotube Formation

Relevância:

20.00% 20.00%

Publicador:

Preparation of Mesoporous Cadmium Sulfide Nanoparticles with Moderate Pore Size

Relevância:

20.00% 20.00%

Publicador:

Whole-proteome phylogeny of large dsDNA viruses and parvoviruses through a composition vector method related to dynamical language model

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size.

Mesoporous structure with size controllable anatase attached on silicate layers for efficient photocatalysis

Relevância:

20.00% 20.00%

Publicador:

Size-controllable synthesis of chromium oxyhydroxide nanomaterials using a soft chemical hydrothermal route

Relevância:

20.00% 20.00%

Publicador:

Thermal performance optimization of building aspect ratio and south window size in five cities having different climatic characteristics of Turkey

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of the study is to establish optimum building aspect ratios and south window sizes of residential buildings from thermal performance point of view. The effects of 6 different building aspect ratios and eight different south window sizes for each building aspect ratio are analyzed for apartments located at intermediate floors of buildings, by the aid of the computer based thermal analysis program SUNCODE-PC in five cities of Turkey: Erzurum, Ankara, Diyarbakir, Izmir, and Antalya. The results are evaluated in terms of annual energy consumption and the optimum values are driven. Comparison of optimum values and the total energy consumption rates is made among the analyzed cities.

The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A3 √((log n)/m) (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.

Identification of area-level influences on regions of high cancer incidence in Queensland, Australia : a classification tree approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Strategies for cancer reduction and management are targeted at both individual and area levels. Area-level strategies require careful understanding of geographic differences in cancer incidence, in particular the association with factors such as socioeconomic status, ethnicity and accessibility. This study aimed to identify the complex interplay of area-level factors associated with high area-specific incidence of Australian priority cancers using a classification and regression tree (CART) approach. Methods: Area-specific smoothed standardised incidence ratios were estimated for priority-area cancers across 478 statistical local areas in Queensland, Australia (1998-2007, n=186,075). For those cancers with significant spatial variation, CART models were used to identify whether area-level accessibility, socioeconomic status and ethnicity were associated with high area-specific incidence. Results: The accessibility of a person’s residence had the most consistent association with the risk of cancer diagnosis across the specific cancers. Many cancers were likely to have high incidence in more urban areas, although male lung cancer and cervical cancer tended to have high incidence in more remote areas. The impact of socioeconomic status and ethnicity on these associations differed by type of cancer. Conclusions: These results highlight the complex interactions between accessibility, socioeconomic status and ethnicity in determining cancer incidence risk.

Quantifying the effect size of walking interventions to reduce the risk of coronary heart disease: meta-analysis

Relevância:

20.00% 20.00%

Publicador:

Monte Carlo modelling of a-Si EPID response: the effect of spectral variations with field size and position

Relevância:

20.00% 20.00%

Publicador:

Differential expression of focimatrix and steroidogenic enzymes before size deviation during waves of follicular development in bovine ovarian follicles

Relevância:

20.00% 20.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
17
18
»