Biblioteca Digital

879 resultados para Gender classification model

Expressive power of binary relevance and chain classifiers based on Bayesian Networks for multi-label classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bayesian network classifiers are widely used in machine learning because they intuitively represent causal relations. Multi-label classification problems require each instance to be assigned a subset of a defined set of h labels. This problem is equivalent to finding a multi-valued decision function that predicts a vector of h binary classes. In this paper we obtain the decision boundaries of two widely used Bayesian network approaches for building multi-label classifiers: Multi-label Bayesian network classifiers built using the binary relevance method and Bayesian network chain classifiers. We extend our previous single-label results to multi-label chain classifiers, and we prove that, as expected, chain classifiers provide a more expressive model than the binary relevance method.

DAEDALUS at PAN 2014: Guessing tweet author's gender and age

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes our participation at PAN 2014 author profiling task. Our idea was to define, develop and evaluate a simple machine learning classifier able to guess the gender and the age of a given user based on his/her texts, which could become part of the solution portfolio of the company. We were interested in finding not the best possible classifier that achieves the highest accuracy, but to find the optimum balance between performance and throughput using the most simple strategy and less dependent of external systems. Results show that our software using Naive Bayes Multinomial with a term vector model representation of the text is ranked quite well among the rest of participants in terms of accuracy.

Multi-dimensional classification of GABAergic interneurons with Bayesian network-modeled label uncertainty

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Interneuron classification is an important and long-debated topic in neuroscience. A recent study provided a data set of digitally reconstructed interneurons classified by 42 leading neuroscientists according to a pragmatic classification scheme composed of five categorical variables, namely, of the interneuron type and four features of axonal morphology. From this data set we now learned a model which can classify interneurons, on the basis of their axonal morphometric parameters, into these five descriptive variables simultaneously. Because of differences in opinion among the neuroscientists, especially regarding neuronal type, for many interneurons we lacked a unique, agreed-upon classification, which we could use to guide model learning. Instead, we guided model learning with a probability distribution over the neuronal type and the axonal features, obtained, for each interneuron, from the neuroscientists’ classification choices. We conveniently encoded such probability distributions with Bayesian networks, calling them label Bayesian networks (LBNs), and developed a method to predict them. This method predicts an LBN by forming a probabilistic consensus among the LBNs of the interneurons most similar to the one being classified. We used 18 axonal morphometric parameters as predictor variables, 13 of which we introduce in this paper as quantitative counterparts to the categorical axonal features. We were able to accurately predict interneuronal LBNs. Furthermore, when extracting crisp (i.e., non-probabilistic) predictions from the predicted LBNs, our method outperformed related work on interneuron classification. Our results indicate that our method is adequate for multi-dimensional classification of interneurons with probabilistic labels. Moreover, the introduced morphometric parameters are good predictors of interneuron type and the four features of axonal morphology and thus may serve as objective counterparts to the subjective, categorical axonal features.

Multi-dimensional classification of GABAergic interneurons with Bayesian network-modeled label uncertainty

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interneuron classification is an important and long-debated topic in neuroscience. A recent study provided a data set of digitally reconstructed interneurons classified by 42 leading neuroscientists according to a pragmatic classification scheme composed of five categorical variables, namely, of the interneuron type and four features of axonal morphology. From this data set we now learned a model which can classify interneurons, on the basis of their axonal morphometric parameters, into these five descriptive variables simultaneously. Because of differences in opinion among the neuroscientists, especially regarding neuronal type, for many interneurons we lacked a unique, agreed-upon classification, which we could use to guide model learning. Instead, we guided model learning with a probability distribution over the neuronal type and the axonal features, obtained, for each interneuron, from the neuroscientists’ classification choices. We conveniently encoded such probability distributions with Bayesian networks, calling them label Bayesian networks (LBNs), and developed a method to predict them. This method predicts an LBN by forming a probabilistic consensus among the LBNs of the interneurons most similar to the one being classified. We used 18 axonal morphometric parameters as predictor variables, 13 of which we introduce in this paper as quantitative counterparts to the categorical axonal features. We were able to accurately predict interneuronal LBNs. Furthermore, when extracting crisp (i.e., non-probabilistic) predictions from the predicted LBNs, our method outperformed related work on interneuron classification. Our results indicate that our method is adequate for multi-dimensional classification of interneurons with probabilistic labels. Moreover, the introduced morphometric parameters are good predictors of interneuron type and the four features of axonal morphology and thus may serve as objective counterparts to the subjective, categorical axonal features.

Multi-dimensional classification with super-classes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The multi-dimensional classification problem is a generalisation of the recently-popularised task of multi-label classification, where each data instance is associated with multiple class variables. There has been relatively little research carried out specific to multi-dimensional classification and, although one of the core goals is similar (modelling dependencies among classes), there are important differences; namely a higher number of possible classifications. In this paper we present method for multi-dimensional classification, drawing from the most relevant multi-label research, and combining it with important novel developments. Using a fast method to model the conditional dependence between class variables, we form super-class partitions and use them to build multi-dimensional learners, learning each super-class as an ordinary class, and thus explicitly modelling class dependencies. Additionally, we present a mechanism to deal with the many class values inherent to super-classes, and thus make learning efficient. To investigate the effectiveness of this approach we carry out an empirical evaluation on a range of multi-dimensional datasets, under different evaluation metrics, and in comparison with high-performing existing multi-dimensional approaches from the literature. Analysis of results shows that our approach offers important performance gains over competing methods, while also exhibiting tractable running time.

Classifying GABAergic interneurons with semi-supervised projected model-based clustering

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives: A recently introduced pragmatic scheme promises to be a useful catalog of interneuron names.We sought to automatically classify digitally reconstructed interneuronal morphologies according tothis scheme. Simultaneously, we sought to discover possible subtypes of these types that might emergeduring automatic classification (clustering). We also investigated which morphometric properties weremost relevant for this classification.Materials and methods: A set of 118 digitally reconstructed interneuronal morphologies classified into thecommon basket (CB), horse-tail (HT), large basket (LB), and Martinotti (MA) interneuron types by 42 of theworld?s leading neuroscientists, quantified by five simple morphometric properties of the axon and fourof the dendrites. We labeled each neuron with the type most commonly assigned to it by the experts. Wethen removed this class information for each type separately, and applied semi-supervised clustering tothose cells (keeping the others? cluster membership fixed), to assess separation from other types and lookfor the formation of new groups (subtypes). We performed this same experiment unlabeling the cells oftwo types at a time, and of half the cells of a single type at a time. The clustering model is a finite mixtureof Gaussians which we adapted for the estimation of local (per-cluster) feature relevance. We performedthe described experiments on three different subsets of the data, formed according to how many expertsagreed on type membership: at least 18 experts (the full data set), at least 21 (73 neurons), and at least26 (47 neurons).Results: Interneurons with more reliable type labels were classified more accurately. We classified HTcells with 100% accuracy, MA cells with 73% accuracy, and CB and LB cells with 56% and 58% accuracy,respectively. We identified three subtypes of the MA type, one subtype of CB and LB types each, andno subtypes of HT (it was a single, homogeneous type). We got maximum (adapted) Silhouette widthand ARI values of 1, 0.83, 0.79, and 0.42, when unlabeling the HT, CB, LB, and MA types, respectively,confirming the quality of the formed cluster solutions. The subtypes identified when unlabeling a singletype also emerged when unlabeling two types at a time, confirming their validity. Axonal morphometricproperties were more relevant that dendritic ones, with the axonal polar histogram length in the [pi, 2pi) angle interval being particularly useful.Conclusions: The applied semi-supervised clustering method can accurately discriminate among CB, HT, LB, and MA interneuron types while discovering potential subtypes, and is therefore useful for neuronal classification. The discovery of potential subtypes suggests that some of these types are more heteroge-neous that previously thought. Finally, axonal variables seem to be more relevant than dendritic ones fordistinguishing among the CB, HT, LB, and MA interneuron types.

Bayesian network modeling of the consensus between experts: an application to neuron classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuronal morphology is hugely variable across brain regions and species, and their classification strategies are a matter of intense debate in neuroscience. GABAergic cortical interneurons have been a challenge because it is difficult to find a set of morphological properties which clearly define neuronal types. A group of 48 neuroscience experts around the world were asked to classify a set of 320 cortical GABAergic interneurons according to the main features of their three-dimensional morphological reconstructions. A methodology for building a model which captures the opinions of all the experts was proposed. First, one Bayesian network was learned for each expert, and we proposed an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts was induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts was built. A thorough analysis of the consensus model identified different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types was defined by performing inference in the Bayesian multinet. These findings were used to validate the model and to gain some insights into neuron morphology.

Autonomous classification models in ubiquitous environments

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stream-mining approach is defined as a set of cutting-edge techniques designed to process streams of data in real time, in order to extract knowledge. In the particular case of classification, stream-mining has to adapt its behaviour to the volatile underlying data distributions, what has been called concept drift. Moreover, it is important to note that concept drift may lead to situations where predictive models become invalid and have therefore to be updated to represent the actual concepts that data poses. In this context, there is a specific type of concept drift, known as recurrent concept drift, where the concepts represented by data have already appeared in the past. In those cases the learning process could be saved or at least minimized by applying a previously trained model. This could be extremely useful in ubiquitous environments that are characterized by the existence of resource constrained devices. To deal with the aforementioned scenario, meta-models can be used in the process of enhancing the drift detection mechanisms used by data stream algorithms, by representing and predicting when the change will occur. There are some real-world situations where a concept reappears, as in the case of intrusion detection systems (IDS), where the same incidents or an adaptation of them usually reappear over time. In these environments the early prediction of drift by means of a better knowledge of past models can help to anticipate to the change, thus improving efficiency of the model regarding the training instances needed. By means of using meta-models as a recurrent drift detection mechanism, the ability to share concepts representations among different data mining processes is open. That kind of exchanges could improve the accuracy of the resultant local model as such model may benefit from patterns similar to the local concept that were observed in other scenarios, but not yet locally. This would also improve the efficiency of training instances used during the classification process, as long as the exchange of models would aid in the application of already trained recurrent models, that have been previously seen by any of the collaborative devices. Which it is to say that the scope of recurrence detection and representation is broaden. In fact the detection, representation and exchange of concept drift patterns would be extremely useful for the law enforcement activities fighting against cyber crime. Being the information exchange one of the main pillars of cooperation, national units would benefit from the experience and knowledge gained by third parties. Moreover, in the specific scope of critical infrastructures protection it is crucial to count with information exchange mechanisms, both from a strategical and technical scope. The exchange of concept drift detection schemes in cyber security environments would aid in the process of preventing, detecting and effectively responding to threads in cyber space. Furthermore, as a complement of meta-models, a mechanism to assess the similarity between classification models is also needed when dealing with recurrent concepts. In this context, when reusing a previously trained model a rough comparison between concepts is usually made, applying boolean logic. The introduction of fuzzy logic comparisons between models could lead to a better efficient reuse of previously seen concepts, by applying not just equal models, but also similar ones. This work faces the aforementioned open issues by means of: the MMPRec system, that integrates a meta-model mechanism and a fuzzy similarity function; a collaborative environment to share meta-models between different devices; a recurrent drift generator that allows to test the usefulness of recurrent drift systems, as it is the case of MMPRec. Moreover, this thesis presents an experimental validation of the proposed contributions using synthetic and real datasets.

Atlantic salmon (Salmo salar) parr as a model to predict the optimum inclusion of air classified faba bean protein concentrate in feeds for seawater salmon

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Peer reviewed

Estimating the probability for a protein to have a new fold: A statistical computational model

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Structural genomics aims to solve a large number of protein structures that represent the protein space. Currently an exhaustive solution for all structures seems prohibitively expensive, so the challenge is to define a relatively small set of proteins with new, currently unknown folds. This paper presents a method that assigns each protein with a probability of having an unsolved fold. The method makes extensive use of protomap, a sequence-based classification, and scop, a structure-based classification. According to protomap, the protein space encodes the relationship among proteins as a graph whose vertices correspond to 13,354 clusters of proteins. A representative fold for a cluster with at least one solved protein is determined after superposition of all scop (release 1.37) folds onto protomap clusters. Distances within the protomap graph are computed from each representative fold to the neighboring folds. The distribution of these distances is used to create a statistical model for distances among those folds that are already known and those that have yet to be discovered. The distribution of distances for solved/unsolved proteins is significantly different. This difference makes it possible to use Bayes' rule to derive a statistical estimate that any protein has a yet undetermined fold. Proteins that score the highest probability to represent a new fold constitute the target list for structural determination. Our predicted probabilities for unsolved proteins correlate very well with the proportion of new folds among recently solved structures (new scop 1.39 records) that are disjoint from our original training set.

Inovação e indústria da moda: um modelo de inovação em estilos e tendências.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Este trabalho tem como intuito propor um modelo de inovação para a indústria da moda feminina. O modelo visa compreender o comportamento de estilos e tendências determinados e difundidos pelas empresas. A construção deste modelo é justificada pela contribuição que um estudo sobre inovação pode proporcionar à indústria da moda, a qual enfrenta baixos padrões de competitividade no mercado externo e interno. Além disso, embora existam muitos artigos sobre o assunto, poucos foram os modelos de inovação para a indústria da moda encontrados por esta pesquisa. Uma avaliação destes modelos indicou que existe espaço para a proposta de um modelo que aborde o comportamento de estilos e tendências ao longo do tempo. A estrutura de composição do modelo é sustentada por três pilares conceituais: teoria econômica neoschumpeteriana, modelos de inovação e modelos de inovação para a indústria da moda. A característica central do modelo é avaliar se existem estilos que permanecem em moda de maneira contínua ou descontínua. Como existe similaridade conceitual entre os estilos, no que se refere à identidade de gênero (androginia e feminilidade), foi efetuada uma aglutinação de alguns estilos dentro desta denominação. Nem todos os estilos se encaixaram nesta classificação. Então, estes estilos foram denominados como neutros. Como a pesquisa tem abordagem fenomenológica, qualitativa e longitudinal, foi adotada a metodologia hipotética dedutiva para a construção do modelo. Para verificação da validade das hipóteses foi usada uma análise exploratória dos dados por meio de estatística descritiva e decomposição da estrutura de variabilidade através de uma análise de componentes principais (PCA). Ambas as análises forneceram evidências a respeito das hipóteses em questão, as quais também foram testadas através de um teste binomial e de uma análise de variância multivariada por meio de permutações. Os resultados comprovaram que existem estilos que permanecem em moda de maneira contínua e que existem períodos de polarização das aglutinações de estilo.

Breaking the silence ceiling: how gender violence became into a social problem in Spain through media and politics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research presents the explanatory model of the process of reconstruction of the ʺsocial problemʺ of Intimate Partner Violence (I.P.V) in Spain during last five years, with special attention to the role of media in this process. Using a content analysis of the three more diffused general newspapers, a content analysis of the minutes of the Parliament, and the statistics of the police reports and murders, from January of 1997 to December of 2001, it observes the relationship between the evolution of the incidence of Intimate Partner Violence (I.P.V) (measured by the number of deaths and the number of police reports) and the evolution of stories about this topic in press. It also studies the interconnection of the two previous variables with the political answer to the problem (measured by the interventions on the I.P.V. in the Senate and in the Congress). Data shows that, even though police reports have increased due to the contribution of politics and media, I.P.V murders keep on growing up.

A Model to Visualize Information in a Complex Streets’ Network

Relevância:

30.00% 30.00%

Publicador:

Resumo:

his paper discusses a process to graphically view and analyze information obtained from a network of urban streets, using an algorithm that establishes a ranking of importance of the nodes of the network itself. The basis of this process is to quantify the network information obtained by assigning numerical values to each node, representing numerically the information. These values are used to construct a data matrix that allows us to apply a classification algorithm of nodes in a network in order of importance. From this numerical ranking of the nodes, the process finish with the graphical visualization of the network. An example is shown to illustrate the whole process.

Gender Equality in the Case Law of the European Court of Justice. IES WORKING PAPER 2/2009

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The principle of gender equality forms a part of the EU’s social policy and serves equally men and women. So far, fourteen directives concerning gender equality have been adopted in the EU, with the New Equal Treatment Directive as the latest one. The EU has developed different models to promote gender equality: equal treatment, positive action and most recently gender mainstreaming. The equal treatment model is primarily concerned with formal equality and it unfortunately prevails in the ECJ’s rulings. Indeed, this paper argues that so far, the ECJ has not managed to develop a firm and consistent case law on gender equality, nor to stretch it coherently to positive action and gender mainstreaming. It seems that in spite of some progress in promoting the position of women, the ECJ’s case law has recently taken a step backwards with its conservative judgments in e.g. the Cadman case. Overall, this paper aims at summing up and evaluating the most important cases of the ECJ on gender equality.

The classification of technical equipment states, using a neural nets approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Maser thesis is devoted to developing a model to technical state of gas turbine engine estimation. The approaches to preparation data, especially to handle unbalanced data were presented in the thesis. In order to efficient estimation of model performance, the special metric was chosen. Goal of the master thesis is analyzing of monitoring parameters data and developing a model of technical state of GTE estimation based on the data.

«
1
2
...
26
27
28
29
30
31
32
...
58
59
»