877 resultados para document clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Defining digital humanities might be an endless debate if we stick to the discussion about the boundaries of this concept as an academic "discipline". In an attempt to concretely identify this field and its actors, this paper shows that it is possible to analyse them through Twitter, a social media widely used by this "community of practice". Based on a network analysis of 2,500 users identified as members of this movement, the visualisation of the "who's following who?" graph allows us to highlight the structure of the network's relationships, and identify users whose position is particular. Specifically, we show that linguistic groups are key factors to explain clustering within a network whose characteristics look similar to a small world.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Peer-reviewed

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the major problems in machine vision is the segmentation of images of natural scenes. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. The main contours of the scene are detected and used to guide the posterior region growing process. The algorithm places a number of seeds at both sides of a contour allowing stating a set of concurrent growing processes. A previous analysis of the seeds permits to adjust the homogeneity criterion to the regions's characteristics. A new homogeneity criterion based on clustering analysis and convex hull construction is proposed

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pantoea agglomerans strains are among the most promising biocontrol agents for avariety of bacterial and fungal plant diseases, particularly fire blight of apple and pear. However, commercial registration of P. agglomerans biocontrol products is hampered because this species is currently listed as a biosafety level 2 (BL2) organism due to clinical reports as an opportunistichuman pathogen. This study compares plant-origin and clinical strains in a search for discriminating genotypic/phenotypic markers using multi-locus phylogenetic analysis and fluorescent amplified fragment length polymorphisms (fAFLP) fingerprinting.Results: Majority of the clinical isolates from culture collections were found to be improperly designated as P. agglomerans after sequence analysis. The frequent taxonomic rearrangements underwent by the Enterobacter agglomerans/Erwinia herbicola complex may be a major problem in assessing clinical associations within P. agglomerans. In the P. agglomerans sensu stricto (in the stricter sense) group, there was no discrete clustering of clinical/biocontrol strains and no marker was identified that was uniquely associated to clinical strains. A putative biocontrol-specific fAFLP marker was identified only in biocontrol strains. The partial ORF located in this band corresponded to an ABC transporter that was found in all P. agglomerans strains. Conclusion: Taxonomic mischaracterization was identified as a major problem with P.agglomerans, and current techniques removed a majority of clinical strains from this species. Although clear discrimination between P. agglomerans plant and clinical strains was not obtained with phylogenetic analysis, a single marker characteristic of biocontrol strains was identified whichmay be of use in strain biosafety determinations. In addition, the lack of Koch's postulate fulfilment, rare retention of clinical strains for subsequent confirmation, and the polymicrobial nature of P. agglomerans clinical reports should be considered in biosafety assessment of beneficial strains in this species

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tot i que en el nostre territori comptem des de els anys 80 amb diferents models de Document de Voluntats Anticipades (DVA), aquests continuen essent desconeguts tant per la ciutadania com pels professionals de la salut. Aquesta situació ha fet que ens plantegem com a objectiu d’aquest estudi descriure si existeix la correlació entre el fet de proporcionar informació sobre el DVA i la motivació per la seva realització. En aquest estudi hem agafat com a mostra els usuaris del servei de psicogeriatria de la Fundació Sociosanitaria de Manresa l’Hospital de Sant Andreu de Manresa, tenint en compte les recomanacions del Document Sitges del 2005 i d’altres autors que recomanen fer el DVA en situació de demència lleu o moderada. També s’ha tingut present l’elevada prevalença d’aquesta patologia. S'ha dissenyat un assaig clínic comunitari amb aleatorització de dos consultoris d'un servei de psicogeriatria. Els metges del consultori assignat al grup control feien el tractament habitual en relació al DVA, és a dir, no informar els pacients atesos sobre l'existència i característiques del DVA, i els metges del consultori assignat al grup intervenció donaven informació reglada als seus pacients sobre el DVA. En el moment de la inclusió es registrava informació sociodemogràfica i clínica per poder classificar els participants i, també a tots els subjectes inclosos en l'assaig, al cap de tres setmanes se'ls feia una enquesta telefònica per avaluar l'opinió i el coneixement sobre el DVA. De les respostes de l’enquesta podem extreure com a resultats que més del 90% dels subjectes del grup control no coneixen el DVA. També s’observa de manera significativa com les persones del grup intervenció parlen amb el metge,la infermera i/o la família sobre la dependència i la mort, tenint en compte que la mort i la dependència continuen sent un tema tabú, i que la majoria de la població de l’estudi no planifiquen com volen ser atesos. Tanmateix s’observa com un 2’3 % tenia fet el DVA i un 22’7% manifesten la seva voluntat de realitzar-lo. Amb aquest estudi es conclou que el fet de proporcionar informació sobre el DVA als usuaris del servei de psicogeriatria afavoreix que aquests estiguin motivats per la realització d’aquest document; al mateix temps també afavoreix la planificació de les cures i el parlar sobres temes com la mort i/o la dependència amb la família, el metge la infermera.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speaker diarization is the process of sorting speeches according to the speaker. Diarization helps to search and retrieve what a certain speaker uttered in a meeting. Applications of diarization systemsextend to other domains than meetings, for example, lectures, telephone, television, and radio. Besides, diarization enhances the performance of several speech technologies such as speaker recognition, automatic transcription, and speaker tracking. Methodologies previously used in developing diarization systems are discussed. Prior results and techniques are studied and compared. Methods such as Hidden Markov Models and Gaussian Mixture Models that are used in speaker recognition and other speech technologies are also used in speaker diarization. The objective of this thesis is to develop a speaker diarization system in meeting domain. Experimental part of this work indicates that zero-crossing rate can be used effectively in breaking down the audio stream into segments, and adaptive Gaussian Models fit adequately short audio segments. Results show that 35 Gaussian Models and one second as average length of each segment are optimum values to build a diarization system for the tested data. Uniting the segments which are uttered by same speaker is done in a bottom-up clustering by a newapproach of categorizing the mixture weights.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering soil and crop data can be used as a basis for the definition of management zones because the data are grouped into clusters based on the similar interaction of these variables. Therefore, the objective of this study was to identify management zones using fuzzy c-means clustering analysis based on the spatial and temporal variability of soil attributes and corn yield. The study site (18 by 250-m in size) was located in Jaboticabal, São Paulo/Brazil. Corn yield was measured in one hundred 4.5 by 10-m cells along four parallel transects (25 observations per transect) over five growing seasons between 2001 and 2010. Soil chemical and physical attributes were measured. SAS procedure MIXED was used to identify which variable(s) most influenced the spatial variability of corn yield over the five study years. Basis saturation (BS) was the variable that better related to corn yield, thus, semivariograms models were fitted for BS and corn yield and then, data values were krigged. Management Zone Analyst software was used to carry out the fuzzy c-means clustering algorithm. The optimum number of management zones can change over time, as well as the degree of agreement between the BS and corn yield management zone maps. Thus, it is very important take into account the temporal variability of crop yield and soil attributes to delineate management zones accurately.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis discusses adaption of new project management tool at ABB Oy Motors and Generators business unit, Synchronous Machines profit centre. Thesis studies project modeling in general and buries in the Gate Model used at ABB Synchronous Machines. It is essential to understand Gate Model because this new project management tool, called Project Master Document, is created on the base of the existing project model. Thesis also analyzes goals and structure of Project Master Document in order to ease implementation of this new tool. Project Master Document aims to improved customer order fulfillment by clearing order handover interface. Office process, especially responsibilities and target dates, become also clearer after Master Document implementation. The document is built to be frame for whole order fulfillment process including check points for each gate of project model and updated memos from all project meetings. Furthermore, project progress will be clearly stated by status markings and visualized with colors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This bachelor’s thesis, written for Lappeenranta University of Technology and implemented in a medium-sized enterprise (SME), examines a distributed document migration system. The system was created to migrate a large number of electronic documents, along with their metadata, from one document management system to another, so as to enable a rapid switchover of an enterprise resource planning systems inside the company. The paper examines, through theoretical analysis, messaging as a possible enabler of distributing applications and how it naturally fits an event based model, whereby system transitions and states are expressed through recorded behaviours. This is put into practice by analysing the implemented migration systems and how the core components, MassTransit, RabbitMQ and MongoDB, were orchestrated together to realize such a system. As a result, the paper presents an architecture for a scalable and distributed system that could migrate hundreds of thousands of documents over weekend, serving its goals in enabling a rapid system switchover.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this thesis is to find out whether all the peer to peer lenders are unworthy of credit and also if there are single qualities or combinations of qualities that determine the probability of default of a person or group of people. Distinguishing qualities are searched with self-organizing maps (SOM). Qualities and groups of people found by the self-organizing map are then compared to the average. The comparison is carried out by looking how big proportion of borrowers meeting the criteria is two months or more behind with their payments. Research data used is collected by an Estonian peer to peer lending company during the years of 2011-2014. Data consists of peer to peer borrowers and information gathered from them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous genetic association studies have overlooked the potential for biased results when analyzing different population structures in ethnically diverse populations. The purpose of the present study was to quantify this bias in two-locus association studies conducted on an admixtured urban population. We studied the genetic structure distribution of angiotensin-converting enzyme insertion/deletion (ACE I/D) and angiotensinogen methionine/threonine (M/T) polymorphisms in 382 subjects from three subgroups in a highly admixtured urban population. Group I included 150 white subjects; group II, 142 mulatto subjects, and group III, 90 black subjects. We conducted sample size simulation studies using these data in different genetic models of gene action and interaction and used genetic distance calculation algorithms to help determine the population structure for the studied loci. Our results showed a statistically different population structure distribution of both ACE I/D (P = 0.02, OR = 1.56, 95% CI = 1.05-2.33 for the D allele, white versus black subgroup) and angiotensinogen M/T polymorphism (P = 0.007, OR = 1.71, 95% CI = 1.14-2.58 for the T allele, white versus black subgroup). Different sample sizes are predicted to be determinant of the power to detect a given genotypic association with a particular phenotype when conducting two-locus association studies in admixtured populations. In addition, the postulated genetic model is also a major determinant of the power to detect any association in a given sample size. The present simulation study helped to demonstrate the complex interrelation among ethnicity, power of the association, and the postulated genetic model of action of a particular allele in the context of clustering studies. This information is essential for the correct planning and interpretation of future association studies conducted on this population.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This master thesis work introduces the fuzzy tolerance/equivalence relation and its application in cluster analysis. The work presents about the construction of fuzzy equivalence relations using increasing generators. Here, we investigate and research on the role of increasing generators for the creation of intersection, union and complement operators. The objective is to develop different varieties of fuzzy tolerance/equivalence relations using different varieties of increasing generators. At last, we perform a comparative study with these developed varieties of fuzzy tolerance/equivalence relations in their application to a clustering method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Verbal fluency tests are used as a measure of executive functions and language, and can also be used to evaluate semantic memory. We analyzed the influence of education, gender and age on scores in a verbal fluency test using the animal category, and on number of categories, clustering and switching. We examined 257 healthy participants (152 females and 105 males) with a mean age of 49.42 years (SD = 15.75) and having a mean educational level of 5.58 (SD = 4.25) years. We asked them to name as many animals as they could. Analysis of variance was performed to determine the effect of demographic variables. No significant effect of gender was observed for any of the measures. However, age seemed to influence the number of category changes, as expected for a sensitive frontal measure, after being controlled for the effect of education. Educational level had a statistically significant effect on all measures, except for clustering. Subject performance (mean number of animals named) according to schooling was: illiterates, 12.1; 1 to 4 years, 12.3; 5 to 8 years, 14.0; 9 to 11 years, 16.7, and more than 11 years, 17.8. We observed a decrease in performance in these five educational groups over time (more items recalled during the first 15 s, followed by a progressive reduction until the fourth interval). We conclude that education had the greatest effect on the category fluency test in this Brazilian sample. Therefore, we must take care in evaluating performance in lower educational subjects.