48 resultados para speaker clustering
Resumo:
A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence is presented. It is comprised of more than 600 individuals acquired simultaneously in three scenarios: 1) over the Internet, 2) in an office environment with desktop PC, and 3) in indoor/outdoor environments with mobile portable hardware. The three scenarios include a common part of audio/video data. Also, signature and fingerprint data have been acquired both with desktop PC and mobile portable hardware. Additionally, hand and iris data were acquired in the second scenario using desktop PC. Acquisition has been conducted by 11 European institutions. Additional features of the BioSecure Multimodal Database (BMDB) are: two acquisitionsessions, several sensors in certain modalities, balanced gender and age distributions, multimodal realistic scenarios with simple and quick tasks per modality, cross-European diversity, availability of demographic data, and compatibility with other multimodal databases. The novel acquisition conditions of the BMDB allow us to perform new challenging research and evaluation of eithermonomodal or multimodal biometric systems, as in the recent BioSecure Multimodal Evaluation campaign. A description of this campaign including baseline results of individual modalities from the new database is also given. The database is expected to beavailable for research purposes through the BioSecure Association during 2008.
Resumo:
Aquest treball descriu una metodologia per classificar els verbs en català segons el seu comportament sintàctic. L’objectiu és adquirir un nombre reduït de classes bàsiques amb una precisió alta fent servir pocs recursos. Obtenir informació sobre classe sintàctica és un procés llarg i costós, però útil per a moltes tasques de PLN. Presentem com obtenir aquesta informació fent servir només un corpus amb anotació de categoria morfològica. Hem explorat tant tècniques supervisades com no supervisades. Primer presentem els experiments que fan servir un mètode supervisat per distingir automàticament entre verbs transitius i intransitius. El nostre sistema té una taxa d’error del 4,65%. Pel que fa als mètodes no supervisats (clustering), presentem dos experiments. El primer pretén classificar els verbs en transitius, intransitius i verbs que alternen amb la partícula se. El segon experiment té per objectiu fer una subclassificació entre intransitius purs i preposicional. Els resultats són uns coeficients-F de 0.84 i 0.88, respectivament.
Resumo:
Hierarchical clustering is a popular method for finding structure in multivariate data,resulting in a binary tree constructed on the particular objects of the study, usually samplingunits. The user faces the decision where to cut the binary tree in order to determine the numberof clusters to interpret and there are various ad hoc rules for arriving at a decision. A simplepermutation test is presented that diagnoses whether non-random levels of clustering are presentin the set of objects and, if so, indicates the specific level at which the tree can be cut. The test isvalidated against random matrices to verify the type I error probability and a power study isperformed on data sets with known clusteredness to study the type II error.