922 resultados para Supervised and Unsupervised Classification
Resumo:
This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.
Resumo:
Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.
Resumo:
Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.
Resumo:
The neural-like growing networks used in the intelligent system of recognition of images are under consideration in this paper. All operations made over the image on a pre-design stage and also classification and storage of the information about the images and their further identification are made extremely by mechanisms of neural-like networks without usage of complex algorithms requiring considerable volumes of calculus. At the conforming hardware support the neural network methods allow considerably to increase the effectiveness of the solution of the given class of problems, saving a high accuracy of result and high level of response, both in a mode of training, and in a mode of identification.
Resumo:
Nowadays, the scientific and social significance of the research of climatic effects has become outstanding. In order to be able to predict the ecological effects of the global climate change, it is necessary to study monitoring databases of the past and explore connections. For the case study mentioned in the title, historical weather data series from the Hungarian Meteorological Service and Szaniszló Priszter’s monitoring data on the phenology of geophytes have been used. These data describe on which days the observed geophytes budded, were blooming and withered. In our research we have found that the classification of the observed years according to phenological events and the classification of those according to the frequency distribution of meteorological parameters show similar patterns, and the one variable group is suitable for explaining the pattern shown by the other one. Furthermore, our important result is that the dates of all three observed phenophases correlate significantly with the average of the daily temperature fluctuation in the given period. The second most often significant parameter is the number of frosty days, this also seem to be determinant for all phenophases. Usual approaches based on the temperature sum and the average temperature don’t seem to be really important in this respect. According to the results of the research, it has turned out that the phenology of geophytes can be well modelled with the linear combination of suitable meteorological parameters
Resumo:
Ebben a tanulmányban a Natura 2000 erdők közgazdasági kérdéseit jártuk körül az ökoszisztéma-szolgáltatások koncepciójának segítségével, nemzetközi és hazai szakirodalomra építve. Emellett a természetközeli erdőgazdálkodás fogalomkörét, s azon belül a folyamatos erdőborítást szolgáló erdőgazdálkodást vettük alapul. A következőkben néhány összegző megállapítást teszünk, és kijelölünk további kutatási irányokat. _____ This study has been prepared within the LIFEinFORESTS – Improved communication, cooperation and capacity building for preserving biodiversity in Natura 2000 forests (B2 action, LIFE13 INF/HU/001163) – project in the framework of LIFE+ Information and Communication under the contract signed with the Duna-Ipoly National Park Directorate. The main aim of the study is to summarize the international and Hungarian economic and environmental economic literature related to the Natura 2000 forests, and serve as a background study for the communication with and training of forest owners and users operating at Natura 2000 sites. The concept of ecosystem services (ESs) is used as an overall framework for the study. In our opinion it is able to show all the benefits provided by forests and can also help to reveal that the benefits of nature-oriented, continuous cover forest management (CCF) can exceed the benefits of traditional rotation forest management (RFM). The definition and the classification of the Millennium Ecosystem Assessment (MA, 2003, 2005) is used throughout the study, so provisioning, cultural, regulating and supporting services are distinguished.
Resumo:
As traffic congestion exuberates and new roadway construction is severely constrained because of limited availability of land, high cost of land acquisition, and communities' opposition to the building of major roads, new solutions have to be sought to either make roadway use more efficient or reduce travel demand. There is a general agreement that travel demand is affected by land use patterns. However, traditional aggregate four-step models, which are the prevailing modeling approach presently, assume that traffic condition will not affect people's decision on whether to make a trip or not when trip generation is estimated. Existing survey data indicate, however, that differences exist in trip rates for different geographic areas. The reasons for such differences have not been carefully studied, and the success of quantifying the influence of land use on travel demand beyond employment, households, and their characteristics has been limited to be useful to the traditional four-step models. There may be a number of reasons, such as that the representation of influence of land use on travel demand is aggregated and is not explicit and that land use variables such as density and mix and accessibility as measured by travel time and congestion have not been adequately considered. This research employs the artificial neural network technique to investigate the potential effects of land use and accessibility on trip productions. Sixty two variables that may potentially influence trip production are studied. These variables include demographic, socioeconomic, land use and accessibility variables. Different architectures of ANN models are tested. Sensitivity analysis of the models shows that land use does have an effect on trip production, so does traffic condition. The ANN models are compared with linear regression models and cross-classification models using the same data. The results show that ANN models are better than the linear regression models and cross-classification models in terms of RMSE. Future work may focus on finding a representation of traffic condition with existing network data and population data which might be available when the variables are needed to in prediction.
Resumo:
The development of 3G (the 3rd generation telecommunication) value-added services brings higher requirements of Quality of Service (QoS). Wideband Code Division Multiple Access (WCDMA) is one of three 3G standards, and enhancement of QoS for WCDMA Core Network (CN) becomes more and more important for users and carriers. The dissertation focuses on enhancement of QoS for WCDMA CN. The purpose is to realize the DiffServ (Differentiated Services) model of QoS for WCDMA CN. Based on the parallelism characteristic of Network Processors (NPs), the NP programming model is classified as Pool of Threads (POTs) and Hyper Task Chaining (HTC). In this study, an integrated programming model that combines both of the two models was designed. This model has highly efficient and flexible features, and also solves the problems of sharing conflicts and packet ordering. We used this model as the programming model to realize DiffServ QoS for WCDMA CN. ^ The realization mechanism of the DiffServ model mainly consists of buffer management, packet scheduling and packet classification algorithms based on NPs. First, we proposed an adaptive buffer management algorithm called Packet Adaptive Fair Dropping (PAFD), which takes into consideration of both fairness and throughput, and has smooth service curves. Then, an improved packet scheduling algorithm called Priority-based Weighted Fair Queuing (PWFQ) was introduced to ensure the fairness of packet scheduling and reduce queue time of data packets. At the same time, the delay and jitter are also maintained in a small range. Thirdly, a multi-dimensional packet classification algorithm called Classification Based on Network Processors (CBNPs) was designed. It effectively reduces the memory access and storage space, and provides less time and space complexity. ^ Lastly, an integrated hardware and software system of the DiffServ model of QoS for WCDMA CN was proposed. It was implemented on the NP IXP2400. According to the corresponding experiment results, the proposed system significantly enhanced QoS for WCDMA CN. It extensively improves consistent response time, display distortion and sound image synchronization, and thus increases network efficiency and saves network resource.^
Resumo:
Voice communication systems such as Voice-over IP (VoIP), Public Switched Telephone Networks, and Mobile Telephone Networks, are an integral means of human tele-interaction. These systems pose distinctive challenges due to their unique characteristics such as low volume, burstiness and stringent delay/loss requirements across heterogeneous underlying network technologies. Effective quality evaluation methodologies are important for system development and refinement, particularly by adopting user feedback based measurement. Presently, most of the evaluation models are system-centric (Quality of Service or QoS-based), which questioned us to explore a user-centric (Quality of Experience or QoE-based) approach as a step towards the human-centric paradigm of system design. We research an affect-based QoE evaluation framework which attempts to capture users' perception while they are engaged in voice communication. Our modular approach consists of feature extraction from multiple information sources including various affective cues and different classification procedures such as Support Vector Machines (SVM) and k-Nearest Neighbor (kNN). The experimental study is illustrated in depth with detailed analysis of results. The evidences collected provide the potential feasibility of our approach for QoE evaluation and suggest the consideration of human affective attributes in modeling user experience.
Resumo:
Automatic detection of blood components is an important topic in the field of hematology. The segmentation is an important stage because it allows components to be grouped into common areas and processed separately and leukocyte differential classification enables them to be analyzed separately. With the auto-segmentation and differential classification, this work is contributing to the analysis process of blood components by providing tools that reduce the manual labor and increasing its accuracy and efficiency. Using techniques of digital image processing associated with a generic and automatic fuzzy approach, this work proposes two Fuzzy Inference Systems, defined as I and II, for autosegmentation of blood components and leukocyte differential classification, respectively, in microscopic images smears. Using the Fuzzy Inference System I, the proposed technique performs the segmentation of the image in four regions: the leukocyte’s nucleus and cytoplasm, erythrocyte and plasma area and using the Fuzzy Inference System II and the segmented leukocyte (nucleus and cytoplasm) classify them differentially in five types: basophils, eosinophils, lymphocytes, monocytes and neutrophils. Were used for testing 530 images containing microscopic samples of blood smears with different methods. The images were processed and its accuracy indices and Gold Standards were calculated and compared with the manual results and other results found at literature for the same problems. Regarding segmentation, a technique developed showed percentages of accuracy of 97.31% for leukocytes, 95.39% to erythrocytes and 95.06% for blood plasma. As for the differential classification, the percentage varied between 92.98% and 98.39% for the different leukocyte types. In addition to promoting auto-segmentation and differential classification, the proposed technique also contributes to the definition of new descriptors and the construction of an image database using various processes hematological staining
Resumo:
Epithelial changes observed in actinic cheilitis (AC) and squamous cell carcinoma of the lower lip (LLSCC) are mainly caused by chronic exposure to ultraviolet rays (UV) and are studied using different immunohistochemical markers trying to evaluate the process of carcinogenesis. The objective of this study was to comparatively evaluate the expression of Ki-67 proteins and IMP-3 in AC and LLSCC to contribute with additional information on carcinogenesis in lower lip. A total of 33 cases of AC and 33 cases of LLSCC were studied, analyzed the clinical and pathological features and immunostaining of Ki-67 and IMP-3. Immunohistochemical analysis of Ki-67 was made through the determination of the proliferation index (PI) and subsequent classification of the cases according to the scores: 0 (0% positive cells) +1 (≤30%) + 2 (> 30% and ≤60%) and +3 (> 60%). For statistical tests cases were classified as unmarked (score 0), low expression (score +1) and high expression (scores +2 and +3). For the expression of IMP-3, the percentage of immunostained epithelial cells was established, and assigned scores: 0 (corresponding to 0%), +1 (up to 30% of positive cells); +2 (From 30% to 60% of immunostained cells) and +3 (over 60% of positive cells). Statistical tests chi-square test, Mann-Whitney and Wilcoxon were used. The significance level was 5%. Most AC chaos was male (78.8%) with mean age of 50 years and cases of LLSCC also were male (69.89%) with an average of 62 years. The Ki-67 was expressed in all cases of AC and in cases of LLSCC, predominantly in the two injuries the score 2, corresponding to 81.8% of cases in ACs and 54.5% in the CELI. The expression of IMP-3 in ACs occurred in 72.7% of cases, predominantly in 36.3% of LLSCC cases score 1. Already in the IMP-3 was expressed in 60.6% of cases, especially in 27.3% of the score of the cases 3. These results allow us to conclude that the expression of IMP3 and proliferative activity are early events in carcinogenesis independently lower lip state of change.
Resumo:
Dry eye syndrome is a multifactorial disease of the tear film, resulting from the instability of the lacrimal functional unit that produces volume change, up or tear distribution. In patients in intensive care the cause is enhanced due to various risk factors, such as mechanical ventilation, sedation, lagophthalmos, low temperatures, among others. The study's purpose is to build an assessment tool of Dry Eye Severity in patients in intensive care units based on the systematization of nursing care and their classification systems. The aim of this study is to build an assessment tool of Dry Eye Severity in hospitalized patients in Care Unit Intensiva.Trata is a methodological study conducted in three stages, namely: context analysis, concept analysis, construction of operational definitions and magnitudes of nursing outcome. For the first step we used the methodological framework for Hinds, Chaves and Cypress (1992). For the second step we used the model of Walker and Avant and an integrative review Whitemore seconds, Knalf (2005). This step enabled the identification of the concept of attributes, background and consequent ground and the construction of the settings for the result of nursing severity of dry eye. For the construction of settings and operational magnitudes, it was used Psicometria proposed by Pasquali (1999). As a result of context analysis, visualized from the reflection that the matter should be discussed and that nursing needs to pay attention to the problem of eye injury, so minimizing strategies are created this event with a high prevalence. With the integrative review were located from the crosses 19 853 titles, selected 215, and from the abstracts 96 articles were read in full. From reading 10 were excluded culminating in the sample of 86 articles that were used to analyze the concept and construction of settings. Selected articles were found in greater numbers in the Scopus database (55.82%), performed in the United States (39.53%), and published mainly in the last five years (48.82). Regarding the concept of analysis were identified as antecedents: age, lagophthalmos, environmental factors, medication use, systemic diseases, mechanical ventilation and ophthalmic surgery. As attributes: TBUT <10s, Schimer I test <5 mm in Schimer II test <10mm, reduced osmolarity. As consequential: the ocular surface damage, ocular discomfort, visual instability. The settings were built and added indicators such as: decreased blink mechanism and eyestrain.
Resumo:
Se estudia la codificación y organización del campo de las Ciencias de la Comunicación, fundamentalmente los códigos Unesco. El objetivo es plantear un cambio en estos códigos, puesto que la forma de clasificar y organizar un dominio científico tiene consecuencias de naturaleza operativa y epistemológica en el propio trabajo científico. En este trabajo se estudia la clasificación actual, que muestra una presencia escasa y dispersa de los términos vinculados a Comunicación. Se describen las dificultades prácticas y teóricas que conlleva su reorganización, las posibles fuentes (planes de estudio, congresos, revistas científicas, propuestas documentales y palabras clave) y los métodos de trabajo que se pueden emplear, tomando como bases teóricas el dominio de la Organización del Conocimiento y el de la Comunicación. Por último, se analizan dos ámbitos disciplinares diferentes (Historia de la Comunicación y Tecnologías de la Comunicación), mediante la información recogida, en una base de datos, de asignaturas en grado y en másteres oficiales de 12 universidades españolas. También se observa que este tipo de propuesta requiere el conocimiento derivado de instrumentos documentales como las clasificaciones y los tesauros.
Resumo:
Cuando abordamos un trabajo de investigación en cualquier centro de documentación que albergue fondo antiguo a menudo nos topamos con papeles o cuadernillos de pocas hojas que muchas veces han pasado desapercibidos y están sin catalogar. La naturaleza de éstos resulta diversa y, en ocasiones, también su identificación puede llegar a ser compleja seguramente por el mero hecho de aparecer muchos de ellos descontextualizados. Por esta razón, el presente artículo pretende establecer una guía sistemática que englobe, clasifique y describa aquellos más habituales en los archivos para facilitar así su identificación.
Resumo:
This paper aims to investigate the use of collocations in DELE B1. We select the reading texts from DELE B1 (2010 to 2014) as research data. The investigation includes: First of all, we will study the theory of collocation and the classification as well as its application to the foreign language learning and teaching. Second, we will analyze the types of collocation annotated by Corpus Tool. Third, we tend to calculate the frequency use of each type of collocations written in Spanish reading texts. Next, we will discuss the interrelationship between collocations and text themes. Finally, we would like to compare the differences of results of collocation use between these two corpus tools: Corpus Tool and Corpus del Español in order to understand the native speakers’ preference of use collocations as well as to provide supplementay materials for teaching of Spanish reading. We hope that the expected results of our research will offer useful references for improving students' Spanish reading comprehension to pass DELE B1 examinations.