15 resultados para Automatic classification
em Helda - Digital Repository of University of Helsinki
Resumo:
Hereditary nonpolyposis colorectal cancer (HNPCC) is the most common known clearly hereditary cause of colorectal and endometrial cancer (CRC and EC). Dominantly inherited mutations in one of the known mismatch repair (MMR) genes predispose to HNPCC. Defective MMR leads to an accumulation of mutations especially in repeat tracts, presenting microsatellite instability. HNPCC is clinically a very heterogeneous disease. The age at onset varies and the target tissue may vary. In addition, families that fulfill the diagnostic criteria for HNPCC but fail to show any predisposing mutation in MMR genes exist. Our aim was to evaluate the genetic background of familial CRC and EC. We performed comprehensive molecular and DNA copy number analyses of CRCs fulfilling the diagnostic criteria for HNPCC. We studied the role of five pathways (MMR, Wnt, p53, CIN, PI3K/AKT) and divided the tumors into two groups, one with MMR gene germline mutations and the other without. We observed that MMR proficient familial CRC consist of two molecularly distinct groups that differ from MMR deficient tumors. Group A shows paucity of common molecular and chromosomal alterations characteristic of colorectal carcinogenesis. Group B shows molecular features similar to classical microsatellite stable tumors with gross chromosomal alterations. Our finding of a unique tumor profile in group A suggests the involvement of novel predisposing genes and pathways in colorectal cancer cohorts not linked to MMR gene defects. We investigated the genetic background of familial ECs. Among 22 families with clustering of EC, two (9%) were due to MMR gene germline mutations. The remaining familial site-specific ECs are largely comparable with HNPCC associated ECs, the main difference between these groups being MMR proficiency vs. deficiency. We studied the role of PI3K/AKT pathway in familial ECs as well and observed that PIK3CA amplifications are characteristic of familial site-specific EC without MMR gene germline mutations. Most of the high-level amplifications occurred in tumors with stable microsatellites, suggesting that these tumors are more likely associated with chromosomal rather than microsatellite instability and MMR defect. The existence of site-specific endometrial carcinoma as a separate entity remains equivocal until predisposing genes are identified. It is possible that no single highly penetrant gene for this proposed syndrome exists, it may, for example be due to a combination of multiple low penetrance genes. Despite advances in deciphering the molecular genetic background of HNPCC, it is poorly understood why certain organs are more susceptible than others to cancer development. We found that important determinants of the HNPCC tumor spectrum are, in addition to different predisposing germline mutations, organ specific target genes and different instability profiles, loss of heterozygosity at MLH1 locus, and MLH1 promoter methylation. This study provided more precise molecular classification of families with CRC and EC. Our observations on familial CRC and EC are likely to have broader significance that extends to sporadic CRC and EC as well.
Resumo:
The aim of this thesis is to develop a fully automatic lameness detection system that operates in a milking robot. The instrumentation, measurement software, algorithms for data analysis and a neural network model for lameness detection were developed. Automatic milking has become a common practice in dairy husbandry, and in the year 2006 about 4000 farms worldwide used over 6000 milking robots. There is a worldwide movement with the objective of fully automating every process from feeding to milking. Increase in automation is a consequence of increasing farm sizes, the demand for more efficient production and the growth of labour costs. As the level of automation increases, the time that the cattle keeper uses for monitoring animals often decreases. This has created a need for systems for automatically monitoring the health of farm animals. The popularity of milking robots also offers a new and unique possibility to monitor animals in a single confined space up to four times daily. Lameness is a crucial welfare issue in the modern dairy industry. Limb disorders cause serious welfare, health and economic problems especially in loose housing of cattle. Lameness causes losses in milk production and leads to early culling of animals. These costs could be reduced with early identification and treatment. At present, only a few methods for automatically detecting lameness have been developed, and the most common methods used for lameness detection and assessment are various visual locomotion scoring systems. The problem with locomotion scoring is that it needs experience to be conducted properly, it is labour intensive as an on-farm method and the results are subjective. A four balance system for measuring the leg load distribution of dairy cows during milking in order to detect lameness was developed and set up in the University of Helsinki Research farm Suitia. The leg weights of 73 cows were successfully recorded during almost 10,000 robotic milkings over a period of 5 months. The cows were locomotion scored weekly, and the lame cows were inspected clinically for hoof lesions. Unsuccessful measurements, caused by cows standing outside the balances, were removed from the data with a special algorithm, and the mean leg loads and the number of kicks during milking was calculated. In order to develop an expert system to automatically detect lameness cases, a model was needed. A probabilistic neural network (PNN) classifier model was chosen for the task. The data was divided in two parts and 5,074 measurements from 37 cows were used to train the model. The operation of the model was evaluated for its ability to detect lameness in the validating dataset, which had 4,868 measurements from 36 cows. The model was able to classify 96% of the measurements correctly as sound or lame cows, and 100% of the lameness cases in the validation data were identified. The number of measurements causing false alarms was 1.1%. The developed model has the potential to be used for on-farm decision support and can be used in a real-time lameness monitoring system.
Resumo:
A new rock mass classification scheme, the Host Rock Classification system (HRC-system) has been developed for evaluating the suitability of volumes of rock mass for the disposal of high-level nuclear waste in Precambrian crystalline bedrock. To support the development of the system, the requirements of host rock to be used for disposal have been studied in detail and the significance of the various rock mass properties have been examined. The HRC-system considers both the long-term safety of the repository and the constructability in the rock mass. The system is specific to the KBS-3V disposal concept and can be used only at sites that have been evaluated to be suitable at the site scale. By using the HRC-system, it is possible to identify potentially suitable volumes within the site at several different scales (repository, tunnel and canister scales). The selection of the classification parameters to be included in the HRC-system is based on an extensive study on the rock mass properties and their various influences on the long-term safety, the constructability and the layout and location of the repository. The parameters proposed for the classification at the repository scale include fracture zones, strength/stress ratio, hydraulic conductivity and the Groundwater Chemistry Index. The parameters proposed for the classification at the tunnel scale include hydraulic conductivity, Q´ and fracture zones and the parameters proposed for the classification at the canister scale include hydraulic conductivity, Q´, fracture zones, fracture width (aperture + filling) and fracture trace length. The parameter values will be used to determine the suitability classes for the volumes of rock to be classified. The HRC-system includes four suitability classes at the repository and tunnel scales and three suitability classes at the canister scale and the classification process is linked to several important decisions regarding the location and acceptability of many components of the repository at all three scales. The HRC-system is, thereby, one possible design tool that aids in locating the different repository components into volumes of host rock that are more suitable than others and that are considered to fulfil the fundamental requirements set for the repository host rock. The generic HRC-system, which is the main result of this work, is also adjusted to the site-specific properties of the Olkiluoto site in Finland and the classification procedure is demonstrated by a test classification using data from Olkiluoto. Keywords: host rock, classification, HRC-system, nuclear waste disposal, long-term safety, constructability, KBS-3V, crystalline bedrock, Olkiluoto
Resumo:
In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.
Resumo:
In visual object detection and recognition, classifiers have two interesting characteristics: accuracy and speed. Accuracy depends on the complexity of the image features and classifier decision surfaces. Speed depends on the hardware and the computational effort required to use the features and decision surfaces. When attempts to increase accuracy lead to increases in complexity and effort, it is necessary to ask how much are we willing to pay for increased accuracy. For example, if increased computational effort implies quickly diminishing returns in accuracy, then those designing inexpensive surveillance applications cannot aim for maximum accuracy at any cost. It becomes necessary to find trade-offs between accuracy and effort. We study efficient classification of images depicting real-world objects and scenes. Classification is efficient when a classifier can be controlled so that the desired trade-off between accuracy and effort (speed) is achieved and unnecessary computations are avoided on a per input basis. A framework is proposed for understanding and modeling efficient classification of images. Classification is modeled as a tree-like process. In designing the framework, it is important to recognize what is essential and to avoid structures that are narrow in applicability. Earlier frameworks are lacking in this regard. The overall contribution is two-fold. First, the framework is presented, subjected to experiments, and shown to be satisfactory. Second, certain unconventional approaches are experimented with. This allows the separation of the essential from the conventional. To determine if the framework is satisfactory, three categories of questions are identified: trade-off optimization, classifier tree organization, and rules for delegation and confidence modeling. Questions and problems related to each category are addressed and empirical results are presented. For example, related to trade-off optimization, we address the problem of computational bottlenecks that limit the range of trade-offs. We also ask if accuracy versus effort trade-offs can be controlled after training. For another example, regarding classifier tree organization, we first consider the task of organizing a tree in a problem-specific manner. We then ask if problem-specific organization is necessary.
Resumo:
Climate change contributes directly or indirectly to changes in species distributions, and there is very high confidence that recent climate warming is already affecting ecosystems. The Arctic has already experienced the greatest regional warming in recent decades, and the trend is continuing. However, studies on the northern ecosystems are scarce compared to more southerly regions. Better understanding of the past and present environmental change is needed to be able to forecast the future. Multivariate methods were used to explore the distributional patterns of chironomids in 50 shallow (≤ 10m) lakes in relation to 24 variables determined in northern Fennoscandia at the ecotonal area from the boreal forest in the south to the orohemiarctic zone in the north. Highest taxon richness was noted at middle elevations around 400 m a.s.l. Significantly lower values were observed from cold lakes situated in the tundra zone. Lake water alkalinity had the strongest positive correlation with the taxon richness. Many taxa had preference for lakes either on tundra area or forested area. The variation in the chironomid abundance data was best correlated with sediment organic content (LOI), lake water total organic carbon content, pH and air temperature, with LOI being the strongest variable. Three major lake groups were separated on the basis of their chironomid assemblages: (i) small and shallow organic-rich lakes, (ii) large and base-rich lakes, and (iii) cold and clear oligotrophic tundra lakes. Environmental variables best discriminating the lake groups were LOI, taxon richness, and Mg. When repeated, this kind of an approach could be useful and efficient in monitoring the effects of global change on species ranges. Many species of fast spreading insects, including chironomids, show a remarkable ability to track environmental changes. Based on this ability, past environmental conditions have been reconstructed using their chitinous remains in the lake sediment profiles. In order to study the Holocene environmental history of subarctic aquatic systems, and quantitatively reconstruct the past temperatures at or near the treeline, long sediment cores covering the last 10000 years (the Holocene) were collected from three lakes. Lower temperature values than expected based on the presence of pine in the catchment during the mid-Holocene were reconstructed from a lake with great water volume and depth. The lake provided thermal refuge for profundal, cold adapted taxa during the warm period. In a shallow lake, the decrease in the reconstructed temperatures during the late Holocene may reflect the indirect response of the midges to climate change through, e.g., pH change. The results from three lakes indicated that the response of chironomids to climate have been more or less indirect. However, concurrent shifts in assemblages of chironomids and vegetation in two lakes during the Holocene time period indicated that the midges together with the terrestrial vegetation had responded to the same ultimate cause, which most likely was the Holocene climate change. This was also supported by the similarity in the long-term trends in faunal succession for the chironomid assemblages in several lakes in the area. In northern Finnish Lapland the distribution of chironomids were significantly correlated with physical and limnological factors that are most likely to change as a result of future climate change. The indirect and individualistic response of aquatic systems, as reconstructed using the chironomid assemblages, to the climate change in the past suggests that in the future, the lake ecosystems in the north do not respond in one predictable way to the global climate change. Lakes in the north may respond to global climate change in various ways that are dependent on the initial characters of the catchment area and the lake.
Resumo:
The aims of the thesis are (1) to present a systematic evaluation of generation and its relevance as a sociological concept, (2) to reflect on how generational consciousness, i.e. generation as an object of collective identification that has social significance, can emerge and take shape, (3) to analyze empirically the generational experiences and consciousness of one specific generation, namely Finnish baby boomers (b. 1945 1950). The thesis contributes to the discussion on the social (as distinct from its genealogical) meaning of the concept of generation, launched by Karl Mannheim s classic Das Problem der Generationen (1928), in which the central idea is that a certain group of people is bonded together by a shared experience and that this bonding can result in a distinct self-consciousness. The thesis is comprised of six original articles and an extensive summarizing chapter. In the empirical articles, the baby boomers are studied on the basis of nationally representative survey data (N = 2628) and narrative life-story interviews (N = 38). In the article that discusses the connection of generations and social movements, the analysis is based on the member survey of Attac Finland (N = 1096). Three main themes were clarified in the thesis. (1) In the social sense the concept of generation is a modern, problematic, and ultimately a political concept. It served the interests of the intellectuals who developed the concept in the early 20th century and provided them, as an alternative to the concept of social class, a new way of think about social change and progress. The concept of generation is always coupled with the concept of Zeitgeist or some other controversial way of defining what is essential, i.e. what creates generations, in a given culture. Thus generation is, as a product of definition and classification struggles, a contested concept. The concept also clearly implies elitist connotations; the idea of some kind of vanguard (the elite) that represents an entire generation by proclaiming itself as its spokesman automatically creates a counterpart, namely the others in the peer group who are thought to be represented (the masses). (2) Generational consciousness cannot emerge as a result of any kind of automatic process or endogenously; it must be made. There has to be somebody who represents the generation in order for that generation to exist in people s minds and as an object of identification; generational experiences and their meanings must be articulated. Hence, social generations are, in a fundamental manner, discursively constructed. The articulations of generational experiences (speeches, writings, manifests, labels etc.) can be called as the discursive dimension of social generations, and through this notion, how public discourse shapes people s generational consciousness can be seen. Another important element in the process is collective memory, as generational consciousness often takes form only retrospectively. (3) Finnish baby boomers are not a united or homogeneous generation but are divided into many smaller sections with specific generational experiences and consciousnesses. The content of the generational consciousness of the baby boomers is heavily politically charged. A salient dividing line inside the age group is formed by individual attitudes towards so-called 1960s radicalism. Identification with the 1960s generation functions today as a positive self-definition of a certain small leftist elite group, and the values and characteristics usually connected with the idea of the 1960s generation do not represent the whole age group. On the contrary, among some of the members of the baby boomers, the generational identification is still directed by the experience of how traditional values were disgraced in the 1960s. As objects of identification, the neutral term baby boomers and the charged 1960s generation are totally different things, and therefore they should not be used as synonyms. Although the significance of the group of the 1960s generation is often overestimated, they are however special with respect to generational consciousness because they have presented themselves as the voice of the entire generation. Their generational interpretations have spread through the media with the help of certain iconic images of the generation insomuch that 1960s radicalism has become an indirect generational experience for other parts of the baby boom cohort as well.
Resumo:
A new classification and linear sequence of the gymnosperms based on previous molecular and morphological phylogenetic and other studies is presented. Currently accepted genera are listed for each family and arranged according to their (probable) phylogenetic position. A full synonymy is provided, and types are listed for accepted genera. An index to genera assists in easy access to synonymy and family placement of genera.
Resumo:
Traumatic brain injury (TBI) affects people of all ages and is a cause of long-term disability. In recent years, the epidemiological patterns of TBI have been changing. TBI is a heterogeneous disorder with different forms of presentation and highly individual outcome regarding functioning and health-related quality of life (HRQoL). The meaning of disability differs from person to person based on the individual s personality, value system, past experience, and the purpose he or she sees in life. Understanding of all these viewpoints is needed in comprehensive rehabilitation. This study examines the epidemiology of TBI in Finland as well as functioning and HRQoL after TBI, and compares the subjective and objective assessments of outcome. The frame of reference is the International Classification of Functioning, Disability and Health (ICF). The subjects of Study I represent the population of Finnish TBI patients who experienced their first TBI between 1991 and 2005. The 55 Finnish subjects of Studies II and IV participated in the first wave of the international Quality of life after brain injury (QOLIBRI) validation study. The 795 subjects from six language areas of Study III formed the second wave of the QOLIBRI validation study. The average annual incidence of Finnish hospitalised TBI patients during the years 1991-2005 was 101:100 000 in patients who had TBI as the primary diagnosis and did not have a previous TBI in their medical history. Males (59.2%) were at considerably higher risk of getting a TBI than females. The most common external cause of the injury was falls in all age groups. The number of TBI patients ≥ 70 years of age increased by 59.4% while the number of inhabitants older than 70 years increased by 30.3% in the population of Finland during the same time period. The functioning of a sample of 55 persons with TBI was assessed by extracting information from the patients medical documents using the ICF checklist. The most common problems were found in the ICF components of Body Functions (b) and Activities and Participation (d). HRQoL was assessed with the QOLIBRI which showed the highest level of satisfaction on the Emotions, Physical Problems and Daily Life and Autonomy scales. The highest scores were obtained by the youngest participants and participants living independently without the help of other people, and by people who were working. The relationship between the functional outcome and HRQoL was not straightforward. The procedure of linking the QOLIBRI and the GOSE to the ICF showed that these two outcome measures cover the relevant domains of TBI patients functioning. The QOLIBRI provides the patients subjective view, while the GOSE summarises the objective elements of functioning. Our study indicates that there are certain domains of functioning that are not traditionally sufficiently documented but are important for the HRQoL of persons with TBI. This was the finding especially in the domains of interpersonal relationships, social and leisure activities, self, and the environment. Rehabilitation aims to optimize functioning and to minimize the experience of disability among people with health conditions, and it needs to be based on a comprehensive understanding of human functioning. As an integrative model, the ICF may serve as a frame of reference in achieving such an understanding.
Resumo:
Road transport and infrastructure has a fundamental meaning for the developing world. Poor quality and inadequate coverage of roads, lack of maintenance operations and outdated road maps continue to hinder economic and social development in the developing countries. This thesis focuses on studying the present state of road infrastructure and its mapping in the Taita Hills, south-east Kenya. The study is included as a part of the TAITA-project by the Department of Geography, University of Helsinki. The road infrastructure of the study area is studied by remote sensing and GIS based methodology. As the principal dataset, true colour airborne digital camera data from 2004, was used to generate an aerial image mosaic of the study area. Auxiliary data includes SPOT satellite imagery from 2003, field spectrometry data of road surfaces and relevant literature. Road infrastructure characteristics are interpreted from three test sites using pixel-based supervised classification, object-oriented supervised classifications and visual interpretation. Road infrastructure of the test sites is interpreted visually from a SPOT image. Road centrelines are then extracted from the object-oriented classification results with an automatic vectorisation process. The road infrastructure of the entire image mosaic is mapped by applying the most appropriate assessed data and techniques. The spectral characteristics and reflectance of various road surfaces are considered with the acquired field spectra and relevant literature. The results are compared with the experimented road mapping methods. This study concludes that classification and extraction of roads remains a difficult task, and that the accuracy of the results is inadequate regardless of the high spatial resolution of the image mosaic used in this thesis. Visual interpretation, out of all the experimented methods in this thesis is the most straightforward, accurate and valid technique for road mapping. Certain road surfaces have similar spectral characteristics and reflectance values with other land cover and land use. This has a great influence for digital analysis techniques in particular. Road mapping is made even more complicated by rich vegetation and tree canopy, clouds, shadows, low contrast between roads and surroundings and the width of narrow roads in relation to the spatial resolution of the imagery used. The results of this thesis may be applied to road infrastructure mapping in developing countries on a more general context, although with certain limits. In particular, unclassified rural roads require updated road mapping schemas to intensify road transport possibilities and to assist in the development of the developing world.