11 resultados para clustered data
em Helda - Digital Repository of University of Helsinki
Resumo:
In this paper, I look into a grammatical phenomenon found among speakers of the Cambridgeshire dialect of English. According to my hypothesis, the phenomenon is a new entry into the past BE verb paradigm in the English language. In my paper, I claim that the structure I have found complements the existing two verb forms, was and were, with a third verb form that I have labelled ‘intermediate past BE’. The paper is divided into two parts. In the first section, I introduce the theoretical ground for the study of variation, which is founded on empiricist principles. In variationist linguistics, the main claim is that heterogeneous language use is structured and ordered. In the last 50 years of history in modern linguistics, this claim is controversial. In the 1960s, the generativist movement spearheaded by Noam Chomsky diverted attention away from grammatical theories that are based on empirical observations. The generativists steered away from language diversity, variation and change in favour of generalisations, abstractions and universalist claims. The theoretical part of my paper goes through the main points of the variationist agenda and concludes that abandoning the concept of language variation in linguistics is harmful for both theory and methodology. In the method part of the paper, I present the Helsinki Archive of Regional English Speech (HARES) corpus. It is an audio archive that contains interviews conducted in England in the 1970s and 1980s. The interviews were done in accordance to methods used generally in traditional dialectology. The informants are mostly elderly male people who have lived in the same region throughout their lives and who have left school at an early age. The interviews are actually conversations: the interviewer allowed the informant to pick the topic of conversation to induce a maximally relaxed and comfortable atmosphere and thus allow the most natural dialect variant to emerge in the informant’s speech. In the paper, the corpus chapter introduces some of the transcription and annotation problems associated with spoken language corpora (especially those containing dialectal speech). Questions surrounding the concept of variation are present in this part of the paper too, as especially transcription work is troubled by the fundamental problem of having to describe the fluctuations of everyday speech in text. In the empirical section of the paper, I use HARES to analyse the speech of four informants, with special focus on the emergence of the intermediate past BE variant. My observations and the subsequent analysis permit me to claim that my hypothesis seems to hold. The intermediate variant occupies almost all contexts where one would expect was or were in the informants’ speech. This means that the new variant is integrated into the speakers’ grammars and exemplifies the kind of variation that is at the heart of this paper.
Resumo:
Lost boys. A multiple case study of the complex school career and life-course of male students who have attended special classes for the emotionally and behaviourally maladjusted The purpose of this thesis is to describe the school career and the life-course of eight former special-class students from the comprehensive school to their further education and into adulthood. The members of the target group have been students of special classes for pupils with emotional and behavioural difficulties in southern Finland. The interviews were made 1994-1997 at school and for follow-up interviews 2002-2003, when the participants were already adults. Six mothers were also interviewed. The qualitative data was gathered using individual interviews and Adult Attachment Interview. The aim was to explore the life-histories of subjects from early childhood to early adulthood. Information was gathered also from the documents concerning the students´ school attendance. Every single life-history is illustrated as a life-course graphic. The data has been analysed using different frames of reference and combining different theories. In addition to theories considering developmental risk factors and protective factors, the data is considered using theories of control over life, attribution, self-efficacy and identity and attitudes towards education. The experiential living mode of the students has been studied, as well. The results of this study show that the frames of references which are used complement each other. The target students clustered identically in spite of the frames of reference. As a result, the study has illustrated the same phenomenon from different points of view. The results of the study consist of three types of school careers: The winding career, the vicious circle career and the straight career. The three careers differ from each other in developmental risk and protective factors and post-school life-courses of the students. The type of childhood families and especially the fathers´ attention to the school education as well as the free time of their sons was important. Keywords: Pupils with emotional and behavioural difficulties, maladjustment to school, life-course, identity
Resumo:
The aim of this study was to measure seasonal variation in mood and behaviour. The dual vulnerability and latitude effect hypothesis, the risk of increased appetite, weight and other seasonal symptoms to develop metabolic syndrome, and perception of low illumination in quality of life and mental well-being were assessed. These variations are prevalent in persons who live in high latitudes and need balancing of metabolic processes to adapt to environmental changes due to seasons. A randomized sample of 8028 adults aged 30 and over (55% women) participated in an epidemiological health examination study, The Health 2000, applying the probability proportional to population size method for a range of socio-demographic characteristics. They were present in a face-to-face interview at home and health status examination. The questionnaires included the modified versions of the Seasonal Pattern Assessment Questionnaire (SPAQ) and Beck Depression Inventory (BDI), the Health Related Quality of Life (HRQoL) instrument 15D, and the General Health Questionnaire (GHQ). The structured and computerized Munich Composite International Diagnostic Interview (M-CIDI) as part of the interview was used to assess diagnoses of mental disorders, and, the National Cholesterol Education Program Adult Treatment Panel III (NCEP-ATPIII) criteria were assessed using all the available information to detect metabolic syndrome. A key finding was that 85% of this nationwide representative sample had seasonal variation in mood and behaviour. Approximately 9% of the study population presented combined seasonal and depressive symptoms with a significant association between their scores, and 2.6% had symptoms that corresponded to Seasonal Affective Disorder (SAD) in severity. Seasonal variations in weight and appetite are two important components that increase the risk of metabolic syndrome. Other factors such as waist circumference and major depressive disorder contributed to the metabolic syndrome as well. Persons reported of having seasonal symptoms were associated with a poorer quality of life and compromised mental well-being, especially if indoors illumination at home and/or at work was experienced as being low. Seasonal and circadian misalignments are suggested to associate with metabolic disorders, and could be remarked if individuals perceive low illumination levels at home and/or at work that affect the health-related quality of life and mental well-being. Keywords: depression, health-related quality of life, illumination, latitude, mental well-being, metabolic syndrome, seasonal variation, winter.
Resumo:
In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.
Resumo:
Helicobacter pylori infection is a risk factor for gastric cancer, which is a major health issue worldwide. Gastric cancer has a poor prognosis due to the unnoticeable progression of the disease and surgery is the only available treatment in gastric cancer. Therefore, gastric cancer patients would greatly benefit from identifying biomarker genes that would improve diagnostic and prognostic prediction and provide targets for molecular therapies. DNA copy number amplifications are the hallmarks of cancers in various anatomical locations. Mechanisms of amplification predict that DNA double-strand breaks occur at the margins of the amplified region. The first objective of this thesis was to identify the genes that were differentially expressed in H. pylori infection as well as the transcription factors and signal transduction pathways that were associated with the gene expression changes. The second objective was to identify putative biomarker genes in gastric cancer with correlated expression and copy number, and the last objective was to characterize cancers based on DNA copy number amplifications. DNA microarrays, an in vitro model and real-time polymerase chain reaction were used to measure gene expression changes in H. pylori infected AGS cells. In order to identify the transcription factors and signal transduction pathways that were activated after H. pylori infection, gene expression profiling data from the H. pylori experiments and a bioinformatics approach accompanied by experimental validation were used. Genome-wide expression and copy number microarray analysis of clinical gastric cancer samples and immunohistochemistry on tissue microarray were used to identify putative gastric cancer genes. Data mining and machine learning techniques were applied to study amplifications in a cross-section of cancers. FOS and various stress response genes were regulated by H. pylori infection. H. pylori regulated genes were enriched in the chromosomal regions that are frequently changed in gastric cancer, suggesting that molecular pathways of gastric cancer and premalignant H. pylori infection that induces gastritis are interconnected. 16 transcription factors were identified as being associated with H. pylori infection induced changes in gene expression. NF-κB transcription factor and p50 and p65 subunits were verified using elecrophoretic mobility shift assays. ERBB2 and other genes located in 17q12- q21 were found to be up-regulated in association with copy number amplification in gastric cancer. Cancers with similar cell type and origin clustered together based on the genomic localization of the amplifications. Cancer genes and large genes were co-localized with amplified regions and fragile sites, telomeres, centromeres and light chromosome bands were enriched at the amplification boundaries. H. pylori activated transcription factors and signal transduction pathways function in cellular mechanisms that might be capable of promoting carcinogenesis of the stomach. Intestinal and diffuse type gastric cancers showed distinct molecular genetic profiles. Integration of gene expression and copy number microarray data allowed the identification of genes that might be involved in gastric carcinogenesis and have clinical relevance. Gene amplifications were demonstrated to be non-random genomic instabilities. Cell lineage, properties of precursor stem cells, tissue microenvironment and genomic map localization of specific oncogenes define the site specificity of DNA amplifications, whereas labile genomic features define the structures of amplicons. These conclusions suggest that the definition of genomic changes in cancer is based on the interplay between the cancer cell and the tumor microenvironment.
Resumo:
During the last decades there has been a global shift in forest management from a focus solely on timber management to ecosystem management that endorses all aspects of forest functions: ecological, economic and social. This has resulted in a shift in paradigm from sustained yield to sustained diversity of values, goods and benefits obtained at the same time, introducing new temporal and spatial scales into forest resource management. The purpose of the present dissertation was to develop methods that would enable spatial and temporal scales to be introduced into the storage, processing, access and utilization of forest resource data. The methods developed are based on a conceptual view of a forest as a hierarchically nested collection of objects that can have a dynamically changing set of attributes. The temporal aspect of the methods consists of lifetime management for the objects and their attributes and of a temporal succession linking the objects together. Development of the forest resource data processing method concentrated on the extensibility and configurability of the data content and model calculations, allowing for a diverse set of processing operations to be executed using the same framework. The contribution of this dissertation to the utilisation of multi-scale forest resource data lies in the development of a reference data generation method to support forest inventory methods in approaching single-tree resolution.
Resumo:
This thesis examines the feasibility of a forest inventory method based on two-phase sampling in estimating forest attributes at the stand or substand levels for forest management purposes. The method is based on multi-source forest inventory combining auxiliary data consisting of remote sensing imagery or other geographic information and field measurements. Auxiliary data are utilized as first-phase data for covering all inventory units. Various methods were examined for improving the accuracy of the forest estimates. Pre-processing of auxiliary data in the form of correcting the spectral properties of aerial imagery was examined (I), as was the selection of aerial image features for estimating forest attributes (II). Various spatial units were compared for extracting image features in a remote sensing aided forest inventory utilizing very high resolution imagery (III). A number of data sources were combined and different weighting procedures were tested in estimating forest attributes (IV, V). Correction of the spectral properties of aerial images proved to be a straightforward and advantageous method for improving the correlation between the image features and the measured forest attributes. Testing different image features that can be extracted from aerial photographs (and other very high resolution images) showed that the images contain a wealth of relevant information that can be extracted only by utilizing the spatial organization of the image pixel values. Furthermore, careful selection of image features for the inventory task generally gives better results than inputting all extractable features to the estimation procedure. When the spatial units for extracting very high resolution image features were examined, an approach based on image segmentation generally showed advantages compared with a traditional sample plot-based approach. Combining several data sources resulted in more accurate estimates than any of the individual data sources alone. The best combined estimate can be derived by weighting the estimates produced by the individual data sources by the inverse values of their mean square errors. Despite the fact that the plot-level estimation accuracy in two-phase sampling inventory can be improved in many ways, the accuracy of forest estimates based mainly on single-view satellite and aerial imagery is a relatively poor basis for making stand-level management decisions.
Resumo:
Establishment of Pinus kesiya Roy. ex Gord. plantations in Thailand began in the 1960s by the Royal Forest Department. The aim was to reforest abandoned swidden areas and grasslands in order to reduce erosion and to produce timber and fuel wood. Today there are about 150, 000 ha of P. kesiya plantations in northern Thailand. Most of these plantations cannot be harvested due to a national logging ban. Previous studies have suggested that Pinus kesiya plantations posses a capability as a foster environment for native broadleaved tree species, but little is known about the extent of regeneration in these plantations. The general aim of the study was to clarify the extent of forest regeneration and interactions behind it in Pinus kesiya plantations of the Ping River basin, northern Thailand. Based on the results of this study and previous literature, forest management proposals were produced for the area studied. In four different pine plantation areas, a total of seven plantations were assessed using systematic data collection with clustered circular sample plots. Vegetation and environmental data were statistically analysed, so as to recognise the key factors affecting regeneration. Regeneration had occurred in all plantations studied. Regeneration of broadleaved trees was negatively affected by forest fire and canopy coverage. A high basal area of mature broadleaved trees affected the regeneration process positively. Forest fire disturbance had a strong effect also on plantation structure and species composition. Because of an unclear future forest management setting as regards forest laws in Thailand, a management system that enables various future utilisation possibilities and emphasises local participation is recommended for P. kesiya watershed platations of northern Thailand.
Resumo:
The aim of this thesis is to develop a fully automatic lameness detection system that operates in a milking robot. The instrumentation, measurement software, algorithms for data analysis and a neural network model for lameness detection were developed. Automatic milking has become a common practice in dairy husbandry, and in the year 2006 about 4000 farms worldwide used over 6000 milking robots. There is a worldwide movement with the objective of fully automating every process from feeding to milking. Increase in automation is a consequence of increasing farm sizes, the demand for more efficient production and the growth of labour costs. As the level of automation increases, the time that the cattle keeper uses for monitoring animals often decreases. This has created a need for systems for automatically monitoring the health of farm animals. The popularity of milking robots also offers a new and unique possibility to monitor animals in a single confined space up to four times daily. Lameness is a crucial welfare issue in the modern dairy industry. Limb disorders cause serious welfare, health and economic problems especially in loose housing of cattle. Lameness causes losses in milk production and leads to early culling of animals. These costs could be reduced with early identification and treatment. At present, only a few methods for automatically detecting lameness have been developed, and the most common methods used for lameness detection and assessment are various visual locomotion scoring systems. The problem with locomotion scoring is that it needs experience to be conducted properly, it is labour intensive as an on-farm method and the results are subjective. A four balance system for measuring the leg load distribution of dairy cows during milking in order to detect lameness was developed and set up in the University of Helsinki Research farm Suitia. The leg weights of 73 cows were successfully recorded during almost 10,000 robotic milkings over a period of 5 months. The cows were locomotion scored weekly, and the lame cows were inspected clinically for hoof lesions. Unsuccessful measurements, caused by cows standing outside the balances, were removed from the data with a special algorithm, and the mean leg loads and the number of kicks during milking was calculated. In order to develop an expert system to automatically detect lameness cases, a model was needed. A probabilistic neural network (PNN) classifier model was chosen for the task. The data was divided in two parts and 5,074 measurements from 37 cows were used to train the model. The operation of the model was evaluated for its ability to detect lameness in the validating dataset, which had 4,868 measurements from 36 cows. The model was able to classify 96% of the measurements correctly as sound or lame cows, and 100% of the lameness cases in the validation data were identified. The number of measurements causing false alarms was 1.1%. The developed model has the potential to be used for on-farm decision support and can be used in a real-time lameness monitoring system.