906 resultados para Data Interpretation, Statistical
Resumo:
An application that translates raw thermal melt curve data into more easily assimilated knowledge is described. This program, called ‘Meltdown’, performs a number of data remediation steps before classifying melt curves and estimating melting temperatures. The final output is a report that summarizes the results of a differential scanning fluorimetry experiment. Meltdown uses a Bayesian classification scheme, enabling reproducible identification of various trends commonly found in DSF datasets. The goal of Meltdown is not to replace human analysis of the raw data, but to provide a sensible interpretation of the data to make this useful experimental technique accessible to naïve users, as well as providing a starting point for detailed analyses by more experienced users.
Resumo:
Compositional data analysis usually deals with relative information between parts where the total (abundances, mass, amount, etc.) is unknown or uninformative. This article addresses the question of what to do when the total is known and is of interest. Tools used in this case are reviewed and analysed, in particular the relationship between the positive orthant of D-dimensional real space, the product space of the real line times the D-part simplex, and their Euclidean space structures. The first alternative corresponds to data analysis taking logarithms on each component, and the second one to treat a log-transformed total jointly with a composition describing the distribution of component amounts. Real data about total abundances of phytoplankton in an Australian river motivated the present study and are used for illustration.
Resumo:
Population dynamics are generally viewed as the result of intrinsic (purely density dependent) and extrinsic (environmental) processes. Both components, and potential interactions between those two, have to be modelled in order to understand and predict dynamics of natural populations; a topic that is of great importance in population management and conservation. This thesis focuses on modelling environmental effects in population dynamics and how effects of potentially relevant environmental variables can be statistically identified and quantified from time series data. Chapter I presents some useful models of multiplicative environmental effects for unstructured density dependent populations. The presented models can be written as standard multiple regression models that are easy to fit to data. Chapters II IV constitute empirical studies that statistically model environmental effects on population dynamics of several migratory bird species with different life history characteristics and migration strategies. In Chapter II, spruce cone crops are found to have a strong positive effect on the population growth of the great spotted woodpecker (Dendrocopos major), while cone crops of pine another important food resource for the species do not effectively explain population growth. The study compares rate- and ratio-dependent effects of cone availability, using state-space models that distinguish between process and observation error in the time series data. Chapter III shows how drought, in combination with settling behaviour during migration, produces asymmetric spatially synchronous patterns of population dynamics in North American ducks (genus Anas). Chapter IV investigates the dynamics of a Finnish population of skylark (Alauda arvensis), and point out effects of rainfall and habitat quality on population growth. Because the skylark time series and some of the environmental variables included show strong positive autocorrelation, the statistical significances are calculated using a Monte Carlo method, where random autocorrelated time series are generated. Chapter V is a simulation-based study, showing that ignoring observation error in analyses of population time series data can bias the estimated effects and measures of uncertainty, if the environmental variables are autocorrelated. It is concluded that the use of state-space models is an effective way to reach more accurate results. In summary, there are several biological assumptions and methodological issues that can affect the inferential outcome when estimating environmental effects from time series data, and that therefore need special attention. The functional form of the environmental effects and potential interactions between environment and population density are important to deal with. Other issues that should be considered are assumptions about density dependent regulation, modelling potential observation error, and when needed, accounting for spatial and/or temporal autocorrelation.
Resumo:
Understanding the functioning of a neural system in terms of its underlying circuitry is an important problem in neuroscience. Recent d evelopments in electrophysiology and imaging allow one to simultaneously record activities of hundreds of neurons. Inferring the underlying neuronal connectivity patterns from such multi-neuronal spike train data streams is a challenging statistical and computational problem. This task involves finding significant temporal patterns from vast amounts of symbolic time series data. In this paper we show that the frequent episode mining methods from the field of temporal data mining can be very useful in this context. In the frequent episode discovery framework, the data is viewed as a sequence of events, each of which is characterized by an event type and its time of occurrence and episodes are certain types of temporal patterns in such data. Here we show that, using the set of discovered frequent episodes from multi-neuronal data, one can infer different types of connectivity patterns in the neural system that generated it. For this purpose, we introduce the notion of mining for frequent episodes under certain temporal constraints; the structure of these temporal constraints is motivated by the application. We present algorithms for discovering serial and parallel episodes under these temporal constraints. Through extensive simulation studies we demonstrate that these methods are useful for unearthing patterns of neuronal network connectivity.
Resumo:
This work focuses on the role of macroseismology in the assessment of seismicity and probabilistic seismic hazard in Northern Europe. The main type of data under consideration is a set of macroseismic observations available for a given earthquake. The macroseismic questionnaires used to collect earthquake observations from local residents since the late 1800s constitute a special part of the seismological heritage in the region. Information of the earthquakes felt on the coasts of the Gulf of Bothnia between 31 March and 2 April 1883 and on 28 July 1888 was retrieved from the contemporary Finnish and Swedish newspapers, while the earthquake of 4 November 1898 GMT is an example of an early systematic macroseismic survey in the region. A data set of more than 1200 macroseismic questionnaires is available for the earthquake in Central Finland on 16 November 1931. Basic macroseismic investigations including preparation of new intensity data point (IDP) maps were conducted for these earthquakes. Previously disregarded usable observations were found in the press. The improved collection of IDPs of the 1888 earthquake shows that this event was a rare occurrence in the area. In contrast to earlier notions it was felt on both sides of the Gulf of Bothnia. The data on the earthquake of 4 November 1898 GMT were augmented with historical background information discovered in various archives and libraries. This earthquake was of some concern to the authorities, because extra fire inspections were conducted in three towns at least, i.e. Tornio, Haparanda and Piteå, located in the centre of the area of perceptibility. This event posed the indirect hazard of fire, although its magnitude around 4.6 was minor on the global scale. The distribution of slightly damaging intensities was larger than previously outlined. This may have resulted from the amplification of the ground shaking in the soft soil of the coast and river valleys where most of the population was found. The large data set of the 1931 earthquake provided an opportunity to apply statistical methods and assess methodologies that can be used when dealing with macroseismic intensity. It was evaluated using correspondence analysis. Different approaches such as gridding were tested to estimate the macroseismic field from the intensity values distributed irregularly in space. In general, the characteristics of intensity warrant careful consideration. A more pervasive perception of intensity as an ordinal quantity affected by uncertainties is advocated. A parametric earthquake catalogue comprising entries from both the macroseismic and instrumental era was used for probabilistic seismic hazard assessment. The parametric-historic methodology was applied to estimate seismic hazard at a given site in Finland and to prepare a seismic hazard map for Northern Europe. The interpretation of these results is an important issue, because the recurrence times of damaging earthquakes may well exceed thousands of years in an intraplate setting such as Northern Europe. This application may therefore be seen as an example of short-term hazard assessment.
Resumo:
An efficient and statistically robust solution for the identification of asteroids among numerous sets of astrometry is presented. In particular, numerical methods have been developed for the short-term identification of asteroids at discovery, and for the long-term identification of scarcely observed asteroids over apparitions, a task which has been lacking a robust method until now. The methods are based on the solid foundation of statistical orbital inversion properly taking into account the observational uncertainties, which allows for the detection of practically all correct identifications. Through the use of dimensionality-reduction techniques and efficient data structures, the exact methods have a loglinear, that is, O(nlog(n)), computational complexity, where n is the number of included observation sets. The methods developed are thus suitable for future large-scale surveys which anticipate a substantial increase in the astrometric data rate. Due to the discontinuous nature of asteroid astrometry, separate sets of astrometry must be linked to a common asteroid from the very first discovery detections onwards. The reason for the discontinuity in the observed positions is the rotation of the observer with the Earth as well as the motion of the asteroid and the observer about the Sun. Therefore, the aim of identification is to find a set of orbital elements that reproduce the observed positions with residuals similar to the inevitable observational uncertainty. Unless the astrometric observation sets are linked, the corresponding asteroid is eventually lost as the uncertainty of the predicted positions grows too large to allow successful follow-up. Whereas the presented identification theory and the numerical comparison algorithm are generally applicable, that is, also in fields other than astronomy (e.g., in the identification of space debris), the numerical methods developed for asteroid identification can immediately be applied to all objects on heliocentric orbits with negligible effects due to non-gravitational forces in the time frame of the analysis. The methods developed have been successfully applied to various identification problems. Simulations have shown that the methods developed are able to find virtually all correct linkages despite challenges such as numerous scarce observation sets, astrometric uncertainty, numerous objects confined to a limited region on the celestial sphere, long linking intervals, and substantial parallaxes. Tens of previously unknown main-belt asteroids have been identified with the short-term method in a preliminary study to locate asteroids among numerous unidentified sets of single-night astrometry of moving objects, and scarce astrometry obtained nearly simultaneously with Earth-based and space-based telescopes has been successfully linked despite a substantial parallax. Using the long-term method, thousands of realistic 3-linkages typically spanning several apparitions have so far been found among designated observation sets each spanning less than 48 hours.
Resumo:
Protein conformations and dynamics can be studied by nuclear magnetic resonance spectroscopy using dilute liquid crystalline samples. This work clarifies the interpretation of residual dipolar coupling data yielded by the experiments. It was discovered that unfolded proteins without any additional structure beyond that of a mere polypeptide chain exhibit residual dipolar couplings. Also, it was found that molecular dynamics induce fluctuations in the molecular alignment and doing so affect residual dipolar couplings. The finding clarified the origins of low order parameter values observed earlier. The work required the development of new analytical and computational methods for the prediction of intrinsic residual dipolar coupling profiles for unfolded proteins. The presented characteristic chain model is able to reproduce the general trend of experimental residual dipolar couplings for denatured proteins. The details of experimental residual dipolar coupling profiles are beyond the analytical model, but improvements are proposed to achieve greater accuracy. A computational method for rapid prediction of unfolded protein residual dipolar couplings was also developed. Protein dynamics were shown to modulate the effective molecular alignment in a dilute liquid crystalline medium. The effects were investigated from experimental and molecular dynamics generated conformational ensembles of folded proteins. It was noted that dynamics induced alignment is significant especially for the interpretation of molecular dynamics in small, globular proteins. A method of correction was presented. Residual dipolar couplings offer an attractive possibility for the direct observation of protein conformational preferences and dynamics. The presented models and methods of analysis provide significant advances in the interpretation of residual dipolar coupling data from proteins.
Resumo:
"We thank MrGilder for his considered comments and suggestions for alternative analyses of our data. We also appreciate Mr Gilder’s support of our call for larger studies to contribute to the evidence base for preoperative loading with high-carbohydrate fluids..."
Resumo:
Image fusion is a formal framework which is expressed as means and tools for the alliance of multisensor, multitemporal, and multiresolution data. Multisource data vary in spectral, spatial and temporal resolutions necessitating advanced analytical or numerical techniques for enhanced interpretation capabilities. This paper reviews seven pixel based image fusion techniques - intensity-hue-saturation, brovey, high pass filter (HPF), high pass modulation (HPM), principal component analysis, fourier transform and correspondence analysis.Validation of these techniques on IKONOS data (Panchromatic band at I m spatial resolution and Multispectral 4 bands at 4 in spatial resolution) reveal that HPF and HPM methods synthesises the images closest to those the corresponding multisensors would observe at the high resolution level.
Resumo:
Background We aimed to assess the effect of afatinib on overall survival of patients with EGFR mutation-positive lung adenocarcinoma through an analysis of data from two open-label, randomised, phase 3 trials. Methods Previously untreated patients with EGFR mutation-positive stage IIIB or IV lung adenocarcinoma were enrolled in LUX-Lung 3 (n=345) and LUX-Lung 6 (n=364). These patients were randomly assigned in a 2:1 ratio to receive afatinib or chemotherapy (pemetrexed-cisplatin [LUX-Lung 3] or gemcitabine-cisplatin [LUX-Lung 6]), stratified by EGFR mutation (exon 19 deletion [del19], Leu858Arg, or other) and ethnic origin (LUX-Lung 3 only). We planned analyses of mature overall survival data in the intention-to-treat population after 209 (LUX-Lung 3) and 237 (LUX-Lung 6) deaths. These ongoing studies are registered with ClinicalTrials.gov, numbers NCT00949650 and NCT01121393. Findings Median follow-up in LUX-Lung 3 was 41 months (IQR 35–44); 213 (62%) of 345 patients had died. Median follow-up in LUX-Lung 6 was 33 months (IQR 31–37); 246 (68%) of 364 patients had died. In LUX-Lung 3, median overall survival was 28·2 months (95% CI 24·6–33·6) in the afatinib group and 28·2 months (20·7–33·2) in the pemetrexed-cisplatin group (HR 0·88, 95% CI 0·66–1·17, p=0·39). In LUX-Lung 6, median overall survival was 23·1 months (95% CI 20·4–27·3) in the afatinib group and 23·5 months (18·0–25·6) in the gemcitabine-cisplatin group (HR 0·93, 95% CI 0·72–1·22, p=0·61). However, in preplanned analyses, overall survival was significantly longer for patients with del19-positive tumours in the afatinib group than in the chemotherapy group in both trials: in LUX-Lung 3, median overall survival was 33·3 months (95% CI 26·8–41·5) in the afatinib group versus 21·1 months (16·3–30·7) in the chemotherapy group (HR 0·54, 95% CI 0·36–0·79, p=0·0015); in LUX-Lung 6, it was 31·4 months (95% CI 24·2–35·3) versus 18·4 months (14·6–25·6), respectively (HR 0·64, 95% CI 0·44–0·94, p=0·023). By contrast, there were no significant differences by treatment group for patients with EGFR Leu858Arg-positive tumours in either trial: in LUX-Lung 3, median overall survival was 27·6 months (19·8–41·7) in the afatinib group versus 40·3 months (24·3–not estimable) in the chemotherapy group (HR 1·30, 95% CI 0·80–2·11, p=0·29); in LUX-Lung 6, it was 19·6 months (95% CI 17·0–22·1) versus 24·3 months (19·0–27·0), respectively (HR 1·22, 95% CI 0·81–1·83, p=0·34). In both trials, the most common afatinib-related grade 3–4 adverse events were rash or acne (37 [16%] of 229 patients in LUX-Lung 3 and 35 [15%] of 239 patients in LUX-Lung 6), diarrhoea (33 [14%] and 13 [5%]), paronychia (26 [11%] in LUX-Lung 3 only), and stomatitis or mucositis (13 [5%] in LUX-Lung 6 only). In LUX-Lung 3, neutropenia (20 [18%] of 111 patients), fatigue (14 [13%]) and leucopenia (nine [8%]) were the most common chemotherapy-related grade 3–4 adverse events, while in LUX-Lung 6, the most common chemotherapy-related grade 3–4 adverse events were neutropenia (30 [27%] of 113 patients), vomiting (22 [19%]), and leucopenia (17 [15%]). Interpretation Although afatinib did not improve overall survival in the whole population of either trial, overall survival was improved with the drug for patients with del19 EGFR mutations. The absence of an effect in patients with Leu858Arg EGFR mutations suggests that EGFR del19-positive disease might be distinct from Leu858Arg-positive disease and that these subgroups should be analysed separately in future trials.
Resumo:
Accelerator mass spectrometry (AMS) is an ultrasensitive technique for measuring the concentration of a single isotope. The electric and magnetic fields of an electrostatic accelerator system are used to filter out other isotopes from the ion beam. The high velocity means that molecules can be destroyed and removed from the measurement background. As a result, concentrations down to one atom in 10^16 atoms are measurable. This thesis describes the construction of the new AMS system in the Accelerator Laboratory of the University of Helsinki. The system is described in detail along with the relevant ion optics. System performance and some of the 14C measurements done with the system are described. In a second part of the thesis, a novel statistical model for the analysis of AMS data is presented. Bayesian methods are used in order to make the best use of the available information. In the new model, instrumental drift is modelled with a continuous first-order autoregressive process. This enables rigorous normalization to standards measured at different times. The Poisson statistical nature of a 14C measurement is also taken into account properly, so that uncertainty estimates are much more stable. It is shown that, overall, the new model improves both the accuracy and the precision of AMS measurements. In particular, the results can be improved for samples with very low 14C concentrations or measured only a few times.
Resumo:
Changes in alcohol pricing have been documented as inversely associated with changes in consumption and alcohol-related problems. Evidence of the association between price changes and health problems is nevertheless patchy and is based to a large extent on cross-sectional state-level data, or time series of such cross-sectional analyses. Natural experimental studies have been called for. There was a substantial reduction in the price of alcohol in Finland in 2004 due to a reduction in alcohol taxes of one third, on average, and the abolition of duty-free allowances for travellers from the EU. These changes in the Finnish alcohol policy could be considered a natural experiment, which offered a good opportunity to study what happens with regard to alcohol-related problems when prices go down. The present study investigated the effects of this reduction in alcohol prices on (1) alcohol-related and all-cause mortality, and mortality due to cardiovascular diseases, (2) alcohol-related morbidity in terms of hospitalisation, (3) socioeconomic differentials in alcohol-related mortality, and (4) small-area differences in interpersonal violence in the Helsinki Metropolitan area. Differential trends in alcohol-related mortality prior to the price reduction were also analysed. A variety of population-based register data was used in the study. Time-series intervention analysis modelling was applied to monthly aggregations of deaths and hospitalisation for the period 1996-2006. These and other mortality analyses were carried out for men and women aged 15 years and over. Socioeconomic differentials in alcohol-related mortality were assessed on a before/after basis, mortality being followed up in 2001-2003 (before the price reduction) and 2004-2005 (after). Alcohol-related mortality was defined in all the studies on mortality on the basis of information on both underlying and contributory causes of death. Hospitalisation related to alcohol meant that there was a reference to alcohol in the primary diagnosis. Data on interpersonal violence was gathered from 86 administrative small-areas in the Helsinki Metropolitan area and was also assessed on a before/after basis followed up in 2002-2003 and 2004-2005. The statistical methods employed to analyse these data sets included time-series analysis, and Poisson and linear regression. The results of the study indicate that alcohol-related deaths increased substantially among men aged 40-69 years and among women aged 50-69 after the price reduction when trends and seasonal variation were taken into account. The increase was mainly attributable to chronic causes, particularly liver diseases. Mortality due to cardiovascular diseases and all-cause mortality, on the other hand, decreased considerably among the-over-69-year-olds. The increase in alcohol-related mortality in absolute terms among the 30-59-year-olds was largest among the unemployed and early-age pensioners, and those with a low level of education, social class or income. The relative differences in change between the education and social class subgroups were small. The employed and those under the age of 35 did not suffer from increased alcohol-related mortality in the two years following the price reduction. The gap between the age and education groups, which was substantial in the 1980s, thus further broadened. With regard to alcohol-related hospitalisation, there was an increase in both chronic and acute causes among men under the age of 70, and among women in the 50-69-year age group when trends and seasonal variation were taken into account. Alcohol dependence and other alcohol-related mental and behavioural disorders were the largest category in both the total number of chronic hospitalisation and in the increase. There was no increase in the rate of interpersonal violence in the Helsinki Metropolitan area, and even a decrease in domestic violence. There was a significant relationship between the measures of social disadvantage on the area level and interpersonal violence, although the differences in the effects of the price reduction between the different areas were small. The findings of the present study suggest that that a reduction in alcohol prices may lead to a substantial increase in alcohol-related mortality and morbidity. However, large population group differences were observed regarding responsiveness to the price changes. In particular, the less privileged, such as the unemployed, were most sensitive. In contrast, at least in the Finnish context, the younger generations and the employed do not appear to be adversely affected, and those in the older age groups may even benefit from cheaper alcohol in terms of decreased rates of CVD mortality. The results also suggest that reductions in alcohol prices do not necessarily affect interpersonal violence. The population group differences in the effects of the price changes on alcohol-related harm should be acknowledged, and therefore the policy actions should focus on the population subgroups that are primarily responsive to the price reduction.
Resumo:
This article reports on a 6-year study that examined the association between pre-admission variables and field placement performance in an Australian bachelor of social work program (N=463). Very few of the pre-admission variables were found to be significantly associated with performance. These findings and the role of the admissions process are discussed. In addition to the usual academic criteria, the authors urge schools to include a focus on nonacademic criteria during the admissions process and the ongoing educational program.
Resumo:
The aim of the study was to find out how the consumption of the population in Finland became a target of social interest and production of statistical data in the early 20th century, and what efforts have been made to influence consumption with social policy measures at different times. Questions concerning consumption are examined through the practices employed in the compilation of statistics on it. The interpretation framework in the study is Michael Foucault s perspective of modern liberal government. This mode of government is typified by pursuit of efficiency and search of equilibrium between economic government and a government of the processes of life. It shows aspirations towards both integration and individualisation. The government is based on freedom practices. It also implies knowledge-based ways of conceptualising reality. Statistical data are of specific significance in this context. The connection between the government of consumption and the compilation of statistics on it is studied through the theoretical, socio-political and statistical conceptualisation of consumption. The research material consisted of Finnish and international documentation on the compilation of statistics on consumption, publications of social programmes, and reports of studies on consumption. The analysis of the material focused especially on the problematisations related to consumption found in these documents and on changes in them over history. There have been both clearly observable changes and as well as historical stratification and diversity in the rationalities and practices of consumption government during the 20th century. Consumption has been influenced by pluralistic government, based at different times and in varying ways on the logics of solidarity and markets. The difference between these is that in the former risks are prepared for collectively while in the latter risks are individualised. Despite the differences, the characteristic that is common to these logics is certain kind of contractuality. They are both permeated by the household logic which differs from them in that it is based on the normative and ethical demands imposed on an individual. There has been a clear interactive connection between statistical data and consumption government. Statistical practices have followed changes in the way consumption has been conceptualised in society. This has been reflected in the statistical phenomena of interest, concepts, classifications and indicators. New ways of compiling statistics have in their turn shaped perceptions of reality. Statistical data have also facilitated a variety of rational calculations with which the consequences of the population s consumption habits have been evaluated at the levels of economy at large and individuals.
Resumo:
Two algorithms are outlined, each of which has interesting features for modeling of spatial variability of rock depth. In this paper, reduced level of rock at Bangalore, India, is arrived from the 652 boreholes data in the area covering 220 sqa <.km. Support vector machine (SVM) and relevance vector machine (RVM) have been utilized to predict the reduced level of rock in the subsurface of Bangalore and to study the spatial variability of the rock depth. The support vector machine (SVM) that is firmly based on the theory of statistical learning theory uses regression technique by introducing epsilon-insensitive loss function has been adopted. RVM is a probabilistic model similar to the widespread SVM, but where the training takes place in a Bayesian framework. Prediction results show the ability of learning machine to build accurate models for spatial variability of rock depth with strong predictive capabilities. The paper also highlights the capability ofRVM over the SVM model.