969 resultados para on-disk data layout
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
In the present study, we modeled a reaching task as a two-link mechanism. The upper arm and forearm motion trajectories during vertical arm movements were estimated from the measured angular accelerations with dual-axis accelerometers. A data set of reaching synergies from able-bodied individuals was used to train a radial basis function artificial neural network with upper arm/forearm tangential angular accelerations. The trained radial basis function artificial neural network for the specific movements predicted forearm motion from new upper arm trajectories with high correlation (mean, 0.9149-0.941). For all other movements, prediction was low (range, 0.0316-0.8302). Results suggest that the proposed algorithm is successful in generalization over similar motions and subjects. Such networks may be used as a high-level controller that could predict forearm kinematics from voluntary movements of the upper arm. This methodology is suitable for restoring the upper limb functions of individuals with motor disabilities of the forearm, but not of the upper arm. The developed control paradigm is applicable to upper-limb orthotic systems employing functional electrical stimulation. The proposed approach is of great significance particularly for humans with spinal cord injuries in a free-living environment. The implication of a measurement system with dual-axis accelerometers, developed for this study, is further seen in the evaluation of movement during the course of rehabilitation. For this purpose, training-related changes in synergies apparent from movement kinematics during rehabilitation would characterize the extent and the course of recovery. As such, a simple system using this methodology is of particular importance for stroke patients. The results underlie the important issue of upper-limb coordination.
Resumo:
This thesis presented the overview of Open Data research area, quantity of evidence and establishes the research evidence based on the Systematic Mapping Study (SMS). There are 621 such publications were identified published between years 2005 and 2014, but only 243 were selected in the review process. This thesis highlights the implications of Open Data principals’ proliferation in the emerging era of the accessibility, reusability and sustainability of data transparency. The findings of mapping study are described in quantitative and qualitative measurement based on the organization affiliation, countries, year of publications, research method, star rating and units of analysis identified. Furthermore, units of analysis were categorized by development lifecycle, linked open data, type of data, technical platforms, organizations, ontology and semantic, adoption and awareness, intermediaries, security and privacy and supply of data which are important component to provide a quality open data applications and services. The results of the mapping study help the organizations (such as academia, government and industries), re-searchers and software developers to understand the existing trend of open data, latest research development and the demand of future research. In addition, the proposed conceptual framework of Open Data research can be adopted and expanded to strengthen and improved current open data applications.
Resumo:
There are many ways to generate geometrical models for numerical simulation, and most of them start with a segmentation step to extract the boundaries of the regions of interest. This paper presents an algorithm to generate a patient-specific three-dimensional geometric model, based on a tetrahedral mesh, without an initial extraction of contours from the volumetric data. Using the information directly available in the data, such as gray levels, we built a metric to drive a mesh adaptation process. The metric is used to specify the size and orientation of the tetrahedral elements everywhere in the mesh. Our method, which produces anisotropic meshes, gives good results with synthetic and real MRI data. The resulting model quality has been evaluated qualitatively and quantitatively by comparing it with an analytical solution and with a segmentation made by an expert. Results show that our method gives, in 90% of the cases, as good or better meshes as a similar isotropic method, based on the accuracy of the volume reconstruction for a given mesh size. Moreover, a comparison of the Hausdorff distances between adapted meshes of both methods and ground-truth volumes shows that our method decreases reconstruction errors faster. Copyright © 2015 John Wiley & Sons, Ltd.
Resumo:
These notes have been prepared as support to a short course on compositional data analysis. Their aim is to transmit the basic concepts and skills for simple applications, thus setting the premises for more advanced projects
Resumo:
One of the disadvantages of old age is that there is more past than future: this, however, may be turned into an advantage if the wealth of experience and, hopefully, wisdom gained in the past can be reflected upon and throw some light on possible future trends. To an extent, then, this talk is necessarily personal, certainly nostalgic, but also self critical and inquisitive about our understanding of the discipline of statistics. A number of almost philosophical themes will run through the talk: search for appropriate modelling in relation to the real problem envisaged, emphasis on sensible balances between simplicity and complexity, the relative roles of theory and practice, the nature of communication of inferential ideas to the statistical layman, the inter-related roles of teaching, consultation and research. A list of keywords might be: identification of sample space and its mathematical structure, choices between transform and stay, the role of parametric modelling, the role of a sample space metric, the underused hypothesis lattice, the nature of compositional change, particularly in relation to the modelling of processes. While the main theme will be relevance to compositional data analysis we shall point to substantial implications for general multivariate analysis arising from experience of the development of compositional data analysis…
Resumo:
Usually, psychometricians apply classical factorial analysis to evaluate construct validity of order rank scales. Nevertheless, these scales have particular characteristics that must be taken into account: total scores and rank are highly relevant
Resumo:
Self-organizing maps (Kohonen 1997) is a type of artificial neural network developed to explore patterns in high-dimensional multivariate data. The conventional version of the algorithm involves the use of Euclidean metric in the process of adaptation of the model vectors, thus rendering in theory a whole methodology incompatible with non-Euclidean geometries. In this contribution we explore the two main aspects of the problem: 1. Whether the conventional approach using Euclidean metric can shed valid results with compositional data. 2. If a modification of the conventional approach replacing vectorial sum and scalar multiplication by the canonical operators in the simplex (i.e. perturbation and powering) can converge to an adequate solution. Preliminary tests showed that both methodologies can be used on compositional data. However, the modified version of the algorithm performs poorer than the conventional version, in particular, when the data is pathological. Moreover, the conventional ap- proach converges faster to a solution, when data is \well-behaved". Key words: Self Organizing Map; Artificial Neural networks; Compositional data
Resumo:
A new tropopause definition, based on a flow-dependent blending of the traditional thermal tropopause with one based on potential vorticity, has been developed. The benefits of such a blending algorithm are most apparent in regions with synoptic scale fluctuations between tropical and extratropical airmasses. The properties of the local airmass determine the relative contributions to the location of the blended tropopause, rather than this being determined by a specified function of latitude. Global climatologies of tropopause height, temperature, potential temperature and zonal wind, based on European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA) ERA-Interim data, are presented for the period 1989-2007. Features of the seasonal-mean tropopause are discussed on a global scale, alongside a focus on selected monthly climatologies for the two high latitude regions and the tropical belt. The height differences between climatologies based on ERA-Interim and ERA-40 data are also presented. Key spatial and temporal features seen in earlier climatologies, based mainly on the World Meteorological Organization thermal tropopause definition, are reproduced with the new definition. Tropopause temperatures are consistent with those from earlier climatologies, despite some differences in height in the extratropics.
Resumo:
A new tropopause definition involving a flow-dependent blending of the traditional thermal tropopause with one based on potential vorticity has been developed and applied to the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalyses (ERA), ERA-40 and ERA-Interim. Global and regional trends in tropopause characteristics for annual and solsticial seasonal means are presented here, with emphasis on significant results for the newer ERA-Interim data for 1989-2007. The global-mean tropopause is rising at a rate of 47 m decade−1 , with pressure falling at 1.0 hPa decade−1 , and temperature falling at 0.18 K decade−1 . The Antarctic tropopause shows decreasing heights,warming,and increasing westerly winds. The Arctic tropopause also shows a warming, but with decreasing westerly winds. In the tropics the trends are small, but at the latitudes of the sub-tropical jets they are almost double the global values. It is found that these changes are mainly concentrated in the eastern hemisphere. Previous and new metrics for the rate of broadening of the tropics, based on both height and wind, give trends in the range 0.9◦ decade−1 to 2.2◦ decade−1 . For ERA-40 the global height and pressure trends for the period 1979-2001 are similar: 39 m decade−1 and -0.8 hPa decade−1. These values are smaller than those found from the thermal tropopause definition with this data set, as was used in most previous studies.
Resumo:
Variational data assimilation in continuous time is revisited. The central techniques applied in this paper are in part adopted from the theory of optimal nonlinear control. Alternatively, the investigated approach can be considered as a continuous time generalization of what is known as weakly constrained four-dimensional variational assimilation (4D-Var) in the geosciences. The technique allows to assimilate trajectories in the case of partial observations and in the presence of model error. Several mathematical aspects of the approach are studied. Computationally, it amounts to solving a two-point boundary value problem. For imperfect models, the trade-off between small dynamical error (i.e. the trajectory obeys the model dynamics) and small observational error (i.e. the trajectory closely follows the observations) is investigated. This trade-off turns out to be trivial if the model is perfect. However, even in this situation, allowing for minute deviations from the perfect model is shown to have positive effects, namely to regularize the problem. The presented formalism is dynamical in character. No statistical assumptions on dynamical or observational noise are imposed.