989 resultados para data matching


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In a networked business environment the visibility requirements towards the supply operations and customer interface has become tighter. In order to meet those requirements the master data of case company is seen as an enabler. However the current state of master data and its quality are not seen good enough to meet those requirements. In this thesis the target of research was to develop a process for managing master data quality as a continuous process and find solutions to cleanse the current customer and supplier data to meet the quality requirements defined in that process. Based on the theory of Master Data Management and data cleansing, small amount of master data was analyzed and cleansed using one commercial data cleansing solution available on the market. This was conducted in cooperation with the vendor as a proof of concept. In the proof of concept the cleansing solution’s applicability to improve the quality of current master data was proved. Based on those findings and the theory of data management the recommendations and proposals for improving the quality of data were given. In the results was also discovered that the biggest reasons for poor data quality is the lack of data governance in the company, and the current master data solutions and its restrictions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis aims to illustrate the construction of a mathematical model of a hydraulic system, oriented to the design of a model predictive control (MPC) algorithm. The modeling procedure starts with the basic formulation of a piston-servovalve system. The latter is a complex non linear system with some unknown and not measurable effects that constitute a challenging problem for the modeling procedure. The first level of approximation for system parameters is obtained basing on datasheet informations, provided workbench tests and other data from the company. Then, to validate and refine the model, open-loop simulations have been made for data matching with the characteristics obtained from real acquisitions. The final developed set of ODEs captures all the main peculiarities of the system despite some characteristics due to highly varying and unknown hydraulic effects, like the unmodeled resistive elements of the pipes. After an accurate analysis, since the model presents many internal complexities, a simplified version is presented. The latter is used to linearize and discretize correctly the non linear model. Basing on that, a MPC algorithm for reference tracking with linear constraints is implemented. The results obtained show the potential of MPC in this kind of industrial applications, thus a high quality tracking performances while satisfying state and input constraints. The increased robustness and flexibility are evident with respect to the standard control techniques, such as PID controllers, adopted for these systems. The simulations for model validation and the controlled system have been carried out in a Python code environment.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Medication data retrieved from Australian Repatriation Pharmaceutical Benefits Scheme (RPBS) claims for 44 veterans residing in nursing homes and Pharmaceutical Benefits Scheme (PBS) claims for 898 nursing home residents were compared with medication data from nursing home records to determine the optimal time interval for retrieving claims data and its validity. Optimal matching was achieved using 12 weeks of RPBS claims data, with 60% of medications in the RPBS claims located in nursing home administration records, and 78% of medications administered to nursing home residents identified in RPBS claims. In comparison, 48% of medications administered to nursing home residents could be found in 12 weeks of PBS data, and 56% of medications present in PBS claims could be matched with nursing home administration records. RPBS claims data was superior to PBS, due to the larger number of scheduled items available to veterans and the veteran's file number, which acts as a unique identifier. These findings should be taken into account when using prescription claims data for medication histories, prescriber feedback, drug utilisation, intervention or epidemiological studies. (C) 2001 Elsevier Science Inc. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Seismic analysis, horizon matching, fault tracking, marked point process,stochastic annealing

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Magdeburg, Univ., Fak. für Informatik, Diss., 2010

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An extensive sample (2%) of private vehicles in Italy are equipped with a GPS device that periodically measures their position and dynamical state for insurance purposes. Having access to this type of data allows to develop theoretical and practical applications of great interest: the real-time reconstruction of traffic state in a certain region, the development of accurate models of vehicle dynamics, the study of the cognitive dynamics of drivers. In order for these applications to be possible, we first need to develop the ability to reconstruct the paths taken by vehicles on the road network from the raw GPS data. In fact, these data are affected by positioning errors and they are often very distanced from each other (~2 Km). For these reasons, the task of path identification is not straightforward. This thesis describes the approach we followed to reliably identify vehicle paths from this kind of low-sampling data. The problem of matching data with roads is solved with a bayesian approach of maximum likelihood. While the identification of the path taken between two consecutive GPS measures is performed with a specifically developed optimal routing algorithm, based on A* algorithm. The procedure was applied on an off-line urban data sample and proved to be robust and accurate. Future developments will extend the procedure to real-time execution and nation-wide coverage.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Molecular and fragment ion data of intact 8- to 43-kDa proteins from electrospray Fourier-transform tandem mass spectrometry are matched against the corresponding data in sequence data bases. Extending the sequence tag concept of Mann and Wilm for matching peptides, a partial amino acid sequence in the unknown is first identified from the mass differences of a series of fragment ions, and the mass position of this sequence is defined from molecular weight and the fragment ion masses. For three studied proteins, a single sequence tag retrieved only the correct protein from the data base; a fourth protein required the input of two sequence tags. However, three of the data base proteins differed by having an extra methionine or by missing an acetyl or heme substitution. The positions of these modifications in the protein examined were greatly restricted by the mass differences of its molecular and fragment ions versus those of the data base. To characterize the primary structure of an unknown represented in the data base, this method is fast and specific and does not require prior enzymatic or chemical degradation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this article, we evaluate the use of simple Lee-Goldburg cross-polarization (LG-CP) NMR experiments for obtaining quantitative information of molecular motion in the intermediate regime. In particular, we introduce the measurement of Hartmann-Hahn matching profiles for the assessment of heteronuclear dipolar couplings as well as dynamics as a reliable and robust alternative to the more common analysis of build-up curves. We have carried out dynamic spin dynamics simulations in order to test the method's sensitivity to intermediate motion and address its limitations concerning possible experimental imperfections. We further demonstrate the successful use of simple theoretical concepts, most prominently Anderson-Weiss (AW) theory, to analyze the data. We further propose an alternative way to estimate activation energies of molecular motions, based upon the acquisition of only two LG-CP spectra per temperature at different temperatures. As experimental tests, molecular jumps in imidazole methyl sulfonate, trimethylsulfoxonium iodide, and bisphenol A polycarbonate were investigated with the new method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a new relative measure of signal complexity, referred to here as relative structural complexity, which is based on the matching pursuit (MP) decomposition. By relative, we refer to the fact that this new measure is highly dependent on the decomposition dictionary used by MP. The structural part of the definition points to the fact that this new measure is related to the structure, or composition, of the signal under analysis. After a formal definition, the proposed relative structural complexity measure is used in the analysis of newborn EEG. To do this, firstly, a time-frequency (TF) decomposition dictionary is specifically designed to compactly represent the newborn EEG seizure state using MP. We then show, through the analysis of synthetic and real newborn EEG data, that the relative structural complexity measure can indicate changes in EEG structure as it transitions between the two EEG states; namely seizure and background (non-seizure).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present exploratory-descriptive cross-national study focused on the career development of 11- to 14-yr.-old children, in particular whether they can match their personal characteristics with their occupational aspirations. Further, the study explored whether their matching may be explained in terms of a fit between person and environment using Holland's theory as an example. Participants included 511 South African and 372 Australian children. Findings relate to two items of the Revised Career Awareness Survey that require children to relate personal-social knowledge to their favorite occupation. Data were analyzed in three stages using descriptive statistics, i.e., mean scores, frequencies, and percentage agreement. The study indicated that children perceived their personal characteristics to be related to their occupational aspirations. However, how this matching takes place is not adequately accounted for in terms of a career theory such as that of Holland.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação

Relevância:

30.00% 30.00%

Publicador:

Resumo:

No decorrer dos últimos anos, os agentes (inteligentes) de software foram empregues como um método para colmatar as dificuldades associadas com a gestão, partilha e reutilização de um crescente volume de informação, enquanto as ontologias foram utilizadas para modelar essa mesma informação num formato semanticamente explícito e rico. À medida que a popularidade da Web Semântica aumenta e cada vez informação é partilhada sob a forma de ontologias, o problema de integração desta informação amplifica-se. Em semelhante contexto, não é expectável que dois agentes que pretendam cooperar utilizem a mesma ontologia para descrever a sua conceptualização do mundo. Inclusive pode revelar-se necessário que agentes interajam sem terem conhecimento prévio das ontologias utilizadas pelos restantes, sendo necessário que as conciliem em tempo de execução num processo comummente designado por Mapeamento de Ontologias [1]. O processo de mapeamento de ontologias é normalmente oferecido como um serviço aos agentes de negócio, podendo ser requisitado sempre que seja necessário produzir um alinhamento. No entanto, tendo em conta que cada agente tem as suas próprias necessidades e objetivos, assim como a própria natureza subjetiva das ontologias que utilizam, é possível que tenham diferentes interesses relativamente ao processo de alinhamento e que, inclusive, recorram aos serviços de mapeamento que considerem mais convenientes [1]. Diferentes matchers podem produzir resultados distintos e até mesmo contraditórios, criando-se assim conflitos entre os agentes. É necessário que se proceda então a uma tentativa de resolução dos conflitos existentes através de um processo de negociação, de tal forma que os agentes possam chegar a um consenso relativamente às correspondências que devem ser utilizadas na tradução de mensagens a trocar. A resolução de conflitos é considerada uma métrica de grande importância no que diz respeito ao processo de negociação [2]: considera-se que existe uma maior confiança associada a um alinhamento quanto menor o número de conflitos por resolver no processo de negociação que o gerou. Desta forma, um alinhamento com um número elevado de conflitos por resolver apresenta uma confiança menor que o mesmo alinhamento associado a um número elevado de conflitos resolvidos. O processo de negociação para que dois ou mais agentes gerem e concordem com um alinhamento é denominado de Negociação de Mapeamentos de Ontologias. À data existem duas abordagens propostas na literatura: (i) baseadas em Argumentação (e.g. [3] [4]) e (ii) baseadas em Relaxamento [5] [6]. Cada uma das propostas expostas apresenta um número de vantagens e limitações. Foram propostas várias formas de combinação das duas técnicas [2], com o objetivo de beneficiar das vantagens oferecidas e colmatar as suas limitações. No entanto, à data, não são conhecidas experiências documentadas que possam provar tal afirmação e, como tal, não é possível atestar que tais combinações tragam, de facto, o benefício que pretendem. O trabalho aqui apresentado pretende providenciar tais experiências e verificar se a afirmação de melhorias em relação aos resultados das técnicas individuais se mantém. Com o objetivo de permitir a combinação e de colmatar as falhas identificadas, foi proposta uma nova abordagem baseada em Relaxamento, que é posteriormente combinada com as abordagens baseadas em Argumentação. Os seus resultados, juntamente com os da combinação, são aqui apresentados e discutidos, sendo possível identificar diferenças nos resultados gerados por combinações diferentes e possíveis contextos de utilização.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of high spatial resolution airborne and spaceborne sensors has improved the capability of ground-based data collection in the fields of agriculture, geography, geology, mineral identification, detection [2, 3], and classification [4–8]. The signal read by the sensor from a given spatial element of resolution and at a given spectral band is a mixing of components originated by the constituent substances, termed endmembers, located at that element of resolution. This chapter addresses hyperspectral unmixing, which is the decomposition of the pixel spectra into a collection of constituent spectra, or spectral signatures, and their corresponding fractional abundances indicating the proportion of each endmember present in the pixel [9, 10]. Depending on the mixing scales at each pixel, the observed mixture is either linear or nonlinear [11, 12]. The linear mixing model holds when the mixing scale is macroscopic [13]. The nonlinear model holds when the mixing scale is microscopic (i.e., intimate mixtures) [14, 15]. The linear model assumes negligible interaction among distinct endmembers [16, 17]. The nonlinear model assumes that incident solar radiation is scattered by the scene through multiple bounces involving several endmembers [18]. Under the linear mixing model and assuming that the number of endmembers and their spectral signatures are known, hyperspectral unmixing is a linear problem, which can be addressed, for example, under the maximum likelihood setup [19], the constrained least-squares approach [20], the spectral signature matching [21], the spectral angle mapper [22], and the subspace projection methods [20, 23, 24]. Orthogonal subspace projection [23] reduces the data dimensionality, suppresses undesired spectral signatures, and detects the presence of a spectral signature of interest. The basic concept is to project each pixel onto a subspace that is orthogonal to the undesired signatures. As shown in Settle [19], the orthogonal subspace projection technique is equivalent to the maximum likelihood estimator. This projection technique was extended by three unconstrained least-squares approaches [24] (signature space orthogonal projection, oblique subspace projection, target signature space orthogonal projection). Other works using maximum a posteriori probability (MAP) framework [25] and projection pursuit [26, 27] have also been applied to hyperspectral data. In most cases the number of endmembers and their signatures are not known. Independent component analysis (ICA) is an unsupervised source separation process that has been applied with success to blind source separation, to feature extraction, and to unsupervised recognition [28, 29]. ICA consists in finding a linear decomposition of observed data yielding statistically independent components. Given that hyperspectral data are, in given circumstances, linear mixtures, ICA comes to mind as a possible tool to unmix this class of data. In fact, the application of ICA to hyperspectral data has been proposed in reference 30, where endmember signatures are treated as sources and the mixing matrix is composed by the abundance fractions, and in references 9, 25, and 31–38, where sources are the abundance fractions of each endmember. In the first approach, we face two problems: (1) The number of samples are limited to the number of channels and (2) the process of pixel selection, playing the role of mixed sources, is not straightforward. In the second approach, ICA is based on the assumption of mutually independent sources, which is not the case of hyperspectral data, since the sum of the abundance fractions is constant, implying dependence among abundances. This dependence compromises ICA applicability to hyperspectral images. In addition, hyperspectral data are immersed in noise, which degrades the ICA performance. IFA [39] was introduced as a method for recovering independent hidden sources from their observed noisy mixtures. IFA implements two steps. First, source densities and noise covariance are estimated from the observed data by maximum likelihood. Second, sources are reconstructed by an optimal nonlinear estimator. Although IFA is a well-suited technique to unmix independent sources under noisy observations, the dependence among abundance fractions in hyperspectral imagery compromises, as in the ICA case, the IFA performance. Considering the linear mixing model, hyperspectral observations are in a simplex whose vertices correspond to the endmembers. Several approaches [40–43] have exploited this geometric feature of hyperspectral mixtures [42]. Minimum volume transform (MVT) algorithm [43] determines the simplex of minimum volume containing the data. The MVT-type approaches are complex from the computational point of view. Usually, these algorithms first find the convex hull defined by the observed data and then fit a minimum volume simplex to it. Aiming at a lower computational complexity, some algorithms such as the vertex component analysis (VCA) [44], the pixel purity index (PPI) [42], and the N-FINDR [45] still find the minimum volume simplex containing the data cloud, but they assume the presence in the data of at least one pure pixel of each endmember. This is a strong requisite that may not hold in some data sets. In any case, these algorithms find the set of most pure pixels in the data. Hyperspectral sensors collects spatial images over many narrow contiguous bands, yielding large amounts of data. For this reason, very often, the processing of hyperspectral data, included unmixing, is preceded by a dimensionality reduction step to reduce computational complexity and to improve the signal-to-noise ratio (SNR). Principal component analysis (PCA) [46], maximum noise fraction (MNF) [47], and singular value decomposition (SVD) [48] are three well-known projection techniques widely used in remote sensing in general and in unmixing in particular. The newly introduced method [49] exploits the structure of hyperspectral mixtures, namely the fact that spectral vectors are nonnegative. The computational complexity associated with these techniques is an obstacle to real-time implementations. To overcome this problem, band selection [50] and non-statistical [51] algorithms have been introduced. This chapter addresses hyperspectral data source dependence and its impact on ICA and IFA performances. The study consider simulated and real data and is based on mutual information minimization. Hyperspectral observations are described by a generative model. This model takes into account the degradation mechanisms normally found in hyperspectral applications—namely, signature variability [52–54], abundance constraints, topography modulation, and system noise. The computation of mutual information is based on fitting mixtures of Gaussians (MOG) to data. The MOG parameters (number of components, means, covariances, and weights) are inferred using the minimum description length (MDL) based algorithm [55]. We study the behavior of the mutual information as a function of the unmixing matrix. The conclusion is that the unmixing matrix minimizing the mutual information might be very far from the true one. Nevertheless, some abundance fractions might be well separated, mainly in the presence of strong signature variability, a large number of endmembers, and high SNR. We end this chapter by sketching a new methodology to blindly unmix hyperspectral data, where abundance fractions are modeled as a mixture of Dirichlet sources. This model enforces positivity and constant sum sources (full additivity) constraints. The mixing matrix is inferred by an expectation-maximization (EM)-type algorithm. This approach is in the vein of references 39 and 56, replacing independent sources represented by MOG with mixture of Dirichlet sources. Compared with the geometric-based approaches, the advantage of this model is that there is no need to have pure pixels in the observations. The chapter is organized as follows. Section 6.2 presents a spectral radiance model and formulates the spectral unmixing as a linear problem accounting for abundance constraints, signature variability, topography modulation, and system noise. Section 6.3 presents a brief resume of ICA and IFA algorithms. Section 6.4 illustrates the performance of IFA and of some well-known ICA algorithms with experimental data. Section 6.5 studies the ICA and IFA limitations in unmixing hyperspectral data. Section 6.6 presents results of ICA based on real data. Section 6.7 describes the new blind unmixing scheme and some illustrative examples. Section 6.8 concludes with some remarks.