912 resultados para forward selection component analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The spatial variability of soil and plant properties exerts great influence on the yeld of agricultural crops. This study analyzed the spatial variability of the fertility of a Humic Rhodic Hapludox with Arabic coffee, using principal component analysis, cluster analysis and geostatistics in combination. The experiment was carried out in an area under Coffea arabica L., variety Catucai 20/15 - 479. The soil was sampled at a depth 0.20 m, at 50 points of a sampling grid. The following chemical properties were determined: P, K+, Ca2+, Mg2+, Na+, S, Al3+, pH, H + Al, SB, t, T, V, m, OM, Na saturation index (SSI), remaining phosphorus (P-rem), and micronutrients (Zn, Fe, Mn, Cu and B). The data were analyzed with descriptive statistics, followed by principal component and cluster analyses. Geostatistics were used to check and quantify the degree of spatial dependence of properties, represented by principal components. The principal component analysis allowed a dimensional reduction of the problem, providing interpretable components, with little information loss. Despite the characteristic information loss of principal component analysis, the combination of this technique with geostatistical analysis was efficient for the quantification and determination of the structure of spatial dependence of soil fertility. In general, the availability of soil mineral nutrients was low and the levels of acidity and exchangeable Al were high.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The model plant Arabidopsis thaliana was studied for the search of new metabolites involved in wound signalling. Diverse LC approaches were considered in terms of efficiency and analysis time and a 7-min gradient on a UPLC-TOF-MS system with a short column was chosen for metabolite fingerprinting. This screening step was designed to allow the comparison of a high number of samples over a wide range of time points after stress induction in positive and negative ionisation modes. Thanks to data treatment, clear discrimination was obtained, providing lists of potential stress-induced ions. In a second step, the fingerprinting conditions were transferred to longer column, providing a higher peak capacity able to demonstrate the presence of isomers among the highlighted compounds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this work is to study the influence of several analytical parameters on the variability of Raman spectra of paint samples. In the present study, microtome thin section and direct (no preparation) analysis are considered as sample preparation. In order to evaluate their influence on the measures, an experimental design such as 'fractional full factorial' with seven factors (including the sampling process) is applied, for a total of 32 experiments representing 160 measures. Once the influence of sample preparation highlighted, a depth profile of a paint sample is carried out by changing the focusing plane in order to measure the colored layer under a clearcoat. This is undertaken in order to avoid sample preparation such a microtome sectioning. Finally, chemometric treatments such as principal component analysis are applied to the resulting spectra. The findings of this study indicate the importance of sample preparation, or more specifically, the surface roughness, on the variability of the measurements on a same sample. Moreover, the depth profile experiment highlights the influence of the refractive index of the upper layer (clearcoat) when measuring through a transparent layer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recognition by the T-cell receptor (TCR) of immunogenic peptides presented by class I major histocompatibility complexes (MHCs) is the determining event in the specific cellular immune response against virus-infected cells or tumor cells. It is of great interest, therefore, to elucidate the molecular principles upon which the selectivity of a TCR is based. These principles can in turn be used to design therapeutic approaches, such as peptide-based immunotherapies of cancer. In this study, free energy simulation methods are used to analyze the binding free energy difference of a particular TCR (A6) for a wild-type peptide (Tax) and a mutant peptide (Tax P6A), both presented in HLA A2. The computed free energy difference is 2.9 kcal/mol, in good agreement with the experimental value. This makes possible the use of the simulation results for obtaining an understanding of the origin of the free energy difference which was not available from the experimental results. A free energy component analysis makes possible the decomposition of the free energy difference between the binding of the wild-type and mutant peptide into its components. Of particular interest is the fact that better solvation of the mutant peptide when bound to the MHC molecule is an important contribution to the greater affinity of the TCR for the latter. The results make possible identification of the residues of the TCR which are important for the selectivity. This provides an understanding of the molecular principles that govern the recognition. The possibility of using free energy simulations in designing peptide derivatives for cancer immunotherapy is briefly discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Organisatorisen luottamuksen tutkimuksessa luottamus nähdään yleensä henkilöiden välisenä ilmiönä kuten työntekijän luottamuksena työtovereihin, esimieheen tai lähimpään johtoon. Organisatorisessa luottamuksessa on kuitenkin myös ei-henkilöityvä ulottuvuus, ns. institutionaalinen luottamus. Tähän mennessä vain muutamat tutkijat ovat omissa tutkimuksissaan käyttäneet myös institutionaalista luottamusta osana organisatorista luottamusta. Tämän työn tavoitteena on kehittää institutionaalisen luottamuksen käsitettä sekä mittari sen havainnoimiseksi organisaatioympäristössä. Kehitysprosessi koostui kolmesta vaiheesta. Ensimmäisessä vaiheessa kehitettiin mittariin tulevia väittämiä sekä arvioitiin sisällön validiteetti. Toinen vaihe käsitti aineiston keruun, väittämien karsimisen sekä vaihtoehtoisten mallien vertailun. Kolmannessa vaiheessa arvioitiin rakennevaliditeetti sekä reliabiliteetti. Työn empiirinen osatoteutettiin internet-kyselynä aikuisopiskelijoiden keskuudessa. Aineiston analysoinnissa käytettiin pääkomponenttianalyysiä sekä konfirmatorista faktorianalyysiä. Institutionaalinen luottamus muodostuu kahdesta ulottuvuudesta: kyvykkyys ja oikeudenmukaisuus. Kyvykkyys muodostuu viidestä alakomponentista: operatiivisen toiminnan organisointi, organisaation pysyvyys, kyvykkyys liiketoiminnan ja ihmisten johtamisessa, teknologinen luotettavuus sekä kilpailukyky. Oikeudenmukaisuus puolestaan muodostuu HRM-käytännöistä, organisaatiossa vallitsevasta reilun pelin hengestä sekä kommunikaatiosta. Lopullinen mittari kyvykkyydelle käsittää 18 väittämää ja oikeudenmukaisuudelle 13 väittämää. Työssä kehitetty mittari mahdollistaa organisatorisen luottamuksen entistä paremman ja luotettavamman mittaamisen. Tutkijan tietämyksen mukaan tämä onensimmäinen kokonaisvaltainen mittari institutionaalisen luottamuksen mittaamiseksi.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tämä diplomityö liittyy Spektrikuvien tutkimiseen tilastollisen kuvamallin näkökulmasta. Diplomityön ensimmäisessä osassa tarkastellaan tilastollisten parametrien jakaumien vaikutusta väreihin ja korostumiin erilaisissa valaistusolosuhteissa. Havaittiin, että tilastollisten parametrien väliset suhteet eivät riipu valaistusolosuhteista, mutta riippuvat kuvan häiriöttömyydestä. Ilmeni myös, että korkea huipukkuus saattaa aiheutua värikylläisyydestä. Lisäksi työssä kehitettiin tilastolliseen spektrimalliin perustuvaa tekstuurinyhdistämisalgoritmia. Sillä saavutettiin hyviä tuloksia, kun tilastollisten parametrien väliset riippuvuussuhteet olivat voimassa. Työn toisessa osassa erilaisia spektrikuvia tutkittiin käyttäen itsenäistä komponenttien analyysia (ICA). Seuraavia itsenäiseen komponenttien analyysiin tarkoitettuja algoritmia tarkasteltiin: JADE, kiinteän pisteen ICA ja momenttikeskeinen ICA. Tutkimuksissa painotettiin erottelun laatua. Paras erottelu saavutettiin JADE- algoritmilla, joskin erot muiden algoritmien välillä eivät olleet merkittäviä. Algoritmi jakoi kuvan kahteen itsenäiseen, joko korostuneeseen ja korostumattomaan tai kromaattiseen ja akromaattiseen, komponenttiin. Lopuksi pohditaan huipukkuuden suhdetta kuvan ominaisuuksiin, kuten korostuneisuuteen ja värikylläisyyteen. Työn viimeisessä osassa ehdotetaan mahdollisia jatkotutkimuskohteita.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study presents a catalogue of synoptic patterns of torrential rainfall in northeast of the Iberian Peninsula (IP). These circulation patterns were obtained by applying a T-mode Principal Component Analysis (PCA) to a daily data grid (NCEP/NCAR reanalysis) at sea level pressure (SLP). The analysis made use of 304 days which recorded >100 mm in one or more stations in provinces of Barcelona, Girona and Tarragona (coastland area of Catalonia) throughout the 1950-2005 period. The catalogue comprises 7 circulation patterns showing a great variety of atmospheric conditions and seasonal or monthly distribution. Likewise, we computed the mean index value of the Western Mediterranean Oscillation index (WeMOi) for the synoptic patterns obtained by averaging all days grouped in each pattern. The results showed a clear association between the negative values of this teleconnection index and torrential rainfall in northeast of the IP. We therefore put forward the WeMO as an essential tool for forecasting heavy rainfall in northeast of Spain

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The uncertainty of any analytical determination depends on analysis and sampling. Uncertainty arising from sampling is usually not controlled and methods for its evaluation are still little known. Pierre Gy’s sampling theory is currently the most complete theory about samplingwhich also takes the design of the sampling equipment into account. Guides dealing with the practical issues of sampling also exist, published by international organizations such as EURACHEM, IUPAC (International Union of Pure and Applied Chemistry) and ISO (International Organization for Standardization). In this work Gy’s sampling theory was applied to several cases, including the analysis of chromite concentration estimated on SEM (Scanning Electron Microscope) images and estimation of the total uncertainty of a drug dissolution procedure. The results clearly show that Gy’s sampling theory can be utilized in both of the above-mentioned cases and that the uncertainties achieved are reliable. Variographic experiments introduced in Gy’s sampling theory are beneficially applied in analyzing the uncertainty of auto-correlated data sets such as industrial process data and environmental discharges. The periodic behaviour of these kinds of processes can be observed by variographic analysis as well as with fast Fourier transformation and auto-correlation functions. With variographic analysis, the uncertainties are estimated as a function of the sampling interval. This is advantageous when environmental data or process data are analyzed as it can be easily estimated how the sampling interval is affecting the overall uncertainty. If the sampling frequency is too high, unnecessary resources will be used. On the other hand, if a frequency is too low, the uncertainty of the determination may be unacceptably high. Variographic methods can also be utilized to estimate the uncertainty of spectral data produced by modern instruments. Since spectral data are multivariate, methods such as Principal Component Analysis (PCA) are needed when the data are analyzed. Optimization of a sampling plan increases the reliability of the analytical process which might at the end have beneficial effects on the economics of chemical analysis,

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Raw measurement data does not always immediately convey useful information, but applying mathematical statistical analysis tools into measurement data can improve the situation. Data analysis can offer benefits like acquiring meaningful insight from the dataset, basing critical decisions on the findings, and ruling out human bias through proper statistical treatment. In this thesis we analyze data from an industrial mineral processing plant with the aim of studying the possibility of forecasting the quality of the final product, given by one variable, with a model based on the other variables. For the study mathematical tools like Qlucore Omics Explorer (QOE) and Sparse Bayesian regression (SB) are used. Later on, linear regression is used to build a model based on a subset of variables that seem to have most significant weights in the SB model. The results obtained from QOE show that the variable representing the desired final product does not correlate with other variables. For SB and linear regression, the results show that both SB and linear regression models built on 1-day averaged data seriously underestimate the variance of true data, whereas the two models built on 1-month averaged data are reliable and able to explain a larger proportion of variability in the available data, making them suitable for prediction purposes. However, it is concluded that no single model can fit well the whole available dataset and therefore, it is proposed for future work to make piecewise non linear regression models if the same available dataset is used, or the plant to provide another dataset that should be collected in a more systematic fashion than the present data for further analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The hydroalcoholic extracts prepared from standard leaves of Maytenus ilicifolia and commercial samples of espinheira-santa were evaluated qualitatively (fingerprinting) and quantitatively. In this paper, fingerprinting chromatogram coupled with Principal Component Analysis (PCA) is described for the metabolomic analysis of standard and commercial espinheira-santa samples. The epicatechin standard was used as an external standard for the development and validation of a quantitative method for the analysis in herbal medicines using a photo diode array detector. This method has been applied for quantification of epicatechin in commercialized herbal medicines sold as espinheira-santa in Brazil and in the standard sample of M. ilicifolia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to investigate the effect of pre-slaughter handling on the occurrence of PSE (Pale, Soft, and Exudative) meat in swine slaughtered at a commercial slaughterhouse located in the metropolitan region of Dourados, Mato Grosso do Sul, Brazil. Based on the database (n=1,832 carcasses), it was possible to apply the integrated multivariate analysis for the purpose of identifying, among the selected variables, those of greatest relevance to this study. Results of the Principal Component Analysis showed that the first five components explained 89.28% of total variance. In the Factor Analysis, the first factor represented the thermal stress and fatiguing conditions for swine during pre-slaughter handling. In general, this study indicated the importance of the pre-slaughter handling stages, evidencing those of greatest stress and threat to animal welfare and pork quality, which are transport time, resting period, lairage time before unloading, unloading time, and ambience.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT This study aimed to develop a methodology based on multivariate statistical analysis of principal components and cluster analysis, in order to identify the most representative variables in studies of minimum streamflow regionalization, and to optimize the identification of the hydrologically homogeneous regions for the Doce river basin. Ten variables were used, referring to the river basin climatic and morphometric characteristics. These variables were individualized for each of the 61 gauging stations. Three dependent variables that are indicative of minimum streamflow (Q7,10, Q90 and Q95). And seven independent variables that concern to climatic and morphometric characteristics of the basin (total annual rainfall – Pa; total semiannual rainfall of the dry and of the rainy season – Pss and Psc; watershed drainage area – Ad; length of the main river – Lp; total length of the rivers – Lt; and average watershed slope – SL). The results of the principal component analysis pointed out that the variable SL was the least representative for the study, and so it was discarded. The most representative independent variables were Ad and Psc. The best divisions of hydrologically homogeneous regions for the three studied flow characteristics were obtained using the Mahalanobis similarity matrix and the complete linkage clustering method. The cluster analysis enabled the identification of four hydrologically homogeneous regions in the Doce river basin.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In today's logistics environment, there is a tremendous need for accurate cost information and cost allocation. Companies searching for the proper solution often come across with activity-based costing (ABC) or one of its variations which utilizes cost drivers to allocate the costs of activities to cost objects. In order to allocate the costs accurately and reliably, the selection of appropriate cost drivers is essential in order to get the benefits of the costing system. The purpose of this study is to validate the transportation cost drivers of a Finnish wholesaler company and ultimately select the best possible driver alternatives for the company. The use of cost driver combinations as an alternative is also studied. The study is conducted as a part of case company's applied ABC-project using the statistical research as the main research method supported by a theoretical, literature based method. The main research tools featured in the study include simple and multiple regression analyses, which together with the literature and observations based practicality analysis forms the basis for the advanced methods. The results suggest that the most appropriate cost driver alternatives are the delivery drops and internal delivery weight. The possibility of using cost driver combinations is not suggested as their use doesn't provide substantially better results while increasing the measurement costs, complexity and load of use at the same time. The use of internal freight cost drivers is also questionable as the results indicate weakening trend in the cost allocation capabilities towards the end of the period. Therefore more research towards internal freight cost drivers should be conducted before taking them in use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.