64 resultados para Minimum Mean Square Error of Intensity Distribution
em Queensland University of Technology - ePrints Archive
Resumo:
The performance of an adaptive filter may be studied through the behaviour of the optimal and adaptive coefficients in a given environment. This thesis investigates the performance of finite impulse response adaptive lattice filters for two classes of input signals: (a) frequency modulated signals with polynomial phases of order p in complex Gaussian white noise (as nonstationary signals), and (b) the impulsive autoregressive processes with alpha-stable distributions (as non-Gaussian signals). Initially, an overview is given for linear prediction and adaptive filtering. The convergence and tracking properties of the stochastic gradient algorithms are discussed for stationary and nonstationary input signals. It is explained that the stochastic gradient lattice algorithm has many advantages over the least-mean square algorithm. Some of these advantages are having a modular structure, easy-guaranteed stability, less sensitivity to the eigenvalue spread of the input autocorrelation matrix, and easy quantization of filter coefficients (normally called reflection coefficients). We then characterize the performance of the stochastic gradient lattice algorithm for the frequency modulated signals through the optimal and adaptive lattice reflection coefficients. This is a difficult task due to the nonlinear dependence of the adaptive reflection coefficients on the preceding stages and the input signal. To ease the derivations, we assume that reflection coefficients of each stage are independent of the inputs to that stage. Then the optimal lattice filter is derived for the frequency modulated signals. This is performed by computing the optimal values of residual errors, reflection coefficients, and recovery errors. Next, we show the tracking behaviour of adaptive reflection coefficients for frequency modulated signals. This is carried out by computing the tracking model of these coefficients for the stochastic gradient lattice algorithm in average. The second-order convergence of the adaptive coefficients is investigated by modeling the theoretical asymptotic variance of the gradient noise at each stage. The accuracy of the analytical results is verified by computer simulations. Using the previous analytical results, we show a new property, the polynomial order reducing property of adaptive lattice filters. This property may be used to reduce the order of the polynomial phase of input frequency modulated signals. Considering two examples, we show how this property may be used in processing frequency modulated signals. In the first example, a detection procedure in carried out on a frequency modulated signal with a second-order polynomial phase in complex Gaussian white noise. We showed that using this technique a better probability of detection is obtained for the reduced-order phase signals compared to that of the traditional energy detector. Also, it is empirically shown that the distribution of the gradient noise in the first adaptive reflection coefficients approximates the Gaussian law. In the second example, the instantaneous frequency of the same observed signal is estimated. We show that by using this technique a lower mean square error is achieved for the estimated frequencies at high signal-to-noise ratios in comparison to that of the adaptive line enhancer. The performance of adaptive lattice filters is then investigated for the second type of input signals, i.e., impulsive autoregressive processes with alpha-stable distributions . The concept of alpha-stable distributions is first introduced. We discuss that the stochastic gradient algorithm which performs desirable results for finite variance input signals (like frequency modulated signals in noise) does not perform a fast convergence for infinite variance stable processes (due to using the minimum mean-square error criterion). To deal with such problems, the concept of minimum dispersion criterion, fractional lower order moments, and recently-developed algorithms for stable processes are introduced. We then study the possibility of using the lattice structure for impulsive stable processes. Accordingly, two new algorithms including the least-mean P-norm lattice algorithm and its normalized version are proposed for lattice filters based on the fractional lower order moments. Simulation results show that using the proposed algorithms, faster convergence speeds are achieved for parameters estimation of autoregressive stable processes with low to moderate degrees of impulsiveness in comparison to many other algorithms. Also, we discuss the effect of impulsiveness of stable processes on generating some misalignment between the estimated parameters and the true values. Due to the infinite variance of stable processes, the performance of the proposed algorithms is only investigated using extensive computer simulations.
Resumo:
In Australia and increasingly worldwide, methamphetamine is one of the most commonly seized drugs analysed by forensic chemists. The current well-established GC/MS methods used to identify and quantify methamphetamine are lengthy, expensive processes, but often rapid analysis is requested by undercover police leading to an interest in developing this new analytical technique. Ninety six illicit drug seizures containing methamphetamine (0.1% - 78.6%) were analysed using Fourier Transform Infrared Spectroscopy with an Attenuated Total Reflectance attachment and Chemometrics. Two Partial Least Squares models were developed, one using the principal Infrared Spectroscopy peaks of methamphetamine and the other a Hierarchical Partial Least Squares model. Both of these models were refined to choose the variables that were most closely associated with the methamphetamine % vector. Both of the models were excellent, with the principal peaks in the Partial Least Squares model having Root Mean Square Error of Prediction 3.8, R2 0.9779 and lower limit of quantification 7% methamphetamine. The Hierarchical Partial Least Squares model had lower limit of quantification 0.3% methamphetamine, Root Mean Square Error of Prediction 5.2 and R2 0.9637. Such models offer rapid and effective methods for screening illicit drug samples to determine the percentage of methamphetamine they contain.
Resumo:
Background: The overuse of antibiotics is becoming an increasing concern. Antibiotic resistance, which increases both the burden of disease, and the cost of health services, is perhaps the most profound impact of antibiotics overuse. Attempts have been made to develop instruments to measure the psychosocial constructs underlying antibiotics use, however, none of these instruments have undergone thorough psychometric validation. This study evaluates the psychometric properties of the Parental Perceptions on Antibiotics (PAPA) scales. The PAPA scales attempt to measure the factors influencing parental use of antibiotics in children. Methods: 1111 parents of children younger than 12 years old were recruited from primary schools’ parental meetings in the Eastern Province of Saudi Arabia from September 2012 to January 2013. The structure of the PAPA instrument was validated using Confirmatory Factor Analysis (CFA) with measurement model fit evaluated using the raw and scaled χ2, Goodness of Fit Index, and Root Mean Square Error of Approximation. Results: A five-factor model was confirmed with the model showing good fit. Constructs in the model include: Knowledge and Beliefs, Behaviors, Sources of information, Adherence, and Awareness about antibiotics resistance. The instrument was shown to have good internal consistency, and good discriminant and convergent validity. Conclusion: The availability of an instrument able to measure the psychosocial factors underlying antibiotics usage allows the risk factors underlying antibiotic use and overuse to now be investigated.
Resumo:
Background and Aims Research into craving is hampered by lack of theoretical specification and a plethora of substance-specific measures. This study aimed to develop a generic measure of craving based on elaborated intrusion (EI) theory. Confirmatory factor analysis (CFA) examined whether a generic measure replicated the three-factor structure of the Alcohol Craving Experience (ACE) scale over different consummatory targets and time-frames. Design Twelve studies were pooled for CFA. Targets included alcohol, cigarettes, chocolate and food. Focal periods varied from the present moment to the previous week. Separate analyses were conducted for strength and frequency forms. Setting Nine studies included university students, with single studies drawn from an internet survey, a community sample of smokers and alcohol-dependent out-patients. Participants A heterogeneous sample of 1230 participants. Measurements Adaptations of the ACE questionnaire. Findings Both craving strength [comparative fit indices (CFI = 0.974; root mean square error of approximation (RMSEA) = 0.039, 95% confidence interval (CI) = 0.035–0.044] and frequency (CFI = 0.971, RMSEA = 0.049, 95% CI = 0.044–0.055) gave an acceptable three-factor solution across desired targets that mapped onto the structure of the original ACE (intensity, imagery, intrusiveness), after removing an item, re-allocating another and taking intercorrelated error terms into account. Similar structures were obtained across time-frames and targets. Preliminary validity data on the resulting 10-item Craving Experience Questionnaire (CEQ) for cigarettes and alcohol were strong. Conclusions The Craving Experience Questionnaire (CEQ) is a brief, conceptually grounded and psychometrically sound measure of desires. It demonstrates a consistent factor structure across a range of consummatory targets in both laboratory and clinical contexts.
Resumo:
OBJECTIVE To examine the psychometric properties of a Chinese version of the Problem Areas In Diabetes (PAID-C) scale. RESEARCH DESIGN AND METHODS The reliability and validity of the PAID-C were evaluated in a convenience sample of 205 outpatients with type 2 diabetes. Confirmatory factor analysis, Bland-Altman analysis, and Spearman's correlations facilitated the psychometric evaluation. RESULTS Confirmatory factor analysis confirmed a one-factor structure of the PAID-C (χ2/df ratio = 1.894, goodness-of-fit index = 0.901, comparative fit index = 0.905, root mean square error of approximation = 0.066). The PAID-C was associated with A1C (rs = 0.15; P < 0.05) and diabetes self-care behaviors in general diet (rs = −0.17; P < 0.05) and exercise (rs = −0.17; P < 0.05). The 4-week test-retest reliability demonstrated satisfactory stability (rs = 0.83; P < 0.01). CONCLUSIONS The PAID-C is a reliable and valid measure to determine diabetes-related emotional distress in Chinese people with type 2 diabetes.
Resumo:
In this paper, spatially offset Raman spectroscopy (SORS) is demonstrated for non-invasively investigating the composition of drug mixtures inside an opaque plastic container. The mixtures consisted of three components including a target drug (acetaminophen or phenylephrine hydrochloride) and two diluents (glucose and caffeine). The target drug concentrations ranged from 5% to 100%. After conducting SORS analysis to ascertain the Raman spectra of the concealed mixtures, principal component analysis (PCA) was performed on the SORS spectra to reveal trends within the data. Partial least squares (PLS) regression was used to construct models that predicted the concentration of each target drug, in the presence of the other two diluents. The PLS models were able to predict the concentration of acetaminophen in the validation samples with a root-mean-square error of prediction (RMSEP) of 3.8% and the concentration of phenylephrine hydrochloride with an RMSEP of 4.6%. This work demonstrates the potential of SORS, used in conjunction with multivariate statistical techniques, to perform non-invasive, quantitative analysis on mixtures inside opaque containers. This has applications for pharmaceutical analysis, such as monitoring the degradation of pharmaceutical products on the shelf, in forensic investigations of counterfeit drugs, and for the analysis of illicit drug mixtures which may contain multiple components.
Resumo:
Background Multi attribute utility instruments (MAUIs) are preference-based measures that comprise a health state classification system (HSCS) and a scoring algorithm that assigns a utility value to each health state in the HSCS. When developing a MAUI from a health-related quality of life (HRQOL) questionnaire, first a HSCS must be derived. This typically involves selecting a subset of domains and items because HRQOL questionnaires typically have too many items to be amendable to the valuation task required to develop the scoring algorithm for a MAUI. Currently, exploratory factor analysis (EFA) followed by Rasch analysis is recommended for deriving a MAUI from a HRQOL measure. Aim To determine whether confirmatory factor analysis (CFA) is more appropriate and efficient than EFA to derive a HSCS from the European Organisation for the Research and Treatment of Cancer’s core HRQOL questionnaire, Quality of Life Questionnaire (QLQ-C30), given its well-established domain structure. Methods QLQ-C30 (Version 3) data were collected from 356 patients receiving palliative radiotherapy for recurrent/metastatic cancer (various primary sites). The dimensional structure of the QLQ-C30 was tested with EFA and CFA, the latter informed by the established QLQ-C30 structure and views of both patients and clinicians on which are the most relevant items. Dimensions determined by EFA or CFA were then subjected to Rasch analysis. Results CFA results generally supported the proposed QLQ-C30 structure (comparative fit index =0.99, Tucker–Lewis index =0.99, root mean square error of approximation =0.04). EFA revealed fewer factors and some items cross-loaded on multiple factors. Further assessment of dimensionality with Rasch analysis allowed better alignment of the EFA dimensions with those detected by CFA. Conclusion CFA was more appropriate and efficient than EFA in producing clinically interpretable results for the HSCS for a proposed new cancer-specific MAUI. Our findings suggest that CFA should be recommended generally when deriving a preference-based measure from a HRQOL measure that has an established domain structure.
Resumo:
Study design Retrospective validation study. Objectives To propose a method to evaluate, from a clinical standpoint, the ability of a finite-element model (FEM) of the trunk to simulate orthotic correction of spinal deformity and to apply it to validate a previously described FEM. Summary of background data Several FEMs of the scoliotic spine have been described in the literature. These models can prove useful in understanding the mechanisms of scoliosis progression and in optimizing its treatment, but their validation has often been lacking or incomplete. Methods Three-dimensional (3D) geometries of 10 patients before and during conservative treatment were reconstructed from biplanar radiographs. The effect of bracing was simulated by modeling displacements induced by the brace pads. Simulated clinical indices (Cobb angle, T1–T12 and T4–T12 kyphosis, L1–L5 lordosis, apical vertebral rotation, torsion, rib hump) and vertebral orientations and positions were compared to those measured in the patients' 3D geometries. Results Errors in clinical indices were of the same order of magnitude as the uncertainties due to 3D reconstruction; for instance, Cobb angle was simulated with a root mean square error of 5.7°, and rib hump error was 5.6°. Vertebral orientation was simulated with a root mean square error of 4.8° and vertebral position with an error of 2.5 mm. Conclusions The methodology proposed here allowed in-depth evaluation of subject-specific simulations, confirming that FEMs of the trunk have the potential to accurately simulate brace action. These promising results provide a basis for ongoing 3D model development, toward the design of more efficient orthoses.
Resumo:
Our results demonstrate that photorefractive residual amplitude modulation (RAM) noise in electro-optic modulators (EOMs) can be reduced by modifying the incident beam intensity distribution. Here we report an order of magnitude reduction in RAM when beams with uniform intensity (flat-top) profiles, generated with an LCOS-SLM, are used instead of the usual fundamental Gaussian mode (TEM00). RAM arises from the photorefractive amplified scatter noise off the defects and impurities within the crystal. A reduction in RAM is observed with increasing intensity uniformity (flatness), which is attributed to a reduction in space charge field on the beam axis. The level of RAM reduction that can be achieved is physically limited by clipping at EOM apertures, with the observed results agreeing well with a simple model. These results are particularly important in applications where the reduction of residual amplitude modulation to 10^-6 is essential.
Resumo:
Since a celebrate linear minimum mean square (MMS) Kalman filter in integration GPS/INS system cannot guarantee the robustness performance, a H(infinity) filtering with respect to polytopic uncertainty is designed. The purpose of this paper is to give an illustration of this application and a contrast with traditional Kalman filter. A game theory H(infinity) filter is first reviewed; next we utilize linear matrix inequalities (LMI) approach to design the robust H(infinity) filter. For the special INS/GPS model, unstable model case is considered. We give an explanation for Kalman filter divergence under uncertain dynamic system and simultaneously investigate the relationship between H(infinity) filter and Kalman filter. A loosely coupled INS/GPS simulation system is given here to verify this application. Result shows that the robust H(infinity) filter has a better performance when system suffers uncertainty; also it is more robust compared to the conventional Kalman filter.
Resumo:
This thesis aimed to investigate the way in which distance runners modulate their speed in an effort to understand the key processes and determinants of speed selection when encountering hills in natural outdoor environments. One factor which has limited the expansion of knowledge in this area has been a reliance on the motorized treadmill which constrains runners to constant speeds and gradients and only linear paths. Conversely, limits in the portability or storage capacity of available technology have restricted field research to brief durations and level courses. Therefore another aim of this thesis was to evaluate the capacity of lightweight, portable technology to measure running speed in outdoor undulating terrain. The first study of this thesis assessed the validity of a non-differential GPS to measure speed, displacement and position during human locomotion. Three healthy participants walked and ran over straight and curved courses for 59 and 34 trials respectively. A non-differential GPS receiver provided speed data by Doppler Shift and change in GPS position over time, which were compared with actual speeds determined by chronometry. Displacement data from the GPS were compared with a surveyed 100m section, while static positions were collected for 1 hour and compared with the known geodetic point. GPS speed values on the straight course were found to be closely correlated with actual speeds (Doppler shift: r = 0.9994, p < 0.001, Δ GPS position/time: r = 0.9984, p < 0.001). Actual speed errors were lowest using the Doppler shift method (90.8% of values within ± 0.1 m.sec -1). Speed was slightly underestimated on a curved path, though still highly correlated with actual speed (Doppler shift: r = 0.9985, p < 0.001, Δ GPS distance/time: r = 0.9973, p < 0.001). Distance measured by GPS was 100.46 ± 0.49m, while 86.5% of static points were within 1.5m of the actual geodetic point (mean error: 1.08 ± 0.34m, range 0.69-2.10m). Non-differential GPS demonstrated a highly accurate estimation of speed across a wide range of human locomotion velocities using only the raw signal data with a minimal decrease in accuracy around bends. This high level of resolution was matched by accurate displacement and position data. Coupled with reduced size, cost and ease of use, the use of a non-differential receiver offers a valid alternative to differential GPS in the study of overground locomotion. The second study of this dissertation examined speed regulation during overground running on a hilly course. Following an initial laboratory session to calculate physiological thresholds (VO2 max and ventilatory thresholds), eight experienced long distance runners completed a self- paced time trial over three laps of an outdoor course involving uphill, downhill and level sections. A portable gas analyser, GPS receiver and activity monitor were used to collect physiological, speed and stride frequency data. Participants ran 23% slower on uphills and 13.8% faster on downhills compared with level sections. Speeds on level sections were significantly different for 78.4 ± 7.0 seconds following an uphill and 23.6 ± 2.2 seconds following a downhill. Speed changes were primarily regulated by stride length which was 20.5% shorter uphill and 16.2% longer downhill, while stride frequency was relatively stable. Oxygen consumption averaged 100.4% of runner’s individual ventilatory thresholds on uphills, 78.9% on downhills and 89.3% on level sections. Group level speed was highly predicted using a modified gradient factor (r2 = 0.89). Individuals adopted distinct pacing strategies, both across laps and as a function of gradient. Speed was best predicted using a weighted factor to account for prior and current gradients. Oxygen consumption (VO2) limited runner’s speeds only on uphill sections, and was maintained in line with individual ventilatory thresholds. Running speed showed larger individual variation on downhill sections, while speed on the level was systematically influenced by the preceding gradient. Runners who varied their pace more as a function of gradient showed a more consistent level of oxygen consumption. These results suggest that optimising time on the level sections after hills offers the greatest potential to minimise overall time when running over undulating terrain. The third study of this thesis investigated the effect of implementing an individualised pacing strategy on running performance over an undulating course. Six trained distance runners completed three trials involving four laps (9968m) of an outdoor course involving uphill, downhill and level sections. The initial trial was self-paced in the absence of any temporal feedback. For the second and third field trials, runners were paced for the first three laps (7476m) according to two different regimes (Intervention or Control) by matching desired goal times for subsections within each gradient. The fourth lap (2492m) was completed without pacing. Goals for the Intervention trial were based on findings from study two using a modified gradient factor and elapsed distance to predict the time for each section. To maintain the same overall time across all paced conditions, times were proportionately adjusted according to split times from the self-paced trial. The alternative pacing strategy (Control) used the original split times from this initial trial. Five of the six runners increased their range of uphill to downhill speeds on the Intervention trial by more than 30%, but this was unsuccessful in achieving a more consistent level of oxygen consumption with only one runner showing a change of more than 10%. Group level adherence to the Intervention strategy was lowest on downhill sections. Three runners successfully adhered to the Intervention pacing strategy which was gauged by a low Root Mean Square error across subsections and gradients. Of these three, the two who had the largest change in uphill-downhill speeds ran their fastest overall time. This suggests that for some runners the strategy of varying speeds systematically to account for gradients and transitions may benefit race performances on courses involving hills. In summary, a non – differential receiver was found to offer highly accurate measures of speed, distance and position across the range of human locomotion speeds. Self-selected speed was found to be best predicted using a weighted factor to account for prior and current gradients. Oxygen consumption limited runner’s speeds only on uphills, speed on the level was systematically influenced by preceding gradients, while there was a much larger individual variation on downhill sections. Individuals were found to adopt distinct but unrelated pacing strategies as a function of durations and gradients, while runners who varied pace more as a function of gradient showed a more consistent level of oxygen consumption. Finally, the implementation of an individualised pacing strategy to account for gradients and transitions greatly increased runners’ range of uphill-downhill speeds and was able to improve performance in some runners. The efficiency of various gradient-speed trade- offs and the factors limiting faster downhill speeds will however require further investigation to further improve the effectiveness of the suggested strategy.
Resumo:
This thesis deals with the problem of the instantaneous frequency (IF) estimation of sinusoidal signals. This topic plays significant role in signal processing and communications. Depending on the type of the signal, two major approaches are considered. For IF estimation of single-tone or digitally-modulated sinusoidal signals (like frequency shift keying signals) the approach of digital phase-locked loops (DPLLs) is considered, and this is Part-I of this thesis. For FM signals the approach of time-frequency analysis is considered, and this is Part-II of the thesis. In part-I we have utilized sinusoidal DPLLs with non-uniform sampling scheme as this type is widely used in communication systems. The digital tanlock loop (DTL) has introduced significant advantages over other existing DPLLs. In the last 10 years many efforts have been made to improve DTL performance. However, this loop and all of its modifications utilizes Hilbert transformer (HT) to produce a signal-independent 90-degree phase-shifted version of the input signal. Hilbert transformer can be realized approximately using a finite impulse response (FIR) digital filter. This realization introduces further complexity in the loop in addition to approximations and frequency limitations on the input signal. We have tried to avoid practical difficulties associated with the conventional tanlock scheme while keeping its advantages. A time-delay is utilized in the tanlock scheme of DTL to produce a signal-dependent phase shift. This gave rise to the time-delay digital tanlock loop (TDTL). Fixed point theorems are used to analyze the behavior of the new loop. As such TDTL combines the two major approaches in DPLLs: the non-linear approach of sinusoidal DPLL based on fixed point analysis, and the linear tanlock approach based on the arctan phase detection. TDTL preserves the main advantages of the DTL despite its reduced structure. An application of TDTL in FSK demodulation is also considered. This idea of replacing HT by a time-delay may be of interest in other signal processing systems. Hence we have analyzed and compared the behaviors of the HT and the time-delay in the presence of additive Gaussian noise. Based on the above analysis, the behavior of the first and second-order TDTLs has been analyzed in additive Gaussian noise. Since DPLLs need time for locking, they are normally not efficient in tracking the continuously changing frequencies of non-stationary signals, i.e. signals with time-varying spectra. Nonstationary signals are of importance in synthetic and real life applications. An example is the frequency-modulated (FM) signals widely used in communication systems. Part-II of this thesis is dedicated for the IF estimation of non-stationary signals. For such signals the classical spectral techniques break down, due to the time-varying nature of their spectra, and more advanced techniques should be utilized. For the purpose of instantaneous frequency estimation of non-stationary signals there are two major approaches: parametric and non-parametric. We chose the non-parametric approach which is based on time-frequency analysis. This approach is computationally less expensive and more effective in dealing with multicomponent signals, which are the main aim of this part of the thesis. A time-frequency distribution (TFD) of a signal is a two-dimensional transformation of the signal to the time-frequency domain. Multicomponent signals can be identified by multiple energy peaks in the time-frequency domain. Many real life and synthetic signals are of multicomponent nature and there is little in the literature concerning IF estimation of such signals. This is why we have concentrated on multicomponent signals in Part-H. An adaptive algorithm for IF estimation using the quadratic time-frequency distributions has been analyzed. A class of time-frequency distributions that are more suitable for this purpose has been proposed. The kernels of this class are time-only or one-dimensional, rather than the time-lag (two-dimensional) kernels. Hence this class has been named as the T -class. If the parameters of these TFDs are properly chosen, they are more efficient than the existing fixed-kernel TFDs in terms of resolution (energy concentration around the IF) and artifacts reduction. The T-distributions has been used in the IF adaptive algorithm and proved to be efficient in tracking rapidly changing frequencies. They also enables direct amplitude estimation for the components of a multicomponent
Resumo:
Background The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
Obesity is a major public health problem in both developed and developing countries. The body mass index (BMI) is the most common index used to define obesity. The universal application of the same BMI classification across different ethnic groups is being challenged due to the inability of the index to differentiate fat mass (FM) and fat�]free mass (FFM) and the recognized ethnic differences in body composition. A better understanding of the body composition of Asian children from different backgrounds would help to better understand the obesity�]related health risks of people in this region. Moreover, the limitations of the BMI underscore the necessity to use where possible, more accurate measures of body fat assessment in research and clinical settings in addition to BMI, particularly in relation to the monitoring of prevention and treatment efforts. The aim of the first study was to determine the ethnic difference in the relationship between BMI and percent body fat (%BF) in pre�]pubertal Asian children from China, Lebanon, Malaysia, the Philippines, and Thailand. A total of 1039 children aged 8�]10 y were recruited using a non�]random purposive sampling approach aiming to encompass a wide BMI range from the five countries. Percent body fat (%BF) was determined using the deuterium dilution technique to quantify total body water (TBW) and subsequently derive proportions of FM and FFM. The study highlighted the sex and ethnic differences between BMI and %BF in Asian children from different countries. Girls had approximately 4.0% higher %BF compared with boys at a given BMI. Filipino boys tended to have a lower %BF than their Chinese, Lebanese, Malay and Thai counterparts at the same age and BMI level (corrected mean %BF was 25.7�}0.8%, 27.4�}0.4%, 27.1�}0.6%, 27.7�}0.5%, 28.1�}0.5% for Filipino, Chinese, Lebanese, Malay and Thai boys, respectively), although they differed significantly from Thai and Malay boys. Thai girls had approximately 2.0% higher %BF values than Chinese, Lebanese, Filipino and Malay counterparts (however no significant difference was seen among the four ethnic groups) at a given BMI (corrected mean %BF was 31.1�}0.5%, 28.6�}0.4%, 29.2�}0.6%, 29.5�}0.6%, 29.5�}0.5% for Thai, Chinese, Lebanese, Malay and Filipino girls, respectively). However, the ethnic difference in BMI�]%BF relationship varied by BMI. Compared with Caucasians, Asian children had a BMI 3�]6 units lower for a given %BF. More than one third of obese Asian children in the study were not identified using the WHO classification and more than half were not identified using the International Obesity Task Force (IOTF) classification. However, use of the Chinese classification increased the sensitivity by 19.7%, 18.1%, 2.3%, 2.3%, and 11.3% for Chinese, Lebanese, Malay, Filipino and Thai girls, respectively. A further aim of the first study was to determine the ethnic difference in body fat distribution in pre�]pubertal Asian children from China, Lebanon, Malaysia, and Thailand. The skin fold thicknesses, height, weight, waist circumference (WC) and total adiposity (as determined by deuterium dilution technique) of 922 children from the four countries was assessed. Chinese boys and girls had a similar trunk�]to�]extremity skin fold thickness ratio to Thai counterparts and both groups had higher ratios than the Malays and Lebanese at a given total FM. At a given BMI, both Chinese and Thai boys and girls had a higher WC than Malays and Lebanese (corrected mean WC was 68.1�}0.2 cm, 67.8�}0.3 cm, 65.8�}0.4 cm, 64.1�}0.3 cm for Chinese, Thai, Lebanese and Malay boys, respectively; 64.2�}0.2 cm, 65.0�}0.3 cm, 62.9�}0.4 cm, 60.6�}0.3 cm for Chinese, Thai, Lebanese and Malay girls, respectively). Chinese boys and girls had lower trunk fat adjusted subscapular/suprailiac skinfold ratio compared with Lebanese and Malay counterparts. The second study aimed to develop and cross�]validate bioelectrical impedance analysis (BIA) prediction equations of TBW and FFM for Asian pre�]pubertal children from China, Lebanon, Malaysia, the Philippines, and Thailand. Data on height, weight, age, gender, resistance and reactance measured by BIA were collected from 948 Asian children (492 boys and 456 girls) aged 8�]10 y from the five countries. The deuterium dilution technique was used as the criterion method for the estimation of TBW and FFM. The BIA equations were developed from the validation group (630 children randomly selected from the total sample) using stepwise multiple regression analysis and cross�]validated in a separate group (318 children) using the Bland�]Altman approach. Age, gender and ethnicity influenced the relationship between the resistance index (RI = height2/resistance), TBW and FFM. The BIA prediction equation for the estimation of TBW was: TBW (kg) = 0.231�~Height2 (cm)/resistance (ƒ¶) + 0.066�~Height (cm) + 0.188�~Weight (kg) + 0.128�~Age (yr) + 0.500�~Sex (male=1, female=0) . 0.316�~Ethnicity (Thai ethnicity=1, others=0) �] 4.574, and for the estimation of FFM: FFM (kg) = 0.299�~Height2 (cm)/resistance (ƒ¶) + 0.086�~Height (cm) + 0.245�~Weight (kg) + 0.260�~Age (yr) + 0.901�~Sex (male=1, female=0) �] 0.415�~Ethnicity (Thai ethnicity=1, others=0) �] 6.952. The R2 was 88.0% (root mean square error, RSME = 1.3 kg), 88.3% (RSME = 1.7 kg) for TBW and FFM equation, respectively. No significant difference between measured and predicted TBW and between measured and predicted FFM for the whole cross�]validation sample was found (bias = �]0.1�}1.4 kg, pure error = 1.4�}2.0 kg for TBW and bias = �]0.2�}1.9 kg, pure error = 1.8�}2.6 kg for FFM). However, the prediction equation for estimation of TBW/FFM tended to overestimate TBW/FFM at lower levels while underestimate at higher levels of TBW/FFM. Accuracy of the general equation for TBW and FFM compared favorably with both BMI�]specific and ethnic�]specific equations. There were significant differences between predicted TBW and FFM from external BIA equations derived from Caucasian populations and measured values in Asian children. There were three specific aims of the third study. The first was to explore the relationship between obesity and metabolic syndrome and abnormalities in Chinese children. A total of 608 boys and 800 girls aged 6�]12 y were recruited from four cities in China. Three definitions of pediatric metabolic syndrome and abnormalities were used, including the International Diabetes Federation (IDF) and National Cholesterol Education Program (NCEP) definition for adults modified by Cook et al. and de Ferranti et al. The prevalence of metabolic syndrome varied with different definitions, was highest using the de Ferranti definition (5.4%, 24.6% and 42.0%, respectively for normal�]weight, overweight and obese children), followed by the Cook definition (1.5%, 8.1%, and 25.1%, respectively), and the IDF definition (0.5%, 1.8% and 8.3%, respectively). Overweight and obese children had a higher risk of developing the metabolic syndrome compared to normal�]weight children (odds ratio varied with different definitions from 3.958 to 6.866 for overweight children, and 12.640�]26.007 for obese children). Overweight and obesity also increased the risk of developing metabolic abnormalities. Central obesity and high triglycerides (TG) were the most common while hyperglycemia was the least frequent in Chinese children regardless of different definitions. The second purpose was to determine the best obesity index for the prediction of cardiovascular (CV) risk factor clustering across a 2�]y follow�]up among BMI, %BF, WC and waist�]to�]height ratio (WHtR) in Chinese children. Height, weight, WC, %BF as determined by BIA, blood pressure, TG, high�]density lipoprotein cholesterol (HDL�]C), and fasting glucose were collected at baseline and 2 years later in 292 boys and 277 girls aged 8�]10 y. The results showed the percentage of children who remained overweight/obese defined on the basis of BMI, WC, WHtR and %BF was 89.7%, 93.5%, 84.5%, and 80.4%, respectively after 2 years. Obesity indices at baseline significantly correlated with TG, HDL�]C, and blood pressure at both baseline and 2 years later with a similar strength of correlations. BMI at baseline explained the greatest variance of later blood pressure. WC at baseline explained the greatest variance of later HDL�]C and glucose, while WHtR at baseline was the main predictor of later TG. Receiver�]operating characteristic (ROC) analysis explored the ability of the four indices to identify the later presence of CV risk. The overweight/obese children defined on the basis of BMI, WC, WHtR or %BF were more likely to develop CV risk 2 years later with relative risk (RR) scores of 3.670, 3.762, 2.767, and 2.804, respectively. The final purpose of the third study was to develop age�] and gender�]specific percentiles of WC and WHtR and cut�]off points of WC and WHtR for the prediction of CV risk in Chinese children. Smoothed percentile curves of WC and WHtR were produced in 2830 boys and 2699 girls aged 6�]12 y randomly selected from southern and northern China using the LMS method. The optimal age�] and gender�]specific thresholds of WC and WHtR for the prediction of cardiovascular risk factors clustering were derived in a sub�]sample (n=1845) by ROC analysis. Age�] and gender�]specific WC and WHtR percentiles were constructed. The WC thresholds were at the 90th and 84th percentiles for Chinese boys and girls, respectively, with sensitivity and specificity ranging from 67.2% to 83.3%. The WHtR thresholds were at the 91st and 94th percentiles for Chinese boys and girls, respectively, with sensitivity and specificity ranging from 78.6% to 88.9%. The cut�]offs of both WC and WHtR were age�] and gender�]dependent. In conclusion, the current thesis quantifies the ethnic differences in the BMI�]%BF relationship and body fat distribution between Asian children from different origins and confirms the necessity to consider ethnic differences in body composition when developing BMI and other obesity index criteria for obesity in Asian children. Moreover, ethnicity is also important in BIA prediction equations. In addition, WC and WHtR percentiles and thresholds for the prediction of CV risk in Chinese children differ from other populations. Although there was no advantage of WC or WHtR over BMI or %BF in the prediction of CV risk, obese children had a higher risk of developing the metabolic syndrome and abnormalities than normal�]weight children regardless of the obesity index used.