57 resultados para Collinearity


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Research problem: Overfitting and collinearity problems commonly exist in current construction cost estimation applications and obstruct researchers and practitioners in achieving better modelling results. Research objective and method: A hybrid approach of Akaike information criterion (AIC) stepwise regression and principal component regression (PCR) is proposed to help solve overfitting and collinearity problems. Utilization of this approach in linear regression is validated by comparing it with other commonly used approaches. The mean square error obtained by leave-one-out cross validation (MSELOOCV) is used in model selection in deciding predictive variables.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Meindl et al. (Adv Space Res 51(7):1047–1064, 2013) showed that the geocenter z -component estimated from observations of global navigation satellite systems (GNSS) is strongly correlated to a particular parameter of the solar radiation pressure (SRP) model developed by Beutler et al. (Manuscr Geod 19:367–386, 1994). They analyzed the forces caused by SRP and the impact on the satellites’ orbits. The authors achieved their results using perturbation theory and celestial mechanics. Rebischung et al. (J Geod doi:10.1016/j.asr.2012.10.026, 2013) also deal with the geocenter determination with GNSS. The authors carried out a collinearity diagnosis of the associated parameter estimation problem. They conclude “without much exaggerating that current GNSS are insensitive to any component of geocenter motion”. They explain this inability by the high degree of collinearity of the geocenter coordinates mainly with satellite clock corrections. Based on these results and additional experiments, they state that the conclusions drawn by Meindl et al. (Adv Space Res 51(7):1047–1064, 2013) are questionable. We do not agree with these conclusions and present our arguments in this article. In the first part, we review and highlight the main characteristics of the studies performed by Meindl et al. (Adv Space Res 51(7):1047–1064, 2013) to show that the experiments are quite different from those performed by Rebischung et al. (J Geod doi:10.1016/j.asr.2012.10.026,2013) . In the second part, we show that normal equation (NEQ) systems are regular when estimating geocenter coordinates, implying that the covariance matrices associated with the NEQ systems may be used to assess the sensitivity to geocenter coordinates in a standard way. The sensitivity of GNSS to the components of the geocenter is discussed. Finally, we comment on the arguments raised by Rebischung et al. (J Geod doi:10.1016/j.asr.2012.10.026, 2013) against the results of Meindl et al. (Adv Space Res 51(7):1047–1064, 2013).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present multifrequency Very Large Array (VLA) observations of two giant quasars, 0437-244 and 1025-229, from the Molonglo Complete Sample. These sources have well-defined FR II radio structure, possible one-sided jets, no significant depolarization between 1365 and 4935 MHz and low rotation measure (\ RM \ < 20 rad m(-2)). The giant sources are defined to be those with overall projected size greater than or equal to 1 Mpc. We have compiled a sample of about 50 known giant radio sources from the literature, and have compared some of their properties with a complete sample of 3CR radio sources of smaller sizes to investigate the evolution of giant sources, and test their consistency with the unified scheme for radio galaxies and quasars. We find an inverse correlation between the degree of core prominence and total radio luminosity, and show that the giant radio sources have similar core strengths to smaller sources of similar total luminosity. Hence their large sizes are unlikely to be caused by stronger nuclear activity. The degree of collinearity of the giant sources is also similar to that of the sample of smaller sources. The luminosity-size diagram shows that the giant sources are less luminous than our sample of smaller sized 3CR sources, consistent with evolutionary scenarios in which the giants have evolved from the smaller sources, losing energy as they expand to these large dimensions. For the smaller sources, radiative losses resulting from synchrotron radiation are more significant while for the giant sources the equipartition magnetic fields are smaller and inverse Compton lass owing to microwave background radiation is the dominant process. The radio properties of the giant radio galaxies and quasars are consistent with the unified scheme.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hanuman langur is one of the widely distributed and extensively studied non-human diurnal primates in India. Until recently it was believed to be a single species - Semnopithecus entellus. Recent molecular and morphological studies suggest that the Hanuman langurs consists of at least three species S. entellus, S. hypoleucos and S. priam. Furthermore, morphological studies suggested that both S. hypoleucos and S. priam have at least three subspecies in each. We explored the use of ecological niche modeling (ENM) to confirm the validity of these seven taxa and an additional taxon S. johnii belonging to the same genus. MaxEnt modeling tool was used with 19 bioclimatic, 12 vegetation and 6 hydrological environmental layers. We reduced total environmental variables to 14 layers after testing for collinearity and an independent test for model prediction was done using ENMTools. A total of 196 non-overlapping data points from primary and secondary sources were used as inputs for ENM. Results showed eight distinct ecological boundaries, corroborating the eight taxa mentioned above thereby confirming validity of these eight taxa. The study, for the first time provided ecological variables that determined the ecological requirements and distribution of members of the Hanuman langur species complex in the Indian peninsula.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hanuman langur is one of the widely distributed and extensively studied non-human diurnal primates in India. Until recently it was believed to be a single species - Semnopithecus entellus. Recent molecular and morphological studies suggest that the Hanuman langurs consists of at least three species S. entellus, S. hypoleucos and S. priam. Furthermore, morphological studies suggested that both S. hypoleucos and S. priam have at least three subspecies in each. We explored the use of ecological niche modeling (ENM) to confirm the validity of these seven taxa and an additional taxon S. johnii belonging to the same genus. MaxEnt modeling tool was used with 19 bioclimatic, 12 vegetation and 6 hydrological environmental layers. We reduced total environmental variables to 14 layers after testing for collinearity and an independent test for model prediction was done using ENMTools. A total of 196 non-overlapping data points from primary and secondary sources were used as inputs for ENM. Results showed eight distinct ecological boundaries, corroborating the eight taxa mentioned above thereby confirming validity of these eight taxa. The study, for the first time provided ecological variables that determined the ecological requirements and distribution of members of the Hanuman langur species complex in the Indian peninsula.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Defining types of seafloor substrate and relating them to the distribution of fish and invertebrates is an important but difficult goal. An examination of the processing steps of a commercial acoustics analyzing software program, as well as the data values produced by the proprietary first echo measurements, revealed potential benef its and drawbacks for distinguishing acoustically distinct seafloor substrates. The positive aspects were convenient processing steps such as gain adjustment, accurate bottom picking, ease of bad data exclusion, and the ability to average across successive pings in order to increase the signal-to-noise ratio. A noteworthy drawback with the processing was the potential for accidental inclusion of a second echo as if it were part of the first echo. Detailed examination of the echogram measurements quantified the amount of collinearity, revealed the lack of standardization (subtraction of mean, division by standard deviation) before principal components analysis (PCA), and showed correlations of individual echogram measurements with depth and seafloor slope. Despite the facility of the software, these previously unknown processing pitfalls and echogram measurement characteristics may have created data artifacts that generated user-derived substrate classifications, rather than actual seafloor substrate types.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spatial pattern metrics have routinely been applied to characterize and quantify structural features of terrestrial landscapes and have demonstrated great utility in landscape ecology and conservation planning. The important role of spatial structure in ecology and management is now commonly recognized, and recent advances in marine remote sensing technology have facilitated the application of spatial pattern metrics to the marine environment. However, it is not yet clear whether concepts, metrics, and statistical techniques developed for terrestrial ecosystems are relevant for marine species and seascapes. To address this gap in our knowledge, we reviewed, synthesized, and evaluated the utility and application of spatial pattern metrics in the marine science literature over the past 30 yr (1980 to 2010). In total, 23 studies characterized seascape structure, of which 17 quantified spatial patterns using a 2-dimensional patch-mosaic model and 5 used a continuously varying 3-dimensional surface model. Most seascape studies followed terrestrial-based studies in their search for ecological patterns and applied or modified existing metrics. Only 1 truly unique metric was found (hydrodynamic aperture applied to Pacific atolls). While there are still relatively few studies using spatial pattern metrics in the marine environment, they have suffered from similar misuse as reported for terrestrial studies, such as the lack of a priori considerations or the problem of collinearity between metrics. Spatial pattern metrics offer great potential for ecological research and environmental management in marine systems, and future studies should focus on (1) the dynamic boundary between the land and sea; (2) quantifying 3-dimensional spatial patterns; and (3) assessing and monitoring seascape change.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Liu, Yonghuai. Improving ICP with Easy Implementation for Free Form Surface Matching. Pattern Recognition, vol. 37, no. 2, pp. 211-226, 2004.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Selection bias in HIV prevalence estimates occurs if non-participation in testing is correlated with HIV status. Longitudinal data suggests that individuals who know or suspect they are HIV positive are less likely to participate in testing in HIV surveys, in which case methods to correct for missing data which are based on imputation and observed characteristics will produce biased results. Methods: The identity of the HIV survey interviewer is typically associated with HIV testing participation, but is unlikely to be correlated with HIV status. Interviewer identity can thus be used as a selection variable allowing estimation of Heckman-type selection models. These models produce asymptotically unbiased HIV prevalence estimates, even when non-participation is correlated with unobserved characteristics, such as knowledge of HIV status. We introduce a new random effects method to these selection models which overcomes non-convergence caused by collinearity, small sample bias, and incorrect inference in existing approaches. Our method is easy to implement in standard statistical software, and allows the construction of bootstrapped standard errors which adjust for the fact that the relationship between testing and HIV status is uncertain and needs to be estimated. Results: Using nationally representative data from the Demographic and Health Surveys, we illustrate our approach with new point estimates and confidence intervals (CI) for HIV prevalence among men in Ghana (2003) and Zambia (2007). In Ghana, we find little evidence of selection bias as our selection model gives an HIV prevalence estimate of 1.4% (95% CI 1.2% – 1.6%), compared to 1.6% among those with a valid HIV test. In Zambia, our selection model gives an HIV prevalence estimate of 16.3% (95% CI 11.0% - 18.4%), compared to 12.1% among those with a valid HIV test. Therefore, those who decline to test in Zambia are found to be more likely to be HIV positive. Conclusions: Our approach corrects for selection bias in HIV prevalence estimates, is possible to implement even when HIV prevalence or non-participation is very high or very low, and provides a practical solution to account for both sampling and parameter uncertainty in the estimation of confidence intervals. The wide confidence intervals estimated in an example with high HIV prevalence indicate that it is difficult to correct statistically for the bias that may occur when a large proportion of people refuse to test.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As técnicas estatísticas são fundamentais em ciência e a análise de regressão linear é, quiçá, uma das metodologias mais usadas. É bem conhecido da literatura que, sob determinadas condições, a regressão linear é uma ferramenta estatística poderosíssima. Infelizmente, na prática, algumas dessas condições raramente são satisfeitas e os modelos de regressão tornam-se mal-postos, inviabilizando, assim, a aplicação dos tradicionais métodos de estimação. Este trabalho apresenta algumas contribuições para a teoria de máxima entropia na estimação de modelos mal-postos, em particular na estimação de modelos de regressão linear com pequenas amostras, afetados por colinearidade e outliers. A investigação é desenvolvida em três vertentes, nomeadamente na estimação de eficiência técnica com fronteiras de produção condicionadas a estados contingentes, na estimação do parâmetro ridge em regressão ridge e, por último, em novos desenvolvimentos na estimação com máxima entropia. Na estimação de eficiência técnica com fronteiras de produção condicionadas a estados contingentes, o trabalho desenvolvido evidencia um melhor desempenho dos estimadores de máxima entropia em relação ao estimador de máxima verosimilhança. Este bom desempenho é notório em modelos com poucas observações por estado e em modelos com um grande número de estados, os quais são comummente afetados por colinearidade. Espera-se que a utilização de estimadores de máxima entropia contribua para o tão desejado aumento de trabalho empírico com estas fronteiras de produção. Em regressão ridge o maior desafio é a estimação do parâmetro ridge. Embora existam inúmeros procedimentos disponíveis na literatura, a verdade é que não existe nenhum que supere todos os outros. Neste trabalho é proposto um novo estimador do parâmetro ridge, que combina a análise do traço ridge e a estimação com máxima entropia. Os resultados obtidos nos estudos de simulação sugerem que este novo estimador é um dos melhores procedimentos existentes na literatura para a estimação do parâmetro ridge. O estimador de máxima entropia de Leuven é baseado no método dos mínimos quadrados, na entropia de Shannon e em conceitos da eletrodinâmica quântica. Este estimador suplanta a principal crítica apontada ao estimador de máxima entropia generalizada, uma vez que prescinde dos suportes para os parâmetros e erros do modelo de regressão. Neste trabalho são apresentadas novas contribuições para a teoria de máxima entropia na estimação de modelos mal-postos, tendo por base o estimador de máxima entropia de Leuven, a teoria da informação e a regressão robusta. Os estimadores desenvolvidos revelam um bom desempenho em modelos de regressão linear com pequenas amostras, afetados por colinearidade e outliers. Por último, são apresentados alguns códigos computacionais para estimação com máxima entropia, contribuindo, deste modo, para um aumento dos escassos recursos computacionais atualmente disponíveis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

L’objectif de ce mémoire est d’étudier l’impact différencié de la satisfaction envers les bonis individuels et les bonis collectifs sur l’intention de rester (au sein d’une entreprise donnée) des travailleurs du secteur des technologies de l’information et des communications. Afin d’étudier cette question, trois hypothèses de recherche ont été émises à l’aide des théories suivantes : 1- la théorie de l’agence, 2- la théorie des attentes et 3- la théorie de l’échange social de Blau (1964). Selon la première hypothèse, la satisfaction envers les bonis individuels contribue à accroître l’intention de rester des travailleurs du secteur des TIC. La seconde hypothèse avance que la satisfaction envers les bonis collectifs contribue à accroître l’intention de rester des travailleurs du secteur des TIC. Enfin, la dernière hypothèse soutient que la satisfaction envers les bonis individuels a un impact plus important sur l’intention de rester des travailleurs du secteur des TIC que la satisfaction envers les bonis collectifs. Les données utilisées pour valider nos hypothèses ont été recueillies dans le cadre d'une enquête portant sur « les relations entre la rémunération, la formation et le développement des compétences avec l’attraction et la rétention des employés clés ». Ces données de nature longitudinale, proviennent d'une entreprise canadienne du secteur des TIC. La population étudiée regroupe les nouveaux employés embauchés entre le 1er avril 2009 et le 30 septembre 2010. Nos résultats confirment l’Hypothèse 1 voulant que la satisfaction envers les bonis individuels contribue à accroître l’intention de rester des travailleurs du secteur des TIC. À l’inverse, ces résultats infirment l’Hypothèse 2. La satisfaction envers les bonis collectifs n’a donc pas d’impact significatif sur l’intention de rester. Malgré un problème de colinéarité, nos résultats suggèrent de confirmer l’Hypothèse 3 voulant que la satisfaction envers les bonis individuels ait un impact plus important sur l’intention de rester des travailleurs du secteur des TIC que la satisfaction envers les bonis collectifs. Les résultats indiquent également que le niveau de scolarité et l’engagement organisationnel ont un impact positif sur l’intention de rester des travailleurs. Les analyses longitudinales révèlent que les différences entre les caractéristiques des travailleurs expliquent davantage l’intention de rester, que les différences à travers les temps chez un même travailleur.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper is concerned with the use of a genetic algorithm to select financial ratios for corporate distress classification models. For this purpose, the fitness value associated to a set of ratios is made to reflect the requirements of maximizing the amount of information available for the model and minimizing the collinearity between the model inputs. A case study involving 60 failed and continuing British firms in the period 1997-2000 is used for illustration. The classification model based on ratios selected by the genetic algorithm compares favorably with a model employing ratios usually found in the financial distress literature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ingestion of caesium (Cs) radioisotopes poses a health risk to humans. Crop varieties that accumulate less Cs in their edible tissues may provide a useful countermeasure. This study was performed to determine whether quantitative genetics on a model plant (Arabidopsis thaliana) might inform such 'safe'-crop strategies. Arabidopsis accessions and recombinant inbred lines (RILs), from Landsberg erecta (Ler) x Cape Verdi Island (Cvi), Ler x Columbia (Col), and Niederzenz (Nd) x Col mapping populations, were grown on agar supplemented with subtoxic levels of Cs. Shoot Cs concentration varied up to three-fold, and shoot f. wt varied up to 25-fold within populations. The heritability of growth and Cs accumulation traits ranged from 0.06 to 0.28. Four quantitative trait loci (QTL) accounted for > 80 of the genetic contribution to the total phenotypic variation in shoot Cs concentration in the Ler x Col population. QTL identified in this study, in particular, QTL co-localizing to the top and bottom regions of Chromosomes I and V in two different mapping populations, are amenable to positional cloning and, through collinearity, may inform selection or breeding strategies for the development of 'safe' crops.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Lipoxygenases (LOXs), a type of non-haem iron-containing dioxygenase, are ubiquitous enzymes in plants and participate in the formation of fruit aroma which is a very important aspect of fruit quality. Amongst the various aroma volatiles, saturated and unsaturated alcohols and aldehydes provide the characteristic aroma of the fruit. These compounds are formed from unsaturated fatty acids through oxidation, pyrolysis and reduction steps. This biosynthetic pathway involves at least four enzymes, including LOX, the enzyme responsible for lipid oxidation. Although some studies have been conducted on the LOX gene family in several species including Arabidopsis, soybean, cucumber and apple, there is no information from pear; and the evolutionary history of this gene family in the Rosaceae is still not resolved. Results In this study we identified 107 LOX homologous genes from five Rosaceous species (Pyrus bretschneideri, Malus × domestica, Fragaria vesca, Prunus mume and Prunus persica); 23 of these sequences were from pear. By using structure analysis, phylogenic analysis and collinearity analysis, we identified variation in gene structure and revealed the phylogenetic evolutionary relationship of this gene family. Expression of certain pear LOX genes during fruit development was verified by analysis of transcriptome data. Conclusions 23 LOX genes were identified in pear and these genes were found to have undergone a duplication 30–45 MYA; most of these 23 genes are functional. Specific gene duplication was found on chromosome4 in the pear genome. Useful information was provided for future research on the evolutionary history and transgenic research on LOX genes.