919 resultados para VLE data sets


Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present two new support vector approaches for ordinal regression. These approaches find the concentric spheres with minimum volume that contain most of the training samples. Both approaches guarantee that the radii of the spheres are properly ordered at the optimal solution. The size of the optimization problem is linear in the number of training samples. The popular SMO algorithm is adapted to solve the resulting optimization problem. Numerical experiments on some real-world data sets verify the usefulness of our approaches for data mining.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper discusses a method for scaling SVM with Gaussian kernel function to handle large data sets by using a selective sampling strategy for the training set. It employs a scalable hierarchical clustering algorithm to construct cluster indexing structures of the training data in the kernel induced feature space. These are then used for selective sampling of the training data for SVM to impart scalability to the training process. Empirical studies made on real world data sets show that the proposed strategy performs well on large data sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Deterministic models have been widely used to predict water quality in distribution systems, but their calibration requires extensive and accurate data sets for numerous parameters. In this study, alternative data-driven modeling approaches based on artificial neural networks (ANNs) were used to predict temporal variations of two important characteristics of water quality chlorine residual and biomass concentrations. The authors considered three types of ANN algorithms. Of these, the Levenberg-Marquardt algorithm provided the best results in predicting residual chlorine and biomass with error-free and ``noisy'' data. The ANN models developed here can generate water quality scenarios of piped systems in real time to help utilities determine weak points of low chlorine residual and high biomass concentration and select optimum remedial strategies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The research analyzes product quality from a customer perspective in the case of the wood products industry. Of specific interest is to understand better how environmental quality is perceived from a customer perspective. The empirical material used comprises four data-sets from Finland, Germany and the UK, collected during 1992 2004. The methods consist of a set of quantitative statistical analyses. The results indicate that perceived quality from a customer perspective can be presented using a multidimensional and hierarchical construct with tangible and intangible dimensions, that is common to different markets and products. This applies in the case of wood products but also more generally at least for some other construction materials. For wood products, tangible product quality has two main sub-dimensions: technical quality and appearance. For product intangibles, a few main quality dimensions seem be detectable: Quality of intangibles related to the physical product, such as environmental issues and product-related information, supplier-related characteristics, and service and sales personnel behavior. Environmental quality and information are often perceived as being inter-related. Technical performance and appearance are the most important considerations for customers in the case of wood products. Organizational customers in particular also clearly consider certain intangible quality dimensions to be important, such as service and supplier reliability. The high technical quality may be considered as a license to operate , but product appearance and intangible quality provide potential for differentiation for attracting certain market segments. Intangible quality issues are those where Nordic suppliers underperform in comparison to their Central-European competitors on the important German markets. Environmental quality may not have been used to its full extent to attract customers. One possibility is to increase the availability of the environment-related information, or to develop environment-related product characteristics to also provide some individual benefits. Information technology provides clear potential to facilitate information-based quality improvements, which was clearly recognized by Finnish forest industry already in the early 1990s. The results indeed indicate that wood products markets are segmented with regard to quality demands

Relevância:

80.00% 80.00%

Publicador:

Resumo:

It is important to identify the ``correct'' number of topics in mechanisms like Latent Dirichlet Allocation(LDA) as they determine the quality of features that are presented as features for classifiers like SVM. In this work we propose a measure to identify the correct number of topics and offer empirical evidence in its favor in terms of classification accuracy and the number of topics that are naturally present in the corpus. We show the merit of the measure by applying it on real-world as well as synthetic data sets(both text and images). In proposing this measure, we view LDA as a matrix factorization mechanism, wherein a given corpus C is split into two matrix factors M-1 and M-2 as given by C-d*w = M1(d*t) x Q(t*w).Where d is the number of documents present in the corpus anti w is the size of the vocabulary. The quality of the split depends on ``t'', the right number of topics chosen. The measure is computed in terms of symmetric KL-Divergence of salient distributions that are derived from these matrix factors. We observe that the divergence values are higher for non-optimal number of topics - this is shown by a `dip' at the right value for `t'.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background and context Since the economic reforms of 1978, China has been acclaimed as a remarkable economy, achieving 9% annual growth per head for more than 25 years. However, China's health sector has not fared well. The population health gains slowed down and health disparities increased. In the field of health and health care, significant progress in maternal care has been achieved. However, there still remain important disparities between the urban and rural areas and among the rural areas in terms of economic development. The excess female infant deaths and the rapidly increasing sex ratio at birth in the last decade aroused serious concerns among policy makers and scholars. Decentralization of the government administration and health sector reform impacts maternal care. Many studies using census data have been conducted to explore the determinants of a high sex ratio at birth, but no agreement has been so far reached on the possible contributing factors. No study using family planning system data has been conducted to explore perinatal mortality and sex ratio at birth and only few studies have examined the impact of the decentralization of government and health sector reforms on the provision and organization of maternal care in rural China. Objectives The general objective of this study was to investigate the state of perinatal health and maternal care and their determinants in rural China under the historic context of major socioeconomic reforms and the one child family planning policy. The specific objectives of the study included: 1) to study pregnancy outcomes and perinatal health and their correlates in a rural Chinese county; 2) to examine the issue of sex ratio at birth and its determinants in a rural Chinese county; 3) to explore the patterns of provision, utilization, and content of maternal care in a rural Chinese county; 4) to investigate the changes in the use of maternal care in China from 1991 to 2003. Materials and Methods This study is based on a project for evaluating the prenatal care programme in Dingyuan county in 1999-2003, Anhui province, China and a nationwide household health survey to describe the changes in maternal care utilization. The approaches used included a retrospective cohort study, cross sectional interview surveys, informant interviews, observations and the use of statistical data. The data sources included the following: 1) A cohort of pregnant women followed from pregnancy up to 7 days after birth in 20 townships in the study county, collecting information on pregnancy outcomes using family planning records; 2) A questionnaire interview survey given to women who gave birth between 2001 and 2003; 3) Various statistical and informant surveys data collected from the study county; 4) Three national household health interview survey data sets (1993-2003) were utilized, and reanalyzed to described the changes in maternity care utilization. Relative risks (RR) and their confidence intervals (CI) were calculated for comparison between parity, approval status, infant sex and township groups. The chi-square test was used to analyse the disparity of use of maternal care between and within urban and rural areas and its trend across the years in China. Logistic regression was used to analyse the factors associated with hospital delivery in rural areas. Results There were 3697 pregnancies in the study cohort, resulting in 3092 live births in a total population of 299463 in the 20 study townships during 1999-2000. The average age at pregnancy in the cohort was 25.9 years. Of the women, 61% were childless, 38% already had one child and 0.3% had two children before the current pregnancy. About 90% of approved pregnancies ended in a live birth while 73% of the unapproved ones were aborted. The perinatal mortality rate was 69 per thousand births. If the 30 induced abortions in which the gestational age was more than 28 weeks had been counted as perinatal deaths, the perinatal mortality rate would have been as high as 78 per thousand. The perinatal mortality rate was negatively associated with the wealth of the township. Approximately two thirds of the perinatal deaths occurred in the early neonatal period. Both the still birth rate and the early neonatal death rate increased with parity. The risk of a stillbirth in a second pregnancy was almost four times that for a first pregnancy, while the risk of early neonatal deaths doubled. The early neonatal mortality rate was twice as high for female as for male infants. The sex difference in the early neonatal mortality rate was mainly attributable to mortality in second births. The male early neonatal mortality rate was not affected by parity, while the female early neonatal mortality rate increased dramatically with parity: it was about six times higher for second births than for first births. About 82% early neonatal deaths happened within 24 hours after birth, and during that time, girls were almost three times more likely to die than boys. The death rate of females on the day of birth increased much more sharply with parity than that of males. The total sex ratio at birth of 3697 registered pregnancies was 152 males to 100 females, with 118 and 287 in first and second pregnancies, respectively. Among unapproved pregnancies, there were almost 5 live-born boys for each girl. Most prenatal and delivery care was to be taken care of in township hospitals. At the village level, there were small private clinics. There was no limitation period for the provision of prenatal and postnatal care by private practitioners. They were not permitted to provide delivery care by the county health bureau, but as some 12% of all births occurred either at home or at private clinics; some village health workers might have been involved. The county level hospitals served as the referral centers for the township hospitals in the county. However, there was no formal regulation or guideline on how the referral system should work. Whether or not a woman was referred to a higher level hospital depended on the individual midwife's professional judgment and on the clients' compliance. The county health bureau had little power over township hospitals, because township hospitals had in the decentralization process become directly accountable to the township government. In the township and county hospitals only 10-20% of the recurrent costs were funded by local government (the township hospital was funded by the township government and the county hospital was funded by the county government) and the hospitals collected user fees to balance their budgets. Also the staff salaries depended on fee incomes by the hospital. The hospitals could define the user charges themselves. Prenatal care consultations were however free in most township hospitals. None of the midwives made postnatal home visits, because of low profit of these services. The three national household health survey data showed that the proportion of women receiving their first prenatal visit within 12 weeks increased greatly from the early to middle 1990s in all areas except for large cities. The increase was much larger in the rural areas, reducing the urban-rural difference from more than 4 times to about 1.4 times. The proportion of women that received antenatal care visits meeting the Ministry of Health s standard (at least 5 times) in the rural areas increased sharply from 12% in 1991-1993 to 36% in 2001-2003. In rural areas, the proportion increase was much faster in less developed areas than in developed areas. The hospital delivery rate increased slightly from 90% to 94% in urban areas while the proportion increased from 27% to 69% in rural areas. The fastest change was found to be in type 4 rural areas, where the utilization even quadrupled. The overall difference between rural and urban areas was substantially narrowed over the period. Multiple logistic regression analysis shows that time periods, residency in rural or urban areas, income levels, age group, education levels, delivery history, occupation, health insurance and distance from the nearest health care facilities were significantly associated with hospital delivery rates. Conclusions 1. Perinatal mortality in this study was much higher than that for urban areas as well as any reported rate from specific studies in rural areas of China. Previous studies in which calculations of infant mortality were not based on epidemiological surveys have been shown to underestimate the rates by more than 50%. 2. Routine statistics collected by the Chinese family planning system proved to be a reliable data source for studying perinatal health, including still births, neonatal deaths, sex ratio at birth and among newborns. National Household Health Survey data proved to be a useful and reliable data source for studying population health and health services. Prior to this research there were few studies in these areas available to international audiences. 3.Though perinatal mortality rate was negatively associated with the level of township economic development, the excess female early neonatal mortality rate contributed much more to high perinatal mortality rate than economic factors. This was likely a result of the role of the family planning policy and the traditional preferences for sons, which leads to lethal neglect of female newborns and high perinatal mortality. 4. The selective abortions of female foetuses were likely to contribute most to the high sex ratio at birth. The underreporting of female births seemed to have played a secondary role. The higher early neonatal mortality rate in second-born as compared to first-born children, particularly in females, may indicate that neglect or poorer care of female newborn infants also contributes to the high sex ratio at birth or among newborns. Existing family planning policy proved not to effectively control the steadily increased birth sex ratio. 5. The rural-urban gap in service utilization was on average significantly narrowed in terms of maternal healthcare in China from 1991 to 2003. This demonstrates that significant achievements in reducing inequities can be made through a combination of socio-economic development and targeted investments in improving health services, including infrastructure, staff capacities, and subsidies to reduce the costs of service utilization for the poorest. However, the huge gap which persisted among cities of different size and within different types of rural areas indicated the need for further efforts to support the poorest areas. 6. Hospital delivery care in the study county was better accepted by women because most of women think delivery care was very important while prenatal and postnatal care were not. Hospital delivery care was more systematically provided and promoted than prenatal and postnatal care by township hospital in the study area. The reliance of hospital staff income on user fees gave the hospitals an incentive to put more emphasis on revenue generating activities such as delivery care instead of prenatal and postnatal care, since delivery care generated much profits than prenatal and postnatal care . Recommendations 1. It is essential for the central government to re-assess and modify existing family planning policies. In order to keep national sex balance, the existing practice of one couple one child in urban areas and at-least-one-son a couple in rural areas should be gradually changed to a two-children-a-couple policy throughout the country. The government should establish a favourable social security policy for couples, especially for rural couples who have only daughters, with particular emphasis on their pension and medical care insurance, combined with an educational campaign for equal rights for boys and girls in society. 2. There is currently no routine vital-statistics registration system in rural China. Using the findings of this study, the central government could set up a routine vital-statistics registration system using family planning routine work records, which could be used by policy makers and researchers. 3. It is possible for the central and provincial government to invest more in the less developed and poor rural areas to increase the access of pregnant women in these areas to maternal care services. Central government together with local government should gradually provide free maternal care including prenatal and postnatal as well as delivery care to the women in poor and less developed rural areas. 4. Future research could be done to explore if county and the township level health care sector and the family planning system could be merged to increase the effectiveness and efficiency of maternal and child care. 5. Future research could be done to explore the relative contribution of maternal care, economic development and family planning policy on perinatal and child health using prospective cohort studies and community based randomized trials. Key words: perinatal health, perinatal mortality, stillbirth, neonatal death, sex selective abortion, sex ratio at birth, family planning, son preference, maternal care, prenatal care, postnatal care, equity, China

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ewing sarcoma is an aggressive and poorly differentiated malignancy of bone and soft tissue. It primarily affects children, adolescents, and young adults, with a slight male predominance. It is characterized by a translocation between chromosomes 11 and 22 resulting in the EWSR1-FLI1fusion transcription factor. The aim of this study is to identify putative Ewing sarcoma target genes through an integrative analysis of three microarray data sets. Array comparative genomic hybridization is used to measure changes in DNA copy number, and analyzed to detect common chromosomal aberrations. mRNA and miRNA microarrays are used to measure expression of protein-coding and miRNA genes, and these results integrated with the copy number data. Chromosomal aberrations typically contain also bystanders in addition to the driving tumor suppressor and oncogenes, and integration with expression helps to identify the true targets. Correlation between expression of miRNAs and their predicted target mRNAs is also evaluated to assess the results of post-transcriptional miRNA regulation on mRNA levels. The highest frequencies of copy number gains were identified in chromosome 8, 1q, and X. Losses were most frequent in 9p21.3, which also showed an enrichment of copy number breakpoints relative to the rest of the genome. Copy number losses in 9p21.3 were found have a statistically significant effect on the expression of MTAP, but not on CDKN2A, which is a known tumor-suppressor in the same locus. MTAP was also down-regulated in the Ewing sarcoma cell lines compared to mesenchymal stem cells. Genes exhibiting elevated expression in association with copy number gains and up-regulation compared to the reference samples included DCAF7, ENO2, MTCP1, andSTK40. Differentially expressed miRNAs were detected by comparing Ewing sarcoma cell lines against mesenchymal stem cells. 21 up-regulated and 32 down-regulated miRNAs were identified, includingmiR-145, which has been previously linked to Ewing sarcoma. The EWSR1-FLI1 fusion gene represses miR-145, which in turn targets FLI1 forming a mutually repressive feedback loop. In addition higher expression linked to copy number gains and compared to mesenchymal stem cells, STK40 was also found to be a target of four different miRNAs that were all down-regulated in Ewing sarcoma cell lines compared to the reference samples. SLCO5A1 was identified as the only up-regulated gene within a frequently gained region in chromosome 8. This region was gained in over 90 % of the cell lines, and also with a higher frequency than the neighboring regions. In addition, SLCO5A1 was found to be a target of three miRNAs that were down-regulated compared to the mesenchymal stem cells.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tiivistelmä ReferatAbstract Metabolomics is a rapidly growing research field that studies the response of biological systems to environmental factors, disease states and genetic modifications. It aims at measuring the complete set of endogenous metabolites, i.e. the metabolome, in a biological sample such as plasma or cells. Because metabolites are the intermediates and end products of biochemical reactions, metabolite compositions and metabolite levels in biological samples can provide a wealth of information on on-going processes in a living system. Due to the complexity of the metabolome, metabolomic analysis poses a challenge to analytical chemistry. Adequate sample preparation is critical to accurate and reproducible analysis, and the analytical techniques must have high resolution and sensitivity to allow detection of as many metabolites as possible. Furthermore, as the information contained in the metabolome is immense, the data set collected from metabolomic studies is very large. In order to extract the relevant information from such large data sets, efficient data processing and multivariate data analysis methods are needed. In the research presented in this thesis, metabolomics was used to study mechanisms of polymeric gene delivery to retinal pigment epithelial (RPE) cells. The aim of the study was to detect differences in metabolomic fingerprints between transfected cells and non-transfected controls, and thereafter to identify metabolites responsible for the discrimination. The plasmid pCMV-β was introduced into RPE cells using the vector polyethyleneimine (PEI). The samples were analyzed using high performance liquid chromatography (HPLC) and ultra performance liquid chromatography (UPLC) coupled to a triple quadrupole (QqQ) mass spectrometer (MS). The software MZmine was used for raw data processing and principal component analysis (PCA) was used in statistical data analysis. The results revealed differences in metabolomic fingerprints between transfected cells and non-transfected controls. However, reliable fingerprinting data could not be obtained because of low analysis repeatability. Therefore, no attempts were made to identify metabolites responsible for discrimination between sample groups. Repeatability and accuracy of analyses can be influenced by protocol optimization. However, in this study, optimization of analytical methods was hindered by the very small number of samples available for analysis. In conclusion, this study demonstrates that obtaining reliable fingerprinting data is technically demanding, and the protocols need to be thoroughly optimized in order to approach the goals of gaining information on mechanisms of gene delivery.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Habitat fragmentation produces patches of suitable habitat surrounded by unfavourable matrix habitat. A species may persist in such a fragmented landscape in an equilibrium between the extinctions and recolonizations of local populations, thus forming a metapopulation. Migration between local populations is necessary for the long-term persistence of a metapopulation. The Glanville fritillary butterfly (Melitaea cinxia) forms a metapopulation in the Åland islands in Finland. There is migration between the populations, the extent of which is affected by several environmental factors and variation in the phenotype of individual butterflies. Different allelic forms of the glycolytic enzyme phosphoglucose isomerase (Pgi) has been identified as a possible genetic factor influencing flight performance and migration rate in this species. The frequency of a certain Pgi allele, Pgi-f, follows the same pattern in relation to population age and connectivity as migration propensity. Furthermore, variation in flight metabolic performance, which is likely to affect migration propensity, has been linked to genetic variation in Pgi or a closely linked locus. The aim of this study was to investigate the association between Pgi genotype and the migration propensity in the Glanville fritillary both at the individual and population levels using a statistical modelling approach. A mark-release-recapture (MRR) study was conducted in a habitat patch network of M. cinxia in Åland to collect data on the movements of individual butterflies. Larval samples from the study area were also collected for population level examinations. Each butterfly and larva was genotyped at the Pgi locus. The MRR data was parameterised with two mathematical models of migration: the Virtual Migration Model (VM) and the spatially explicit diffusion model. VM model predicted and observed numbers of emigrants from populations with high and low frequencies of Pgi-f were compared. Posterior predictive data sets were simulated based on the parameters of the diffusion model. Lack-of-fit of observed values to the model predicted values of several descriptors of movements were detected, and the effect of Pgi genotype on the deviations was assessed by randomizations including the genotype information. This study revealed a possible difference in the effect of Pgi genotype on migration propensity between the two sexes in the Glanville fritillary. The females with and males without the Pgi-f allele moved more between habitat patches, which is probably related to differences in the function of flight in the two sexes. Females may use their high flight capacity to migrate between habitat patches to find suitable oviposition sites, whereas males may use it to acquire mates by keeping a territory and fighting off other intruding males, possibly causing them to emigrate. The results were consistent across different movement descriptors and at the individual and population levels. The effect of Pgi is likely to be dependent on the structure of the landscape and the prevailing environmental conditions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The factors affecting the non-industrial, private forest landowners' (hereafter referred to using the acronym NIPF) strategic decisions in management planning are studied. A genetic algorithm is used to induce a set of rules predicting potential cut of the landowners' choices of preferred timber management strategies. The rules are based on variables describing the characteristics of the landowners and their forest holdings. The predictive ability of a genetic algorithm is compared to linear regression analysis using identical data sets. The data are cross-validated seven times applying both genetic algorithm and regression analyses in order to examine the data-sensitivity and robustness of the generated models. The optimal rule set derived from genetic algorithm analyses included the following variables: mean initial volume, landowner's positive price expectations for the next eight years, landowner being classified as farmer, and preference for the recreational use of forest property. When tested with previously unseen test data, the optimal rule set resulted in a relative root mean square error of 0.40. In the regression analyses, the optimal regression equation consisted of the following variables: mean initial volume, proportion of forestry income, intention to cut extensively in future, and positive price expectations for the next two years. The R2 of the optimal regression equation was 0.34 and the relative root mean square error obtained from the test data was 0.38. In both models, mean initial volume and positive stumpage price expectations were entered as significant predictors of potential cut of preferred timber management strategy. When tested with the complete data set of 201 observations, both the optimal rule set and the optimal regression model achieved the same level of accuracy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A sensitive framework has been developed for modelling young radiata pine survival, its growth and its size class distribution, from time of planting to age 5 or 6 years. The data and analysis refer to the Central North Island region of New Zealand. The survival function is derived from a Weibull probability density function, to reflect diminishing mortality with the passage of time in young stands. An anamorphic family of trends was used, as very little between-tree competition can be expected in young stands. An exponential height function was found to fit best the lower portion of its sigmoid form. The most appropriate basal area/ha exponential function included an allometric adjustment which resulted in compatible mean height and basal area/ha models. Each of these equations successfully represented the effects of several establishment practices by making coefficients linear functions of site factors, management activities and their interactions. Height and diameter distribution modelling techniques that ensured compatibility with stand values were employed to represent the effects of management practices on crop variation. Model parameters for this research were estimated using data from site preparation experiments in the region and were tested with some independent data sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tasaikäisen metsän alle muodostuvilla alikasvoksilla on merkitystä puunkorjuun, metsänuudistamisen, näkemä-ja maisema-analyysien sekä biodiversiteetin ja hiilitaseen arvioinnin kannalta. Ilma-aluksista tehtävä laserkeilaus on osoittautunut tehokkaaksi kaukokartoitusmenetelmäksi varttuneiden puustojen mittauksessa. Laserkeilauksen käyttöönotto operatiivisessa metsäsuunnittelussa mahdollistaa aiempaa tarkemman tiedon tuottamisen alikasvoksista, mikäli alikasvoksen ominaisuuksia voidaan tulkita laseraineistoista. Tässä työssä käytettiin tarkasti mitattuja maastokoealoja ja kaikulaserkeilausaineistoja (discrete return LiDAR) usealta vuodelta (1–2 km lentokorkeus, 0,9–9,7 pulssia m-2). Laserkeilausaineistot oli hankittu Optech ALTM3100 ja Leica ALS50-II sensoreilla. Koealat edustavat suomalaisia tasaikäisiä männiköitä eri kehitysvaiheissa. Tutkimuskysymykset olivat: 1) Minkälainen on alikasvoksesta saatu lasersignaali yksittäisen pulssin tasolla ja mitkä tekijät signaaliin vaikuttavat? 2) Mikä on käytännön sovelluksissa hyödynnettävien aluepohjaisten laserpiirteiden selitysvoima alikasvospuuston ominaisuuksien ennustamisessa? Erityisesti haluttiin selvittää, miten laserpulssin energiahäviöt ylempiin latvuskerroksiin vaikuttavat saatuun signaaliin, ja voidaanko laserkaikujen intensiteetille tehdä energiahäviöiden korjaus. Puulajien väliset erot laserkaiun intensiteetissä olivat pieniä ja vaihtelivat keilauksesta toiseen. Intensiteetin käyttömahdollisuudet alikasvoksen puulajin tulkinnassa ovat siten hyvin rajoittuneet. Energiahäviöt ylempiin latvuskerroksiin aiheuttivat alikasvoksesta saatuun lasersignaaliin kohinaa. Energiahäviöiden korjaus tehtiin alikasvoksesta saaduille laserpulssin 2. ja 3. kaiuille. Korjauksen avulla pystyttiin pienentämään kohteen sisäistä intensiteetin hajontaa ja parantamaan kohteiden luokittelutarkkuutta alikasvoskerroksessa. Käytettäessä 2. kaikuja oikeinluokitusprosentti luokituksessa maan ja yleisimmän puulajin välillä oli ennen korjausta 49,2–54,9 % ja korjauksen jälkeen 57,3–62,0 %. Vastaavat kappa-arvot olivat 0,03–0,13 ja 0,10–0,22. Tärkein energiahäviöitä selittävä tekijä oli pulssista saatujen aikaisempien kaikujen intensiteetti, mutta hieman merkitystä oli myös pulssin leikkausgeometrialla ylemmän latvuskerroksen puiden kanssa. Myös 3. kaiuilla luokitustarkkuus parani. Puulajien välillä havaittiin eroja siinä, kuinka herkästi ne tuottavat kaiun laserpulssin osuessa puuhun. Kuusi tuotti kaiun suuremmalla todennäköisyydellä kuin lehtipuut. Erityisen selvä tämä ero oli pulsseilla, joissa oli energiahäviöitä. Laserkaikujen korkeusjakaumapiirteet voivat siten olla riippuvaisia puulajista. Sensorien välillä havaittiin selviä eroja intensiteettijakaumissa, mikä vaikeuttaa eri sensoreilla hankittujen aineistojen yhdistämistä. Myös kaiun todennäköisyydet erosivat jonkin verran sensorien välillä, mikä aiheutti pieniä eroavaisuuksia kaikujen korkeusjakaumiin. Aluepohjaisista laserpiirteistä löydettiin alikasvoksen runkolukua ja keskipituutta hyvin selittäviä piirteitä, kun rajoitettiin tarkastelu yli 1 m pituisiin puihin. Piirteiden selitysvoima oli parempi runkoluvulle kuin keskipituudelle. Selitysvoima ei merkittävästi alentunut pulssitiheyden pienentyessä, mikä on hyvä asia käytännön sovelluksia ajatellen. Lehtipuun osuutta ei pystytty selittämään. Tulosten perusteella kaikulaserkeilausta voi olla mahdollista hyödyntää esimerkiksi ennakkoraivaustarpeen arvioinnissa. Sen sijaan alikasvoksen tarkempi luokittelu (esim. puulajitulkinta) voi olla vaikeaa. Kaikkein pienimpiä alikasvospuita ei pystytä havaitsemaan. Lisää tutkimuksia tarvitaan tulosten yleistämiseksi erilaisiin metsiköihin.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Gaussian Processes (GPs) are promising Bayesian methods for classification and regression problems. They have also been used for semi-supervised learning tasks. In this paper, we propose a new algorithm for solving semi-supervised binary classification problem using sparse GP regression (GPR) models. It is closely related to semi-supervised learning based on support vector regression (SVR) and maximum margin clustering. The proposed algorithm is simple and easy to implement. It gives a sparse solution directly unlike the SVR based algorithm. Also, the hyperparameters are estimated easily without resorting to expensive cross-validation technique. Use of sparse GPR model helps in making the proposed algorithm scalable. Preliminary results on synthetic and real-world data sets demonstrate the efficacy of the new algorithm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An algorithm to generate a minimal spanning tree is presented when the nodes with their coordinates in some m-dimensional Euclidean space and the corresponding metric are given. This algorithm is tested on manually generated data sets. The worst case time complexity of this algorithm is O(n log2n) for a collection of n data samples.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose an efficient and parameter-free scoring criterion, the factorized conditional log-likelihood (ˆfCLL), for learning Bayesian network classifiers. The proposed score is an approximation of the conditional log-likelihood criterion. The approximation is devised in order to guarantee decomposability over the network structure, as well as efficient estimation of the optimal parameters, achieving the same time and space complexity as the traditional log-likelihood scoring criterion. The resulting criterion has an information-theoretic interpretation based on interaction information, which exhibits its discriminative nature. To evaluate the performance of the proposed criterion, we present an empirical comparison with state-of-the-art classifiers. Results on a large suite of benchmark data sets from the UCI repository show that ˆfCLL-trained classifiers achieve at least as good accuracy as the best compared classifiers, using significantly less computational resources.