957 resultados para Statistical approach
Resumo:
Subcycling, or the use of different timesteps at different nodes, can be an effective way of improving the computational efficiency of explicit transient dynamic structural solutions. The method that has been most widely adopted uses a nodal partition. extending the central difference method, in which small timestep updates are performed interpolating on the displacement at neighbouring large timestep nodes. This approach leads to narrow bands of unstable timesteps or statistical stability. It also can be in error due to lack of momentum conservation on the timestep interface. The author has previously proposed energy conserving algorithms that avoid the first problem of statistical stability. However, these sacrifice accuracy to achieve stability. An approach to conserve momentum on an element interface by adding partial velocities is considered here. Applied to extend the central difference method. this approach is simple. and has accuracy advantages. The method can be programmed by summing impulses of internal forces, evaluated using local element timesteps, in order to predict a velocity change at a node. However, it is still only statistically stable, so an adaptive timestep size is needed to monitor accuracy and to be adjusted if necessary. By replacing the central difference method with the explicit generalized alpha method. it is possible to gain stability by dissipating the high frequency response that leads to stability problems. However. coding the algorithm is less elegant, as the response depends on previous partial accelerations. Extension to implicit integration, is shown to be impractical due to the neglect of remote effects of internal forces acting across a timestep interface. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
The design of randomized controlled trials entails decisions that have economic as well as statistical implications. In particular, the choice of an individual or cluster randomization design may affect the cost of achieving the desired level of power, other things being equal. Furthermore, if cluster randomization is chosen, the researcher must decide how to balance the number of clusters, or sites, and the size of each site. This article investigates these interrelated statistical and economic issues. Its principal purpose is to elucidate the statistical and economic trade-offs to assist researchers to employ randomized controlled trials that have desired economic, as well as statistical, properties. (C) 2003 Elsevier Inc. All rights reserved.
Resumo:
Wyner - Ziv (WZ) video coding is a particular case of distributed video coding (DVC), the recent video coding paradigm based on the Slepian - Wolf and Wyner - Ziv theorems which exploits the source temporal correlation at the decoder and not at the encoder as in predictive video coding. Although some progress has been made in the last years, WZ video coding is still far from the compression performance of predictive video coding, especially for high and complex motion contents. The WZ video codec adopted in this study is based on a transform domain WZ video coding architecture with feedback channel-driven rate control, whose modules have been improved with some recent coding tools. This study proposes a novel motion learning approach to successively improve the rate-distortion (RD) performance of the WZ video codec as the decoding proceeds, making use of the already decoded transform bands to improve the decoding process for the remaining transform bands. The results obtained reveal gains up to 2.3 dB in the RD curves against the performance for the same codec without the proposed motion learning approach for high motion sequences and long group of pictures (GOP) sizes.
Resumo:
Water covers over 70% of the Earth's surface, and is vital for all known forms of life. But only 3% of the Earth's water is fresh water, and less than 0.3% of all freshwater is in rivers, lakes, reservoirs and the atmosphere. However, rivers and lakes are an important part of fresh surface water, amounting to about 89%. In this Master Thesis dissertation, the focus is on three types of water bodies – rivers, lakes and reservoirs, and their water quality issues in Asian countries. The surface water quality in a region is largely determined both by the natural processes such as climate or geographic conditions, and the anthropogenic influences such as industrial and agricultural activities or land use conversion. The quality of the water can be affected by pollutants discharge from a specific point through a sewer pipe and also by extensive drainage from agriculture/urban areas and within basin. Hence, water pollutant sources can be divided into two categories: Point source pollution and Non-point source (NPS) pollution. Seasonal variations in precipitation and surface run-off have a strong effect on river discharge and the concentration of pollutants in water bodies. For example, in the rainy season, heavy and persistent rain wash off the ground, the runoff flow increases and may contain various kinds of pollutants and, eventually, enters the water bodies. In some cases, especially in confined water bodies, the quality may be positive related with rainfall in the wet season, because this confined type of fresh water systems allows high dilution of pollutants, decreasing their possible impacts. During the dry season, the quality of water is largely related to industrialization and urbanization pollution. The aim of this study is to identify the most common water quality problems in Asian countries and to enumerate and analyze the methodologies used for assessment of water quality conditions of both rivers and confined water bodies (lakes and reservoirs). Based on the evaluation of a sample of 57 papers, dated between 2000 and 2012, it was found that over the past decade, the water quality of rivers, lakes, and reservoirs in developing countries is being degraded. Water pollution and destruction of aquatic ecosystems have caused massive damage to the functions and integrity of water resources. The most widespread NPS in Asian countries and those which have the greatest spatial impacts are urban runoff and agriculture. Locally, mine waste runoff and rice paddy are serious NPS problems. The most relevant point pollution sources are the effluents from factories, sewage treatment plant, and public or household facilities. It was found that the most used methodology was unquestionably the monitoring activity, used in 49 of analyzed studies, accounting for 86%. Sometimes, data from historical databases were used as well. It can be seen that taking samples from the water body and then carry on laboratory work (chemical analyses) is important because it can give an understanding of the water quality. 6 papers (11%) used a method that combined monitoring data and modeling. 6 papers (11%) just applied a model to estimate the quality of water. Modeling is a useful resource when there is limited budget since some models are of free download and use. In particular, several of used models come from the U.S.A, but they have their own purposes and features, meaning that a careful application of the models to other countries and a critical discussion of the results are crucial. 5 papers (9%) focus on a method combining monitoring data and statistical analysis. When there is a huge data matrix, the researchers need an efficient way of interpretation of the information which is provided by statistics. 3 papers (5%) used a method combining monitoring data, statistical analysis and modeling. These different methods are all valuable to evaluate the water quality. It was also found that the evaluation of water quality was made as well by using other types of sampling different than water itself, and they also provide useful information to understand the condition of the water body. These additional monitoring activities are: Air sampling, sediment sampling, phytoplankton sampling and aquatic animal tissues sampling. Despite considerable progress in developing and applying control regulations to point and NPS pollution, the pollution status of rivers, lakes, and reservoirs in Asian countries is not improving. In fact, this reflects the slow pace of investment in new infrastructure for pollution control and growing population pressures. Water laws or regulations and public involvement in enforcement can play a constructive and indispensable role in environmental protection. In the near future, in order to protect water from further contamination, rapid action is highly needed to control the various kinds of effluents in one region. Environmental remediation and treatment of industrial effluent and municipal wastewaters is essential. It is also important to prevent the direct input of agricultural and mine site runoff. Finally, stricter environmental regulation for water quality is required to support protection and management strategies. It would have been possible to get further information based in the 57 sample of papers. For instance, it would have been interesting to compare the level of concentrations of some pollutants in the diferente Asian countries. However the limit of three months duration for this study prevented further work to take place. In spite of this, the study objectives were achieved: the work provided an overview of the most relevant water quality problems in rivers, lakes and reservoirs in Asian countries, and also listed and analyzed the most common methodologies.
Resumo:
TPM Vol. 21, No. 4, December 2014, 435-447 – Special Issue © 2014 Cises.
Resumo:
Modern real-time systems, with a more flexible and adaptive nature, demand approaches for timeliness evaluation based on probabilistic measures of meeting deadlines. In this context, simulation can emerge as an adequate solution to understand and analyze the timing behaviour of actual systems. However, care must be taken with the obtained outputs under the penalty of obtaining results with lack of credibility. Particularly important is to consider that we are more interested in values from the tail of a probability distribution (near worst-case probabilities), instead of deriving confidence on mean values. We approach this subject by considering the random nature of simulation output data. We will start by discussing well known approaches for estimating distributions out of simulation output, and the confidence which can be applied to its mean values. This is the basis for a discussion on the applicability of such approaches to derive confidence on the tail of distributions, where the worst-case is expected to be.
Resumo:
This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.
Resumo:
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para a obtenção do grau de Mestre em Engenharia do Ambiente
Resumo:
In the current context of serious climate changes, where the increase of the frequency of some extreme events occurrence can enhance the rate of periods prone to high intensity forest fires, the National Forest Authority often implements, in several Portuguese forest areas, a regular set of measures in order to control the amount of fuel mass availability (PNDFCI, 2008). In the present work we’ll present a preliminary analysis concerning the assessment of the consequences given by the implementation of prescribed fire measures to control the amount of fuel mass in soil recovery, in particular in terms of its water retention capacity, its organic matter content, pH and content of iron. This work is included in a larger study (Meira-Castro, 2009(a); Meira-Castro, 2009(b)). According to the established praxis on the data collection, embodied in multidimensional matrices of n columns (variables in analysis) by p lines (sampled areas at different depths), and also considering the quantitative data nature present in this study, we’ve chosen a methodological approach that considers the multivariate statistical analysis, in particular, the Principal Component Analysis (PCA ) (Góis, 2004). The experiments were carried out in a soil cover over a natural site of Andaluzitic schist, in Gramelas, Caminha, NW Portugal, who was able to maintain itself intact from prescribed burnings from four years and was submit to prescribed fire in March 2008. The soils samples were collected from five different plots at six different time periods. The methodological option that was adopted have allowed us to identify the most relevant relational structures inside the n variables, the p samples and in two sets at the same time (Garcia-Pereira, 1990). Consequently, and in addition to the traditional outputs produced from the PCA, we have analyzed the influence of both sampling depths and geomorphological environments in the behavior of all variables involved.
Resumo:
Hyperspectral remote sensing exploits the electromagnetic scattering patterns of the different materials at specific wavelengths [2, 3]. Hyperspectral sensors have been developed to sample the scattered portion of the electromagnetic spectrum extending from the visible region through the near-infrared and mid-infrared, in hundreds of narrow contiguous bands [4, 5]. The number and variety of potential civilian and military applications of hyperspectral remote sensing is enormous [6, 7]. Very often, the resolution cell corresponding to a single pixel in an image contains several substances (endmembers) [4]. In this situation, the scattered energy is a mixing of the endmember spectra. A challenging task underlying many hyperspectral imagery applications is then decomposing a mixed pixel into a collection of reflectance spectra, called endmember signatures, and the corresponding abundance fractions [8–10]. Depending on the mixing scales at each pixel, the observed mixture is either linear or nonlinear [11, 12]. Linear mixing model holds approximately when the mixing scale is macroscopic [13] and there is negligible interaction among distinct endmembers [3, 14]. If, however, the mixing scale is microscopic (or intimate mixtures) [15, 16] and the incident solar radiation is scattered by the scene through multiple bounces involving several endmembers [17], the linear model is no longer accurate. Linear spectral unmixing has been intensively researched in the last years [9, 10, 12, 18–21]. It considers that a mixed pixel is a linear combination of endmember signatures weighted by the correspondent abundance fractions. Under this model, and assuming that the number of substances and their reflectance spectra are known, hyperspectral unmixing is a linear problem for which many solutions have been proposed (e.g., maximum likelihood estimation [8], spectral signature matching [22], spectral angle mapper [23], subspace projection methods [24,25], and constrained least squares [26]). In most cases, the number of substances and their reflectances are not known and, then, hyperspectral unmixing falls into the class of blind source separation problems [27]. Independent component analysis (ICA) has recently been proposed as a tool to blindly unmix hyperspectral data [28–31]. ICA is based on the assumption of mutually independent sources (abundance fractions), which is not the case of hyperspectral data, since the sum of abundance fractions is constant, implying statistical dependence among them. This dependence compromises ICA applicability to hyperspectral images as shown in Refs. [21, 32]. In fact, ICA finds the endmember signatures by multiplying the spectral vectors with an unmixing matrix, which minimizes the mutual information among sources. If sources are independent, ICA provides the correct unmixing, since the minimum of the mutual information is obtained only when sources are independent. This is no longer true for dependent abundance fractions. Nevertheless, some endmembers may be approximately unmixed. These aspects are addressed in Ref. [33]. Under the linear mixing model, the observations from a scene are in a simplex whose vertices correspond to the endmembers. Several approaches [34–36] have exploited this geometric feature of hyperspectral mixtures [35]. Minimum volume transform (MVT) algorithm [36] determines the simplex of minimum volume containing the data. The method presented in Ref. [37] is also of MVT type but, by introducing the notion of bundles, it takes into account the endmember variability usually present in hyperspectral mixtures. The MVT type approaches are complex from the computational point of view. Usually, these algorithms find in the first place the convex hull defined by the observed data and then fit a minimum volume simplex to it. For example, the gift wrapping algorithm [38] computes the convex hull of n data points in a d-dimensional space with a computational complexity of O(nbd=2cþ1), where bxc is the highest integer lower or equal than x and n is the number of samples. The complexity of the method presented in Ref. [37] is even higher, since the temperature of the simulated annealing algorithm used shall follow a log( ) law [39] to assure convergence (in probability) to the desired solution. Aiming at a lower computational complexity, some algorithms such as the pixel purity index (PPI) [35] and the N-FINDR [40] still find the minimum volume simplex containing the data cloud, but they assume the presence of at least one pure pixel of each endmember in the data. This is a strong requisite that may not hold in some data sets. In any case, these algorithms find the set of most pure pixels in the data. PPI algorithm uses the minimum noise fraction (MNF) [41] as a preprocessing step to reduce dimensionality and to improve the signal-to-noise ratio (SNR). The algorithm then projects every spectral vector onto skewers (large number of random vectors) [35, 42,43]. The points corresponding to extremes, for each skewer direction, are stored. A cumulative account records the number of times each pixel (i.e., a given spectral vector) is found to be an extreme. The pixels with the highest scores are the purest ones. N-FINDR algorithm [40] is based on the fact that in p spectral dimensions, the p-volume defined by a simplex formed by the purest pixels is larger than any other volume defined by any other combination of pixels. This algorithm finds the set of pixels defining the largest volume by inflating a simplex inside the data. ORA SIS [44, 45] is a hyperspectral framework developed by the U.S. Naval Research Laboratory consisting of several algorithms organized in six modules: exemplar selector, adaptative learner, demixer, knowledge base or spectral library, and spatial postrocessor. The first step consists in flat-fielding the spectra. Next, the exemplar selection module is used to select spectral vectors that best represent the smaller convex cone containing the data. The other pixels are rejected when the spectral angle distance (SAD) is less than a given thresh old. The procedure finds the basis for a subspace of a lower dimension using a modified Gram–Schmidt orthogonalizati on. The selected vectors are then projected onto this subspace and a simplex is found by an MV T pro cess. ORA SIS is oriented to real-time target detection from uncrewed air vehicles using hyperspectral data [46]. In this chapter we develop a new algorithm to unmix linear mixtures of endmember spectra. First, the algorithm determines the number of endmembers and the signal subspace using a newly developed concept [47, 48]. Second, the algorithm extracts the most pure pixels present in the data. Unlike other methods, this algorithm is completely automatic and unsupervised. To estimate the number of endmembers and the signal subspace in hyperspectral linear mixtures, the proposed scheme begins by estimating sign al and noise correlation matrices. The latter is based on multiple regression theory. The signal subspace is then identified by selectin g the set of signal eigenvalue s that best represents the data, in the least-square sense [48,49 ], we note, however, that VCA works with projected and with unprojected data. The extraction of the end members exploits two facts: (1) the endmembers are the vertices of a simplex and (2) the affine transformation of a simplex is also a simplex. As PPI and N-FIND R algorithms, VCA also assumes the presence of pure pixels in the data. The algorithm iteratively projects data on to a direction orthogonal to the subspace spanned by the endmembers already determined. The new end member signature corresponds to the extreme of the projection. The algorithm iterates until all end members are exhausted. VCA performs much better than PPI and better than or comparable to N-FI NDR; yet it has a computational complexity between on e and two orders of magnitude lower than N-FINDR. The chapter is structure d as follows. Section 19.2 describes the fundamentals of the proposed method. Section 19.3 and Section 19.4 evaluate the proposed algorithm using simulated and real data, respectively. Section 19.5 presents some concluding remarks.
Resumo:
The importance of wind power energy for energy and environmental policies has been growing in past recent years. However, because of its random nature over time, the wind generation cannot be reliable dispatched and perfectly forecasted, becoming a challenge when integrating this production in power systems. In addition the wind energy has to cope with the diversity of production resulting from alternative wind power profiles located in different regions. In 2012, Portugal presented a cumulative installed capacity distributed over 223 wind farms [1]. In this work the circular data statistical methods are used to analyze and compare alternative spatial wind generation profiles. Variables indicating extreme situations are analyzed. The hour (s) of the day where the farm production attains its maximum daily production is considered. This variable was converted into circular variable, and the use of circular statistics enables to identify the daily hour distribution for different wind production profiles. This methodology was applied to a real case, considering data from the Portuguese power system regarding the year 2012 with a 15-minutes interval. Six geographical locations were considered, representing different wind generation profiles in the Portuguese system.In this work the circular data statistical methods are used to analyze and compare alternative spatial wind generation profiles. Variables indicating extreme situations are analyzed. The hour (s) of the day where the farm production attains its maximum daily production is considered. This variable was converted into circular variable, and the use of circular statistics enables to identify the daily hour distribution for different wind production profiles. This methodology was applied to a real case, considering data from the Portuguese power system regarding the year 2012 with a 15-minutes interval. Six geographical locations were considered, representing different wind generation profiles in the Portuguese system.
Resumo:
Spent coffee grounds (SCG) represent a high-volume food waste worldwide, and several reuse approaches have been attempted. Herein, a greenhouse field experiment was carried out by cultivating Batavia lettuce with 5%, 10%, 15%, 20% and 30% (v/v) espresso SCG directly composted in the soil. Healthy vegetables were obtained for all treatments, without yield loss for up to 10% SCG. A progressive increment of green color intensity with increasing SCG content was observed, corroborated by the increase of their photosynthetic pigments (chlorophylls and carotenoids). Furthermore, total ascorbic acid and tocopherols showed statistical significant increases (p < 0.001) between control and all tested groups. Marked variations of nutritionally relevant minerals, particularly potassium, phosphorous and sodium were also revealed at higher percentage treatments (20% and 30%). This approach constitutes a clean, direct, simple and cost-effective measure to produce value-added vegetables, while reducing food waste disposal.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica e de Computadores
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
BACKGROUND: Geographical differences in asthma prevalence are currently accepted, but evidence is sparse due to the lack of multicentre studies using the same protocol. OBJECTIVES: To compare the prevalence of asthma and atopy among schoolchildren from Portuguese speaking countries (ISAAC and Portuguese Study) and evaluate some environmental variables, such as house dust mite exposure. MATERIAL AND METHODS: Significant random samples of schoolchildren studied with standard validated methods--questionnaires, skin prick tests, methacholine bronchial challenge tests; dust bed sampling for analysis of mite antigens. RESULTS: In the ISAAC study, in the 13-14 year-old age group, statistical significant differences were found, with higher wheezing prevalence in Brazil than in Portugal (two-fold). In the Portuguese Study, atopy prevalence ranged between 6.0 and 11.9% in Sal and S. Vicente (Cape Verde), up to 48.6 and 54.1% in Macau and Madeira. Active asthma had the higher values in Madeira (14.6%), and the lower in Macau (1.3%). Cape Verde had intermediate asthma prevalence (10.6 and 7.0%). The bronchial challenge test was positive in 25, 66 and 70% of asthmatic children from Sal, S. Vicente and Madeira respectively. Significant HDM antigen concentrations (Der p1) were found in Cape Verde and Madeira. CONCLUSIONS: There are significant variations in asthma and atopy prevalence between these pediatric populations. The reasons remain under discussion, but genetics linked to race, seem to play a central role, modulated by environmental and lifestyle variables.