26 resultados para Input-output data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

SUMMARY: Large sets of data, such as expression profiles from many samples, require analytic tools to reduce their complexity. The Iterative Signature Algorithm (ISA) is a biclustering algorithm. It was designed to decompose a large set of data into so-called 'modules'. In the context of gene expression data, these modules consist of subsets of genes that exhibit a coherent expression profile only over a subset of microarray experiments. Genes and arrays may be attributed to multiple modules and the level of required coherence can be varied resulting in different 'resolutions' of the modular mapping. In this short note, we introduce two BioConductor software packages written in GNU R: The isa2 package includes an optimized implementation of the ISA and the eisa package provides a convenient interface to run the ISA, visualize its output and put the biclusters into biological context. Potential users of these packages are all R and BioConductor users dealing with tabular (e.g. gene expression) data. AVAILABILITY: http://www.unil.ch/cbg/ISA CONTACT: sven.bergmann@unil.ch

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distribution of socio-economic features in urban space is an important source of information for land and transportation planning. The metropolization phenomenon has changed the distribution of types of professions in space and has given birth to different spatial patterns that the urban planner must know in order to plan a sustainable city. Such distributions can be discovered by statistical and learning algorithms through different methods. In this paper, an unsupervised classification method and a cluster detection method are discussed and applied to analyze the socio-economic structure of Switzerland. The unsupervised classification method, based on Ward's classification and self-organized maps, is used to classify the municipalities of the country and allows to reduce a highly-dimensional input information to interpret the socio-economic landscape. The cluster detection method, the spatial scan statistics, is used in a more specific manner in order to detect hot spots of certain types of service activities. The method is applied to the distribution services in the agglomeration of Lausanne. Results show the emergence of new centralities and can be analyzed in both transportation and social terms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examined the validity and reliability of a sequential "Run-Bike-Run" test (RBR) in age-group triathletes. Eight Olympic distance (OD) specialists (age 30.0 ± 2.0 years, mass 75.6 ± 1.6 kg, run VO2max 63.8 ± 1.9 ml· kg(-1)· min(-1), cycle VO2peak 56.7 ± 5.1 ml· kg(-1)· min(-1)) performed four trials over 10 days. Trial 1 (TRVO2max) was an incremental treadmill running test. Trials 2 and 3 (RBR1 and RBR2) involved: 1) a 7-min run at 15 km· h(-1) (R1) plus a 1-min transition to 2) cycling to fatigue (2 W· kg(-1) body mass then 30 W each 3 min); 3) 10-min cycling at 3 W· kg(-1) (Bsubmax); another 1-min transition and 4) a second 7-min run at 15 km· h(-1) (R2). Trial 4 (TT) was a 30-min cycle - 20-min run time trial. No significant differences in absolute oxygen uptake (VO2), heart rate (HR), or blood lactate concentration ([BLA]) were evidenced between RBR1 and RBR2. For all measured physiological variables, the limits of agreement were similar, and the mean differences were physiologically unimportant, between trials. Low levels of test-retest error (i.e. ICC <0.8, CV<10%) were observed for most (logged) measurements. However [BLA] post R1 (ICC 0.87, CV 25.1%), [BLA] post Bsubmax (ICC 0.99, CV 16.31) and [BLA] post R2 (ICC 0.51, CV 22.9%) were least reliable. These error ranges may help coaches detect real changes in training status over time. Moreover, RBR test variables can be used to predict discipline specific and overall TT performance. Cycle VO2peak, cycle peak power output, and the change between R1 and R2 (deltaR1R2) in [BLA] were most highly related to overall TT distance (r = 0.89, p < 0. 01; r = 0.94, p < 0.02; r = 0.86, p < 0.05, respectively). The percentage of TR VO2max at 15 km· h(-1), and deltaR1R2 HR, were also related to run TT distance (r = -0.83 and 0.86, both p < 0.05).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This contribution introduces Data Envelopment Analysis (DEA), a performance measurement technique. DEA helps decision makers for the following reasons: (1) By calculating an efficiency score, it indicates if a firm is efficient or has capacity for improvement; (2) By setting target values for input and output, it calculates how much input must be decreased or output increased in order to become efficient; (3) By identifying the nature of returns to scale, it indicates if a firm has to decrease or increase its scale (or size) in order to minimise the average total cost; (4) By identifying a set of benchmarks, it specifies which other firms' processes need to be analysed in order to improve its own practices. This contribution presents the essentials about DEA, alongside a case study to intuitively understand its application. It also introduces Win4DEAP, a software package that conducts efficiency analysis based on DEA methodology. The methodical background of DEA is presented for more demanding readers. Finally, four advanced topics of DEA are treated: adjustment to the environment, preferences, sensitivity analysis and time series data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Measuring school efficiency is a challenging task. First, a performance measurement technique has to be selected. Within Data Envelopment Analysis (DEA), one such technique, alternative models have been developed in order to deal with environmental variables. The majority of these models lead to diverging results. Second, the choice of input and output variables to be included in the efficiency analysis is often dictated by data availability. The choice of the variables remains an issue even when data is available. As a result, the choice of technique, model and variables is probably, and ultimately, a political judgement. Multi-criteria decision analysis methods can help the decision makers to select the most suitable model. The number of selection criteria should remain parsimonious and not be oriented towards the results of the models in order to avoid opportunistic behaviour. The selection criteria should also be backed by the literature or by an expert group. Once the most suitable model is identified, the principle of permanence of methods should be applied in order to avoid a change of practices over time. Within DEA, the two-stage model developed by Ray (1991) is the most convincing model which allows for an environmental adjustment. In this model, an efficiency analysis is conducted with DEA followed by an econometric analysis to explain the efficiency scores. An environmental variable of particular interest, tested in this thesis, consists of the fact that operations are held, for certain schools, on multiple sites. Results show that the fact of being located on more than one site has a negative influence on efficiency. A likely way to solve this negative influence would consist of improving the use of ICT in school management and teaching. Planning new schools should also consider the advantages of being located on a unique site, which allows reaching a critical size in terms of pupils and teachers. The fact that underprivileged pupils perform worse than privileged pupils has been public knowledge since Coleman et al. (1966). As a result, underprivileged pupils have a negative influence on school efficiency. This is confirmed by this thesis for the first time in Switzerland. Several countries have developed priority education policies in order to compensate for the negative impact of disadvantaged socioeconomic status on school performance. These policies have failed. As a result, other actions need to be taken. In order to define these actions, one has to identify the social-class differences which explain why disadvantaged children underperform. Childrearing and literary practices, health characteristics, housing stability and economic security influence pupil achievement. Rather than allocating more resources to schools, policymakers should therefore focus on related social policies. For instance, they could define pre-school, family, health, housing and benefits policies in order to improve the conditions for disadvantaged children.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Isotopic analyses on bulk carbonates are considered a useful tool for palaeoclimatic reconstruction assuming calcite precipitation occurring at oxygen isotope equilibrium with local water and detrital carbonate input being absent or insignificant. We present results from Lake Neuchatel (western Switzerland) that demonstrate equilibrium precipitation of calcite, except during high productivity periods, and the presence of detrital and resuspended calcite. Mineralogy, geochemistry and stable isotope values of Lake Neuchatel trap sediments and adjacent rivers suspension were studied. Mineralogy of suspended matter in the major inflowing rivers documents an important contribution of detrital carbonates, predominantly calcite with minor amounts of dolomite and ankerite. Using mineralogical data, the quantity of allochthonous calcite can be estimated by comparing the ratio ankerite + dolomite/calcite + ankerite + dolomite in the inflowing rivers and in the traps. Material taken from sediment traps shows an evolution from practically pure endogenic calcite in summer (10-20% detrital material) to higher percentages of detrital material in winter (up to 20-40%). Reflecting these mineralogical variations, delta(13)C and delta(18)O values of calcite from sediment traps are more negative in summer than in winter times. Since no significant variations in isotopic composition of lake water were detected over one year, factors controlling oxygen isotopic composition of calcite in sediment traps are the precipitation temperature, and the percentage of resuspended and detrital calcite. Samples taken close to the river inflow generally have higher delta values than the others, confirming detrital influence. SEM and isotopic studies on different size fractions (<2, 2-6, 6-20, 20-60, >60 mu m) of winter and summer samples allowed the recognition of resuspension and to separate new endogenic calcite from detrital calcite. Fractions >60 and (2 mu m have the highest percentage of detritus, Fractions 2-6 and 6-20 mu m are typical for the new endogenic calcite in summer, as given by calculations assuming isotopic equilibrium with local water. In winter such fractions show similar values than in summer, indicating resuspension. Using the isotopic composition of sediment traps material and of different size fractions, as well as the isotopic composition of lake water, the water temperature measurements and mineralogy, we re-evaluated the bulk carbonate potential for palaeoclimatic reconstruction in the presence of detrital and re-suspended calcite. This re-evaluation leads to the following conclusion: (1) the endogenic signal can be amplified by applying a particle-size separation, once the size of endogenic calcite is known from SEM study; (2) resuspended calcite does not alter the endogenic signal, but it lowers the time resolution; (3) detrital input decreases at increasing distances from the source, and it modifies the isotopic signal only when very abundant; (4) influence of detrital calcite on bulk sediment isotopic composition can be calculated. (C) 1998 Elsevier Science B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In French the adjective petit 'small, little' has a special status: it fulfills various pragmatic functions in addition to semantic meanings and it is thus highly frequent in discourse. Résumé: This study, based on the data of two children, aged 1;6 to 2;11, argues that petit and its pragmatic meanings play a specific role in the acquisition of French adjectives. In contrast to what is expected in child language, petit favours the early development of a pattern of noun phrase with prenominal attributive adjective. The emergence and distribution of petit in the children's production is examined and related to its distribution in the input, and the detailed pragmatic meanings and functions of petit are analysed. Prenominal petit emerges early as the preferred and most productive adjective. Pragmatic meanings of petit appear to be predominant in this early age and are of two main types: expressions of endearment (in noun phrases) and mitigating devices whose scope is the entire utterance. These results, as well as instances of children's pragmatic overgeneralizations, provide new evidence that at least some pragmatic meanings are prior to semantic meanings in early acquisition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

INTRODUCTION: Handwriting is a modality of language production whose cerebral substrates remain poorly known although the existence of specific regions is postulated. The description of brain damaged patients with agraphia and, more recently, several neuroimaging studies suggest the involvement of different brain regions. However, results vary with the methodological choices made and may not always discriminate between "writing-specific" and motor or linguistic processes shared with other abilities. METHODS: We used the "Activation Likelihood Estimate" (ALE) meta-analytical method to identify the cerebral network of areas commonly activated during handwriting in 18 neuroimaging studies published in the literature. Included contrasts were also classified according to the control tasks used, whether non-specific motor/output-control or linguistic/input-control. These data were included in two secondary meta-analyses in order to reveal the functional role of the different areas of this network. RESULTS: An extensive, mainly left-hemisphere network of 12 cortical and sub-cortical areas was obtained; three of which were considered as primarily writing-specific (left superior frontal sulcus/middle frontal gyrus area, left intraparietal sulcus/superior parietal area, right cerebellum) while others related rather to non-specific motor (primary motor and sensorimotor cortex, supplementary motor area, thalamus and putamen) or linguistic processes (ventral premotor cortex, posterior/inferior temporal cortex). CONCLUSIONS: This meta-analysis provides a description of the cerebral network of handwriting as revealed by various types of neuroimaging experiments and confirms the crucial involvement of the left frontal and superior parietal regions. These findings provide new insights into cognitive processes involved in handwriting and their cerebral substrates.