938 resultados para k-means clustering


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this work is to develop entrepreneurship in university technology side. The means of SMEs is made in collaboration with the research development and innovation projects. At the same time supporting and enabling the growth of SMEs, the internationalization of higher education and increase student’s employability in SMEs. The aim is to create new startup companies with the help of co-operation both SMEs and universities, especially assist companies find the successor generation of change by improving interaction between the parties. The new growth oriented entrepreneurial business creation with help of SMEs will be seen more business supporting and opening more opportunities. Portfolio Entrepreneurship is a form of the company’s growth, even the size of company does not change so much. Portfolio Entrepreneurship is an alternative for entrepreneurs which do have possibilities to expand, but they do like keep it as family business. Variety can be seen in expansion if the SMEs have common interest to do so. Co-made projects, between SMEs and universities, are seen to be significantly affecting the wellbeing of society as a whole, and eve international level. Co-operation is considered to be also affected the quality of learning and teachers’ professional development. higher education supports the change of the supports to innovation when it is done with SMEs. Research material has been collected by interviewing the experts, who have already distinguished themselves in their field. Problem focused interviews as method gave the ways how the university students and SMEs can proceed and promote on industrial sector.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prediction of variety composite means was shown to be feasible without diallel crossing the parental varieties. Thus, the predicted mean for a quantitative trait of a composite is given by: Yk = a1 sigmaVj + a2sigmaTj + a3 - a4, with coefficients a1 = (n - 2k)/k²(n - 2); a2 = 2n(k - 1)/k²(n - 2); a3 = n(k - 1)/k(n - 1)(n - 2); and a4 = n²(k - 1)/k(n - 1)(n - 2); summation is for j = 1 to k, where k is the size of the composite (number of parental varieties of a particular composite) and n is the total number of parent varieties. Vj is the mean of varieties and Tj is the mean of topcrosses (pool of varieties as tester), and and are the respective average values in the whole set. Yield data from a 7 x 7 variety diallel cross were used for the variety means and for the "simulated" topcross means to illustrate the proposed procedure. The proposed prediction procedure was as effective as the prediction based on Yk = - ( -)/k, where and refer to the mean of hybrids (F1) and parental varieties, respectively, in a variety diallel cross. It was also shown in the analysis of variance that the total sum of squares due to treatments (varieties and topcrosses) can be orthogonally partitioned following the reduced model Yjj’ = mu + ½(v j + v j’) + + h j+ h j’, thus making possible an F test for varieties, average heterosis and variety heterosis. Least square estimates of these effects are also given

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Naïvement perçu, le processus d’évolution est une succession d’événements de duplication et de mutations graduelles dans le génome qui mènent à des changements dans les fonctions et les interactions du protéome. La famille des hydrolases de guanosine triphosphate (GTPases) similaire à Ras constitue un bon modèle de travail afin de comprendre ce phénomène fondamental, car cette famille de protéines contient un nombre limité d’éléments qui diffèrent en fonctionnalité et en interactions. Globalement, nous désirons comprendre comment les mutations singulières au niveau des GTPases affectent la morphologie des cellules ainsi que leur degré d’impact sur les populations asynchrones. Mon travail de maîtrise vise à classifier de manière significative différents phénotypes de la levure Saccaromyces cerevisiae via l’analyse de plusieurs critères morphologiques de souches exprimant des GTPases mutées et natives. Notre approche à base de microscopie et d’analyses bioinformatique des images DIC (microscopie d’interférence différentielle de contraste) permet de distinguer les phénotypes propres aux cellules natives et aux mutants. L’emploi de cette méthode a permis une détection automatisée et une caractérisation des phénotypes mutants associés à la sur-expression de GTPases constitutivement actives. Les mutants de GTPases constitutivement actifs Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V ont été analysés avec succès. En effet, l’implémentation de différents algorithmes de partitionnement, permet d’analyser des données qui combinent les mesures morphologiques de population native et mutantes. Nos résultats démontrent que l’algorithme Fuzzy C-Means performe un partitionnement efficace des cellules natives ou mutantes, où les différents types de cellules sont classifiés en fonction de plusieurs facteurs de formes cellulaires obtenus à partir des images DIC. Cette analyse démontre que les mutations Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V induisent respectivement des phénotypes amorphe, allongé, rond et large qui sont représentés par des vecteurs de facteurs de forme distincts. Ces distinctions sont observées avec différentes proportions (morphologie mutante / morphologie native) dans les populations de mutants. Le développement de nouvelles méthodes automatisées d’analyse morphologique des cellules natives et mutantes s’avère extrêmement utile pour l’étude de la famille des GTPases ainsi que des résidus spécifiques qui dictent leurs fonctions et réseau d’interaction. Nous pouvons maintenant envisager de produire des mutants de GTPases qui inversent leur fonction en ciblant des résidus divergents. La substitution fonctionnelle est ensuite détectée au niveau morphologique grâce à notre nouvelle stratégie quantitative. Ce type d’analyse peut également être transposé à d’autres familles de protéines et contribuer de manière significative au domaine de la biologie évolutive.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The influence of the pseudopotential on both the structure and the self-diffusion of liquid rubidium at the melting point has been investigated by means of molecular-dynamics calculations. The model potential considered has been computed from the pseudopotential of Ashcroft, the dielectric function of Geldart and Vosko, and a Born-Mayer term. Four different values for the core radius which enters as input in the pseudopotential have been considered. In this way we have been able to observe and interpret the effect of this contribution on the properties of the liquid.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An Overview of known spatial clustering algorithms The space of interest can be the two-dimensional abstraction of the surface of the earth or a man-made space like the layout of a VLSI design, a volume containing a model of the human brain, or another 3d-space representing the arrangement of chains of protein molecules. The data consists of geometric information and can be either discrete or continuous. The explicit location and extension of spatial objects define implicit relations of spatial neighborhood (such as topological, distance and direction relations) which are used by spatial data mining algorithms. Therefore, spatial data mining algorithms are required for spatial characterization and spatial trend analysis. Spatial data mining or knowledge discovery in spatial databases differs from regular data mining in analogous with the differences between non-spatial data and spatial data. The attributes of a spatial object stored in a database may be affected by the attributes of the spatial neighbors of that object. In addition, spatial location, and implicit information about the location of an object, may be exactly the information that can be extracted through spatial data mining

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our essay aims at studying suitable statistical methods for the clustering of compositional data in situations where observations are constituted by trajectories of compositional data, that is, by sequences of composition measurements along a domain. Observed trajectories are known as “functional data” and several methods have been proposed for their analysis. In particular, methods for clustering functional data, known as Functional Cluster Analysis (FCA), have been applied by practitioners and scientists in many fields. To our knowledge, FCA techniques have not been extended to cope with the problem of clustering compositional data trajectories. In order to extend FCA techniques to the analysis of compositional data, FCA clustering techniques have to be adapted by using a suitable compositional algebra. The present work centres on the following question: given a sample of compositional data trajectories, how can we formulate a segmentation procedure giving homogeneous classes? To address this problem we follow the steps described below. First of all we adapt the well-known spline smoothing techniques in order to cope with the smoothing of compositional data trajectories. In fact, an observed curve can be thought of as the sum of a smooth part plus some noise due to measurement errors. Spline smoothing techniques are used to isolate the smooth part of the trajectory: clustering algorithms are then applied to these smooth curves. The second step consists in building suitable metrics for measuring the dissimilarity between trajectories: we propose a metric that accounts for difference in both shape and level, and a metric accounting for differences in shape only. A simulation study is performed in order to evaluate the proposed methodologies, using both hierarchical and partitional clustering algorithm. The quality of the obtained results is assessed by means of several indices

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our purpose is to provide a set-theoretical frame to clustering fuzzy relational data basically based on cardinality of the fuzzy subsets that represent objects and their complementaries, without applying any crisp property. From this perspective we define a family of fuzzy similarity indexes which includes a set of fuzzy indexes introduced by Tolias et al, and we analyze under which conditions it is defined a fuzzy proximity relation. Following an original idea due to S. Miyamoto we evaluate the similarity between objects and features by means the same mathematical procedure. Joining these concepts and methods we establish an algorithm to clustering fuzzy relational data. Finally, we present an example to make clear all the process

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The clustering in time (seriality) of extratropical cyclones is responsible for large cumulative insured losses in western Europe, though surprisingly little scientific attention has been given to this important property. This study investigates and quantifies the seriality of extratropical cyclones in the Northern Hemisphere using a point-process approach. A possible mechanism for serial clustering is the time-varying effect of the large-scale flow on individual cyclone tracks. Another mechanism is the generation by one parent cyclone of one or more offspring through secondary cyclogenesis. A long cyclone-track database was constructed for extended October March winters from 1950 to 2003 using 6-h analyses of 850-mb relative vorticity derived from the NCEP NCAR reanalysis. A dispersion statistic based on the varianceto- mean ratio of monthly cyclone counts was used as a measure of clustering. It reveals extensive regions of statistically significant clustering in the European exit region of the North Atlantic storm track and over the central North Pacific. Monthly cyclone counts were regressed on time-varying teleconnection indices with a log-linear Poisson model. Five independent teleconnection patterns were found to be significant factors over Europe: the North Atlantic Oscillation (NAO), the east Atlantic pattern, the Scandinavian pattern, the east Atlantic western Russian pattern, and the polar Eurasian pattern. The NAO alone is not sufficient for explaining the variability of cyclone counts in the North Atlantic region and western Europe. Rate dependence on time-varying teleconnection indices accounts for the variability in monthly cyclone counts, and a cluster process did not need to be invoked.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification or erroneous annotations in the database.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The measurement of the impact of technical change has received significant attention within the economics literature. One popular method of quantifying the impact of technical change is the use of growth accounting index numbers. However, in a recent article Nelson and Pack (1999) criticise the use of such index numbers in situations where technical change is likely to be biased in favour of one or other inputs. In particular they criticise the common approach of applying observed cost shares, as proxies for partial output elasticities, to weight the change in quantities which they claim is only valid under Hicks neutrality. Recent advances in the measurement of product and factor biases of technical change developed by Balcombe et al (2000) provide a relatively straight-forward means of correcting product and factor shares in the face of biased technical progress. This paper demonstrates the correction of both revenue and cost shares used in the construction of a TFP index for UK agriculture over the period 1953 to 2000 using both revenue and cost function share equations appended with stochastic latent variables to capture the bias effect. Technical progress is shown to be biased between both individual input and output groups. Output and input quantity aggregates are then constructed using both observed and corrected share weights and the resulting TFPs are compared. There does appear to be some significant bias in TFP if the effect of biased technical progress is not taken into account when constructing the weights