877 resultados para GALAXIES, CLUSTERING
Resumo:
Background and Aim: The identification of gastric carcinomas (GC) has traditionally been based on histomorphology. Recently, DNA microarrays have successfully been used to identify tumors through clustering of the expression profiles. Random forest clustering is widely used for tissue microarrays and other immunohistochemical data, because it handles highly-skewed tumor marker expressions well, and weighs the contribution of each marker according to its relatedness with other tumor markers. In the present study, we e identified biologically- and clinically-meaningful groups of GC by hierarchical clustering analysis of immunohistochemical protein expression. Methods: We selected 28 proteins (p16, p27, p21, cyclin D1, cyclin A, cyclin B1, pRb, p53, c-met, c-erbB-2, vascular endothelial growth factor, transforming growth factor [TGF]-beta I, TGF-beta II, MutS homolog-2, bcl-2, bax, bak, bcl-x, adenomatous polyposis coli, clathrin, E-cadherin, beta-catenin, mucin (MUC) 1, MUC2, MUC5AC, MUC6, matrix metalloproteinase [ MMP]-2, and MMP-9) to be investigated by immunohistochemistry in 482 GC. The analyses of the data were done using a random forest-clustering method. Results: Proteins related to cell cycle, growth factor, cell motility, cell adhesion, apoptosis, and matrix remodeling were highly expressed in GC. We identified protein expressions associated with poor survival in diffuse-type GC. Conclusions: Based on the expression analysis of 28 proteins, we identified two groups of GC that could not be explained by any clinicopathological variables, and a subgroup of long-surviving diffuse-type GC patients with a distinct molecular profile. These results provide not only a new molecular basis for understanding the biological properties of GC, but also better prediction of survival than the classic pathological grouping.
Resumo:
This paper addresses the m-machine no-wait flow shop problem where the set-up time of a job is separated from its processing time. The performance measure considered is the total flowtime. A new hybrid metaheuristic Genetic Algorithm-Cluster Search is proposed to solve the scheduling problem. The performance of the proposed method is evaluated and the results are compared with the best method reported in the literature. Experimental tests show superiority of the new method for the test problems set, regarding the solution quality. (c) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.
Resumo:
Background: A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results: In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions: This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.
Resumo:
We investigate the nature of extremely red galaxies (ERGs), objects whose colours are redder than those found in the red sequence present in colour–magnitude diagrams of galaxies. We selected from the Sloan Digital Sky Survey Data Release 7 a volume-limited sample of such galaxies in the redshift interval 0.010 < z < 0.030, brighter than Mr = −17.8 (magnitudes dereddened, corrected for the Milky Way extinction) and with (g − r) colours larger than those of galaxies in the red sequence. This sample contains 416 ERGs, which were classified visually. Our classification was cross-checked with other classifications available in the literature. We found from our visual classification that the majority of objects in our sample are edge-on spirals (73 per cent). Other spirals correspond to 13 per cent, whereas elliptical galaxies comprise only 11 per cent of the objects. After comparing the morphological mix and the distributions of Hα/Hβ and axial ratios of ERGs and objects in the red sequence, we suggest that dust, more than stellar population effects, is the driver of the red colours found in these extremely red galaxies.
Resumo:
Thanks to the Chandra and XMM–Newton surveys, the hard X-ray sky is now probed down to a flux limit where the bulk of the X-ray background is almost completely resolved into discrete sources, at least in the 2–8 keV band. Extensive programs of multiwavelength follow-up observations showed that the large majority of hard X–ray selected sources are identified with Active Galactic Nuclei (AGN) spanning a broad range of redshifts, luminosities and optical properties. A sizable fraction of relatively luminous X-ray sources hosting an active, presumably obscured, nucleus would not have been easily recognized as such on the basis of optical observations because characterized by “peculiar” optical properties. In my PhD thesis, I will focus the attention on the nature of two classes of hard X-ray selected “elusive” sources: those characterized by high X-ray-to-optical flux ratios and red optical-to-near-infrared colors, a fraction of which associated with Type 2 quasars, and the X-ray bright optically normal galaxies, also known as XBONGs. In order to characterize the properties of these classes of elusive AGN, the datasets of several deep and large-area surveys have been fully exploited. The first class of “elusive” sources is characterized by X-ray-to-optical flux ratios (X/O) significantly higher than what is generally observed from unobscured quasars and Seyfert galaxies. The properties of well defined samples of high X/O sources detected at bright X–ray fluxes suggest that X/O selection is highly efficient in sampling high–redshift obscured quasars. At the limits of deep Chandra surveys (∼10−16 erg cm−2 s−1), high X/O sources are generally characterized by extremely faint optical magnitudes, hence their spectroscopic identification is hardly feasible even with the largest telescopes. In this framework, a detailed investigation of their X-ray properties may provide useful information on the nature of this important component of the X-ray source population. The X-ray data of the deepest X-ray observations ever performed, the Chandra deep fields, allows us to characterize the average X-ray properties of the high X/O population. The results of spectral analysis clearly indicate that the high X/O sources represent the most obscured component of the X–ray background. Their spectra are harder (G ∼ 1) than any other class of sources in the deep fields and also of the XRB spectrum (G ≈ 1.4). In order to better understand the AGN physics and evolution, a much better knowledge of the redshift, luminosity and spectral energy distributions (SEDs) of elusive AGN is of paramount importance. The recent COSMOS survey provides the necessary multiwavelength database to characterize the SEDs of a statistically robust sample of obscured sources. The combination of high X/O and red-colors offers a powerful tool to select obscured luminous objects at high redshift. A large sample of X-ray emitting extremely red objects (R−K >5) has been collected and their optical-infrared properties have been studied. In particular, using an appropriate SED fitting procedure, the nuclear and the host galaxy components have been deconvolved over a large range of wavelengths and ptical nuclear extinctions, black hole masses and Eddington ratios have been estimated. It is important to remark that the combination of hard X-ray selection and extreme red colors is highly efficient in picking up highly obscured, luminous sources at high redshift. Although the XBONGs do not present a new source population, the interest on the nature of these sources has gained a renewed attention after the discovery of several examples from recent Chandra and XMM–Newton surveys. Even though several possibilities were proposed in recent literature to explain why a relatively luminous (LX = 1042 − 1043erg s−1) hard X-ray source does not leave any significant signature of its presence in terms of optical emission lines, the very nature of XBONGs is still subject of debate. Good-quality photometric near-infrared data (ISAAC/VLT) of 4 low-redshift XBONGs from the HELLAS2XMMsurvey have been used to search for the presence of the putative nucleus, applying the surface-brightness decomposition technique. In two out of the four sources, the presence of a nuclear weak component hosted by a bright galaxy has been revealed. The results indicate that moderate amounts of gas and dust, covering a large solid angle (possibly 4p) at the nuclear source, may explain the lack of optical emission lines. A weak nucleus not able to produce suffcient UV photons may provide an alternative or additional explanation. On the basis of an admittedly small sample, we conclude that XBONGs constitute a mixed bag rather than a new source population. When the presence of a nucleus is revealed, it turns out to be mildly absorbed and hosted by a bright galaxy.
Resumo:
The present work proposes a method based on CLV (Clustering around Latent Variables) for identifying groups of consumers in L-shape data. This kind of datastructure is very common in consumer studies where a panel of consumers is asked to assess the global liking of a certain number of products and then, preference scores are arranged in a two-way table Y. External information on both products (physicalchemical description or sensory attributes) and consumers (socio-demographic background, purchase behaviours or consumption habits) may be available in a row descriptor matrix X and in a column descriptor matrix Z respectively. The aim of this method is to automatically provide a consumer segmentation where all the three matrices play an active role in the classification, getting homogeneous groups from all points of view: preference, products and consumer characteristics. The proposed clustering method is illustrated on data from preference studies on food products: juices based on berry fruits and traditional cheeses from Trentino. The hedonic ratings given by the consumer panel on the products under study were explained with respect to the product chemical compounds, sensory evaluation and consumer socio-demographic information, purchase behaviour and consumption habits.
Resumo:
Seyfert galaxies are the closest active galactic nuclei. As such, we can use
them to test the physical properties of the entire class of objects. To investigate
their general properties, I took advantage of different methods of data analysis. In
particular I used three different samples of objects, that, despite frequent overlaps,
have been chosen to best tackle different topics: the heterogeneous BeppoS AX
sample was thought to be optimized to test the average hard X-ray (E above 10 keV)
properties of nearby Seyfert galaxies; the X-CfA was thought the be optimized to
compare the properties of low-luminosity sources to the ones of higher luminosity
and, thus, it was also used to test the emission mechanism models; finally, the
XMM–Newton sample was extracted from the X-CfA sample so as to ensure a
truly unbiased and well defined sample of objects to define the average properties
of Seyfert galaxies.
Taking advantage of the broad-band coverage of the BeppoS AX MECS and
PDS instruments (between ~2-100 keV), I infer the average X-ray spectral propertiesof nearby Seyfert galaxies and in particular the photon index (
Resumo:
The intensity of regional specialization in specific activities, and conversely, the level of industrial concentration in specific locations, has been used as a complementary evidence for the existence and significance of externalities. Additionally, economists have mainly focused the debate on disentangling the sources of specialization and concentration processes according to three vectors: natural advantages, internal, and external scale economies. The arbitrariness of partitions plays a key role in capturing these effects, while the selection of the partition would have to reflect the actual characteristics of the economy. Thus, the identification of spatial boundaries to measure specialization becomes critical, since most likely the model will be adapted to different scales of distance, and be influenced by different types of externalities or economies of agglomeration, which are based on the mechanisms of interaction with particular requirements of spatial proximity. This work is based on the analysis of the spatial aspect of economic specialization supported by the manufacturing industry case. The main objective is to propose, for discrete and continuous space: i) a measure of global specialization; ii) a local disaggregation of the global measure; and iii) a spatial clustering method for the identification of specialized agglomerations.
Resumo:
The aim of this PhD thesis is the study of the nuclear properties of radio loud AGN. Multiple and/or recent mergers in the host galaxy and/or the presence of cool core in galaxy clusters can play a role in the formation and evolution of the radio source. Being a unique class of objects (Lin & Mohr 2004), we focus on Brightest Cluster Galaxies (BCGs). We investigate their parsec scale radio emission with VLBI (Very Long Baseline Interferometer) observations. From literature or new data , we collect and analyse VLBA (Very Long Baseline) observations at 5 GHz of a complete sample of BCGs and ``normal'' radio galaxies (Bologna Complete Sample , BCS). Results on nuclear properties of BCGs are coming from the comparison with the results for the Bologna COmplete Sample (BCS). Our analysis finds a possible dichotomy between BCGs in cool-core clusters and those in non-cool-core clusters. Only one-sided BCGs have similar kinematic properties with FRIs. Furthermore, the dominance of two-sided jet structures only in cooling clusters suggests sub-relativistic jet velocities. The different jet properties can be related to a different jet origin or to the interaction with a different ISM. We larger discuss on possible explanation of this.
Resumo:
The purpose of this Thesis is to develop a robust and powerful method to classify galaxies from large surveys, in order to establish and confirm the connections between the principal observational parameters of the galaxies (spectral features, colours, morphological indices), and help unveil the evolution of these parameters from $z \sim 1$ to the local Universe. Within the framework of zCOSMOS-bright survey, and making use of its large database of objects ($\sim 10\,000$ galaxies in the redshift range $0 < z \lesssim 1.2$) and its great reliability in redshift and spectral properties determinations, first we adopt and extend the \emph{classification cube method}, as developed by Mignoli et al. (2009), to exploit the bimodal properties of galaxies (spectral, photometric and morphologic) separately, and then combining together these three subclassifications. We use this classification method as a test for a newly devised statistical classification, based on Principal Component Analysis and Unsupervised Fuzzy Partition clustering method (PCA+UFP), which is able to define the galaxy population exploiting their natural global bimodality, considering simultaneously up to 8 different properties. The PCA+UFP analysis is a very powerful and robust tool to probe the nature and the evolution of galaxies in a survey. It allows to define with less uncertainties the classification of galaxies, adding the flexibility to be adapted to different parameters: being a fuzzy classification it avoids the problems due to a hard classification, such as the classification cube presented in the first part of the article. The PCA+UFP method can be easily applied to different datasets: it does not rely on the nature of the data and for this reason it can be successfully employed with others observables (magnitudes, colours) or derived properties (masses, luminosities, SFRs, etc.). The agreement between the two classification cluster definitions is very high. ``Early'' and ``late'' type galaxies are well defined by the spectral, photometric and morphological properties, both considering them in a separate way and then combining the classifications (classification cube) and treating them as a whole (PCA+UFP cluster analysis). Differences arise in the definition of outliers: the classification cube is much more sensitive to single measurement errors or misclassifications in one property than the PCA+UFP cluster analysis, in which errors are ``averaged out'' during the process. This method allowed us to behold the \emph{downsizing} effect taking place in the PC spaces: the migration between the blue cloud towards the red clump happens at higher redshifts for galaxies of larger mass. The determination of $M_{\mathrm{cross}}$ the transition mass is in significant agreement with others values in literature.
Resumo:
The goal of this thesis is to analyze the possibility of using early-type galaxies to place evolutionary and cosmological constraints, by both disentangling what is the main driver of ETGs evolution between mass and environment, and developing a technique to constrain H(z) and the cosmological parameters studying the ETGs age-redshift relation. The (U-V) rest-frame color distribution is studied as a function of mass and environment for two sample of ETGs up to z=1, extracted from the zCOSMOS survey with a new selection criterion. The color distributions and the slopes of the color-mass and color-environment relations are studied, finding a strong dependence on mass and a minor dependence on environment. The spectral analysis performed on the D4000 and Hδ features gives results validating the previous analysis. The main driver of galaxy evolution is found to be the galaxy mass, the environment playing a subdominant but non negligible role. The age distribution of ETGs is also analyzed as a function of mass, providing strong evidences supporting a downsizing scenario. The possibility of setting cosmological constraints studying the age-redshift relation is studied, discussing the relative degeneracies and model dependencies. A new approach is developed, aiming to minimize the impact of systematics on the “cosmic chronometer” method. Analyzing theoretical models, it is demonstrated that the D4000 is a feature correlated almost linearly with age at fixed metallicity, depending only minorly on the models assumed or on the SFH chosen. The analysis of a SDSS sample of ETGs shows that it is possible to use the differential D4000 evolution of the galaxies to set constraints to cosmological parameters in an almost model-independent way. Values of the Hubble constant and of the dark energy EoS parameter are found, which are not only fully compatible, but also with a comparable error budget with the latest results.