34 resultados para Network scale-up method


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction Facing the challenging treatment of neurodegenerative diseases as well as complex craniofacial injuries such as those common after cancer therapy, the field of regenerative medicine increasingly relies on stem cell transplantation strategies. Here, neural crest-derived stem cells (NCSCs) offer many promising applications, although scale up of clinical-grade processes prior to potential transplantations is currently limiting. In this study, we aimed to establish a clinical-grade, cost-reducing cultivation system for NCSCs isolated from the adult human nose using cGMP-grade Afc-FEP bags. Methods We cultivated human neural crest-derived stem cells from inferior turbinate (ITSCs) in a cell culture bag system using Afc-FEP bags in human blood plasma-supplemented medium. Investigations of viability, proliferation and expression profile of bag-cultured ITSCs were followed by DNA-content and telomerase activity determination. Cultivated ITSCs were introduced to directed in vitro differentiation assays to assess their potential for mesodermal and ectodermal differentiation. Mesodermal differentiation was determined using an enzyme activity assay (alkaline phosphatase, ALP), respective stainings (Alizarin Red S, Von Kossa and Oil Red O), and RT-PCR, while immunocytochemistry and synaptic vesicle recycling were applied to assay neuroectodermal differentiation of ITSCs. Results When cultivated within Afc-FEP bags, ITSCs grew three-dimensionally in a human blood plasma-derived matrix, thereby showing unchanged morphology, proliferation capability, viability and expression profile in comparison to three dimensionally-cultured ITSCs growing in standard cell culture plastics. Genetic stability of bag-cultured ITSCs was further accompanied by unchanged telomerase activity. Importantly, ITSCs retained their potential to differentiate into mesodermal cell types, particularly including ALP-active, Alizarin Red S-, and Von Kossa-positive osteogenic cell types, as well as adipocytes positive in Oil Red O assays. Bag culture further did not affect the potential of ITSCs to undergo differentiation into neuroectodermal cell types coexpressing β-III-tubulin and MAP2 and exhibiting the capability for synaptic vesicle recycling. Conclusions Here, we report for the first time the successful cultivation of human NCSCs within cGMP-grade Afc-FEP bags using a human blood plasma-supplemented medium. Our findings particularly demonstrate the unchanged differentiation capability and genetic stability of the cultivated NCSCs, suggesting the great potential of this culture system for future medical applications in the field of regenerative medicine.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biaxially oriented films produced from semi-crystalline, semi-aromatic polyesters are utilised extensively as components within various applications, including the specialist packaging, flexible electronic and photovoltaic markets. However, the thermal performance of such polyesters, specifically poly(ethylene terephthalate) (PET) and poly(ethylene-2,6-naphthalate) (PEN), is inadequate for several applications that require greater dimensional stability at higher operating temperatures. The work described in this project is therefore primarily focussed upon the copolymerisation of rigid comonomers with PET and PEN, in order to produce novel polyester-based materials that exhibit superior thermomechanical performance, with retention of crystallinity, to achieve biaxial orientation. Rigid biphenyldiimide comonomers were readily incorporated into PEN and poly(butylene-2,6-naphthalate) (PBN) via a melt-polycondensation route. For each copoly(ester-imide) series, retention of semi-crystalline behaviour is observed throughout entire copolymer composition ratios. This phenomenon may be rationalised by cocrystallisation between isomorphic biphenyldiimide and naphthalenedicarboxylate residues, which enables statistically random copolymers to melt-crystallise despite high proportions of imide sub-units being present. In terms of thermal performance, the glass transition temperature, Tg, linearly increases with imide comonomer content for both series. This facilitated the production of several high performance PEN-based biaxially oriented films, which displayed analogous drawing, barrier and optical properties to PEN. Selected PBN copoly(ester-imide)s also possess the ability to either melt-crystallise, or form a mesophase from the isotropic state depending on the applied cooling rate. An equivalent synthetic approach based upon isomorphic comonomer crystallisation was subsequently applied to PET by copolymerisation with rigid diimide and Kevlar®-type amide comonomers, to afford several novel high performance PET-based copoly(ester-imide)s and copoly(ester-amide)s that all exhibited increased Tgs. Retention of crystallinity was achieved in these copolymers by either melt-crystallisation or thermal annealing. The initial production of a semi-crystalline, PET-based biaxially oriented film with a Tg in excess of 100 °C was successful, and this material has obvious scope for further industrial scale-up and process development.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Despite the success of studies attempting to integrate remotely sensed data and flood modelling and the need to provide near-real time data routinely on a global scale as well as setting up online data archives, there is to date a lack of spatially and temporally distributed hydraulic parameters to support ongoing efforts in modelling. Therefore, the objective of this project is to provide a global evaluation and benchmark data set of floodplain water stages with uncertainties and assimilation in a large scale flood model using space-borne radar imagery. An algorithm is developed for automated retrieval of water stages with uncertainties from a sequence of radar imagery and data are assimilated in a flood model using the Tewkesbury 2007 flood event as a feasibility study. The retrieval method that we employ is based on possibility theory which is an extension of fuzzy sets and that encompasses probability theory. In our case we first attempt to identify main sources of uncertainty in the retrieval of water stages from radar imagery for which we define physically meaningful ranges of parameter values. Possibilities of values are then computed for each parameter using a triangular ‘membership’ function. This procedure allows the computation of possible values of water stages at maximum flood extents along a river at many different locations. At a later stage in the project these data are then used in assimilation, calibration or validation of a flood model. The application is subsequently extended to a global scale using wide swath radar imagery and a simple global flood forecasting model thereby providing improved river discharge estimates to update the latter.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The field site network (FSN) plays a central role in conducting joint research within all Assessing Large-scale Risks for biodiversity with tested Methods (ALARM) modules and provides a mechanism for integrating research on different topics in ALARM on the same site for measuring multiple impacts on biodiversity. The network covers most European climates and biogeographic regions, from Mediterranean through central European and boreal to subarctic. The project links databases with the European-wide field site network FSN, including geographic information system (GIS)-based information to characterise the test location for ALARM researchers for joint on-site research. Maps are provided in a standardised way and merged with other site-specific information. The application of GIS for these field sites and the information management promotes the use of the FSN for research and to disseminate the results. We conclude that ALARM FSN sites together with other research sites in Europe jointly could be used as a future backbone for research proposals

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The tagged microarray marker (TAM) method allows high-throughput differentiation between predicted alternative PCR products. Typically, the method is used as a molecular marker approach to determining the allelic states of single nucleotide polymorphisms (SNPs) or insertion-deletion (indel) alleles at genomic loci in multiple individuals. Biotin-labeled PCR products are spotted, unpurified, onto a streptavidin-coated glass slide and the alternative products are differentiated by hybridization to fluorescent detector oligonucleotides that recognize corresponding allele-specific tags on the PCR primers. The main attractions of this method are its high throughput (thousands of PCRs are analyzed per slide), flexibility of scoring (any combination, from a single marker in thousands of samples to thousands of markers in a single sample, can be analyzed) and flexibility of scale (any experimental scale, from a small lab setting up to a large project). This protocol describes an experiment involving 3,072 PCRs scored on a slide. The whole process from the start of PCR setup to receiving the data spreadsheet takes 2 d.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cloud optical depth is one of the most poorly observed climate variables. The new “cloud mode” capability in the Aerosol Robotic Network (AERONET) will inexpensively yet dramatically increase cloud optical depth observations in both number and accuracy. Cloud mode optical depth retrievals from AERONET were evaluated at the Atmospheric Radiation Measurement program’s Oklahoma site in sky conditions ranging from broken clouds to overcast. For overcast cases, the 1.5 min average AERONET cloud mode optical depths agreed to within 15% of those from a standard ground‐based flux method. For broken cloud cases, AERONET retrievals also captured rapid variations detected by the microwave radiometer. For 3 year climatology derived from all nonprecipitating clouds, AERONET monthly mean cloud optical depths are generally larger than cloud radar retrievals because of the current cloud mode observation strategy that is biased toward measurements of optically thick clouds. This study has demonstrated a new way to enhance the existing AERONET infrastructure to observe cloud optical properties on a global scale.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Previous studies have reported that cheese curd syneresis kinetics can be monitored by dilution of chemical tracers, such as Blue Dextran, in whey. The objective of this study was to evaluate an improved tracer method to monitor whey volumes expelled over time during syneresis. Two experiments with different ranges of milk fat (0-5% and 2.3-3.5%) were carried out in an 11 L double-O laboratory scale cheese vat. Tracer was added to the curd-whey mixture during the cutting phase of cheese making and samples were taken at 10 min intervals up to 75 min after cutting. The volume of whey expelled was measured gravimetrically and the dilution of tracer in the whey was measured by absorbance at 620 nm. The volumes of whey expelled were significantly reduced at higher milk fat levels. Whey yield was predicted with a SEP ranging from 3.2 to 6.3 g whey/100 mL of milk and a CV ranging from 2.03 to 2.7% at different milk fat levels.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Motivation: In order to enhance genome annotation, the fully automatic fold recognition method GenTHREADER has been improved and benchmarked. The previous version of GenTHREADER consisted of a simple neural network which was trained to combine sequence alignment score, length information and energy potentials derived from threading into a single score representing the relationship between two proteins, as designated by CATH. The improved version incorporates PSI-BLAST searches, which have been jumpstarted with structural alignment profiles from FSSP, and now also makes use of PSIPRED predicted secondary structure and bi-directional scoring in order to calculate the final alignment score. Pairwise potentials and solvation potentials are calculated from the given sequence alignment which are then used as inputs to a multi-layer, feed-forward neural network, along with the alignment score, alignment length and sequence length. The neural network has also been expanded to accommodate the secondary structure element alignment (SSEA) score as an extra input and it is now trained to learn the FSSP Z-score as a measurement of similarity between two proteins. Results: The improvements made to GenTHREADER increase the number of remote homologues that can be detected with a low error rate, implying higher reliability of score, whilst also increasing the quality of the models produced. We find that up to five times as many true positives can be detected with low error rate per query. Total MaxSub score is doubled at low false positive rates using the improved method.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Following a malicious or accidental atmospheric release in an outdoor environment it is essential for first responders to ensure safety by identifying areas where human life may be in danger. For this to happen quickly, reliable information is needed on the source strength and location, and the type of chemical agent released. We present here an inverse modelling technique that estimates the source strength and location of such a release, together with the uncertainty in those estimates, using a limited number of measurements of concentration from a network of chemical sensors considering a single, steady, ground-level source. The technique is evaluated using data from a set of dispersion experiments conducted in a meteorological wind tunnel, where simultaneous measurements of concentration time series were obtained in the plume from a ground-level point-source emission of a passive tracer. In particular, we analyze the response to the number of sensors deployed and their arrangement, and to sampling and model errors. We find that the inverse algorithm can generate acceptable estimates of the source characteristics with as few as four sensors, providing these are well-placed and that the sampling error is controlled. Configurations with at least three sensors in a profile across the plume were found to be superior to other arrangements examined. Analysis of the influence of sampling error due to the use of short averaging times showed that the uncertainty in the source estimates grew as the sampling time decreased. This demonstrated that averaging times greater than about 5min (full scale time) lead to acceptable accuracy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we develop a method, termed the Interaction Distribution (ID) method, for analysis of quantitative ecological network data. In many cases, quantitative network data sets are under-sampled, i.e. many interactions are poorly sampled or remain unobserved. Hence, the output of statistical analyses may fail to differentiate between patterns that are statistical artefacts and those which are real characteristics of ecological networks. The ID method can support assessment and inference of under-sampled ecological network data. In the current paper, we illustrate and discuss the ID method based on the properties of plant-animal pollination data sets of flower visitation frequencies. However, the ID method may be applied to other types of ecological networks. The method can supplement existing network analyses based on two definitions of the underlying probabilities for each combination of pollinator and plant species: (1), pi,j: the probability for a visit made by the i’th pollinator species to take place on the j’th plant species; (2), qi,j: the probability for a visit received by the j’th plant species to be made by the i’th pollinator. The method applies the Dirichlet distribution to estimate these two probabilities, based on a given empirical data set. The estimated mean values for pi,j and qi,j reflect the relative differences between recorded numbers of visits for different pollinator and plant species, and the estimated uncertainty of pi,j and qi,j decreases with higher numbers of recorded visits.