947 resultados para Data pre-processing


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The simulation and development work that has been undertaken to produce a signal equaliser used to improve the data rates from oil well logging instruments is presented. The instruments are lowered into the drill bore hole suspended by a cable which has poor electrical characteristics. The equaliser described in the paper corrects for the distortions that occur from the cable (dispersion and attenuation) with the result that the instrument can send data at 100 K.bits/second down its own suspension cable of 12 Km in length. The use of simulation techniques and tools were invaluable in generating a model for the distortions and proved to be a useful tool when site testing was not available.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article analyses the results of an empirical study on the 200 most popular UK-based websites in various sectors of e-commerce services. The study provides empirical evidence on unlawful processing of personal data. It comprises a survey on the methods used to seek and obtain consent to process personal data for direct marketing and advertisement, and a test on the frequency of unsolicited commercial emails (UCE) received by customers as a consequence of their registration and submission of personal information to a website. Part One of the article presents a conceptual and normative account of data protection, with a discussion of the ethical values on which EU data protection law is grounded and an outline of the elements that must be in place to seek and obtain valid consent to process personal data. Part Two discusses the outcomes of the empirical study, which unveils a significant departure between EU legal theory and practice in data protection. Although a wide majority of the websites in the sample (69%) has in place a system to ask separate consent for engaging in marketing activities, it is only 16.2% of them that obtain a consent which is valid under the standards set by EU law. The test with UCE shows that only one out of three websites (30.5%) respects the will of the data subject not to receive commercial communications. It also shows that, when submitting personal data in online transactions, there is a high probability (50%) of incurring in a website that will ignore the refusal of consent and will send UCE. The article concludes that there is severe lack of compliance of UK online service providers with essential requirements of data protection law. In this respect, it suggests that there is inappropriate standard of implementation, information and supervision by the UK authorities, especially in light of the clarifications provided at EU level.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Environmental Data Abstraction Library provides a modular data management library for bringing new and diverse datatypes together for visualisation within numerous software packages, including the ncWMS viewing service, which already has very wide international uptake. The structure of EDAL is presented along with examples of its use to compare satellite, model and in situ data types within the same visualisation framework. We emphasize the value of this capability for cross calibration of datasets and evaluation of model products against observations, including preparation for data assimilation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The present study examines the processing of subject-verb (SV) number agreement with coordinate subjects in pre-verbal and post-verbal positions in Greek. Greek is a language with morphological number marked on nominal and verbal elements. Coordinate SV agreement, however, is special in Greek as it is sensitive to the coordinate subject's position: when pre-verbal, the verb is marked for plural while when post-verbal the verb can be in the singular. We conducted two experiments, an acceptability judgment task with adult monolinguals as a pre-study (Experiment 1) and a self-paced reading task as the main study (Experiment 2) in order to obtain acceptance as well as processing data. Forty adult monolingual speakers of Greek participated in Experiment 1 and a hundred and forty one in Experiment 2. Seventy one children participated in Experiment 2: 30 Albanian-Greek sequential bilingual children and 41 Greek monolingual children aged 10–12 years. The adult data in Experiment 1 establish the difference in acceptability between singular VPs in SV and VS constructions reaffirming our hypothesis. Meanwhile, the adult data in Experiment 2 show that plural verbs accelerate processing regardless of subject position. The child online data show that sequential bilingual children have longer reading times (RTs) compared to the age-matched monolingual control group. However, both child groups follow a similar processing pattern in both pre-verbal and post-verbal constructions showing longer RTs immediately after a singular verb when the subject was pre-verbal indicating a grammaticality effect. In the post-verbal coordinate subject sentences, both child groups showed longer RTs on the first subject following the plural verb due to the temporary number mismatch between the verb and the first subject. This effect was resolved in monolingual children but was still present at the end of the sentence for bilingual children indicating difficulties to reanalyze and integrate information. Taken together, these findings demonstrate that (a) 10–12 year-old sequential bilingual children are sensitive to number agreement in SV coordinate constructions parsing sentences in the same way as monolingual children even though their vocabulary abilities are lower than that of age-matched monolingual peers and (b) bilinguals are slower in processing overall.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In eukaryotes, pre-rRNA processing depends on a large number of nonribosomal trans-acting factors that form intriguingly organized complexes. One of the early stages of pre-rRNA processing includes formation of the two intermediate complexes pre-40S and pre-60S, which then form the mature ribosome subunits. Each of these complexes contains specific pre-rRNAs, ribosomal proteins and processing factors. The yeast nucleolar protein Nop53p has previously been identified in the pre-60S complex and shown to affect pre-rRNA processing by directly binding to 5.8S rRNA, and to interact with Nop17p and Nip7p, which are also involved in this process. Here we show that Nop53p binds 5.8S rRNA co-transcriptionally through its N-terminal region, and that this protein portion can also partially complement growth of the conditional mutant strain Delta nop53/GAL:NOP53. Nop53p interacts with Rrp6p and activates the exosome in vitro. These results indicate that Nop53p may recruit the exosome to 7S pre-rRNA for processing. Consistent with this observation and similar to the observed in exosome mutants, depletion of Nop53p leads to accumulation of polyadenylated pre-rRNAs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

U3 snoRNA is transcribed from two intron-containing genes in yeast, snR17A and snR17B. Although the assembly of the U3 snoRNP has not been precisely determined, at least some of the core box C/D proteins are known to bind pre-U3 co-transcriptionally, thereby affecting splicing and 3 `-end processing of this snoRNA. We identified the interaction between the box C/D assembly factor Nop17p and Cwc24p, a novel yeast RING finger protein that had been previously isolated in a complex with the splicing factor Cef1p. Here we show that, consistent with the protein interaction data, Cwc24p localizes to the cell nucleus, and its depletion leads to the accumulation of both U3 pre-snoRNAs. U3 snoRNA is involved in the early cleavages of 35 S pre-rRNA, and the defective splicing of pre-U3 detected in cells depleted of Cwc24p causes the accumulation of the 35 S precursor rRNA. These results led us to the conclusion that Cwc 24p is involved in pre-U3 snoRNA splicing, indirectly affecting pre-rRNA processing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In eukaryotes, pre-rRNA processing depends on a large number of nonribosomal trans-acting factors that form intriguingly organized complexes. Two intermediate complexes, pre-40S and pre-60S, are formed at the early stages of 35S pre-rRNA processing and give rise to the mature ribosome subunits. Each of these complexes contains specific pre-rRNAs, some ribosomal proteins and processing factors. The novel yeast protein Utp25p has previously been identified in the nucleolus, an indication that this protein could be involved in ribosome biogenesis. Here we show that Utp25p interacts with the SSU processome proteins Sas10p and Mpp10p, and affects 18S rRNA maturation. Depletion of Utp25p leads to accumulation of the pre-rRNA 35S and the aberrant rRNA 23S, and to a severe reduction in 40S ribosomal subunit levels. Our results indicate that Utp25p is a novel SSU processome subunit involved in pre-40S maturation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Shwachman-Bodian-Diamond syndrome protein (SBDS) is a member of a highly conserved protein family of not well understood function, with putative orthologues found in different organisms ranging from Archaea, yeast and plants to vertebrate animals. The yeast orthologue of SBDS, Sdo1p, has been previously identified in association with the 60S ribosomal subunit and is proposed to participate in ribosomal recycling. Here we show that Sdo1p interacts with nucleolar rRNA processing factors and ribosomal proteins, indicating that it might bind the pre-60S complex and remain associated with it during processing and transport to the cytoplasm. Corroborating the protein interaction data, Sdo1p localizes to the nucleus and cytoplasm and co-immunoprecipitates precursors of 60S and 40S subunits, as well as the mature rRNAs. Sdo1p binds RNA directly, suggesting that it may associate with the ribosomal subunits also through RNA interaction. Copyright (C) 2009 John Wiley & Sons, Ltd.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A student from the Data Processing program at the New York Trade School is shown working. Black and white photograph with some edge damage due to writing in black along the top.