979 resultados para Massive Data
Resumo:
Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.
Resumo:
Big Data Analytics is an emerging field since massive storage and computing capabilities have been made available by advanced e-infrastructures. Earth and Environmental sciences are likely to benefit from Big Data Analytics techniques supporting the processing of the large number of Earth Observation datasets currently acquired and generated through observations and simulations. However, Earth Science data and applications present specificities in terms of relevance of the geospatial information, wide heterogeneity of data models and formats, and complexity of processing. Therefore, Big Earth Data Analytics requires specifically tailored techniques and tools. The EarthServer Big Earth Data Analytics engine offers a solution for coverage-type datasets, built around a high performance array database technology, and the adoption and enhancement of standards for service interaction (OGC WCS and WCPS). The EarthServer solution, led by the collection of requirements from scientific communities and international initiatives, provides a holistic approach that ranges from query languages and scalability up to mobile access and visualization. The result is demonstrated and validated through the development of lighthouse applications in the Marine, Geology, Atmospheric, Planetary and Cryospheric science domains.
Resumo:
Big Data Analytics is an emerging field since massive storage and computing capabilities have been made available by advanced e-infrastructures. Earth and Environmental sciences are likely to benefit from Big Data Analytics techniques supporting the processing of the large number of Earth Observation datasets currently acquired and generated through observations and simulations. However, Earth Science data and applications present specificities in terms of relevance of the geospatial information, wide heterogeneity of data models and formats, and complexity of processing. Therefore, Big Earth Data Analytics requires specifically tailored techniques and tools. The EarthServer Big Earth Data Analytics engine offers a solution for coverage-type datasets, built around a high performance array database technology, and the adoption and enhancement of standards for service interaction (OGC WCS and WCPS). The EarthServer solution, led by the collection of requirements from scientific communities and international initiatives, provides a holistic approach that ranges from query languages and scalability up to mobile access and visualization. The result is demonstrated and validated through the development of lighthouse applications in the Marine, Geology, Atmospheric, Planetary and Cryospheric science domains.
Resumo:
Key Performance Indicators (KPIs) and their predictions are widely used by the enterprises for informed decision making. Nevertheless , a very important factor, which is generally overlooked, is that the top level strategic KPIs are actually driven by the operational level business processes. These two domains are, however, mostly segregated and analysed in silos with different Business Intelligence solutions. In this paper, we are proposing an approach for advanced Business Simulations, which converges the two domains by utilising process execution & business data, and concepts from Business Dynamics (BD) and Business Ontologies, to promote better system understanding and detailed KPI predictions. Our approach incorporates the automated creation of Causal Loop Diagrams, thus empowering the analyst to critically examine the complex dependencies hidden in the massive amounts of available enterprise data. We have further evaluated our proposed approach in the context of a retail use-case that involved verification of the automatically generated causal models by a domain expert.
Resumo:
The astrophysical context in which this thesis project lies concerns the comprehension of the mutual interaction between the accretion onto a Super Massive Black Hole (SMBH) and the Star Formation (SF), that take place in the host galaxy. This is one of the key topic of the modern extragalactic astrophysical research. Indeed, it is widely accepted that to understand the physics of a galaxy, the contribution of a possible central AGN must be taken into account. The aim of this thesis is the study of the physical processes of the nearby Seyfert galaxy NGC 34. This source was selected because of the wide collection of multiwavelength data available in the literature. In addition, recently, it has been observed with the Atacama Large Submillimeter/Millimeter Array (ALMA) in Band 9. This project is divided in two main parts: first of all, we reduced and analyzed the ALMA data, obtaining the continuum and CO(6-5) maps; then, we looked for a coherent explaination of NGC 34 physical characteristics. In particular, we focused on the ISM physics, in order to understand its properties in terms of density, chemical composition and dominant radiation field (SF or accretion). This work has been done through the analysis of the spectral distribution of several CO transitions as a function of the transition number (CO SLED), obtained joining the CO(6-5) line with other transitions available in the literature. More precisely, the observed CO SLED has been compared with ISM models, including Photo-Dissociation Regions (PDRs) and X-ray-Dominated Regions (XDRs). These models have been obtained through the state-of-the-art photoionization code CLOUDY. Along with the observed CO SLED, we have taken into account other physical properties of NGC 34, such as the Star Formation Rate (SFR), the gas mass and the X-ray luminosity.
Resumo:
Abstract Massive Open Online Courses (MOOCs) generate enormous amounts of data. The University of Southampton has run and is running dozens of MOOC instances. The vast amount of data resulting from our MOOCs can provide highly valuable information to all parties involved in the creation and delivery of these courses. However, analysing and visualising such data is a task that not all educators have the time or skills to undertake. The recently developed MOOC Dashboard is a tool aimed at bridging such a gap: it provides reports and visualisations based on the data generated by learners in MOOCs. Speakers Manuel Leon is currently a Lecturer in Online Teaching and Learning in the Institute for Learning Innovation and Development (ILIaD). Adriana Wilde is a Teaching Fellow in Electronics and Computer Science, with research interests in MOOCs and Learning Analytics. Darron Tang (4th Year BEng Computer Science) and Jasmine Cheng (BSc Mathematics & Actuarial Science and starting MSc Data Science shortly) have been working as interns over this Summer (2016) as have been developing the MOOC Dashboard.
Resumo:
Americans are accustomed to a wide range of data collection in their lives: census, polls, surveys, user registrations, and disclosure forms. When logging onto the Internet, users’ actions are being tracked everywhere: clicking, typing, tapping, swiping, searching, and placing orders. All of this data is stored to create data-driven profiles of each user. Social network sites, furthermore, set the voluntarily sharing of personal data as the default mode of engagement. But people’s time and energy devoted to creating this massive amount of data, on paper and online, are taken for granted. Few people would consider their time and energy spent on data production as labor. Even if some people do acknowledge their labor for data, they believe it is accessory to the activities at hand. In the face of pervasive data collection and the rising time spent on screens, why do people keep ignoring their labor for data? How has labor for data been become invisible, as something that is disregarded by many users? What does invisible labor for data imply for everyday cultural practices in the United States? Invisible Labor for Data addresses these questions. I argue that three intertwined forces contribute to framing data production as being void of labor: data production institutions throughout history, the Internet’s technological infrastructure (especially with the implementation of algorithms), and the multiplication of virtual spaces. There is a common tendency in the framework of human interactions with computers to deprive data and bodies of their materiality. My Introduction and Chapter 1 offer theoretical interventions by reinstating embodied materiality and redefining labor for data as an ongoing process. The middle Chapters present case studies explaining how labor for data is pushed to the margin of the narratives about data production. I focus on a nationwide debate in the 1960s on whether the U.S. should build a databank, contemporary Big Data practices in the data broker and the Internet industries, and the group of people who are hired to produce data for other people’s avatars in the virtual games. I conclude with a discussion on how the new development of crowdsourcing projects may usher in the new chapter in exploiting invisible and discounted labor for data.
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.
Resumo:
The speed with which data has moved from being scarce, expensive and valuable, thus justifying detailed and careful verification and analysis to a situation where the streams of detailed data are almost too large to handle has caused a series of shifts to occur. Legal systems already have severe problems keeping up with, or even in touch with, the rate at which unexpected outcomes flow from information technology. The capacity to harness massive quantities of existing data has driven Big Data applications until recently. Now the data flows in real time are rising swiftly, become more invasive and offer monitoring potential that is eagerly sought by commerce and government alike. The ambiguities as to who own this often quite remarkably intrusive personal data need to be resolved – and rapidly - but are likely to encounter rising resistance from industrial and commercial bodies who see this data flow as ‘theirs’. There have been many changes in ICT that has led to stresses in the resolution of the conflicts between IP exploiters and their customers, but this one is of a different scale due to the wide potential for individual customisation of pricing, identification and the rising commercial value of integrated streams of diverse personal data. A new reconciliation between the parties involved is needed. New business models, and a shift in the current confusions over who owns what data into alignments that are in better accord with the community expectations. After all they are the customers, and the emergence of information monopolies needs to be balanced by appropriate consumer/subject rights. This will be a difficult discussion, but one that is needed to realise the great benefits to all that are clearly available if these issues can be positively resolved. The customers need to make these data flow contestable in some form. These Big data flows are only going to grow and become ever more instructive. A better balance is necessary, For the first time these changes are directly affecting governance of democracies, as the very effective micro targeting tools deployed in recent elections have shown. Yet the data gathered is not available to the subjects. This is not a survivable social model. The Private Data Commons needs our help. Businesses and governments exploit big data without regard for issues of legality, data quality, disparate data meanings, and process quality. This often results in poor decisions, with individuals bearing the greatest risk. The threats harbored by big data extend far beyond the individual, however, and call for new legal structures, business processes, and concepts such as a Private Data Commons. This Web extra is the audio part of a video in which author Marcus Wigan expands on his article "Big Data's Big Unintended Consequences" and discusses how businesses and governments exploit big data without regard for issues of legality, data quality, disparate data meanings, and process quality. This often results in poor decisions, with individuals bearing the greatest risk. The threats harbored by big data extend far beyond the individual, however, and call for new legal structures, business processes, and concepts such as a Private Data Commons.
Resumo:
This thesis investigates how web search evaluation can be improved using historical interaction data. Modern search engines combine offline and online evaluation approaches in a sequence of steps that a tested change needs to pass through to be accepted as an improvement and subsequently deployed. We refer to such a sequence of steps as an evaluation pipeline. In this thesis, we consider the evaluation pipeline to contain three sequential steps: an offline evaluation step, an online evaluation scheduling step, and an online evaluation step. In this thesis we show that historical user interaction data can aid in improving the accuracy or efficiency of each of the steps of the web search evaluation pipeline. As a result of these improvements, the overall efficiency of the entire evaluation pipeline is increased. Firstly, we investigate how user interaction data can be used to build accurate offline evaluation methods for query auto-completion mechanisms. We propose a family of offline evaluation metrics for query auto-completion that represents the effort the user has to spend in order to submit their query. The parameters of our proposed metrics are trained against a set of user interactions recorded in the search engine’s query logs. From our experimental study, we observe that our proposed metrics are significantly more correlated with an online user satisfaction indicator than the metrics proposed in the existing literature. Hence, fewer changes will pass the offline evaluation step to be rejected after the online evaluation step. As a result, this would allow us to achieve a higher efficiency of the entire evaluation pipeline. Secondly, we state the problem of the optimised scheduling of online experiments. We tackle this problem by considering a greedy scheduler that prioritises the evaluation queue according to the predicted likelihood of success of a particular experiment. This predictor is trained on a set of online experiments, and uses a diverse set of features to represent an online experiment. Our study demonstrates that a higher number of successful experiments per unit of time can be achieved by deploying such a scheduler on the second step of the evaluation pipeline. Consequently, we argue that the efficiency of the evaluation pipeline can be increased. Next, to improve the efficiency of the online evaluation step, we propose the Generalised Team Draft interleaving framework. Generalised Team Draft considers both the interleaving policy (how often a particular combination of results is shown) and click scoring (how important each click is) as parameters in a data-driven optimisation of the interleaving sensitivity. Further, Generalised Team Draft is applicable beyond domains with a list-based representation of results, i.e. in domains with a grid-based representation, such as image search. Our study using datasets of interleaving experiments performed both in document and image search domains demonstrates that Generalised Team Draft achieves the highest sensitivity. A higher sensitivity indicates that the interleaving experiments can be deployed for a shorter period of time or use a smaller sample of users. Importantly, Generalised Team Draft optimises the interleaving parameters w.r.t. historical interaction data recorded in the interleaving experiments. Finally, we propose to apply the sequential testing methods to reduce the mean deployment time for the interleaving experiments. We adapt two sequential tests for the interleaving experimentation. We demonstrate that one can achieve a significant decrease in experiment duration by using such sequential testing methods. The highest efficiency is achieved by the sequential tests that adjust their stopping thresholds using historical interaction data recorded in diagnostic experiments. Our further experimental study demonstrates that cumulative gains in the online experimentation efficiency can be achieved by combining the interleaving sensitivity optimisation approaches, including Generalised Team Draft, and the sequential testing approaches. Overall, the central contributions of this thesis are the proposed approaches to improve the accuracy or efficiency of the steps of the evaluation pipeline: the offline evaluation frameworks for the query auto-completion, an approach for the optimised scheduling of online experiments, a general framework for the efficient online interleaving evaluation, and a sequential testing approach for the online search evaluation. The experiments in this thesis are based on massive real-life datasets obtained from Yandex, a leading commercial search engine. These experiments demonstrate the potential of the proposed approaches to improve the efficiency of the evaluation pipeline.
Resumo:
The proliferation of new mobile communication devices, such as smartphones and tablets, has led to an exponential growth in network traffic. The demand for supporting the fast-growing consumer data rates urges the wireless service providers and researchers to seek a new efficient radio access technology, which is the so-called 5G technology, beyond what current 4G LTE can provide. On the other hand, ubiquitous RFID tags, sensors, actuators, mobile phones and etc. cut across many areas of modern-day living, which offers the ability to measure, infer and understand the environmental indicators. The proliferation of these devices creates the term of the Internet of Things (IoT). For the researchers and engineers in the field of wireless communication, the exploration of new effective techniques to support 5G communication and the IoT becomes an urgent task, which not only leads to fruitful research but also enhance the quality of our everyday life. Massive MIMO, which has shown the great potential in improving the achievable rate with a very large number of antennas, has become a popular candidate. However, the requirement of deploying a large number of antennas at the base station may not be feasible in indoor scenarios. Does there exist a good alternative that can achieve similar system performance to massive MIMO for indoor environment? In this dissertation, we address this question by proposing the time-reversal technique as a counterpart of massive MIMO in indoor scenario with the massive multipath effect. It is well known that radio signals will experience many multipaths due to the reflection from various scatters, especially in indoor environments. The traditional TR waveform is able to create a focusing effect at the intended receiver with very low transmitter complexity in a severe multipath channel. TR's focusing effect is in essence a spatial-temporal resonance effect that brings all the multipaths to arrive at a particular location at a specific moment. We show that by using time-reversal signal processing, with a sufficiently large bandwidth, one can harvest the massive multipaths naturally existing in a rich-scattering environment to form a large number of virtual antennas and achieve the desired massive multipath effect with a single antenna. Further, we explore the optimal bandwidth for TR system to achieve maximal spectral efficiency. Through evaluating the spectral efficiency, the optimal bandwidth for TR system is found determined by the system parameters, e.g., the number of users and backoff factor, instead of the waveform types. Moreover, we investigate the tradeoff between complexity and performance through establishing a generalized relationship between the system performance and waveform quantization in a practical communication system. It is shown that a 4-bit quantized waveforms can be used to achieve the similar bit-error-rate compared to the TR system with perfect precision waveforms. Besides 5G technology, Internet of Things (IoT) is another terminology that recently attracts more and more attention from both academia and industry. In the second part of this dissertation, the heterogeneity issue within the IoT is explored. One of the significant heterogeneity considering the massive amount of devices in the IoT is the device heterogeneity, i.e., the heterogeneous bandwidths and associated radio-frequency (RF) components. The traditional middleware techniques result in the fragmentation of the whole network, hampering the objects interoperability and slowing down the development of a unified reference model for the IoT. We propose a novel TR-based heterogeneous system, which can address the bandwidth heterogeneity and maintain the benefit of TR at the same time. The increase of complexity in the proposed system lies in the digital processing at the access point (AP), instead of at the devices' ends, which can be easily handled with more powerful digital signal processor (DSP). Meanwhile, the complexity of the terminal devices stays low and therefore satisfies the low-complexity and scalability requirement of the IoT. Since there is no middleware in the proposed scheme and the additional physical layer complexity concentrates on the AP side, the proposed heterogeneous TR system better satisfies the low-complexity and energy-efficiency requirement for the terminal devices (TDs) compared with the middleware approach.
Resumo:
By providing vehicle-to-vehicle and vehicle-to-infrastructure wireless communications, vehicular ad hoc networks (VANETs), also known as the “networks on wheels”, can greatly enhance traffic safety, traffic efficiency and driving experience for intelligent transportation system (ITS). However, the unique features of VANETs, such as high mobility and uneven distribution of vehicular nodes, impose critical challenges of high efficiency and reliability for the implementation of VANETs. This dissertation is motivated by the great application potentials of VANETs in the design of efficient in-network data processing and dissemination. Considering the significance of message aggregation, data dissemination and data collection, this dissertation research targets at enhancing the traffic safety and traffic efficiency, as well as developing novel commercial applications, based on VANETs, following four aspects: 1) accurate and efficient message aggregation to detect on-road safety relevant events, 2) reliable data dissemination to reliably notify remote vehicles, 3) efficient and reliable spatial data collection from vehicular sensors, and 4) novel promising applications to exploit the commercial potentials of VANETs. Specifically, to enable cooperative detection of safety relevant events on the roads, the structure-less message aggregation (SLMA) scheme is proposed to improve communication efficiency and message accuracy. The scheme of relative position based message dissemination (RPB-MD) is proposed to reliably and efficiently disseminate messages to all intended vehicles in the zone-of-relevance in varying traffic density. Due to numerous vehicular sensor data available based on VANETs, the scheme of compressive sampling based data collection (CS-DC) is proposed to efficiently collect the spatial relevance data in a large scale, especially in the dense traffic. In addition, with novel and efficient solutions proposed for the application specific issues of data dissemination and data collection, several appealing value-added applications for VANETs are developed to exploit the commercial potentials of VANETs, namely general purpose automatic survey (GPAS), VANET-based ambient ad dissemination (VAAD) and VANET based vehicle performance monitoring and analysis (VehicleView). Thus, by improving the efficiency and reliability in in-network data processing and dissemination, including message aggregation, data dissemination and data collection, together with the development of novel promising applications, this dissertation will help push VANETs further to the stage of massive deployment.
Resumo:
NEW DATA ON THE CHRONOLOGY OF THE VALE DO FORNO SEDIMENTARY SEQUENCE (LOWER TAGUS RIVER TERRACE STAIRCASE) AND ITS RELEVANCE AS FLUVIAL ARCHIVE OF THE MIDDLE PLEISTOCENE IN WESTERN IBERIA Pedro P. Cunha 1, António A. Martins 2, Jan-Pieter Buylaert 3,4, Andrew S. Murray 4, Luis Raposo 5, Paolo Mozzi 6, Martin Stokes 7 1 MARE - Marine and Environmental Sciences Centre, Department of Earth Sciences, University of Coimbra, Portugal: pcunha@dct.uc.pt 2 MARE - Marine and Environmental Sciences Centre, Dep. Geociências, University of Évora, Portugal; aam@uevora.pt 3 Centre for Nuclear Technologies, Technical University of Denmark, Risø Campus, Denmark; jabu@dtu.dk 4 Nordic Laboratory for Luminescence Dating, Aarhus University, Risø DTU, Denmark; anmu@dtu.dk 5 Museu Nacional de Arqueologia, Lisboa, Portugal; 3raposos@sapo.pt 6 Department of Geosciences, University of Padova, Italy; paolo.mozzi@unipd.it 7 School of Geography, Earth and Environmental Sciences, University of Plymouth, UK; m.stokes@plymouth.ac.uk The stratigraphic units that record the evolution of the Tagus River in Portugal (study area between Vila Velha de Ródão and Porto Alto villages; Fig. 1) have different sedimentary characteristics and lithic industries (Cunha et al., 2012): - a culminant sedimentary unit (the ancestral Tagus, before the drainage network entrenchment) – SLD13 (+142 to 262 m above river bed – a.r.b.; with probable age ca. 3,6 to 1,8 Ma), without artefacts; - T1 terrace (+84 to 180 m; ca. 1000? to 900 ka), without artefacts; - T2 terrace (+57 to 150 m; top deposits with a probable age ca. 600 ka), without artefacts; - T3 terrace (+43 to 113 m; ca. 460 to 360? ka), without artefacts; - T4 terrace (+26 to 55 m; ca. 335 a 155 ka), Lower Paleolithic (Acheulian) at basal and middle levels but early Middle Paleolithic at top levels; - T5 terrace (+5 to 34 m; 135 to 73 ka), Middle Paleolithic (Mousterian; Levallois technique); - T6 terrace (+3 to 14 m; 62 to 32 ka), late Middle Paleolithic (late Mousterian); - Carregueira Sands (aeolian sands) and colluvium (+3 a ca. 100 m; 32 to 12 ka), Upper Paleolithic to Epipaleolithic; - alluvial plain (+0 to 8 m; ca. 12 ka to present), Mesolithic and more recent industries. The differences in elevation (a.r.b.) of the several terrace staircases results from differential uplift due to active faults. Longitudinal correlation with the terrace levels indicates that a graded profile ca. 200 km long was achieved during terrace formation periods and a strong control by sea base level was determinant for terrace formation. The Neogene sedimentary units constituted the main source of sediments for the fluvial terraces (Fig. 2). Geomorphological mapping, coupled with lithostratigraphy, sedimentology and luminescence dating (quartz-OSL and K-feldspar post-IRIR290) were used in this study focused on the T4 terrace, which comprises a Lower Gravels (LG) unit and an Upper Sand (US) unit. The thick, coarse and dominantly massive gravels of the LG unit indicate deposition by a coarse bed-load braided river, with strong sediment supply, high gradient and fluvial competence, during conditions of rapidly rising sea level. Luminescence dating only provided minimum ages but it is probable that the LG unit corresponds to the earlier part of the MIS9 (ca. 335 to 325 ka), immediately postdating the incision promoted by the very low sea level (reaching ca. -140 m) during MIS10 (362 to 337 ka), a period of relatively cold climate conditions with weak vegetation cover on slopes and low sea level. Fig. 1. Main Portuguese reaches in which the Tagus River can be divided (Lower Tagus Basin): I – from the Spanish border to Arneiro (a general E–W trend, mainly consisting of polygonal segments); II – from Arneiro to Gavião (NE–SW); III – from Gavião to Arripiado (E–W); IV – from Arripiado to Vila Franca de Xira (NNE-SSW); V – from Vila Franca de Xira to the Atlantic shoreline. The faults considered to be the limit of the referred fluvial sectors are: F1 – Ponsul-Arneiro fault (WSW-ENE); F2 – Gavião fault (NW-SE); F3 – Ortiga fault (NW-SE); F4 – Vila Nova da Barquinha fault (W-E); F5 – Arripiado-Chamusca fault (NNE-SSW). 1 – estuary; 2 – terraces; 3 – faults; 4 – Tagus main channel. The main Iberian drainage basins are also represented (inset). The lower and middle parts of the US unit, comprising an alternation of clayish silts with paleosols and minor sands to the east (flood-plain deposits) and sand deposits to the west (channel belt), have a probable age of ca. 325 to 200 ka. This points to formation during MIS9 to MIS7, under conditions of high to medium sea levels and warm to mild conditions. The upper part of the US unit, dominated by sand facies and with OSL ages of ca. 200 to 154 ka, correlates with the early part of the MIS6. During this period, progradation resulted from climate deterioration and relative depletion of vegetation that promoted enhanced sediment production in the catchment, coupled with initiation of sea-level lowering that increased the longitudinal slope. The Vale do Forno and Vale da Atela archaeological sites (Alpiarça, central Portugal) document the earliest human occupation in the Lower Tagus River, well established in geomorphological and environmental terms, within the Middle Pleistocene. The Lower Palaeolithic sites were found on the T4 terrace (+26 m, a.r.b.). The oldest artefacts previously found in the LG unit, display crude bifacial forms that can be attributed to the Acheulian, with a probable age of ca. 335 to 325 ka. The T4 US unit has archaeological sites stratigraphically documenting successive phases of an evolved Acheulian, that probably date ca. 325 to 300 ka. Notably, these Lower Palaeolithic artisans were able to produce tools with different sophistication levels, simply by applying different strategies: more elaborated reduction sequences in case of bifaces and simple reduction sequences to obtain cleavers. Fig. 2. . Simplified geologic map of the Lower Tagus Cenozoic basin, adapted from the Carta Geológica de Portugal, 1/500000, 1992). The study area (comprising the Vale do Forno and Vale de Atela sites) is located on the more upstream sector of the Lower Tagus River reach IV, between Arripiado and Chamusca villages. 1 – alluvium (Holocene); 2 – terraces (Pleistocene); 3 – sands, silts and gravels (Paleogene to Pliocene); 4 – Sintra Massif (Cretaceous); 5 – limestones, marls, silts and sandstones (Mesozoic); 6 – quartzites (Ordovician); 7 – basement (Proterozoic to Palaeozoic); 8 – main fault. The main Portuguese reaches of the Tagus River are identified (I to V). The VF3 site (Milharós), containing a Final Acheulian industry, with fine and elaborated bifaces) found in a stratigraphic level located between the T4 terrace deposits and a colluvium associated with Late Pleistocene aeolian sands (32 to 12 ka), has an age younger than ca. 154 ka but much older than 32 ka. In the study area, the sedimentary units of the T4 terrace seem to record the river response to sea-level changes and climatically-driven fluctuations in sediment supply. REFERENCES Cunha P. P., Almeida N. A. C., Aubry T., Martins A. A., Murray A. S., Buylaert J.-P., Sohbati R., Raposo L., Rocha L., 2012, Records of human occupation from Pleistocene river terrace and aeolian sediments in the Arneiro depression (Lower Tejo River, central eastern Portugal). Geomorphology, vol. 165-166, pp. 78-90.
Resumo:
Recent years have witnessed an increasing evolution of wireless mobile networks, with an intensive research work aimed at developing new efficient techniques for the future 6G standards. In the framework of massive machine-type communication (mMTC), emerging Internet of Things (IoT) applications, in which sensor nodes and smart devices transmit unpredictably and sporadically short data packets without coordination, are gaining an increasing interest. In this work, new medium access control (MAC) protocols for massive IoT, capable of supporting a non-instantaneous feedback from the receiver, are studied. These schemes guarantee an high time for the acknowledgment (ACK) messages to the base station (BS), without a significant performance loss. Then, an error floor analysis of the considered protocols is performed in order to obtain useful guidelines for the system design. Furthermore, non-orthogonal multiple access (NOMA) coded random access (CRA) schemes based on power domain are here developed. The introduction of power diversity permits to solve more packet collision at the physical (PHY) layer, with an important reduction of the packet loss rate (PLR) in comparison to the number of active users in the system. The proposed solutions aim to improve the actual grant-free protocols, respecting the stringent constraints of scalability, reliability and latency requested by 6G networks.
Resumo:
High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at: http://www.lge.ibi.unicamp.br/lnbio/IIS/.