872 resultados para Repositories Mining
Resumo:
Gairebé 182 milions d'ciutadans de la Unió Europea (= 37,5% de la població total) viuen en aproximadament 130 regions frontereres i transfrontereres. Aquestes regions contribueixen significativament al procés d'integració europea. Aquesta importància es documenta pel paquet dels Fons Estructurals 2007-2013, que ha estat presentat per la Comissió Europea i que va ser aprovat recentment pel Parlament Europeu. Considerant que la UE ha gastat uns 4875 € milions per a la cooperació transfronterera, transnacional i interregional en el marc de la iniciativa Interreg per al període 2000-2006, la cooperació territorial europea es convertirà en un dels tres objectius dels fons estructurals i rebrà € 7750000000 (5,57 milions d'euros per a la cooperació transfronterera només) per al període 2007-2013 (Comissió Europea, 2006a, 2006b). A part d'això, un nou conjunt de normes per a l'establiment d'una "agrupació europea de cooperació territorial" (AECT) ha estat adoptat i que facilitarà la cooperació transboundray, transnacional i interregional a la UE. Aquest treball s'ocuparà de les estructures de la institucionalització, la presa de decisions i l'execució i les polítiques de la "Gran Regió" / "Großregion" (d'ara endavant: GR o Gran Regió).
Resumo:
Gold-mining may play an important role in the maintenance of malaria worldwide. Gold-mining, mostly illegal, has significantly expanded in Colombia during the last decade in areas with limited health care and disease prevention. We report a descriptive study that was carried out to determine the malaria prevalence in gold-mining areas of Colombia, using data from the public health surveillance system (National Health Institute) during the period 2010-2013. Gold-mining was more prevalent in the departments of Antioquia, Córdoba, Bolívar, Chocó, Nariño, Cauca, and Valle, which contributed 89.3% (270,753 cases) of the national malaria incidence from 2010-2013 and 31.6% of malaria cases were from mining areas. Mining regions, such as El Bagre, Zaragoza, and Segovia, in Antioquia, Puerto Libertador and Montelíbano, in Córdoba, and Buenaventura, in Valle del Cauca, were the most endemic areas. The annual parasite index (API) correlated with gold production (R2 0.82, p < 0.0001); for every 100 kg of gold produced, the API increased by 0.54 cases per 1,000 inhabitants. Lack of malaria control activities, together with high migration and proliferation of mosquito breeding sites, contribute to malaria in gold-mining regions. Specific control activities must be introduced to control this significant source of malaria in Colombia.
Resumo:
The objective of the PANACEA ICT-2007.2.2 EU project is to build a platform that automates the stages involved in the acquisition,production, updating and maintenance of the large language resources required by, among others, MT systems. The development of a Corpus Acquisition Component (CAC) for extracting monolingual and bilingual data from the web is one of the most innovative building blocks of PANACEA. The CAC, which is the first stage in the PANACEA pipeline for building Language Resources, adopts an efficient and distributed methodology to crawl for web documents with rich textual content in specific languages and predefined domains. The CAC includes modules that can acquire parallel data from sites with in-domain content available in more than one language. In order to extrinsically evaluate the CAC methodology, we have conducted several experiments that used crawled parallel corpora for the identification and extraction of parallel sentences using sentence alignment. The corpora were then successfully used for domain adaptation of Machine Translation Systems.
Resumo:
O presente trabalho cujo Título é técnicas de Data e Text Mining para a anotação dum Arquivo Digital, tem como objectivo testar a viabilidade da utilização de técnicas de processamento automático de texto para a anotação das sessões dos debates parlamentares da Assembleia da República de Portugal. Ao longo do trabalho abordaram-se conceitos como tecnologias de descoberta do conhecimento (KDD), o processo da descoberta do conhecimento em texto, a caracterização das várias etapas do processamento de texto e a descrição de algumas ferramentas open souce para a mineração de texto. A metodologia utilizada baseou-se na experimentação de várias técnicas de processamento textual utilizando a open source R/tm. Apresentam-se, como resultados, a influência do pré-processamento, tamanho dos documentos e tamanhos dos corpora no resultado do processamento utilizando o algoritmo knnflex.
Resumo:
Data mining can be defined as the extraction of previously unknown and potentially useful information from large datasets. The main principle is to devise computer programs that run through databases and automatically seek deterministic patterns. It is applied in different fields of application, e.g., remote sensing, biometry, speech recognition, but has seldom been applied to forensic case data. The intrinsic difficulty related to the use of such data lies in its heterogeneity, which comes from the many different sources of information. The aim of this study is to highlight potential uses of pattern recognition that would provide relevant results from a criminal intelligence point of view. The role of data mining within a global crime analysis methodology is to detect all types of structures in a dataset. Once filtered and interpreted, those structures can point to previously unseen criminal activities. The interpretation of patterns for intelligence purposes is the final stage of the process. It allows the researcher to validate the whole methodology and to refine each step if necessary. An application to cutting agents found in illicit drug seizures was performed. A combinatorial approach was done, using the presence and the absence of products. Methods coming from the graph theory field were used to extract patterns in data constituted by links between products and place and date of seizure. A data mining process completed using graphing techniques is called ``graph mining''. Patterns were detected that had to be interpreted and compared with preliminary knowledge to establish their relevancy. The illicit drug profiling process is actually an intelligence process that uses preliminary illicit drug classes to classify new samples. Methods proposed in this study could be used \textit{a priori} to compare structures from preliminary and post-detection patterns. This new knowledge of a repeated structure may provide valuable complementary information to profiling and become a source of intelligence.
Resumo:
Este trabalho foi realizado no âmbito do regulamento dos cursos de graduação da Universidade Jean Piaget de Cabo Verde, procura realçar a importância da recolha de dados na Web nos dias de hoje. Também apresenta um CMS (Sistema de Gestão de Conteúdo) utilizado no desenvolvimento de Websites, mostrando que é possível obter dados que podem ser considerados úteis acerca do acesso e utilização dos mesmos, dotando-os de componentes desenvolvidos para estes sistemas.
Resumo:
Ultra-high-throughput sequencing (UHTS) techniques are evolving rapidly and may soon become an affordable and routine tool for sequencing plant DNA, even in smaller plant biology labs. Here we review recent insights into intraspecific genome variation gained from UHTS, which offers a glimpse of the rather unexpected levels of structural variability among Arabidopsis thaliana accessions. The challenges that will need to be addressed to efficiently assemble and exploit this information are also discussed.
Resumo:
Introduction. The DRIVER I project drew up a detailed report of European repositories based on data gathered in a survey in which Spain's participation was very low. This created a highly distorted image of the implementation of repositories in Spain. This study aims to analyse the current state of Spanish open-access institutional repositories and to describe their characteristics. Method. The data were gathered through a Web survey. The questionnaire was based on that used by DRIVER I: coverage; technical infrastructure and technical issues; institutional policies; services created; and stimulators and inhibitors for establishing, filling and maintaining their digital institutional repositories. Analysis. Data were tabulated and analysed systematically according responses obtained from the questionnaire and grouped by coverage. Results. Responses were obtained from 38 of the 104 institutions contacted, which had 29 institutional repositories. This represents 78.3% of the Spanish repositories according to the BuscaRepositorios directory. Spanish repositories contained mainly full-text materials (journal articles and doctoral theses) together with metadata. The software most used was DSpace, followed by EPrints. The metadata standard most used was Dublin Core. Spanish repositories offered more usage statistics and fewer author-oriented services than the European average. The priorities for the future development of the repositories are the need for clear policies on access to scientific production based on public funding and the need for quality control indicators. Conclusions.This is the first detailed study of Spanish institutional repositories. The key stimulants for establishing, filling and maintaining were, in order of importance, the increase of visibility and citation, the interest of decision-makers, simplicity of use and search services. On the other hand the main inhibitors identified were the absence of policies, the lack of integration with other national and international systems and the lack of awareness efforts among academia.
Resumo:
Although research on influenza lasted for more than 100 years, it is still one of the most prominent diseases causing half a million human deaths every year. With the recent observation of new highly pathogenic H5N1 and H7N7 strains, and the appearance of the influenza pandemic caused by the H1N1 swine-like lineage, a collaborative effort to share observations on the evolution of this virus in both animals and humans has been established. The OpenFlu database (OpenFluDB) is a part of this collaborative effort. It contains genomic and protein sequences, as well as epidemiological data from more than 27,000 isolates. The isolate annotations include virus type, host, geographical location and experimentally tested antiviral resistance. Putative enhanced pathogenicity as well as human adaptation propensity are computed from protein sequences. Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it. Several analysis tools including multiple sequence alignment, phylogenetic analysis and sequence similarity maps enable rapid and efficient mining. The contents of OpenFluDB are supplied by direct user submission, as well as by a daily automatic procedure importing data from public repositories. Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank. This resource has been successfully used to rapidly and widely distribute the sequences collected during the recent human swine flu outbreak and also as an exchange platform during the vaccine selection procedure. Database URL: http://openflu.vital-it.ch.
Resumo:
Tannery residues and coal mine waste are heavily polluting sources in Brazil, mainly in the Southern States of Rio Grande do Sul and Santa Catarina. In order to study the effects of residues of chrome leather tanning (sludge and leather shavings) and coal waste on soybean and maize crops, a field experiment is in progress since 1996, at the Federal University of Rio Grande do Sul Experimental Station, county of Eldorado do Sul, Brazil. The residues were applied twice (growing seasons 1996/97 and 1999/00). The amounts of tannery residues were applied according to their neutralizing value, at rates of up to 86.8 t ha-1, supplying from 671 to 1.342 kg ha-1 Cr(III); coal waste was applied at a total rate of 164 t ha-1. Crop yield and dry matter production were evaluated, as well as the nutrients (N, P, K, Ca, Mg, Cu and Zn) and Cr contents. Crop yields with tannery sludge application were similar to those obtained with N and lime supplied with mineral amendments. Plant Cr absorption did not increase significantly with the residue application. Tannery sludge can be used also to neutralize the high acidity developed in the soil by coal mine waste.
Resumo:
La présente étude est à la fois une évaluation du processus de la mise en oeuvre et des impacts de la police de proximité dans les cinq plus grandes zones urbaines de Suisse - Bâle, Berne, Genève, Lausanne et Zurich. La police de proximité (community policing) est à la fois une philosophie et une stratégie organisationnelle qui favorise un partenariat renouvelé entre la police et les communautés locales dans le but de résoudre les problèmes relatifs à la sécurité et à l'ordre public. L'évaluation de processus a analysé des données relatives aux réformes internes de la police qui ont été obtenues par l'intermédiaire d'entretiens semi-structurés avec des administrateurs clés des cinq départements de police, ainsi que dans des documents écrits de la police et d'autres sources publiques. L'évaluation des impacts, quant à elle, s'est basée sur des variables contextuelles telles que des statistiques policières et des données de recensement, ainsi que sur des indicateurs d'impacts construit à partir des données du Swiss Crime Survey (SCS) relatives au sentiment d'insécurité, à la perception du désordre public et à la satisfaction de la population à l'égard de la police. Le SCS est un sondage régulier qui a permis d'interroger des habitants des cinq grandes zones urbaines à plusieurs reprises depuis le milieu des années 1980. L'évaluation de processus a abouti à un « Calendrier des activités » visant à créer des données de panel permettant de mesurer les progrès réalisés dans la mise en oeuvre de la police de proximité à l'aide d'une grille d'évaluation à six dimensions à des intervalles de cinq ans entre 1990 et 2010. L'évaluation des impacts, effectuée ex post facto, a utilisé un concept de recherche non-expérimental (observational design) dans le but d'analyser les impacts de différents modèles de police de proximité dans des zones comparables à travers les cinq villes étudiées. Les quartiers urbains, délimités par zone de code postal, ont ainsi été regroupés par l'intermédiaire d'une typologie réalisée à l'aide d'algorithmes d'apprentissage automatique (machine learning). Des algorithmes supervisés et non supervisés ont été utilisés sur les données à haute dimensionnalité relatives à la criminalité, à la structure socio-économique et démographique et au cadre bâti dans le but de regrouper les quartiers urbains les plus similaires dans des clusters. D'abord, les cartes auto-organisatrices (self-organizing maps) ont été utilisées dans le but de réduire la variance intra-cluster des variables contextuelles et de maximiser simultanément la variance inter-cluster des réponses au sondage. Ensuite, l'algorithme des forêts d'arbres décisionnels (random forests) a permis à la fois d'évaluer la pertinence de la typologie de quartier élaborée et de sélectionner les variables contextuelles clés afin de construire un modèle parcimonieux faisant un minimum d'erreurs de classification. Enfin, pour l'analyse des impacts, la méthode des appariements des coefficients de propension (propensity score matching) a été utilisée pour équilibrer les échantillons prétest-posttest en termes d'âge, de sexe et de niveau d'éducation des répondants au sein de chaque type de quartier ainsi identifié dans chacune des villes, avant d'effectuer un test statistique de la différence observée dans les indicateurs d'impacts. De plus, tous les résultats statistiquement significatifs ont été soumis à une analyse de sensibilité (sensitivity analysis) afin d'évaluer leur robustesse face à un biais potentiel dû à des covariables non observées. L'étude relève qu'au cours des quinze dernières années, les cinq services de police ont entamé des réformes majeures de leur organisation ainsi que de leurs stratégies opérationnelles et qu'ils ont noué des partenariats stratégiques afin de mettre en oeuvre la police de proximité. La typologie de quartier développée a abouti à une réduction de la variance intra-cluster des variables contextuelles et permet d'expliquer une partie significative de la variance inter-cluster des indicateurs d'impacts avant la mise en oeuvre du traitement. Ceci semble suggérer que les méthodes de géocomputation aident à équilibrer les covariables observées et donc à réduire les menaces relatives à la validité interne d'un concept de recherche non-expérimental. Enfin, l'analyse des impacts a révélé que le sentiment d'insécurité a diminué de manière significative pendant la période 2000-2005 dans les quartiers se trouvant à l'intérieur et autour des centres-villes de Berne et de Zurich. Ces améliorations sont assez robustes face à des biais dus à des covariables inobservées et covarient dans le temps et l'espace avec la mise en oeuvre de la police de proximité. L'hypothèse alternative envisageant que les diminutions observées dans le sentiment d'insécurité soient, partiellement, un résultat des interventions policières de proximité semble donc être aussi plausible que l'hypothèse nulle considérant l'absence absolue d'effet. Ceci, même si le concept de recherche non-expérimental mis en oeuvre ne peut pas complètement exclure la sélection et la régression à la moyenne comme explications alternatives. The current research project is both a process and impact evaluation of community policing in Switzerland's five major urban areas - Basel, Bern, Geneva, Lausanne, and Zurich. Community policing is both a philosophy and an organizational strategy that promotes a renewed partnership between the police and the community to solve problems of crime and disorder. The process evaluation data on police internal reforms were obtained through semi-structured interviews with key administrators from the five police departments as well as from police internal documents and additional public sources. The impact evaluation uses official crime records and census statistics as contextual variables as well as Swiss Crime Survey (SCS) data on fear of crime, perceptions of disorder, and public attitudes towards the police as outcome measures. The SCS is a standing survey instrument that has polled residents of the five urban areas repeatedly since the mid-1980s. The process evaluation produced a "Calendar of Action" to create panel data to measure community policing implementation progress over six evaluative dimensions in intervals of five years between 1990 and 2010. The impact evaluation, carried out ex post facto, uses an observational design that analyzes the impact of the different community policing models between matched comparison areas across the five cities. Using ZIP code districts as proxies for urban neighborhoods, geospatial data mining algorithms serve to develop a neighborhood typology in order to match the comparison areas. To this end, both unsupervised and supervised algorithms are used to analyze high-dimensional data on crime, the socio-economic and demographic structure, and the built environment in order to classify urban neighborhoods into clusters of similar type. In a first step, self-organizing maps serve as tools to develop a clustering algorithm that reduces the within-cluster variance in the contextual variables and simultaneously maximizes the between-cluster variance in survey responses. The random forests algorithm then serves to assess the appropriateness of the resulting neighborhood typology and to select the key contextual variables in order to build a parsimonious model that makes a minimum of classification errors. Finally, for the impact analysis, propensity score matching methods are used to match the survey respondents of the pretest and posttest samples on age, gender, and their level of education for each neighborhood type identified within each city, before conducting a statistical test of the observed difference in the outcome measures. Moreover, all significant results were subjected to a sensitivity analysis to assess the robustness of these findings in the face of potential bias due to some unobserved covariates. The study finds that over the last fifteen years, all five police departments have undertaken major reforms of their internal organization and operating strategies and forged strategic partnerships in order to implement community policing. The resulting neighborhood typology reduced the within-cluster variance of the contextual variables and accounted for a significant share of the between-cluster variance in the outcome measures prior to treatment, suggesting that geocomputational methods help to balance the observed covariates and hence to reduce threats to the internal validity of an observational design. Finally, the impact analysis revealed that fear of crime dropped significantly over the 2000-2005 period in the neighborhoods in and around the urban centers of Bern and Zurich. These improvements are fairly robust in the face of bias due to some unobserved covariate and covary temporally and spatially with the implementation of community policing. The alternative hypothesis that the observed reductions in fear of crime were at least in part a result of community policing interventions thus appears at least as plausible as the null hypothesis of absolutely no effect, even if the observational design cannot completely rule out selection and regression to the mean as alternative explanations.
Resumo:
Tests for bioaccessibility are useful in human health risk assessment. No research data with the objective of determining bioaccessible arsenic (As) in areas affected by gold mining and smelting activities have been published so far in Brazil. Samples were collected from four areas: a private natural land reserve of Cerrado; mine tailings; overburden; and refuse from gold smelting of a mining company in Paracatu, Minas Gerais. The total, bioaccessible and Mehlich-1-extractable As levels were determined. Based on the reproducibility and the accuracy/precision of the in vitro gastrointestinal (IVG) determination method of bioaccessible As in the reference material NIST 2710, it was concluded that this procedure is adequate to determine bioaccessible As in soil and tailing samples from gold mining areas in Brazil. All samples from the studied mining area contained low percentages of bioaccessible As.
Resumo:
The construction of a soil after surface coal mining involves heavy machinery traffic during the topographic regeneration of the area, resulting in compaction of the relocated soil layers. This leads to problems with water infiltration and redistribution along the new profile, causing water erosion and consequently hampering the revegetation of the reconstructed soil. The planting of species useful in the process of soil decompaction is a promising strategy for the recovery of the soil structural quality. This study investigated the influence of different perennial grasses on the recovery of reconstructed soil aggregation in a coal mining area of the Companhia Riograndense de Mineração, located in Candiota-RS, which were planted in September/October 2007. The treatments consisted of planting: T1- Cynodon dactylon cv vaquero; T2 - Urochloa brizantha; T3 - Panicum maximun; T4 - Urochloa humidicola; T5 - Hemarthria altissima; T6 - Cynodon dactylon cv tifton 85. Bare reconstructed soil, adjacent to the experimental area, was used as control treatment (T7) and natural soil adjacent to the mining area covered with native vegetation was used as reference area (T8). Disturbed and undisturbed soil samples were collected in October/2009 (layers 0.00-0.05 and 0.10-0.15 m) to determine the percentage of macro- and microaggregates, mean weight diameter (MWD) of aggregates, organic matter content, bulk density, and macro- and microporosity. The lower values of macroaggregates and MWD in the surface than in the subsurface layer of the reconstructed soil resulted from the high degree of compaction caused by the traffic of heavy machinery on the clay material. After 24 months, all experimental grass treatments showed improvements in soil aggregation compared to the bare reconstructed soil (control), mainly in the 0.00-0.05 m layer, particularly in the two Urochloa treatments (T2 and T4) and Hemarthria altissima (T5). However, the great differences between the treatments with grasses and natural soil (reference) indicate that the recovery of the pre-mining soil structure could take decades.