858 resultados para Cutting machine
Resumo:
In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam translation using statistical models like translation model, language model and a decoder. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set up among the sentence pairs of the source and target language before subjecting them for training. This paper is deals with the techniques which can be adopted for improving the alignment model of SMT. Incorporating the parts of speech information into the bilingual corpus has eliminated many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Resumo:
This paper presents the application of wavelet processing in the domain of handwritten character recognition. To attain high recognition rate, robust feature extractors and powerful classifiers that are invariant to degree of variability of human writing are needed. The proposed scheme consists of two stages: a feature extraction stage, which is based on Haar wavelet transform and a classification stage that uses support vector machine classifier. Experimental results show that the proposed method is effective
Resumo:
In our study we use a kernel based classification technique, Support Vector Machine Regression for predicting the Melting Point of Drug – like compounds in terms of Topological Descriptors, Topological Charge Indices, Connectivity Indices and 2D Auto Correlations. The Machine Learning model was designed, trained and tested using a dataset of 100 compounds and it was found that an SVMReg model with RBF Kernel could predict the Melting Point with a mean absolute error 15.5854 and Root Mean Squared Error 19.7576
Resumo:
Energy production from biomass and the conservation of ecologically valuable grassland habitats are two important issues of agriculture today. The combination of a bioenergy production, which minimises environmental impacts and competition with food production for land with a conversion of semi-natural grasslands through new utilization alternatives for the biomass, led to the development of the IFBB process. Its basic principle is the separation of biomass into a liquid fraction (press fluid, PF) for the production of electric and thermal energy after anaerobic digestion to biogas and a solid fraction (press cake, PC) for the production of thermal energy through combustion. This study was undertaken to explore mass and energy flows as well as quality aspects of energy carriers within the IFBB process and determine their dependency on biomass-related and technical parameters. Two experiments were conducted, in which biomass from semi-natural grassland was conserved as silage and subjected to a hydrothermal conditioning and a subsequent mechanical dehydration with a screw press. Methane yield of the PF and the untreated silage was determined in anaerobic digestion experiments in batch fermenters at 37°C with a fermentation time of 13-15 and 27-35 days for the PF and the silage, respectively. Concentrations of dry matter (DM), ash, crude protein (CP), crude fibre (CF), ether extract (EE), neutral detergent fibre (NDF), acid detergent fibre (ADF), acid detergent ligning (ADL) and elements (K, Mg, Ca, Cl, N, S, P, C, H, N) were determined in the untreated biomass and the PC. Higher heating value (HHV) and ash softening temperature (AST) were calculated based on elemental concentration. Chemical composition of the PF and mass flows of all plant compounds into the PF were calculated. In the first experiment, biomass from five different semi-natural grassland swards (Arrhenaterion I and II, Caricion fuscae, Filipendulion ulmariae, Polygono-Trisetion) was harvested at one late sampling (19 July or 31 August) and ensiled. Each silage was subjected to three different temperature treatments (5°C, 60°C, 80°C) during hydrothermal conditioning. Based on observed methane yields and HHV as energy output parameters as well as literature-based and observed energy input parameters, energy and green house gas (GHG) balances were calculated for IFBB and two reference conversion processes, whole-crop digestion of untreated silage (WCD) and combustion of hay (CH). In the second experiment, biomass from one single semi-natural grassland sward (Arrhenaterion) was harvested at eight consecutive dates (27/04, 02/05, 09/05, 16/05, 24/05, 31/05, 11/06, 21/06) and ensiled. Each silage was subjected to six different treatments (no hydrothermal conditioning and hydrothermal conditioning at 10°C, 30°C, 50°C, 70°C, 90°C). Energy balance was calculated for IFBB and WCD. Multiple regression models were developed to predict mass flows, concentrations of elements in the PC, concentration of organic compounds in the PF and energy conversion efficiency of the IFBB process from temperature of hydrothermal conditioning as well as NDF and DM concentration in the silage. Results showed a relative reduction of ash and all elements detrimental for combustion in the PC compared to the untreated biomass of 20-90%. Reduction was highest for K and Cl and lowest for N. HHV of PC and untreated biomass were in a comparable range (17.8-19.5 MJ kg-1 DM), but AST of PC was higher (1156-1254°C). Methane yields of PF were higher compared to those of WCD when the biomass was harvested late (end of May and later) and in a comparable range when the biomass was harvested early and ranged from 332 to 458 LN kg-1 VS. Regarding energy and GHG balances, IFBB, with a net energy yield of 11.9-14.1 MWh ha-1, a conversion efficiency of 0.43-0.51, and GHG mitigation of 3.6-4.4 t CO2eq ha-1, performed better than WCD, but worse than CH. WCD produces thermal and electric energy with low efficiency, CH produces only thermal energy with a low quality solid fuel with high efficiency, IFBB produces thermal and electric energy with a solid fuel of high quality with medium efficiency. Regression models were able to predict target parameters with high accuracy (R2=0.70-0.99). The influence of increasing temperature of hydrothermal conditioning was an increase of mass flows, a decrease of element concentrations in the PC and a differing effect on energy conversion efficiency. The influence of increasing NDF concentration of the silage was a differing effect on mass flows, a decrease of element concentrations in the PC and an increase of energy conversion efficiency. The influence of increasing DM concentration of the silage was a decrease of mass flows, an increase of element concentrations in the PC and an increase of energy conversion efficiency. Based on the models an optimised IFBB process would be obtained with a medium temperature of hydrothermal conditioning (50°C), high NDF concentrations in the silage and medium DM concentrations of the silage.
Resumo:
Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.
Resumo:
Since dwarf napiergrass (Pennisetum purpureum Schumach.) must be propagated vegetatively due to lack of viable seeds, root splitting and stem cuttings are generally used to obtain true-to-type plant populations. These ordinary methods are laborious and costly, and are the greatest barriers for expanding the cultivation area of this crop. The objectives of this research were to develop nursery production of dwarf napiergrass in cell trays and to compare the efficiency of mechanical versus manual methods for cell-tray propagation and field transplanting. After defoliation of herbage either by a sickle (manually) or hand-mowing machine, every potential aerial tiller bud was cut to a single one for transplanting into cell trays as stem cuttings and placed in a glasshouse over winter. The following June, nursery plants were trimmed to a 25–cm length and transplanted in an experimental field (sandy soil) with 20,000 plants ha^(−1) either by shovel (manually) or Welsh onion planter. Labour time was recorded for each process. The manual defoliation of plants required 44% more labour time for preparing the stem cuttings (0.73 person-min. stemcutting^(−1)) compared to using hand-mowing machinery (0.51 person-min. stem-cutting^(−1)). In contrast, labour time for transplanting required an extra 0.30 person-min. m^(−2) (14%) using the machinery compared to manual transplanting, possibly due to the limited plot size for machinery operation. The transplanting method had no significant effect on plant establishment or plant growth, except for herbage yield 110 days after planting. Defoliation of herbage by machinery, production using a cell-tray nursery and mechanical transplanting reduced the labour intensity of dwarf napiergrass propagation.
Resumo:
Machine translation has been a particularly difficult problem in the area of Natural Language Processing for over two decades. Early approaches to translation failed since interaction effects of complex phenomena in part made translation appear to be unmanageable. Later approaches to the problem have succeeded (although only bilingually), but are based on many language-specific rules of a context-free nature. This report presents an alternative approach to natural language translation that relies on principle-based descriptions of grammar rather than rule-oriented descriptions. The model that has been constructed is based on abstract principles as developed by Chomsky (1981) and several other researchers working within the "Government and Binding" (GB) framework. Thus, the grammar is viewed as a modular system of principles rather than a large set of ad hoc language-specific rules.
Resumo:
The dataflow model of computation exposes and exploits parallelism in programs without requiring programmer annotation; however, instruction- level dataflow is too fine-grained to be efficient on general-purpose processors. A popular solution is to develop a "hybrid'' model of computation where regions of dataflow graphs are combined into sequential blocks of code. I have implemented such a system to allow the J-Machine to run Id programs, leaving exposed a high amount of parallelism --- such as among loop iterations. I describe this system and provide an analysis of its strengths and weaknesses and those of the J-Machine, along with ideas for improvement.
Resumo:
In this thesis, I designed and implemented a virtual machine (VM) for a monomorphic variant of Athena, a type-omega denotational proof language (DPL). This machine attempts to maintain the minimum state required to evaluate Athena phrases. This thesis also includes the design and implementation of a compiler for monomorphic Athena that compiles to the VM. Finally, it includes details on my implementation of a read-eval-print loop that glues together the VM core and the compiler to provide a full, user-accessible interface to monomorphic Athena. The Athena VM provides the same basis for DPLs that the SECD machine does for pure, functional programming and the Warren Abstract Machine does for Prolog.
Resumo:
We compare Naive Bayes and Support Vector Machines on the task of multiclass text classification. Using a variety of approaches to combine the underlying binary classifiers, we find that SVMs substantially outperform Naive Bayes. We present full multiclass results on two well-known text data sets, including the lowest error to date on both data sets. We develop a new indicator of binary performance to show that the SVM's lower multiclass error is a result of its improved binary performance. Furthermore, we demonstrate and explore the surprising result that one-vs-all classification performs favorably compared to other approaches even though it has no error-correcting properties.
Resumo:
Support Vector Machines Regression (SVMR) is a regression technique which has been recently introduced by V. Vapnik and his collaborators (Vapnik, 1995; Vapnik, Golowich and Smola, 1996). In SVMR the goodness of fit is measured not by the usual quadratic loss function (the mean square error), but by a different loss function called Vapnik"s $epsilon$- insensitive loss function, which is similar to the "robust" loss functions introduced by Huber (Huber, 1981). The quadratic loss function is well justified under the assumption of Gaussian additive noise. However, the noise model underlying the choice of Vapnik's loss function is less clear. In this paper the use of Vapnik's loss function is shown to be equivalent to a model of additive and Gaussian noise, where the variance and mean of the Gaussian are random variables. The probability distributions for the variance and mean will be stated explicitly. While this work is presented in the framework of SVMR, it can be extended to justify non-quadratic loss functions in any Maximum Likelihood or Maximum A Posteriori approach. It applies not only to Vapnik's loss function, but to a much broader class of loss functions.
Predicting sense of community and participation by applying machine learning to open government data
Resumo:
Community capacity is used to monitor socio-economic development. It is composed of a number of dimensions, which can be measured to understand the possible issues in the implementation of a policy or the outcome of a project targeting a community. Measuring community capacity dimensions is usually expensive and time consuming, requiring locally organised surveys. Therefore, we investigate a technique to estimate them by applying the Random Forests algorithm on secondary open government data. This research focuses on the prediction of measures for two dimensions: sense of community and participation. The most important variables for this prediction were determined. The variables included in the datasets used to train the predictive models complied with two criteria: nationwide availability; sufficiently fine-grained geographic breakdown, i.e. neighbourhood level. The models explained 77% of the sense of community measures and 63% of participation. Due to the low geographic detail of the outcome measures available, further research is required to apply the predictive models to a neighbourhood level. The variables that were found to be more determinant for prediction were only partially in agreement with the factors that, according to the social science literature consulted, are the most influential for sense of community and participation. This finding should be further investigated from a social science perspective, in order to be understood in depth.
Resumo:
Este estudio muestra la prevalencia por enfermedad laboral de un grupo de trabajadores afiliados a una ARL en Colombia. Compara la morbilidad laboral entre dos grupo de trabajadores expuestos y no expuestos al trabajo agrícola y al interior del grupo de trabajadores agrícolas agrupados en las actividades de corte de caña, cultivo de banano y flores. Se realizó un estudio descriptivo de tipo transversal durante el periodo 2011-2012, mediante la revisión de una base de datos de morbilidad laboral. Se realizó un análisis uni-variado y Bi-variado y se comparó la morbilidad con datos sociodemográficos, grupos de trabajadores agrícolas y no agrícolas, y actividad productiva del sector agrícola. Se revisaron 3129 diagnósticos de enfermedad profesional durante el periodo de estudio, 433 diagnósticos fueron trabajadores agrícolas y 2696 pertenecieron a otros grupos de trabajadores. Los desórdenes Osteomusculares fueron los diagnósticos más prevalentes en el grupo Agro 92% y No Agro 86% y en las actividades de corte de caña, cultivo de banano y flores. Entre el grupo Agrícola y no agrícola se encontraron diferencias significativas en los siguientes diagnósticos: Síndrome del túnel del carpo, Síndrome de manguito rotador, Otras sinovitis y tenosinovitis, Lumbago no Especificado, Hipoacusia Neurosensorial Bilateral y epicondilitis lateral; de igual manera se encontraron diferencias entre las actividades de corte de caña y cultivo de banano y flores en los diagnósticos de: Epicondilitis, Sinovitis, Síndrome del túnel del Carpo y Trastorno lumbar. El factor de riesgo más prevalente en el grupo agrícola fue el Ergonómico con el 92.8% de los casos
Resumo:
Estudio preliminar para la construcción de una máquina que sea capaz de reconocer la letra impresa para su uso por parte de invidentes. Establecen unos métodos de reconocimiento que tratan un mínimo de información con un nivel de reconocimiento aceptable y con el objetivo de que el aparato resultante sea lo más económico posible. El sistema, desde la introducción de la información luminosa hasta la salida en braille, fue simulado en un ordenador. Los resultados obtenidos fueron satisfactorios: con una pequeña cámara de captación de informaciones luminosas, conteniendo aproximadamente 50 elementos fotorreceptores, se obtiene más de un 90 por ciento de reconocimiento, y esto independientemente de la velocidad de desplazamiento de la cámara con relación al texto y con una muestra de datos de calidad bastante mediocre. Suponen que los excelentes resultados obtenidos son aún mejorables y que con este estudio previo y los resultados obtenidos va a permitirles ahora, realizar la construcción definitiva de la máquina.