2009

Optimum Data Partitioning for Shared Nothing Environment C.Brightwin(05C20),S.Karthiga Renga(05C38),R.Mahesh Ranadeeran(05C50)

This project aims at partitioning a large database with multiple relations and placing the fragments generated in the nodes of cluster machine for concurrent execution of queries such that dependencies between the partitions is minimum.It is best suitable for datawarehouse applications that involve processing huge datasets.The database is fragmented and the partitions are generated using enhanced grouping algorithm.The generated partitions are subjected to another phase of the algorithm that we have proposed to identify those partitions that are to placed in a single node of cluster and those that need not.After the fragment information are obtained,the scripts are run to place the fragments in the nodes of cluster machine. Query processing involves only the nodes in which the partitions for that attribute are found instead of the entire database

Real Time Alert Clustering and Classification to Reduce False Positives in Intrusion Detection, George Mathew (07CS04)

Intrusion Detection Systems (IDS) monitor a secured network for the evidence of malicious activities originating either inside or outside. Upon identifying a suspicious traffic, IDS generates and logs an alert. Unfortunately, most of the alerts generated are either false positive, i.e. benign traffic that has been classified as intrusions, or irrelevant, i.e. attacks that are not successful. The abundance of false positive alerts makes it difficult for the security analyst to find successful attacks and take remedial action. This paper describes a two phase automatic alert classification system to assist the human analyst in identifying the false positives. In the first phase, the alerts collected from one or more sensors are normalized and similar alerts are grouped to form a meta-alert. These meta-alerts are passively verified with an asset database to find out irrelevant alerts. In addition, alert generalization is also performed for root cause analysis and thereby reduces false positives with human interaction. In the second phase, the reduced alerts are labeled and passed to an alert classifier which uses machine learning techniques for building the classification rules. This helps the analyst in automatic classification of the alerts. The system is tested in real environments and found to be effective in reducing the number of alerts as well as false positives dramatically, and thereby reducing the workload of human analyst

DDoS Attack Detection and Classification using Testbed, Suneel Reddy C(08CS21)

The Objective of this project is to present an automatic alert classification system to filter FPs from the Output of the IDS and this system is accurate and maintainable. The system contains five phased of filtering the FPs. In the first phase, the DDoS attacks will be generated in Testbed and the traffic will be dumped in a file. In the second phase, the dump file will be given as input to the snort sensor to generate alerts. the third phase consists of alert fusion and generalization using which the alerts are fused to form meta alerts and the number of alerts are greatly reduced. In the fourth phase the reduced alerts are classified into DDoS arrack by eliminating the FPs using Fuzzy Inference System and threshold based classfier. FIS is used to form the classification rules. This system is tested in real environments and found to be effective in reducing the number of alerts as well as the FPs dramatically and thus reducing the workload of the intrusion analyst.

Dynamic Critical Path Scheduling Of Highly Communicating Task Graphs, R.Dolores Scholastica(08CS08)

Scheduling highly communicating task graphs leads to high overhead and high complexity than that of ordinary task graph.The aim of the project is to introduce a hybrid scheduling methodology for task graphs (which are essentially highly communicating task graphs),which maps task nodes to processors in the static mapping stage and assigns start times to the nodes in the online sheduling stage.The statergy has three phases comprising task mapping,selective duplication,and an online scheduling algorithm. In this project,a Gaussian Elimination task graph was generated and the generated task graph was statistically scheduled in the processor using the Dynamic Critical Path(DCP) sheduling algorithm.Then the generalized concurrency graph was generated for the Gaussian elimination application.The generated static schedule and concurrency graph will be used for task duplication and an online scheduling in the next phase of this project.The future implementation includes task duplication and online sheduling of highly communicating task graphs and there by producing an efficient approach considering memory and time constrains.

Classification Of Tamil Web Documents Using Natural Language Processing And Probabilitstic Approach, M.Mangaiyarkarasi(05C51)

This project titled "Classification of Tamil Web documents using Natural Language Processing and Probabilistice approach" aims at categorising various Tamil Web pages and comparing the correctness of the classifier using Probabilistic neural network and Support Vector machine. The domain of study involves medicine(maruththuvam), music(isai), education(kalvi), business(porulathaaram), biotechnology(uyiriyalthozhilnutpam). After converting into intermediate form the web documents are used for classification.

Expert Debugging Systems For Interface Drivers, R.Andrews Roberta Mary(05C05)

This project titled "Expert Debugging Systems For Interface Drivers" aims at the development of a debugger for Fibre Channel Cards. This debugger developed is capable of giving a report of the possible errors. When a system crashes, it produces a crash dump that stores the information about all statistics at the time when the system crashed. This crash dump is loaded in HP's kernel debugger "KWDB". The dynamically loadable kernel modules for Fibre Channel Devices are then loaded. The debugger perl script is called which displays all statistics and their values. Upon having a quick look at the value of statistics, and expert can easily spot out the errors that should have occured when the system went into crash. This project involves the following submodules:

  • Tachlite Analyzer.
  • QLogic Fibre Channel Device Analyzer.
    • Automator for Analyzers

    CPBC:A Comprehensive Solution For Gramatical Interface,D.Jena cathereine bel(08CS10)

    This project addresses the problem of learning the rules from language corpora to classify whether the sentence belong to the language or not based on their syntactic structure.The project proposes the use of unsupervised learning technique to learn rules from the corpora based on covering principle and hence called as Covering Principle Based Classifier(CPBC).The rules inferred are in the Chomsky Normal Form(CNF).The sentences that are to be classified are parsed using dynamic top down statergy.If the input sentences belong to the language,then the rule needed to form the sentence will be generated.The fitness of the rule is evaluated by counting the number of positive and negative sentences in which the rule appears.The Benchmark data sets are given as input to the proposed system namely tom,brown-a,brown-e and children corpus.

    Out of the entire corpus,30% of sentences in the corpus are used for training and rule generation and 70% of the sentences are used for testing the inferred rule.Performance measures considered are fitness,true positive rate,false negative rate and number of iterations.The fitness of grammar is found to be 89.02%,91.19%,92.7%and 92.05% for tom,brown-a,brown-e and children data sets respectively.The true positive rate for all the data sets are found to be higher than the traditional methods like Grammar based Classifier System(GCS)[9],the genetic algorithmic approach proposed in[1](AKM).Our results prove advantageous as the false negatives rate are found to be reduced for large corpora like brown-e,tom.

    Contention Awareness In Task Scheduling Using Tabu Search, R.Shanmugapriya(07CS21)

    In this project we proposed an effective and efficient algorithm called Migration Scheduling Algorithm (MSA) based on Tabu search extended from list scheduling algorithm. The edges among the tasks are also scheduled by treating communication links between the processors as resources. To present the effectiveness of the proposed algorithm, we compared it with the Dynamic Level Scheduling(DLS) algorithm and a list scheduling without contention. The proposed algorithm has admissible time complexity and suitable for regular as well as irregular task graph structures. Experimental results show that our proposed algorithm with tabu search produce optimal schedules in reasonable time.

    Automatic speech recognition using isolated words, Akilandeshwari(08CS02)

    Speech recognition is the process of interpreting human speech in a computer. More technically means building a system for mapping acoustic signals to a string of words. Isolated-words automatic speech recognition system (IWASR) is to automatically extract the spoken word from the input signal and finally return the correctly matched word to the user from the corpus. This system matches an input speech signal with the trained set of speech signals which are stored in the database/code book and returns matching result to the users. Main goal of this system is to produce better quality of Distributed Speech Recognition (DSR) System and improve the accuracy and matching Standards, which can be obtained by competent algorithms like Mel Frequency Cepstral Coefficients (MFCC) Computation for Feature Extraction process. Subsequently, Vector Quantization was used for all feature vectors generated from MFCC. This environment shows the recognition accuracy can be significantly improved for all types of noise and all SNR conditions.


    Implementing SCSI-3 Persistent Reservation For An Open-Source iSCSI Target,J.Aswin(05C15),R.Nagarajan(05C59)

    The project focuses on providing persistent reservation support for the iSCSI enterprise target.iSCSI is basically over the internet.The SCSI commands are passed over existing TCP/IP socketson to an iSCSI target where the SCSI commands are processed.This project provides support for the SCSI commands PERSISTENT RESERVE IN (Opcode:5e) and PERSISTENT RESERVE OUT(Opcode:5f) with a variety of its subcommands callled service actions in the iSCSI target.Persistent Reservation is a concept wherein multiple iSCSI initiations accessing an iSCSI target establish a reservation policy on the exposed logical unit(LUN)by means of a reservation key.The PERSISTENT RESERVATION IN command is used to read existing registrations and reservatiions.The PERSISTENT RESERVATION OUT command is used to create new registrations and reservations and update existing ones.