Implementation Of a Mini Search Engine | Computer Science Project Topics

Published on Nov 30, 2023

Introduction

In this project, we will design and implement a mini search engine that is used to search through a colle ction of documents . The data struc tures used are files for sto rin g, has h tab les for ind exi ng and tre es for search ing the doc ume nts .

The documents will be stored using files and given a set of texts and a query, the search engine will locate all the documents that contain the keywords in that query. The purpose of this project is to provide an overview of how a search engine works and to gain hands-on experience in using hash tables, files and trees.

Indexing

The documents stored as files will be indexed based on their words/tokens using hashing functions. This is done in order to make it easier to retrieve the required documents.

Searching

Searching will be done using trees, and depend in g upon th eefficiency an d complexity of the algorithm we will use AVL trees or balanced binary search trees. In order to allow efficient searching, for every word a list of documents where it will occur will be stored. The queries may contain simple Boolean operators, that is AND/OR, which act in a similar manner with the well-known analogous logical operators. For each such query, the document that satisfies that query will be displayed.

For instance, a query:

Keyword1 AND Keyword2 -- should retrieve all documents that contain both these keywords (elements).

Keyword1 OR Keyword2 -- instead will retrieve documents that contain either one of the two keywords

Related Projects

SMTP Mail Server

Implementation Of a Mini Search Engine

Implementation of Security in WAN

E-Mail Campaign System

UA Portal

Log Reader Based Code Analyzer