GCSP | Fall 2019
Developing Parallel and Efficient Algorithms For Compact Similarity Joins
Similarity join is a fundamental operation in data cleaning and integration. The objective of this research is to study, design and implement a distributed and scalable Compact Similarity Join operator for Big Data. The research focuses on designing an efficient algorithm for the Compact Similarity Join operator, implementing it and evaluating the performance of the proposed algorithm while also discovering the scalability properties of the proposed operator. The proposed algorithm will then act as a tool that would ease the management of big data and help organizations to efficiently analyze large datasets.
Hometown: Chandigarh, Punjab, India
Graduation date: Spring 2020