Semantic Association Processing in a Distributed Environment
Matt Perry, Maciej Janik, Conrad Ibanez, and Cartic Ramakrishnan

Project Description

This project will consist of investigating ways to evaluate semantic association queries over distributed data sets. The distributed environment may or may not follow a peer to peer architecture. The data contained in each data set will most likely overlap. The start and end entities for a query may both be in a single data set or may be in separate data sets. The fundamental issue that must be solved in this project is how to efficiently find paths that traverse more than one data set.

For deliverables we will develop an API which executes semantic association queries over a distributed data set. Our API will most likely be built on top of the new Semdis data store developed by Maciej Janik, and data sets can be synthetically generated or we could possibly use the SWETO data set. For evaluation we can compare statistics about completeness, etc. from query execution on a centralized data set, naive implementation on a distributed set, and our implementation on a distributed set.