Web 3.0, as originally envisioned, is an evolution of the network as a global information sharing platform. This evolution, titled "semantic web", would dramatically increase the capacity to effectively use information on the web. This capability is crucial in the big data era where data is abundant, but understanding and processing is still effectively a manual process. There are notable roadblocks to implementation of the semantic web, namely data extraction and information labeling. In this project, we take a stab at identifying the practical weaknesses of semantic web and how to overcome them. We address several data extraction challenges through the introduction of contextually rich coordinates derived from HTML web pages and pair this with a multi-pass minimal context evaluation. Finally, we discuss the implications and practicality of handling distributed “truth” systems in the context of semantic web.
Jason Carpenter is a Ph.D. student in the Department of Computer Science & Engineering, mentored by Dr. Zhi-Li Zhang. His research interests include future Internet, distributed systems, and network infrastructure.