What Good Is Linked Data?

Note: This is a conceptual overview, for a technical look, see here or here.

To follow up my previous post concerning raw data I thought it would be good to give a discussion to linked data. First of all, it must be emphasized that linked data cannot exist if there is not access to raw data. So “raw data now,” then linked data.

This whole notion of linked data is really the idea of making data useful, really useful. At a high level, a good example of linked data is Wikipedia. In particular, take a look at this article about BSD. As I type that sentence, I realize that many do not know that BSD stands for Berkeley Software Distribution. Nor would many others guess that BSD happens to be the precursor to many flavors of operating systems, among them FreeBSD, NetBSD, MAC OS X, DragonFlyBSD, etc. The point that I want to extract from the Wikipedia article is that there is a plethora of information in the article but there is also a plethora of links that one can access through the article. Thus, if one wanted to learn about FreeBSD, the Wikipedia BSD article already has a link to it. Further, one could read the FreeBSD page and find a nice graphical derivative, PC-BSD. Without the basic implementation of links, these correlations would be much more difficult to come by.

On the internet, linking is the way to go. If we zoom in a little bit, we may notice some interesting features of linking. Let’s stick with Wikipedia, their main page boasts 27 different languages, impressive. Now, go ahead and return to our BSD article and on the bottom left select another language. Now you have the same information, yet in a completely different language. The data on the German page and the data on the English page should conceptually be the same information, yet because it is presented in those two different languages the article is now much more useful to many more people. Now multiply that by 27 and it is very easy to see why Wikipedia has gained incredible worldwide appreciation. How many languages can you get Encyclopedia Britannica in?

Alright, so those examples deal mainly with information in the form that we are used to seeing online, web pages. What happens when we take a look at data itself? Tim Berners-Lee uses census data as an example in his TED talk, but I thought it would be more interesting to look into Scriptural data. In the field of Biblical Studies we have a lot of manuscripts. What we don’t have is a lot of easy access to those manuscripts nor easy methods to compare. However, that is changing! As more of these manuscripts become available online (see these projects) we have the ability to link them together. The Manuscript Comparator is a prototype of this linkage. What the prototype accomplishes is systematically linking the data found in the manuscripts for simplified and complete comparison. Sure, someone could get hard copies of each manuscript and manually compare them. But anyone who has done ancient language study will surely appreciate the beauty and simplicity of this application. To simply type in the passage that one is studying and then be able to easily view discrepancies is a huge resource! Not only that but it demonstrates the power of linked data.

This is only the beginning for Biblical Studies, if you want to see what the collective mind of Open Scriptures dreams about when we consider linked data, check out the Potential Applications page.

One Comment

Justin

June 11th, 2009 at 4:35 am

You step into uncertain territory when attributing data any implicit meaning or ascribing it any inherent value. You can do fantastic things with it, but please tread carefully when drawing conclusions.

Subscribe to the comments feed.