A Linguistic View of the Semantic Web?
Alternatives to the Semantic Web?
What follows is largely half-baked, stream-of-consciousness thoughtstuff. You've been warned.
Lately, at least in the Ruby community, there's been discussion of a language based approach to programming. Domain Specific Languages are all the rage these days. See, for example, Martin Fowler's excellent article Using the Rake Build Language. (And from RubyConf, I've now got some more reading to do on the Forth language that I successfully sidestepped during college.)
The World as Language
Now, all of this has me thinking quite a lot about the world as language. Conversations, context of a discussion, verbs & nouns as implemented in code, "executable words", etc, etc. It would make Wittgenstein proud! (Actually, it would probably make him berate me until I was reduced to an illogical quivering mass, but I'll go with "proud" just the same).
Aside: Levels of Complexity
So in the context of the Semantic Web, I'm considering an analog of the physical world where, in considering the human brain for instance, we have something like:
Physics->Chemistry->Biology->Neurophysiology->Cognitive Neuroscience
(Note that this is not meant to be at all a rigorous taxonomy, just an example.)
Each of these fields looks at the physical world at differing levels of abstraction and they are all useful. While a physicist may quip that all of biology is reducable to physics, and while that may be true, it would be a notably bad idea to fire all the biologists. Biologists discuss their world at a level of abstraction correlating to complexity of the physical world at the biological level.
So what makes us think that as we build a "Semantic Web" that we wouldn't need similar levels of abstraction?
"But XML allows markup of any concept, no matter how concrete or abstract. I think we've got it covered."
Well yes, but that statement is quite similar to the statement that biology is reducable to physics. What are those intermediate concepts (or levels of complexity)?
Another way to say this in terms of contemporary programming is as follows:
We talk about stuff and meta-stuff these days. Programming and metaprogramming. Content and metacontent. But it seems that this is a relatively simplistic view of our programs, content and by extension our world. I think we need some intermediate (meta) levels.
Back to Language
And this brings me back to the current thought. Is a linguistic view of the world useful as a conceptual building block akin to biology in our example? Or is it an optional and orthogonal way of considering the content on the Web?
Well, until the content on the web looks like pure and granular semantic markup, we're left with language as a communication platform. I can imagine content directly in the form of object structures captured in XML (or it's more readable forms like YAML - see some of my other posts) and without sentences:
lent:
from: Bob
to: Sally
thing:
book:
title: "A Tale of Two Cities"
id: ...
Yet even in this form, the markup carries quite a lot of linguistic baggage. 'Lent' is a verb, but it can also be considered an operation. 'from' and 'to' are prepositions, but can be considered attributes of the type or "dynamically typed object" called 'lent'. Similarly, 'book' is the object of the structure, even if it isn't a sentence, but it is also an object in the OO sense.
We can't seem to extract the 'linguistic' from the structure, which brings up another possibility: that language is (a) already considered in the notions of the Semantic Web and (b) so deeply embedded that it's there but not explicit.
So for now I leave it as an interesting question as to where language fits in to the Semantic Web. (Mostly because I need to get on with my day.)