The Web 3.0
Pipeline, Part 1:

The Semantic Web

By David M. Freedman
about the author
updated 9/5/08

“The semantic web” is the concept most commonly trotted out to speculate about what Web 3.0 might look like. It is only one of several web technologies in development that could define the Web 3.0 experience.

In the early days of Web 1.0, or the “read-only web,” content flowed mostly one way: from websites to users. Then the web became more interactive, but website developers and publishers generally defined the online experience.

Web 2.0 is characterized by (a) user-generated content, (b) widely distributed and massive collaboration, and (c) community networks. In Web 2.0, users play a larger role in defining their individual and collective online experience.

Web 3.0 will be characterized by one or more of the following five technologies, which are still in development—plus some that we haven’t yet imagined.

The semantic web
The media-centric web
The 3D web
The pervasive web
The sensory web

This article will focus on the semantic web. The term was coined by Tim Berners-Lee, who is credited with inventing the World Wide Web. It refers to the ability of computers to read and understand websites and web content almost as humans do, using software that performs logical reasoning operations. The most obvious application will likely be in search engines, which will act like intelligent “agents” to (a) interpret your request in natural language, (b) find the information you need from a variety of sources, and (c) assemble that information into a kind of report, instead of merely returning a list of links to each those sources.

Context, relevancy, and patterns
Consider the difference between making a phone call and mailing a letter. When you dial a phone number, intending to call your relative Joe in Kokomo, if you misdial one digit the call will fail, or you will reach the wrong person. Neither the telephone nor the phone company’s equipment can guess who you really intended to call and reroute you to Joe in Kokomo—there is no human intervention involved, and the machines aren’t intelligent enough to figure it out.

On the other hand, when you snail-mail a letter and write the wrong street number on the envelope, if it is within a short distance of your intended destination, the mail carrier can probably recognize that it’s a mistake, figure out the correct street number, and deliver it to Joe. The human mail carrier, who is familiar with the neighborhood and the relationships between residents’ names and their street numbers (data patterns), deduces that Joe does not live at the address on the envelope, Joe does in fact live a block away, and he can easily make the correction and deliver the letter to Joe's house.

Now go back to the telephone example. If the phone or the phone company’s switches had semantic-type software, it could review the massive amounts of data available on the web including the phone company’s own records: It derives your name from your phone number, where the call originates; it knows the geographic location of the phone number you are calling, which is Kokomo; it can discover from a database on the web that you have a relative in Kokomo named Joe; and it knows that the number you dialed is not Joe’s and, in fact, you have never dialed this number before; but you have dialed Joe’s number twice in the past six months. Just as Google can often correct typos in your search query and offer alternatives, the phone could offer you the alternative of connecting to Joe or continuing with the call as you dialed it.

This isn’t quite artificial intelligence, although it approaches it. The reason why the search engine can employ something like reason is that it perceives context, relevancy, relationships, and patterns among vast stores of data, quickly.

Examples: ads, health care, and vacations
Searches in the semantic web could be much more complex and use more sophisticated logical reasoning operations (perhaps involving probabilistic approaches), with access to cosmically grander stores of content and data (all of which will be “folksonomically” organized with both publisher-generated and user-generated metadata, tags, reviews, ratings, comments, sharing, diggs, bookmarks, and other Web 2.0 social media tools.

The semantic web will change other Internet applications besides search, of course. Advertising models, for example will shift from quantity (in terms of reach, frequency, page views, clicks, etc.) to quality (relevancy, context, user consumption patterns, etc.).

Here is another example, in the health care field. In fact, this hypothetical example was used by Berners-Lee in a 1991 article that he coauthored in Scientific American. Lucy’s mother is advised by her doctor to consult with a specialist. Lucy enters a set of commands on her handheld web browser. The browser retrieves information about her mom’s illness and the prescribed treatment (from her doctor’s records), looks up several specialists within a 20-mile radius who are members of her insurance plan and who have excellent trust ratings, checks potential appointment times against her mom’s schedule, and makes an appointment with one of the specialists, rescheduling other appointments around it if necessary—all in one seamless operation.

Another example commonly cited is planning a vacation: booking flights, making reservations, purchasing tickets, planning itineraries, recommending restaurants, and mapping out transportation routes, etc., and delivering the final results in a neatly organized report with contact information and confirmation numbers—all charged to a credit card, of course.

On the light side, an example you’ll find in the literature about Web 3.0—there must be a reference to Paris Hilton in any definitive body of knowledge—is that a search engine will be able to distinguish between the ubiquitous blond celebrity and the hotel in France, given the context.

Reannotation and RDF
To make the semantic web possible, a heck of a lot of data will have to be “reannotated” with metadata, reviews, tags, diggs, and all those social sharing tools—but also using new tools like Recourse Description Framework (RDF, sometimes described as XML on steroids*) and Web Ontology Language (OWL), which are being developed now at universities, research and consulting firms, big corporations, and government agencies (especially defense). The MIT-based World Wide Web Consortium (W3C) created a Semantic Web Initiative which in 2004 set “core standards” for this array of new technologies.

coming next
The Web 3.0 Pipeline, Part 2: The Media-centric Web



* Dirk-Willem van Gulik, chief technology officer, Joost.



Home
Freedman
Contact


CONTEXT: WEB 2.0 BASICS
Web 2.0 is more collaborative and “user-generated” than the previous version of the World Wide Web. In the early days, content flowed mainly one way: from websites to users. Web 2.0, also called the social Web, enables non-tech users to create content and form communities of content creators. Content now flows every which way and back again—it’s a conversation.

What’s fueled this groundswell shift from traditional to social media is new technology that makes the web more collaborative and participatory. This new tech includes social networking websites, wikis, blogs, and tools that let users publish, review, rate, rank, tag, and share content, all without web programming skills or HTML knowledge. (more)


ABOUT THE AUTHOR
David M. Freedman has worked as a financial, legal, and technology journalist since 1978. He has been a Chicago-based website content developer and media relations consultant since 1999. (more)

 

 

 

© 2008 David M. Freedman, Chicago/Northbrook, IL
847-204-6848
e-mail 

David M. Freedman: The Semantic Web