Monday 28 November 2011

The Semantic Web

Expressing meaning:
Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully. Computers can adeptly parse Web pages for layout and routine processing here a header, here a link to another page but in general, computers have no reliable way to process the semantics. Significant new functionality as machines become much better able to process and "understand" the data that they merely display at present.

The Web has developed most rapidly as a medium of documents for people rather than for data and information that can be processed automatically. The Semantic Web aims to make up for this.

The Semantic Web will be as decentralized as possible.

Knowledge Representation:
For the semantic web to function, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning.

Traditional knowledge-representation systems typically have been centralized, requiring everyone to share exactly the same definition of common concepts such as "parent" or "vehicle.

The challenge of the Semantic Web, therefore, is to provide a language that expresses both data and rules for reasoning about the data and that allows rules from any existing knowledge-representation system to be exported onto the Web.

Two important technologies for developing the Semantic Web are already in place: eXtensible Markup Language (XML) and the Resource Description Framework (RDF). Meaning is expressed by RDF, which encodes it in sets of triples, each triple being rather like the subject, verb and object of an elementary sentence.

The triples of RDF form webs of information about related things.

Ontologies
Two databases may use different identifiers for what is in fact the same concept, such as zip code. A program that wants to compare or combine information across the two databases has to know that these two terms are being used to mean the same thing.

A solution to this problem is provided by the third basic component of the Semantic Web, collections of information called ontologies. In philosophy, an ontology is a theory about the nature of existence, of what types of things exist; ontology as a discipline studies such theories. Artificial-intelligence and Web researchers have co-opted the term for their own jargon, and for them an ontology is a document or file that formally defines the relations among terms. The most typical kind of ontology for the Web has a taxonomy and a set of inference rules.

References:
The semantic web. Tim Berners-Lee, et al. http://www.dblab.ntua.gr/~bikakis/SW.pdf
Semantics of Business Vocabulary and Information Rules (SBVR)
Originated in December 11, 2007. This is a new standard to represent business semantics for machines.

The majority of specialists in the field of semantics focus on what symbols denote, rather than what they connote. There are two fundamental tests for whether a machine is 'doing semantics'.

Test 1: Can the machine determine that some instance (of something) does or does not fall into some class of things the machine knows about? For example, if a machine were handed a fruit electronically could it determine that the fruit was or was not an apple? What the machine 'knows' about the fruit would have to satisfy all the encoded rules for 'apple-ness'. Representing such rules is a primary focus of semantic languages, including in particular those proposed for the semantic web.

Test 2: Can the machine determine whether or not two expressions mean the same? For example, suppose humans take customer and client (and perhaps cliente in Spanish) to designate the same concept (think synonyms). If humans specify as much to machines, then the machines will 'know' the meaning denoted by the symbols is the same one, not different.

The essence of SBVR then turns out to be a vocabulary a painfully deep and complete vocabulary covering all the discrete ideas needed to structure the semantics of rule-ish sentences

What makes SBVR so unique?
SBVR is a vocabulary (or more accurately, a set of inter-related sub-vocabularies) that permits the capture of semantics for the kinds of sentences commonly used to express business rules.

Why SBVR practices?
1. You need to retain business know-how. You need your operational business knowledge to be explicit, rather than tacit. Your company runs the risk of losing key people. You need a pragmatic approach to knowledge retention.

2. Your business doesn't really know what its business rules are. You need a better way to manage and disseminate business rules consistently across various parts of the organization.

3. You need to develop operational decision logic and other kinds of shared, knowledge-rich specifications directly with business people in business terms.

References:

What is the Semantic Web?

Definitions:
1. The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF). See also the separate FAQ for further information [1].

2. Right now the HTML+CSS is centered more on structure and presentation. Semantics is about the meaning of the information. In semantic web you use shared ontologies to establish meaning (semantic) of the object and meaning of relations between the objects. Best known ontologies are: FOAF and Dublin Core.

Typically semantics would be expressed in specialized language, such as RDF or OWL. RDF can be embedded within XHTML using eRDF or W3C's RDFa.

Less structured alternative to eRDF/RDFa are microformats. [2]

3. This is a more practical definition which I understand much more. I found it in Stackoverflow as well.

4. Real world example

5. Introduction to the Semantic web

6. Semantic web frameworks

7. Semantic web and Syntactic web resources

8. Commercial applications using Semantic web

9. Semantic web stack

10. Semantic web site

11. The Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents. It is also about language for recording how the data relates to real world objects [3].

12. Semantic Web Standards Wiki

13. Semantic Overflow

14. Semantic Web Case Studies and Use Cases


References:

ONTORULE Project

Reading some material on regards the development of business rules and their integration with ontologies and the semantic web I found with a project that aims to do so. The 1st International Workshop on Business Models, Business Rules and Ontologies (BuRO 2010).

Workshop Description
Three views on the business organization:
1. The view of the business analyst using a formal and validated business model
2. The view of the knowledge engineer via ontologies and rules
3. The view of the IT department via an operationalization in applications

These views can be glued with an end-to-end point solution:
1. Conceptualization and where possible acquisition of business models and their transformation into ontologies and rules.
2. Their management and maintenance
3. The transparent operationalization in IT applications

The vision at the heart of the Semantic Web is of high relevance in a business setting as well. The workshop deals with the different issues that arise in a company that wishes to have a transparent and where possible a useful and semi-automatic transfer of knowledge present in business documents expressing, e.g., policies, to an IT operationalization. Moreover, the workshop uses a holistic perspective, raising awareness for the overall picture, instead of focusing on stand-alone issues. E.g., although OWL is well-investigated it is unclear how business knowledge expressed in SBVR can be mapped to it.

Topics of interest 
- The acquisition of ontologies and rules from unstructured text via Natural Language Processing (NLP) techniques.
- The development of a complete, formal and validated business model, taking all possible inputs into account (people and documents, structured and unstructured, some of which as output from an NLP phase), using the Semantics of Business Vocabulary and Business Rules (SBVR).
- Transformation from structured business representations, from SBVR, to RDF/OWL and/or rules.
- The management and maintenance of business models, ontologies and rules, e.g., consistency maintenance and the integration of rules and ontologies (semantics, algorithms).
- Implementations of such management systems.
- Use cases and field reports.

Further readings:
1. The Semantic Web by Tim Berners-Lee
2. The Semantic Web. Recompilation of references and definitions
3. Semantics of Business Vocabulary and Business Rules (SBVR)

References
The ONTORULE Project http://ontorule-project.eu/dissemination/events/buro2010

Starting on the Web Semantic and related

I am going to be researching about ontologies, web semantic and business rules. In this moment I do not feel quite comfortable with these topics, I do not understand many things. However, I am willing to invest my time in learning these important subjects. I am an IT student, I like to do programming, PHP and related frameworks like CakePHP and recently Yii. The reason why I want to study about ontologies and the web semantic is basically because I believe these technologies are going to be the next generation of the Internet. This knowledge will contribute enormously to my professional development. So this blog is aimed to help me in this task. However, if I find a better way to organise my ideas, then I will have to change from this environment.

Note: On the way if I can help somebody else that will be even greater.