Saturday, September 29, 2007

Microformats: One Step towards Semantic Web (Part 1)

So what is really Semantic Web? Well, Semantic Web is an extension of the World Wide Web in which content can be expressed in a format that can be read and used by software not only by humans, to facilitate the searching, sharing and integration of information. In simple words it is a way to standardize the data and information representation over the world wide web, so that software would be able to understand its semantics without the need of complicated intelligent software. There have been several technologies that drive the road to semantic web like Resource Description Framework (RDF) and Web Ontology Language (OWL).

Semantic Web is one of the theories that Tim Burners Lee the father of the World Wide Web has been dreaming about since he launched the first website in the history in 1991. He had a vision towards this giant network that it shouldn't be a human-human communication only, machines should have a role too. His exact words were "a single Web of meaning, about everything and for everyone.". He made his idea available freely with no patent, so they can be easily adopted by anyone.

During the keynote at Microsoft MIX06, Bill Gates said, “We need microformats and to get people to agree on them. It is going to bootstrap exchanging data on the Web . . . we need them for things like contact cards, events, directions.”



What are Microformats?

Microformats are, according to the definition of microformats.org,

“Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g., XHTML, blogging).”

In another words, it is a web-based data formatting methodology that re-use existing content as metadata, using only XHTML and HTML classes and attributes. So that human still be able to understand it as they used to and software agents can process it easily to extract the information they want.

Microformats might catch up with the web development community faster than the RDF or OWL. As it is build on the existing skills of web developers. It doesn’t require developers to learn whole new technologies, throw away their existing code bases, or wait years for browser developers to catch up and actually implement support for the technology. Microformats simply use features of HTML that have been around for years and are familiar to most web developers,

Current microformats allow the encoding and extraction of events, contact information, social relationships, and so on. More are being developed.

Microformats uses XHTML and HTML standards allow for semantics to be embedded within the attributes of markup tags. Microformats indicates the presence of metadata using the following attributes class, rel and rev.

Let us see an example of microformats to understand how it works, the following example is how to use microformats to describe contact information:

<div class="vcard">
<div class="fn">Fady Younan</div>
<div class="org">ITWorx</div>
<div class="tel">000-555-1234</div>
<a class="url" href="http://www.fady-younan.com">http://www.fady-younan.com</a>

</div>


Another example of how describe geographic information:

<span class="geo">
<span class="latitude">28.47</span>,
<span class="longitude">19.89</span>

</span>


If you haven't notice it uses the same XHTML and CSS to embed semantics in the web page, it uses the class "geo" to describe that is a geographic information, latitude and longitude to describe the co-ordination and so on ...

Here is list of some of the available microformats and there are others under development:
  • hCard: describes the contact information for people or organizations
  • hCalendar: describes information about events such as conferences,meetings and parties
  • XFN: describes the relationships between people.
  • hReview: describes the reviews for movies, books .. etc.
To be continued ...

Saturday, September 1, 2007

Erlang: Web Development (Part 4)

Erlang was mainly developed targeting the telecom application. It had to satisfy some of requirements of the telecom applications which are being distributed, massively concurrent, real-time and high availability. These requirements are similar to those put upon Internet based applications, making Erlang a good candidate for development of web based services. So is Erlang ready to be an enterprise web development framework?

The first tries of Erlang to conquer the web development community was back at 1997, when Ericsson developed its first web server. This server was called INETS. It is still a part of the Open Telecom Platform (OTP). Back there it was written with 10,000 lines of code (Don’t forget it is a functional language) and achieved 80% of the Apache server functionality which at that time consisted of about 100,000 lines of code.


Yaws (Yet Another Web Server) is another HTTP high performance web server particularly well suited for dynamic-content web applications. It is entirely written in Erlang. It was released as an open source in 2002. It is a multithreaded web server where a light weight process is used to handle each client request. The performance of Yaws comes from the underlying Erlang system and its ability to handle concurrent processes in an efficient way. It was a revolutionary web server considering the performance.

Let us not forget the famous comparison between Yaws and Apache. In a denial of service attack, the number of parallel connections needed to crash the Erlang web server was about 20 times as many as an Apache web server running on the same hardware. Apache (blue and green) dies when subject to a load of concurrent 4,000 parallel requests. Yaws (red) works till 80,000 concurrent requests.


I will try to give you a simple example about how to create a simple dynamic web page with Yaws.

This a simple Hello world to Yaws.

<html>
<h1> Title</h1>
<erl>
out(Arg) -> {html, “Hello World"}.
</erl>
<h1> Something</h1>
</html>

Thanks to Yariv Sadan’s effort to involve Erlang in the web development community using ErlyWeb which is a web framework that helps you easily build database-driven applications using the MVC architecture. It's similar to Ruby on Rails, except that it's written in Erlang. There is also ErlyDB a database abstraction layer generator. ErlyDB taps into Erlang’s runtime metaprogramming powers to generate an abstraction layer for your database on the fly.


I am going to propose an architecture for web development using Erlang. It will consists of five layers:

  • Hardware/ Operating System: This layer represents the physical layer of the system. It can be a Linux system running on a network file system(NFS) to make use of the concurrency and distribution features in Erlang.
  • DBMS: This layer represents the persistence layer of the system. It can be either Mnesia or MySQL.
  • Data Base Abstraction Layer.
  • Erlang/Open Telecom Platform: It is the standard libraries of the Erlang language.
  • Presentation Layer: This layer consists of two components. Yaws as a web server and ErlyWeb as rapid development framework.

Now, I can see that Erlang is starting to involve itself in the web development community. The problem is that the Erlang community isn't big enough to support this involvement. Is this going to change?



Erlang: Open Telecom Platform (Part 3)

Is Erlang really a new language? Do we have to write our old libraries and applications again from scratch? Actually Erlang is an old language may be older than Java and dot NET. It was developed in the late 80's. I will talk today about one the strengths of Erlang which is the Open Telecom Platform (OTP). The OTP is a development platform for building telecommunications applications based on Erlang. It was released as open source in 1998.

The OTP is a development system platform for building, and a control system platform for running, telecommunications applications. It is not a monolithic platform, but is made up of sets of tools and building blocks. Most of it is written in Erlang, there are some components that are written in C++. The OTP architecture consists of three layers as shown in the figure.


The Bottom layer: The system hardware in this layer. This is merely an architectural view; in real systems, the bottom layer contains many computers which may be of different types.

The Middle layer: Support for telecommunications requirements is provided by a robust real-time components. Its main modules are:
  • Erlang run-time system: The basic system that supports the execution of Erlang programs.
  • Web server: A Web server that serves HTTP requests via executing server side applications written in Erlang.
  • Mnesia: A real-time fault-tolerant distributed DBMS that supports fast transactions for the telecommunications application, and a query language, called Mnemosyne, for handling complex queries.
  • SASL: The systems architecture support libraries (SASL) contain basic software that supports system start/restart, live system software updates, and process management.
  • SNMP support: SNMP provides run-time support through an extensible agent.
The Top layer: All applications have access to Mnesia and SASL. The SNMP agent and the Web server may also invoke functions that are provided by the applications in this layer.

Using this platform, libraries and utilities you will be able to write robust, real-time systems. You won't be able to realize how powerful is Erlang. I will talk next time about existing systems that uses Erlang.

To be continued ...