by Pablo Cesar, Dick Bulterman and Jack Jansen

In order to make multimedia a first-class citizen on the Web, there is a need for major efforts across the community. European projects such as Passepartout (ITEA) and SPICE (IST IP) show that there is a need for a standardized mechanism to provide rich interaction for continuous media content. CWI is helping to build a framework that adds a temporal dimension to existing a-temporal Web browsers.

Web 2.0 is not so much a technological revolution as an evolution in the attitude of end-users towards the Web. What started as a global library is becoming a social meeting place in which users can share views and content. Where the initial focus was on a repository of static documents, the future focus will be on the provision of dynamic document services, such as the one shown in Figure 1.

Figure 1: Screenshot of the e-tourism scenario.
Figure 1: Screenshot of the e-tourism scenario.

Some of the future scenarios that motivate our work are:

  • E-tourism: an online guide to a city that includes videos of the place of interest. The presentation can dynamically export the coordinates of the locations presented in the videos, which can be used to represent the guided tour in an external engine such as Google maps, as shown in Figure 1. In addition, a GPS-enabled phone can request the current coordinates and import them into a presentation that shows the location of the user.
  • E-learning: an e-learning portal that includes synchronized videos and slides. An interactive test controlled by an external calculation engine can provide results to the media player, allowing the learning material to be adapted to the knowledge of the student.
  • E-commercials: media-based commercials can be customized to a specific user by, for example, displaying the name of the user and adapting the media presentation based on user preferences.

Persistent segmentation: for example, by allowing the user to explicitly pause a presentation and then restart it at some later point – possibly days or weeks later.

In order to realize the services-oriented vision with video, an interaction model needs to be defined that transcends the traditional control set of start, stop and pause. The content within the video element will need to be triggered from external, peer-level content, as in the e-commercial scenario. That content in turn needs to trigger related content within the context of a higher-level embedding, as in the e-tourism scenario.

The scenarios show that there is a clear need for richer temporal semantics when integrating a conventional (X)HTML browser interface with multimedia documents. To this end, we wrap videos with an external data model, to extend content-related (not content-based) interaction. The data model – rather than the video encoding - is the focal point for sharing, mashing and reusing individual objects.

Following the lead of XForms, our data model is defined as a small XML document. This data model is language-independent and can be shared between different XML-based documents such as (X)HTML, SMIL, or SVG. In addition, the framework provides support for defining and manipulating the value of variables in the data model. Moreover, the framework provides the mechanism by which variables can be evaluated at runtime and the state variable values saved for the next time the media document is played.

By exporting the data model to the outside world, it becomes possible for the media document to affect other contexts, eg the (X)HTML presentation. At the same time, external engines can affect the media presentation. So, unlike embedded video players, in our scenarios the video plays an active role in the Web page.

At the moment, the framework is implemented in the Ambulant open-source SMIL player. The work sketched in this article has been submitted to the W3C's SYMM working group under the name of smilState. It is expected to be integrated in the SMIL 3.0 release in early 2008. We are also actively participating in the W3C Backplane work to use the results from this and other Web groups to integrate a broadly consistent framework for sharing the data model across XML-based languages.

This work has been funded by the Dutch Bsik BRICKS project, the ITEA Project Passepartout and the FP6 IST project SPICE. Development of the open-source Ambulant Player and CWI's participation in the SMIL standardization effort have been funded by the NLnet foundation.


Please contact:
Pablo Cesar, CWI, The Netherlands
Tel: +31 20 592 4332

Next issue: July 2023
Special theme:
"Eplainable AI"
Call for the next issue
Get the latest issue to your desktop
RSS Feed
Cookies user preferences
We use cookies to ensure you to get the best experience on our website. If you decline the use of cookies, this website may not function as expected.
Accept all
Decline all
Read more
Tools used to analyze the data to measure the effectiveness of a website and to understand how it works.
Google Analytics
Set of techniques which have for object the commercial strategy and in particular the market study.
DoubleClick/Google Marketing