The HPC Lab at ISTI-CNR has developed a service provider ranker (SPRanker) tool. The SPRanker module intervenes between the three main service-oriented architecture stakeholders: 'providers' that publish services, 'users' that discover and bind services, and 'brokers' that act as a provider's medium for spreading information on services to users. Users, instead, use a broker to locate and select the services they need. Figure 1 depicts the publish-discover-bind process that typically takes place in real-world Grid-service-based SOAs.
Modern approaches to quality of service (QoS) within SOAs adopt service-level agreements (SLAs) as a way of defining constraints that must be satisfied by customers and providers. In this respect, SLAs represent a 'best effort' strategy for selecting service providers. Simply put, in this kind of system the broker is usually a UDDI (Universal Description, Discovery and Integration) registry publishing WSDL (Web Services Description Language) of the stored services. Providers publish a service by pushing its description into the registry. Customers discover services by querying the registry for a service URI. Finally, the binding phase consists of invoking the actual service through Simple Object Access Protocol (SOAP) .
We hypothesize that in the near future, the world will be populated by thousands of millions of services from different providers. Like the Web today, where the same information is supplied by different Web sites, many different providers will supply diverse services in the future Net.
In order to develop a reliable, scalable, highly available and highly performing service, it is necessary that the discovery phase provide the best possible set of matching services (ie that it have a high level of precision). In SLA-based SOAs, once a service has been found it is bound to the client only if the SLA-template the provider offers is appropriate for the customer.
SPRanker not only returns a flat list of results but also ranks the various providers according to some quality metric. The service designer eventually chooses the provider from the ranked list. Note that SPRanker is different from UDDI registries, which are not capable of retrieval on the basis of QoS information. The use of UDDI corresponds to composing workflows according to a 'best effort' strategy.
Our ranked discovery service implements a novel ranking schema based on solid information retrieval theory, namely the vector space model, by considering historical information on expired SLAs. The ranking score is in fact based on the assumption that provider performance (in terms of QoS) should be evaluated collaboratively by considering user feedback.
The vector space model represents an object as a vector in ℜn where each dimension corresponds to a separate term. If a term occurs in the object, its value in the corresponding vector entry is non-zero. SLAs are the objects that are modelled as vectors in ℜn . Here, n is the number of possible SLA terms, and each SLA term is mapped into a particular dimension. To keep the model as simple as possible we consider only unit vectors. The normalization is done in such a way that all the vector coordinates will range between 0 and 1/n .
An SLA-vector is defined as a unit vector representing a successfully completed service-level agreement issued to a service provider at a given time. Each value of the vector is the value associated to a term of the SLA template. For example, could represent the SLA of a service provided at time T running on a 2-way SMP, with 1TB free disk space and at the cost of 0.04$.
Queries in SPRanker are called Query-SLAs. A Query-SLA is a unit vector where each value can assume one of the following values:
1. A reference value for term Ti of the SLA template;
2. ο, meaning that we do not want to take into account the i-th term; and
3. •, meaning that the i-th term may assume any value between 0 and 1/n.
Assume a Query-SLA and a set of SLA-vectors representing an SLA successfully issued at time T, by a provider, on a particular service. A similarity function that takes into account the provider, service name and SLA issue time is defined. We define sim=0 if either the provider or the service name differs. Otherwise the similarity is defined as the sum of the common terms shared by Query-SLA and an SLA-vector weighted by the time elapsed since the SLA was issued. Presenting a Query-SLA, SPRanker seeks a list of providers offering a particular service ordered by similarity with the query.
The architecture of SPRanker is composed of three modules: gatherer, indexer and query server. The gatherer collects data from (positively) expired SLAs. We only consider positively expired SLAs because we want to discriminate between good and bad service provisioning, and because we want to avoid satisfied customers incurring false bad judgements from malicious partners (clients or customers) willing to lower a provider's score. The gatherer can act in two different modes push-based and pull-based.
When in push-based modality, the gatherer receives SLAs directly from providers and customers. In contrast, pull-based mode is used to periodically poll known providers for up-to-date information. The indexer is used to transform SLAs collected by the gatherer into a machine-readable format. The query server has been implemented as a Web service. It offers two distinct methods, one for each kind of query answered by SPRanker.
This research has been carried out in the frame of S-Cube, the "European Network of Excellence in Software Services and Systems" funded by the European Commission's Seventh Framework Programme.
Franco Maria Nardini
Tel: +39 0050 3153055