Name

zoom — Metaproxy ZOOM Module

DESCRIPTION

This filter implements a generic client based on ZOOM of YAZ. The client implements the protocols that ZOOM C does: Z39.50, SRU (GET, POST, SOAP) and SOLR .

This filter only deals with Z39.50 on input. The following services are supported: init, search, present and close. The backend target is selected based on the database as part search and not as part of init.

This filter is an alternative to the z3950_client filter but also shares properties of the virt_db - in that the target is selected for a specific database

The ZOOM filter relies on a target profile description, which is XML based. It picks the profile for a given database from a web service or it may be locally given for each unique database (AKA virtual database in virt_db). Target profiles are directly and indrectly given as part of the torus element in the configuration.

CONFIGURATION

The configuration consists of five parts: torus, fieldmap, cclmap, contentProxy and log.

torus

The torus element specifies target profiles and takes the following content:

attribute url

URL of Web service to be used to fetch target profile for a given database (udb) of type searchable. The special sequence %db of the URL is replaced by the actual database specified as part of Search.

The special sequence %realm is replaced by value of attribute realm or by realm DATABASE argument.

attribute content_url

URL of Web service to be used to fetch target profile for a given database (udb) of type content. Semantics otherwise like url attribute above.

attribute realm

The default realm value. Used for %realm in URL, unless specified in DATABASE argument.

attribute proxy

HTTP proxy to bse used for fetching target profiles.

attribute xsldir

Directory that is searched for XSL stylesheets. Stylesheets are specified in the target profile by the transform element.

attribute element_transform

Specifies the element that triggers retrieval and transform using the parameters elementSet, recordEncoding, requestSyntax, transform from the target profile. Default value is "pz2", due to the fact that for historical reasons the common format is that used in Pazpar2.

attribute element_raw

Specifies an element that triggers retrieval using the parameters elementSet, recordEncoding, requestSyntax from the target profile. Same actions as for element_transform, but without the XSL transform. Useful for debugging. The default value is "raw".

element records

Local target profiles. This element may includes zero or more record elements (one per target profile). See section TARGET PROFILE.

fieldmap

The fieldmap may be specified zero or more times and specifies the map from CQL fields to CCL fields and takes the following content:

attribute cql

CQL field that we are mapping "from".

attribute ccl

CCL field that we are mapping "to".

cclmap

The third part of the configuration consists of zero or more cclmap elements that specifies base CCL profile to be used for all targets. This configuration, thus, will be combined with cclmap-definitions from the target profile.

contentProxy

The contentProxy element controls content proxy'in. This section is optional and must only be defined if content proxy'ing is enabled.

attribute server

Specifies the content proxy host. The host is of the form host[:port]. That is without a method (such as HTTP) and optional port number.

attribute tmp_file

Specifies a filename of a session file for content proxy'ing. The file should be an absolute filename that includes XXXXXX which is replaced by a unique filename using the mkstemp(3) system call. The default value of this setting is /tmp/cf.XXXXXX.p.

log

The log element controls logging for the ZOOM filter.

attribute apdu

If the value of apdu is "true", then protocol packages (APDUs and HTTP packages) from the ZOOM filter will be logged to the yaz_log system. A value of "false" will not perform logging of protocol packages (the default behavior).

QUERY HANDLING

The ZOOM filter accepts three query types: RPN(Type-1), CCL and CQL.

Queries are converted in two separate steps. In the first step the input query is converted to RPN/Type-1. This is always the common internal format between step 1 and step 2. In step 2 the query is converted to the native query type of the target.

Step 1: for RPN, the query is passed un-modified to the target.

Step 1: for CCL, the query is converted to RPN via cclmap elements part of the target profile as well as base CCL maps.

Step 1: For CQL, the query is converted to CCL. The mappings of CQL fields to CCL fields are handled by fieldmap elements as part of the target profile. The resulting query, CCL, is the converted to RPN using the schema mentioned earlier (via cclmap).

Step 2: If the target is Z39.50-based, it is passed verbatim (RPN). If the target is SRU-based, the RPN will be converted to CQL. If the target is SOLR-based, the RPN will be converted to SOLR's query type.

SORTING

The ZOOM module actively handle CQL sorting - using the SORTBY parameter which was introduced in SRU version 1.2. The conversion from SORTBY clause to native sort for some target is driven by the two parameters: sortStrategy and sortmap_field.

If a sort field that does not have an equivalent sortmap_-mapping is passed un-modified through the conversion. It doesn't throw a diagnostic.

TARGET PROFILE

The ZOOM module is driven by a number of settings that specifies how to handle each target. Note that unknown elements are silently ignored.

The elements, in alphabetical order, are:

authentication

Authentication parameters to be sent to the target. For Z39.50 targets, this will be sent as part of the Init Request. Authentication consists of two components: username and password, separated by a slash.

If this value is omitted or empty no authentication information is sent.

cclmap_field

This value specifies CCL field (qualifier) definition for some field. For Z39.50 targets this most likely will specify the mapping to a numeric use attribute + a structure attribute. For SRU targets, the use attribute should be string based, in order to make the RPN to CQL conversion work properly (step 2).

cfAuth

When cfAuth is defined, its value will be used as authentication to backend target and authentication setting will be specified as part of a database. This is like a "proxy" for authentication and is used for Connector Framework based targets.

cfProxy

Specifies HTTP proxy for the target in the form host:port.

cfSubDB

Specifies sub database for a Connector Framework based target.

contentConnector

Specifies a database for content-based proxy'ing.

elementSet

Specifies the elementSet to be sent to the target if record transform is enabled (not to be confused' with the record_transform module). The record transform is enabled only if the client uses record syntax = XML and a element set determined by the element_transform / element_raw from the configuration. By default that is the element sets pz2 and raw. If record transform is not enabled, this setting is not used and the element set specified by the client is passed verbatim.

literalTransform

Specifies a XSL stylesheet to be used if record transform is anabled; see description of elementSet. The XSL transform is only used if the element set is set to the value of element_transform in the configuration.

The value of literalTransform is the XSL - string encoded.

piggyback

A value of 1/true is a hint to the ZOOM module that this Z39.50 target supports piggyback searches, ie Search Response with records. Any other value (false) will prevent the ZOOM module to make use of piggyback (all records part of Present Response).

queryEncoding

If this value is defined, all queries will be converted to this encoding. This should be used for all Z39.50 targets that do not use UTF-8 for query terms.

recordEncoding

Specifies the character encoding of records that are returned by the target. This is primarily used for targets were records are not UTF-8 encoded already. This setting is only used if the record transform is enabled (see description of elementSet).

requestSyntax

Specifies the record syntax to be specified for the target if record transform is enabled; see description of elementSet. If record transform is not enabled, the record syntax of the client is passed verbatim to the target.

sortmap_field

This value the native field for a target. The form of the value is given by sortStrategy.

sortStrategy

Specifies sort strategy for a target. One of: z3950, type7, cql, sru11 or embed. The embed chooses type-7 or CQL sortby depending on whether Type-1 or CQL is actually sent to the target.

sru

If this setting is set, it specifies that the target is web service based and must be one of : get, post, soap or solr.

sruVersion

Specifies the SRU version to use. It unset, version 1.2 will be used. Some servers do not support this version, in which case version 1.1 or even 1.0 could be set it.

transform

Specifies a XSL stylesheet filename to be used if record transform is anabled; see description of elementSet. The XSL transform is only used if the element set is set to the value of element_transform in the configuration.

udb

This value is required and specifies the unique database for this profile . All target profiles should hold a unique database.

urlRecipe

The value of this field is a string that generates a dynamic link based on record content. If the resulting string is non-zero in length a new field, metadata with attribute type="generated-url" is generated. The contents of this field is the result of the URL recipe conversion. The urlRecipe value may refer to an existing metadata element by ${field[pattern/result/flags]}, which will take content of field and perform a regular expression conversion using the pattern given. For example: ${md-title[\s+/+/g]} takes metadata element title and converts one or more spaces to a plus character.

If the contentConnector setting also defined, the resulting value is augmented with a session string as well as host name of the content proxy server.

zurl

This is setting is mandatory and specifies the ZURL of the target in the form of host/database. The HTTP method should not be provided as this is guessed from the "sru" attribute value.

DATABASE parameters

Extra information may be carried in the Z39.50 Database or SRU path, such as authentication to be passed to backend etc. Some of the parameters override TARGET profile values. The format is

udb,parm1=value1&parm2=value2&...

Where udb is the unique database recognised by the backend and parm1, value1, .. are parameters to be passed. The following describes the supported parameters. Like form values in HTTP the parameters and values are URL encoded. The separator, though, between udb and parameters is a comma rather than a question mark. What follows question mark are HTTP arguments (in this case SRU arguments).

user

Specifies user to be passed to backend. If this parameter is omitted, the user will be taken from TARGET profile setting authentication .

password

Specifies password to be passed to backend. If this parameters is omitted, the password will be taken from TARGET profile setting authentication .

proxy

Specifies proxy to be for backend. If this parameters is omitted, the proxy will be taken from TARGET profile setting cfProxy .

cproxysession

Session ID for content proxy. This parameter is, generally, not used by anything but the content proxy itself.

realm

Session realm to be used for this target, changed the resulting URL to be used for getting a target profile, by changing the value that gets substituted for the %realm string.

x-parm

All parameters that has prefix x, dash are passed verbatim to the backend.

SCHEMA

# Metaproxy XML config file schema

namespace mp = "http://indexdata.com/metaproxy"

filter_zoom =
  attribute type { "zoom" },
  attribute id { xsd:NCName }?,
  attribute name { xsd:NCName }?,
  element mp:torus {
    attribute url { xsd:string },
    attribute content_url { xsd:string }?,
    attribute realm { xsd:string },
    attribute xsldir { xsd:string }?,
    attribute element_transform { xsd:string }?,
    attribute element_raw { xsd:string }?,
    attribute proxy { xsd:string }?,
    element mp:records {
      element mp:record {
        element mp:authentication { xsd:string }?,
        element mp:piggyback { xsd:string }?,
        element mp:queryEncoding { xsd:string }?,
        element mp:udb { xsd:string },
        element mp:cclmap_au { xsd:string }?,
        element mp:cclmap_date { xsd:string }?,
        element mp:cclmap_isbn { xsd:string }?,
        element mp:cclmap_su { xsd:string }?,
        element mp:cclmap_term { xsd:string }?,
        element mp:cclmap_ti { xsd:string }?,
        element mp:elementSet { xsd:string }?,
        element mp:recordEncoding { xsd:string }?,
        element mp:requestSyntax { xsd:string }?,
        element mp:sru { xsd:string }?,
        element mp:sruVersion { xsd:string }?,
        element mp:transform { xsd:string }?,
        element mp:literalTransform { xsd:string }?,
        element mp:urlRecipe { xsd:string }?,
        element mp:zurl { xsd:string },
        element mp:cfAuth { xsd:string }?,
        element mp:cfProxy { xsd:string }?,
        element mp:cfSubDB { xsd:string }?,
        element mp:contentConnector { xsd:string }?,
        element mp:sortStrategy { xsd:string }?,
        element mp:sortmap_author { xsd:string }?,
        element mp:sortmap_date { xsd:string }?,
        element mp:sortmap_title { xsd:string }?
      }*
    }?
  }?,
  element mp:fieldmap {
    attribute cql { xsd:string },
    attribute ccl { xsd:string }?
  }*,
  element mp:cclmap {
    element mp:qual {
      attribute name { xsd:string },
      element mp:attr {
        attribute type { xsd:string },
        attribute value { xsd:string }
      }+
    }*
  }?,
  element mp:contentProxy {
    attribute server { xsd:string }?,
    attribute tmp_file { xsd:string }?
  }?,
  element mp:log {
    attribute apdu { xsd:boolean }?
  }?




  

EXAMPLES

The following configuration illustrates most of the facilities:

    <filter type="zoom">
      <torus
         url="http://torus.indexdata.com/src/records/?query=udb%3D%db"
	 proxy="localhost:3128"
         />
      <fieldmap cql="cql.anywhere"/>
      <fieldmap cql="cql.serverChoice"/>
      <fieldmap cql="dc.creator" ccl="au"/>
      <fieldmap cql="dc.title" ccl="ti"/>
      <fieldmap cql="dc.subject" ccl="su"/>
      
      <cclmap>
        <qual name="ocn">
          <attr type="u" value="12"/>
          <attr type="s" value="107"/>
        </qual>
      </cclmap>
      <log apdu="true"/>
    </filter>

   

SEE ALSO

metaproxy(1)

virt_db(3mp)

COPYRIGHT

Copyright (C) 2005-2011 Index Data