Next Previous Contents

5. Result sets

This section covers the queries used by IrTcl, and how searches and presents are handled.

A search operation and a result set is described by the ir set object. The ir set object is defined by the ir-set command which has two parameters. The first is the name of the new ir set object, and the second, which is optional, is the name of an assocation -- an ir object. The second argument is required if the ir set object should be able to perform searches and presents. However, it is not required if only ``local'' operations is done with the ir set object.

When the ir set object is created a number of settings are inherited from the ir object, such as the selected databass, query type, etc. Thus, the ir object contains what we could call default settings.

5.1 Queries

Search requests are sent by the search action which takes a query as parameter. There are two types of queries, RPN and CCL, controlled by the setting queryType. A string representation for the query is used in IrTcl since Tcl has reasonably powerful string manipulaton capabilities. The RPN query used in IrTcl is the prefix query notation also used in the YAZ test client.

The CCL query is an uninterpreted octet-string which is parsed by the target. We refer to the standard: ISO 8777. Note that only a few targets actually support the CCL query and the interpretation of the standard may vary.

The prefix query notation (which is converted to RPN) offer a few operators. They are:

@attr list op

The attributes in list are applied to op

@and op1 op2

Boolean and on op1 and op2

@or op1 op2

Boolean or on op1 and op2

@not op1 op2

Boolean not on op1 and op2

@prox list op1 op2

Proximity operation on op1 and op2. Not implemented yet.

@set name

Result set reference

@attrset set

Whole query uses the specified attribute set. If this operator is used it must be defined at the beginning of the query.

It is simple to build RPN queries in IrTcl. Search terms are sequences of characters, as in:

   science

Boolean operators use the prefix notation (instead of the suffix/RPN), as in:

   @and science technology

Search terms may be associated with attributes. These attributes are indicated by the @attr operator. Assuming the bib-1 attribute set, we can set the use-attribute (type is 1) to title (value is 4):

   @attr 1=4 science

Also, it is possible to apply attributes to a range of search terms. In the query below, both search terms have use=title but the tech term is right truncated:

   @attr 1=4 @and @attr 5=1 tech beta

To search for the DatabaseInfo records from an Explain server, we could use

   @attrset exp1 @attr 1=1 DatabaseInfo

5.2 Search

The settings that affect the search are listed below:

databaseNames list

Database-names.

smallSetUpperBound integer

Small set upper bound. Default 0.

largeSetLowerBound integer

Large set lower bound. Default 1.

mediumSetPresentNumber integer

Medium set present number. Default 0.

replaceIndicator boolean

Replace-indicator. Default true (1).

setName string

Name of result set. Default name of set is default.

queryType rpn|ccl

Query type-1 or query type-2. Default rpn (type-1).

preferredRecordSyntax string

Preferred record syntax -- UNIMARC, USMARC, etc.

smallSetElementSetNames string

small-set-element-set names. If string is empty the element set is not set. Default is empty (not set).

mediumSetElementSetNames string

medium-set-element-set names. If string is empty the element set is not set. Default is empty (not set).

nextResultSetPosition returns integer

Next result set position.

referenceId string

Reference-id. If string is empty no reference-id is used.

searchResponse list

Search-response Tcl script.

callback list

General response Tcl script. Only used if searchResponse is not specified. This setting is valid only for the ir object -- not the ir-set object.

Setting the databaseNames is mandatory. All other settings have reasonable defaults. The search-response handler, specified by the callback - or the searchResponse setting, should read some of the settings shown below:

searchStatus returns boolean

Search-status. True if search operation was successful; false otherwise.

responseStatus returns list

Response status information.

resultCount returns integer

result-count

numberOfRecordsReturned returns integer

Number of records returned.

referenceId returns string

Reference-id of search response.

The responseStatus signals one of three conditions which is indicated by the value of the first item in the list:

NSD

indicates that the target has returned one or more non-surrogate diagnostic messages. The NSD item is followed by a list with all non-surrogate messages. Each non-surrogate message consists of three items. The first item of the three items is the error code (integer); the next item is a textual representation of the error code in plain english; the third item is additional information, possibly empty if no additional information was returned by the target.

DBOSD

indicates a successful operation where the target has returned one or more records. Each record may be either a database record or a surrogate diagnostic.

OK

indicates a successful operation -- no records are returned from the target.

Example

We continue with the multiple-targets example. The init-response procedure will attempt to make searches:

proc init-response {assoc} {
    puts "$assoc connected"
    ir-set ${assoc}.1 $assoc
    $assoc.1 queryType rpn
    $assoc.1 databaseNames base-a base-b
    $assoc callback [list search-response $assoc ${assoc}.1]
    $assoc.1 search "@attr 1=4 @and @attr 5=1 tech beta"
}

An ir set object is defined and the ir object is told about the name of ir object. The ir set object use the name of the ir object as prefix.

Then, the query-type is defined to be RPN, i.e. we will use the prefix query notation later on.

Two databases, base-a and base-b, are selected.

A search-response handler is defined with the ir object and the ir-set object as parameters and the search is executed.

The first part of the search-response looks like:

proc search-response {assoc rset} {
    set status [$rset responseStatus]
    set type [lindex $status 0]
    if {$type == "NSD"} {
        set code [lindex $status 1]
        set msg [lindex $status 2]
        set addinfo [lindex $status 3]
        puts "NSD $code: $msg: $addinfo"
        return
    } 
    set hits [$rset resultCount]
    if {$type == "DBOSD"} {
        set ret [$rset numberOfRecordsReturned]
        ...
    }
}
The response status is stored in variable status and the first element indicates the condition. If non-surrogate diagnostics are returned they are displayed. Otherwise, the search was a success and the number of hits is read. Finally, it is tested whether the search response returned records (database or diagnostic).

Note that we actually didn't inspect the search status (setting searchStatus) to determine whether the search was successful or not, because the standard specifies that one or more non-surrogate diagnostics should be returned by the target in case of errors.

End of example

If one or more records are returned from the target they will be stored in the result set object. In the case in which the search response contains records, it is very similar to the present response case. Therefore, some settings are common to both situations.

5.3 Present

The present action sends a present request. The present is followed by two optional integers. The first integer is the result-set starting position -- defaults to 1. The second integer is the number of records requested -- defaults to 10. The settings which could be modified before a present action are:

preferredRecordSyntax string

preferred record syntax -- UNIMARC, USMARC, etc.

elementSetNames string

Element-set names. If string is empty the element set is not set. Default is empty (not set).

referenceId string

Reference-id. If string is empty no reference-id is used.

presentResponse list

Present-response Tcl script.

callback list

General response Tcl script. Only used if presentResponse is not specified This setting is valid only for the ir object -- not the ir-set object.

The present-response handler should inspect the settings shown in table below. Note that responseStatus and numberOfRecordsReturned settings were also used in the search-response case.

As in the search response case, records returned from the target are stored in the result set object.

presentStatus returns boolean

Present-status.

responseStatus returns list

Response status information.

numberOfRecordsReturned returns integer

Number of records returned.

nextResultSetPosition returns integer

Next result set position.

referenceId returns string

Reference-id of present response.

5.4 Records

Search responses and present responses may result in one or more records stored in the ir set object if the responseStatus setting indicates database or surrogate diagnostics (DBOSD). The individual records, indexed by an integer position offset, should then be inspected.

If element set names have been specified either in the search requests (smallSetElementSetNames / mediumSetElementSetNames) or present requests (elementSetNames) the individual records in the ir set object are assigned appropriate element set ids. In this mode records at a given position are treated different as long as they have difference element set ids. To inspect records with a particular element set id in subsequent operations use the recordElements setting followed by the id. If you have more than one record at a given position and you do not use recordElements the record selected at the given position is undefined.

The action type followed by an integer returns information about a given position in an ir set. There are three possiblities:

SD

The item is a surrogate diagnostic record.

empty

There is no record at the specified position.

DB

The item is a database record.

To handle the first case, surrogate diagnostic record, the Diag action should be used. It returns three items: error code (integer), text representation in plain english (string), and additional information (string, possibly empty).

In the second case, no record, note that there still might be a record at the position but with an id that differs from that specified by recordElements.

In the third case, database record, the recordType action should be used. It returns the record type at the given position. Some record types are:

UNIMARC INTERMARC CCF USMARC UKMARC NORMARC LIBRISMARC DANMARC FINMARC SUTRS

Example

We continue our search-response example. In the case, DBOSD, we should inspect the result set items. Recall that the ir set name was passed to the search-response handler as argument rset.

    if {$type == "DBOSD"} {
        set ret [$rset numberOfRecordsReturned]
        for {set i 1} {$i<=$ret} {incr i} {
            set itype [$rset type $i]
            if {$itype == "SD"} {
                set diag [$rset Diag $i]
                set code [lindex $diag 0]
                set msg [lindex $diag 1]
                set addinfo [lindex $diag 2]
                puts "$i: NSD $code: $msg: $addinfo"
            } elseif {$itype == "DB"} {
                set rtype [$rset recordType $i]
                puts "$i: type is $rtype"
            }
        }
    }
Each item in the result set is examined. If an item is a diagnostic message it is displayed; otherwise if it's a database record its type is displayed.

End of example

5.5 MARC records

In the case, where there is a MARC record at a given position we want to display it somehow. The action getMarc is what we need. The getMarc is followed by a position integer and the type of extraction we want to make: field or line.

The field and line type are followed by three parameters that serve as extraction masks. They are called tag, indicator and field. If the mask matches a tag/indicator/field of a record the information is extracted. Two characters have special meaning in masks: the dot (any character) and star (any number of any character).

The field type returns one or more lists of field information that matches the mask specification. Only the content of fields is returned.

The line type, on the other hand, returns a Tcl list that completely describe the layout of the MARC record -- including tags, fields, etc.

The field type is sufficient and efficient in the case, where only a small number of fields are extracted, and in the case where no further processing (in Tcl) is necessary.

However, if the MARC record is to be edited or altered in any way, the line extraction is more powerful -- only limited by the Tcl language itself.

Example

Consider the record below:

001       11224466 
003    DLC
005    00000000000000.0
008    910710c19910701nju           00010 eng  
010    $a    11224466 
040    $a DLC $c DLC
050 00 $a 123-xyz
100 10 $a Jack Collins
245 10 $a How to program a computer
260 1  $a Penguin
263    $a 8710
300    $a p. cm.

Assuming this record is at position 1 in ir-set z.1, we might extract the title-field (245 * a), with the following command:

z.1 getMarc 1 field 245 * a

which gives:

{How to program a computer}

Using the line instead of field gives:

{245 {10} {{a {How to program a computer}} }}

If we wish to extract the whole record as a list, we use:

z.1 getMarc 1 line * * *

giving:

{001 {} {{{} {   11224466 }} }}
{003 {} {{{} DLC} }}
{005 {} {{{} 00000000000000.0} }}
{008 {} {{{} {910710c19910701nju           00010 eng  }} }}
{010 {  } {{a {   11224466 }} }}
{040 {  } {{a DLC} {c DLC} }}
{050 {00} {{a 123-xyz} }}
{100 {10} {{a {Jack Collins}} }}
{245 {10} {{a {How to program a computer}} }}
{260 {1 } {{a Penguin} }}
{263 {  } {{a 8710} }}
{300 {  } {{a {p. cm.}} }}

End of example

Example

This example demonstrates how Tcl can be used to examine a MARC record in the list notation.

The procedure extract-format makes an extraction of fields in a MARC record based on a number of masks. There are 5 parameters, r: a record in list notation, tag: regular expression to match the record tags, ind: regular expression to match indicators, field: regular expression to match fields, and finally text: regular expression to match the content of a field.

proc extract-format {r tag ind field text} {
    foreach line $r {
        if {[regexp $tag [lindex $line 0]] && \
                [regexp $ind [lindex $line 1]]} {
            foreach f [lindex $line 2] {
                if {[regexp $field [lindex $f 0]]} {
                    if {[regexp $text [lindex $f 1]]} {
                        puts [lindex $f 1]
                    }
                }
            }
        }
    }
}

To match comput followed by any number of character(s) in the 245 fields in the record from the previous example, we could use:

set r [z.1 getMarc 1 line * * *]

extract-format $r 245 .. . comput
which gives:
How to program a computer

End of example

The putMarc action does the opposite of getMarc. It copies a record in Tcl list notation to a ir set object and is needed if a result-set must be updated by a Tcl modified (user-edited) record.

5.6 SUTRS

In IrTcl a SUTRS record is treated as one single string. To retrieve a SUTRS record use the getSutrs followed by an index.

5.7 XML

In IrTcl an XML record is treated as one single string. To retrieve a XML record use the getXml followed by an index.

5.8 GRS-1

A GRS-1 record in IrTcl is represented as a list of elements. Each element specifies a tag as well as data. The data may be a subtree, which is represented as a list, and so on.

The method getGrs is followed by a record index and optional specifiers that selects a specific sub-tree. Each element consists of 5 elements:

tag-set

Tag set number.

value-type

Type of tag value. May be either numeric of string.

value

The value it self.

data-type

May be either octets, numeric, ext, string, bool, intUnit, empty, notRequested, diagnostic or subtree.

data

The data associated with element of given type as indicated before. If data-type is numeric or string then data is encoded as a single Tcl token. The data-type bool is encoded as 0 or 1 for false and true respectively. If the data-type is subtree the data is a sub-list. In all other cases, the data is the empty string.

Example

Consider the GRS-1 record below as shown by the YAZ client program:

(1,1) OID: GILS-schema
(1,14) 2
(2,1)  UTAH EARTHQUAKE EPICENTERS
    class=4,type=1,value=us
(4,52) UTAH GEOLOGICAL AND MINERAL SURVEY
(3,Local-Subject-Index) APPALACHIAN VALLEY; EARTHQUAKE; EPICENTER
(2,6)
    (1,19) Five files of epicenter data arranged by ...
    (3,Format) DIGITAL DATA SETS
    (3,Data-Category) TERRESTRIAL
    (3,Comments) Data are supplied by the University of Utah ...
(4,70)
    (4,90)
        (2,10) UTAH GEOLOGICAL AND MINERAL SURVEY
        (4,2) 606 BLACK HAWK WAY
        (4,3) SALT LAKE CITY
        (3,State) UT
        (3,Zip-Code) 84108
        (2,16) USA
        (2,14) (801) 581-6831
    (4,7) UTAH EARTHQUAKE EPICENTERS
(4,1) ESDD0006
(1,16) 198903

The record may be fetched from the result set, z.1, at position 1 by using:

z.1 getGrs 1 
which will return:
{ 1 numeric 1 oid 1.2.840.10003.13.2 }
{ 1 numeric 14 string 2 }
{ 2 numeric 1 string
   { UTAH EARTHQUAKE EPICENTERS} }
{ 4 numeric 52 string {UTAH GEOLOGICAL AND MINERAL SURVEY} }
{ 3 string Local-Subject-Index string
   {APPALACHIAN VALLEY; EARTHQUAKE; EPICENTER} }
{ 2 numeric 6 subtree
   { { 1 numeric 19 string
      {Five files of epicenter data arranged by ...} }
   { 3 string Format string {DIGITAL DATA SETS} }
   { 3 string Data-Category string TERRESTRIAL }
   { 3 string Comments string   
      {Data are supplied by the University of Utah ...} } } }
{ 4 numeric 70 subtree
   { { 4 numeric 90 subtree
      { { 2 numeric 10 string
         {UTAH GEOLOGICAL AND MINERAL SURVEY} }
      { 4 numeric 2 string {606 BLACK HAWK WAY} }
      { 4 numeric 3 string {SALT LAKE CITY} }
      { 3 string State string UT }
      { 3 string Zip-Code string 84108 }
      { 2 numeric 16 string USA }
      { 2 numeric 14 string {(801) 581-6831} } } }
      { 4 numeric 7 string {UTAH EARTHQUAKE EPICENTERS} } } }
{ 4 numeric 1 string ESDD0006 }
{ 1 numeric 16 string 198903 } 

We can choose only to get the path (2,6) by using:

z.1 getGrs 1 (2,6)
and we'll get:
{ 2 numeric 6 subtree { { 1 numeric 19 string
   {Five files of epicenter data arranged by ...} }
   { 3 string Format string {DIGITAL DATA SETS} }
   { 3 string Data-Category string TERRESTRIAL }
   { 3 string Comments
      string {Data are supplied by the University of Utah ...} } } }

To get the well known (1,19) within the subject (2,6) we use

z.1 getGrs 1 (2,6) (1,19)
and get:
{ 2 numeric 6 subtree
   { { 1 numeric 19 string
      {Five files of epicenter data arranged by ...} } } }
End of example

5.9 Explain

Explain records are retrieved like other records. The method, getExplain is followed by an index and and an optional Explain record pattern.

The returned record is a canonical representation of the Explain record. An ASN.1 sequence is represented as a list. Each item in the list consists of the name of the element, followed by its value if the value is supplied.

The optional pattern that follows the index after getExplain consists of one or more elements, that is matched against the elements of the actual record.

Example

One of the few targets that support explain is the ATT research server at z3950.research.att.com.

The targetInfo record was returned by the target and it's stored in position 1 in the result set, z.1. To retrieve the whole record we must use

z.1 getExplain 1

and we get in return

{targetInfo commonInfo {name {Lucent Technologies Research Server}}
recentNews icon {namedResultSets 1} {multipleDBsearch 0}
{maxResultSets 100} {maxResultSize 600000} maxTerms timeoutInterval
{welcomeMessage {strings { {language eng}
{text
{Salutations - this is Lucent Technologies experimental Z39.50 server.
No guarentees, but free and unlimited access!}} } } }
{contactInfo {name {Robert Waldstein}} {description {strings
{ {language eng}
{text {Librarian system designer - no legal anythings}} } } }
{address {strings { {language eng} {text {Room 3D-591
600 Mountain Ave
Murray Hill
N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}} }
description nicknames {usageRest {strings { {language eng}
{text {None - as long as nonProfit research}} } } } paymentAddr
{hours {strings { {language eng} {text {Should never be down}} } } }
dbCombinations addresses commonAccessInfo } 

The targetInfo above indicates the the record is really a targetInfo record. The commonInfo, which is optional, is not supplied by this server. The name, however is supplied, with the value Lucent Technologies Research Server.

To retrieve the contactInfo from the record above we can extract the element from the record by using Tcl's list manipulation facilities, for example by doing

set ti [z.1 getExplain 1]
lindex [lindex $ti 0] 12 
which will return
contactInfo {name {Robert Waldstein}} {description {strings
{ {language eng}
{text {Librarian system designer - no legal anythings}} }
} } {address {strings { {language eng} {text {Room 3D-591
600 Mountain Ave
Murray Hill
N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}}

We can also extract almost the same by doing

z.1 getExplain 1 targetInfo contactInfo
which will return
{name {Robert Waldstein}} {description {strings { {language eng}
{text {Librarian system designer - no legal anythings}} } } }
{address {strings { {language eng} {text {Room 3D-591
600 Mountain Ave
Murray Hill
N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}}

End of example


Next Previous Contents