This section covers the queries used by
A search operation and a result set is described by the ir set object.
The ir set object is defined by the ir-set
command which
has two parameters. The first is the name of the new ir set object, and
the second, which is optional, is the name of an assocation -- an ir
object. The second argument is required if the ir set object should be able
to perform searches and presents. However, it is not required if
only ``local'' operations is done with the ir set object.
When the ir set object is created a number of settings are inherited from the ir object, such as the selected databass, query type, etc. Thus, the ir object contains what we could call default settings.
Search requests are sent by the search
action which
takes a query as parameter. There are two types of queries,
RPN and CCL, controlled by the setting queryType
.
A string representation for the query is used in
The CCL query is an uninterpreted octet-string which is parsed by the target. We refer to the standard: ISO 8777. Note that only a few targets actually support the CCL query and the interpretation of the standard may vary.
The prefix query notation (which is converted to RPN) offer a few operators. They are:
@attr
list opThe attributes in list are applied to op
@and
op1 op2Boolean and on op1 and op2
@or
op1 op2Boolean or on op1 and op2
@not
op1 op2Boolean not on op1 and op2
@prox
list op1 op2Proximity operation on op1 and op2. Not implemented yet.
@set
nameResult set reference
@attrset
setWhole query uses the specified attribute set. If this operator is used it must be defined at the beginning of the query.
It is simple to build RPN queries in
science
Boolean operators use the prefix notation (instead of the suffix/RPN), as in:
@and science technology
Search terms may be associated with attributes. These
attributes are indicated by the @attr
operator.
Assuming the bib-1 attribute set, we can set the use-attribute
(type is 1) to title (value is 4):
@attr 1=4 science
Also, it is possible to apply attributes to a range of search terms.
In the query below, both search terms have use=title but the tech
term is right truncated:
@attr 1=4 @and @attr 5=1 tech beta
To search for the DatabaseInfo records from an Explain server, we could use
@attrset exp1 @attr 1=1 DatabaseInfo
The settings that affect the search are listed below:
databaseNames
listDatabase-names.
smallSetUpperBound
integerSmall set upper bound. Default 0.
largeSetLowerBound
integerLarge set lower bound. Default 1.
mediumSetPresentNumber
integerMedium set present number. Default 0.
replaceIndicator
booleanReplace-indicator. Default true (1).
setName
stringName of result set. Default name of set is default
.
queryType rpn|ccl
Query type-1 or query type-2. Default rpn (type-1).
preferredRecordSyntax
stringPreferred record syntax -- UNIMARC, USMARC, etc.
smallSetElementSetNames
stringsmall-set-element-set names. If string is empty the element set is not set. Default is empty (not set).
mediumSetElementSetNames
stringmedium-set-element-set names. If string is empty the element set is not set. Default is empty (not set).
nextResultSetPosition
returns integerNext result set position.
referenceId
stringReference-id. If string is empty no reference-id is used.
searchResponse
listSearch-response Tcl script.
callback
listGeneral response Tcl script. Only used if searchResponse is not specified.
This setting is valid only for the ir
object -- not the
ir-set
object.
Setting the databaseNames
is mandatory. All other settings
have reasonable defaults.
The search-response handler, specified by the callback
- or
the searchResponse
setting,
should read some of the settings shown below:
searchStatus
returns booleanSearch-status. True if search operation was successful; false otherwise.
responseStatus
returns listResponse status information.
resultCount
returns integerresult-count
numberOfRecordsReturned
returns integerNumber of records returned.
referenceId
returns stringReference-id of search response.
The responseStatus
signals one of three conditions which
is indicated by the value of the first item in the list:
NSD
indicates that the target has returned one or
more non-surrogate diagnostic messages. The NSD
item is followed by
a list with all non-surrogate messages. Each non-surrogate message consists
of three items. The first item of the three items is the error
code (integer); the next item is a textual representation of the error
code in plain english; the third item is additional information, possibly
empty if no additional information was returned by the target.
DBOSD
indicates a successful operation where the target has returned one or more records. Each record may be either a database record or a surrogate diagnostic.
OK
indicates a successful operation -- no records are returned from the target.
Example
We continue with the multiple-targets example.
The init-response
procedure will attempt to make searches:
proc init-response {assoc} {
puts "$assoc connected"
ir-set ${assoc}.1 $assoc
$assoc.1 queryType rpn
$assoc.1 databaseNames base-a base-b
$assoc callback [list search-response $assoc ${assoc}.1]
$assoc.1 search "@attr 1=4 @and @attr 5=1 tech beta"
}
An ir set object is defined and the ir object is told about the name of ir object. The ir set object use the name of the ir object as prefix.
Then, the query-type is defined to be RPN, i.e. we will use the prefix query notation later on.
Two databases, base-a
and base-b
, are selected.
A search-response
handler is defined with the
ir object and the ir-set object as parameters and
the search is executed.
The first part of the search-response
looks like:
proc search-response {assoc rset} {
set status [$rset responseStatus]
set type [lindex $status 0]
if {$type == "NSD"} {
set code [lindex $status 1]
set msg [lindex $status 2]
set addinfo [lindex $status 3]
puts "NSD $code: $msg: $addinfo"
return
}
set hits [$rset resultCount]
if {$type == "DBOSD"} {
set ret [$rset numberOfRecordsReturned]
...
}
}
The response status is stored in variable status
and
the first element indicates the condition.
If non-surrogate diagnostics are returned they are displayed.
Otherwise, the search was a success and the number of hits
is read. Finally, it is tested whether the search response
returned records (database or diagnostic).
Note that we actually didn't inspect the search status (setting
searchStatus
) to determine whether the search was successful or not,
because the standard specifies that one or more non-surrogate
diagnostics should be returned by the target in case of errors.
End of example
If one or more records are returned from the target they will be stored in the result set object. In the case in which the search response contains records, it is very similar to the present response case. Therefore, some settings are common to both situations.
The present
action sends a present request. The present
is
followed by two optional integers. The first integer is the
result-set starting position -- defaults to 1. The second integer
is the number of records requested -- defaults to 10.
The settings which could be modified before a present
action are:
preferredRecordSyntax
stringpreferred record syntax -- UNIMARC, USMARC, etc.
elementSetNames
stringElement-set names. If string is empty the element set is not set. Default is empty (not set).
referenceId
stringReference-id. If string is empty no reference-id is used.
presentResponse
listPresent-response Tcl script.
callback
listGeneral response Tcl script. Only used if presentResponse is not specified
This setting is valid only for the ir
object -- not the
ir-set
object.
The present-response handler should inspect the settings
shown in table below.
Note that responseStatus
and numberOfRecordsReturned
settings were also used in the search-response case.
As in the search response case, records returned from the target are stored in the result set object.
presentStatus
returns booleanPresent-status.
responseStatus
returns listResponse status information.
numberOfRecordsReturned
returns integerNumber of records returned.
nextResultSetPosition
returns integerNext result set position.
referenceId
returns stringReference-id of present response.
Search responses and present responses may result in
one or more records stored in the ir set object if
the responseStatus
setting indicates database or
surrogate diagnostics (DBOSD
). The individual
records, indexed by an integer position offset, should then be
inspected.
If element set names have been specified either in the
search requests (smallSetElementSetNames
/
mediumSetElementSetNames
) or present requests
(elementSetNames
) the individual records in the
ir set object are assigned appropriate element set ids.
In this mode records at a given position are treated different as
long as they have difference element set ids.
To inspect records with a particular element set id in subsequent
operations use the recordElements
setting followed by the id.
If you have more than one record at a given position and you do not
use recordElements
the record selected at the given position
is undefined.
The action type
followed by an integer returns information
about a given position in an ir set. There are three possiblities:
SD
The item is a surrogate diagnostic record.
There is no record at the specified position.
DB
The item is a database record.
To handle the first case, surrogate diagnostic record, the
Diag
action should be used. It returns three
items: error code (integer), text representation in plain english
(string), and additional information (string, possibly empty).
In the second case, no record, note that there still might
be a record at the position but with an id that differs from that
specified by recordElements
.
In the third case, database record, the recordType
action should
be used. It returns the record type at the given position.
Some record types are:
UNIMARC
INTERMARC
CCF
USMARC
UKMARC
NORMARC
LIBRISMARC
DANMARC
FINMARC
SUTRS
Example
We continue our search-response example. In the case,
DBOSD
, we should inspect the result set items.
Recall that the ir set name was passed to the
search-response handler as argument rset
.
if {$type == "DBOSD"} {
set ret [$rset numberOfRecordsReturned]
for {set i 1} {$i<=$ret} {incr i} {
set itype [$rset type $i]
if {$itype == "SD"} {
set diag [$rset Diag $i]
set code [lindex $diag 0]
set msg [lindex $diag 1]
set addinfo [lindex $diag 2]
puts "$i: NSD $code: $msg: $addinfo"
} elseif {$itype == "DB"} {
set rtype [$rset recordType $i]
puts "$i: type is $rtype"
}
}
}
Each item in the result set is examined.
If an item is a diagnostic message it is displayed; otherwise
if it's a database record its type is displayed.
End of example
In the case, where there is a MARC record at a given position we
want to display it somehow. The action getMarc
is what we need.
The getMarc
is followed by a position integer and the type of
extraction we want to make: field
or line
.
The field
and line
type are followed by three
parameters that serve as extraction masks.
They are called tag, indicator and field.
If the mask matches a tag/indicator/field of a record the information
is extracted. Two characters have special meaning in masks: the
dot (any character) and star (any number of any character).
The field
type returns one or more lists of field information
that matches the mask specification. Only the content of fields
is returned.
The line
type, on the other hand, returns a Tcl list that
completely describe the layout of the MARC record -- including
tags, fields, etc.
The field
type is sufficient and efficient in the case, where only a
small number of fields are extracted, and in the case where no
further processing (in Tcl) is necessary.
However, if the MARC record is to be edited or altered in any way, the
line
extraction is more powerful -- only limited by the Tcl
language itself.
Example
Consider the record below:
001 11224466
003 DLC
005 00000000000000.0
008 910710c19910701nju 00010 eng
010 $a 11224466
040 $a DLC $c DLC
050 00 $a 123-xyz
100 10 $a Jack Collins
245 10 $a How to program a computer
260 1 $a Penguin
263 $a 8710
300 $a p. cm.
Assuming this record is at position 1 in ir-set z.1
, we
might extract the title-field (245 * a), with the following command:
z.1 getMarc 1 field 245 * a
which gives:
{How to program a computer}
Using the line
instead of field
gives:
{245 {10} {{a {How to program a computer}} }}
If we wish to extract the whole record as a list, we use:
z.1 getMarc 1 line * * *
giving:
{001 {} {{{} { 11224466 }} }}
{003 {} {{{} DLC} }}
{005 {} {{{} 00000000000000.0} }}
{008 {} {{{} {910710c19910701nju 00010 eng }} }}
{010 { } {{a { 11224466 }} }}
{040 { } {{a DLC} {c DLC} }}
{050 {00} {{a 123-xyz} }}
{100 {10} {{a {Jack Collins}} }}
{245 {10} {{a {How to program a computer}} }}
{260 {1 } {{a Penguin} }}
{263 { } {{a 8710} }}
{300 { } {{a {p. cm.}} }}
End of example
Example
This example demonstrates how Tcl can be used to examine a MARC record in the list notation.
The procedure extract-format
makes an extraction of
fields in a MARC record based on a number of masks.
There are 5 parameters, r
: a
record in list notation, tag
: regular expression to
match the record tags, ind
: regular expression to
match indicators, field
: regular expression to
match fields, and finally text
: regular expression to
match the content of a field.
proc extract-format {r tag ind field text} {
foreach line $r {
if {[regexp $tag [lindex $line 0]] && \
[regexp $ind [lindex $line 1]]} {
foreach f [lindex $line 2] {
if {[regexp $field [lindex $f 0]]} {
if {[regexp $text [lindex $f 1]]} {
puts [lindex $f 1]
}
}
}
}
}
}
To match comput
followed by any number of character(s) in the
245 fields in the record from the previous example, we could use:
set r [z.1 getMarc 1 line * * *]
extract-format $r 245 .. . comput
which gives:
How to program a computer
End of example
The putMarc
action does the opposite of getMarc
. It
copies a record in Tcl list notation to a ir set object and is
needed if a result-set must be updated by a Tcl modified (user-edited)
record.
In getSutrs
followed by an index.
In getXml
followed by an index.
A GRS-1 record in
The method getGrs
is followed by a record index and
optional specifiers that selects a specific sub-tree. Each element
consists of 5 elements:
Tag set number.
Type of tag value. May be either
numeric
of string
.
The value it self.
May be either octets
, numeric
,
ext
, string
, bool
, intUnit
, empty
,
notRequested
, diagnostic
or subtree
.
The data associated with element of given type as
indicated before. If data-type is numeric
or string
then data is encoded as a single Tcl token. The data-type bool
is encoded as 0 or 1 for false and true respectively. If the
data-type is subtree
the data is a sub-list.
In all other cases, the data is the empty string.
Example
Consider the GRS-1 record below as shown by the YAZ client program:
(1,1) OID: GILS-schema
(1,14) 2
(2,1) UTAH EARTHQUAKE EPICENTERS
class=4,type=1,value=us
(4,52) UTAH GEOLOGICAL AND MINERAL SURVEY
(3,Local-Subject-Index) APPALACHIAN VALLEY; EARTHQUAKE; EPICENTER
(2,6)
(1,19) Five files of epicenter data arranged by ...
(3,Format) DIGITAL DATA SETS
(3,Data-Category) TERRESTRIAL
(3,Comments) Data are supplied by the University of Utah ...
(4,70)
(4,90)
(2,10) UTAH GEOLOGICAL AND MINERAL SURVEY
(4,2) 606 BLACK HAWK WAY
(4,3) SALT LAKE CITY
(3,State) UT
(3,Zip-Code) 84108
(2,16) USA
(2,14) (801) 581-6831
(4,7) UTAH EARTHQUAKE EPICENTERS
(4,1) ESDD0006
(1,16) 198903
The record may be fetched from the result set, z.1
, at position 1
by using:
z.1 getGrs 1
which will return:
{ 1 numeric 1 oid 1.2.840.10003.13.2 }
{ 1 numeric 14 string 2 }
{ 2 numeric 1 string
{ UTAH EARTHQUAKE EPICENTERS} }
{ 4 numeric 52 string {UTAH GEOLOGICAL AND MINERAL SURVEY} }
{ 3 string Local-Subject-Index string
{APPALACHIAN VALLEY; EARTHQUAKE; EPICENTER} }
{ 2 numeric 6 subtree
{ { 1 numeric 19 string
{Five files of epicenter data arranged by ...} }
{ 3 string Format string {DIGITAL DATA SETS} }
{ 3 string Data-Category string TERRESTRIAL }
{ 3 string Comments string
{Data are supplied by the University of Utah ...} } } }
{ 4 numeric 70 subtree
{ { 4 numeric 90 subtree
{ { 2 numeric 10 string
{UTAH GEOLOGICAL AND MINERAL SURVEY} }
{ 4 numeric 2 string {606 BLACK HAWK WAY} }
{ 4 numeric 3 string {SALT LAKE CITY} }
{ 3 string State string UT }
{ 3 string Zip-Code string 84108 }
{ 2 numeric 16 string USA }
{ 2 numeric 14 string {(801) 581-6831} } } }
{ 4 numeric 7 string {UTAH EARTHQUAKE EPICENTERS} } } }
{ 4 numeric 1 string ESDD0006 }
{ 1 numeric 16 string 198903 }
We can choose only to get the path (2,6) by using:
z.1 getGrs 1 (2,6)
and we'll get:
{ 2 numeric 6 subtree { { 1 numeric 19 string
{Five files of epicenter data arranged by ...} }
{ 3 string Format string {DIGITAL DATA SETS} }
{ 3 string Data-Category string TERRESTRIAL }
{ 3 string Comments
string {Data are supplied by the University of Utah ...} } } }
To get the well known (1,19) within the subject (2,6) we use
z.1 getGrs 1 (2,6) (1,19)
and get:
{ 2 numeric 6 subtree
{ { 1 numeric 19 string
{Five files of epicenter data arranged by ...} } } }
End of example
Explain records are retrieved like other records. The method,
getExplain
is followed by an index and and an optional
Explain record pattern.
The returned record is a canonical representation of the Explain record. An ASN.1 sequence is represented as a list. Each item in the list consists of the name of the element, followed by its value if the value is supplied.
The optional pattern that follows the index after getExplain
consists of one or more elements, that is matched against the elements
of the actual record.
Example
One of the few targets that support explain is the ATT research server
at z3950.research.att.com
.
The targetInfo record was returned by the target and it's stored in
position 1 in the result set, z.1
. To retrieve the whole
record we must use
z.1 getExplain 1
and we get in return
{targetInfo commonInfo {name {Lucent Technologies Research Server}}
recentNews icon {namedResultSets 1} {multipleDBsearch 0}
{maxResultSets 100} {maxResultSize 600000} maxTerms timeoutInterval
{welcomeMessage {strings { {language eng}
{text
{Salutations - this is Lucent Technologies experimental Z39.50 server.
No guarentees, but free and unlimited access!}} } } }
{contactInfo {name {Robert Waldstein}} {description {strings
{ {language eng}
{text {Librarian system designer - no legal anythings}} } } }
{address {strings { {language eng} {text {Room 3D-591
600 Mountain Ave
Murray Hill
N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}} }
description nicknames {usageRest {strings { {language eng}
{text {None - as long as nonProfit research}} } } } paymentAddr
{hours {strings { {language eng} {text {Should never be down}} } } }
dbCombinations addresses commonAccessInfo }
The targetInfo
above indicates the the record is really a
targetInfo
record. The commonInfo
, which is optional, is
not supplied by this server. The name
, however is supplied,
with the value Lucent Technologies Research Server
.
To retrieve the contactInfo
from the record above we can
extract the element from the record by using Tcl's list manipulation
facilities, for example by doing
set ti [z.1 getExplain 1]
lindex [lindex $ti 0] 12
which will return
contactInfo {name {Robert Waldstein}} {description {strings
{ {language eng}
{text {Librarian system designer - no legal anythings}} }
} } {address {strings { {language eng} {text {Room 3D-591
600 Mountain Ave
Murray Hill
N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}}
We can also extract almost the same by doing
z.1 getExplain 1 targetInfo contactInfo
which will return
{name {Robert Waldstein}} {description {strings { {language eng}
{text {Librarian system designer - no legal anythings}} } } }
{address {strings { {language eng} {text {Room 3D-591
600 Mountain Ave
Murray Hill
N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}}
End of example