NAME

Thesaurus::Httpd -- The httpd query standard for thesaurus interaction


DESCRIPTION

This document describes the URLS that a server can process from a client. Since these provide the functional access to the server thesaurus they are referred to as URL_functions in the following text. Note that there is no difference between these URL_functions and normal URLS handled by the server. Since parameters are passed to these URL_functions, the server must typically use an additional application, to process these requests.


Command Syntax

All URL_functions conform to the following structure. In structure BASE_URL is the unique identifier of the thesaurus server and a thesaurus served by that host. This BASE_URL serves as a unique identifier for the thesaurus.

Format: BASE_URL?command=command_name&parameters

Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes?match=exact&term=Ecosystems


parameters

term=term_of_interest

This parameter is used to specify a term within the thesaurus. It is used for retrieval of known terms in the thesaurus.

type=type_of_term

This parameter specifies the return type(s) requested. It can be used to narrow match requests. The valid types are (Descriptor | EntryTerm | Category | Term). Term specifies any of the specific Terms. Multiple types can be specified with multiple type= parameters.

continue=continuation_parameter

The server may need to provide a method for a client to obtain a continuation of a request. For example, a client issues request resulting in many terms to be returned. The server may wish to respond only with the first 100 terms. The thesaurus response will contain a continuation parameter (simple string). The client can recall the same URL_function, with the added continuation parameter to retrieve the next set of terms.

It is suggested that the server return a number specifing the next term in the servers numeric sequence for the query.


Commands


information

Required: Yes

Returns: text/HTML

Parameters:

Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes

When the thesaurus is called without any parameters whatsoever, the server responds with information about the thesaurus.The result is an html document (unspecified which version). The following information must be included within the returned document.

Title

A short title for the thesaurus. This should be unique to this thesaurus.

Description

A general description of the thesaurus, it's intended use and audience.

Contact

Contact information for the thesaurus. If there is a different contact for the Thesaurus content, as opposed to its dissemination, that should be noted. An electronic URL/email address is mimimally required.

BaseURL

The baseURL that is used for this thesaurus

URLS supported

An indication of the server's support of optional URL_functions.

Optionally, The information returned can include:

Historical Notes

Notes regarding the historical changes to the thesaurus.

Statistics

Information regarding the Number of Descriptors, EntryTerms, etc.

The information can also include any other information pertaining to the thesaurus that may be of interest to the user. (Links to home page, relted information, etc.)

More than one thesaurus may be served by the same html document.


Single Parameter Behavior

Parameters: identifier or term

Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes?Ecosystems

When a single parameter is passed to the baseURL, it will treat this as a request for term information regarding that specific identifier.. The response will be a description of that term alone. Only one term is ever specified in this type of call.

The server should expect that the client identify the term either by the identifier for the term, or by the term itself. The server should either determing the nature of the request from the parameter itself, or attempt a match by identifier before a match by term. In cases where the term is not a unique specification (eg. multi-lingual thesaurus) the server may only respond to identifier requests.

If the identifier is not found in the thesaurus, the server should return a warning to that effect.


match

Parameters: match=string term|match=string type=type_of_term

Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes?match=sql&term=Eco%

the match URL_function is used for a client to receive multiple terms from the thesaurus. The term= parameter is used for a client to specify a search string for the server.

The match command has some additional syntax. The command b<match> may be sub-specified with a match parameter. These may be unique to the server, but the following behaviors are specified, and some are required.

match=default REQUIRED

This is the default behavior of the match function, and what is used if no match parameter is specified.

The term parameters are strings and the server must respond with terms that may share similar concepts to those of the parameters. How the server determines this is unspecified. No characters of the term parameters are considered 'special'; they are treated as simple text strings.

The server decides how to define 'similar' in it's thesaurus.

Multiple term= parameters should return an RDF document that contains matches to all the passed concepts.

match=exact REQUIRED

The term parameters must match exactly the terms specified.

match=start REQUIRED

This atch parameter allows the server to limit terms to those that would help a user begin browsing the thesaurus.

The server's response may or may not be affected by any term parameters, but should behave responsibly if no term parameters are included in the request. If the server is affected by b<term> parameters, that should be included in the information.

match=all OPTIONAL

The server may respond with all the terms in the thesaurus (modified by any type parameters.) The server may chose to implement this only for specific term types (ie. Category) or not implement this at all.

The server behavior is not modified by any term parameters. Multiple type= parameters should return all requested term types.

Since this is an optional function, the server need not repspond to this request, or may respond only with a subset of term types. If the client makes a request that is not supported by the server, it should be noted by the server.

match=sql OPTIONAL

term parameters are matched to terms using SQL style like pattern matching.

match=glob OPTIONAL

term parameters are matched to terms using the unix glob style matching.

match=regexp OPTIONAL

term parameters are matched to terms using unix regexp style pattern matching.

Multiple term= parameters should return an RDF document that contains matchs to all the passed concepts.

Multiple match= parameters are not allowed.

Multiple type= parameters should return all requested term types.


RDF Document Structure

See Thesaurus for a description of the RDF format of response to these queries.


Unresolved Issues

timeHack ( To get terms modified since some date ) - I can see some difficulty in this since I don't know how to keep track of relationships that have gone away, save to ping both Terms, which might work I suppose.