Thesaurus::Httpd -- The httpd query standard for thesaurus interaction
This document describes the URLS that a server can process from a client. Since these provide the functional access to the server thesaurus they are referred to as URL_functions in the following text. Note that there is no difference between these URL_functions and normal URLS handled by the server. Since parameters are passed to these URL_functions, the server must typically use an additional application, to process these requests.
All URL_functions conform to the following structure. In structure BASE_URL is the unique identifier of the thesaurus server and a thesaurus served by that host. This BASE_URL serves as a unique identifier for the thesaurus.
Format: BASE_URL?command=command_name¶meters
Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes?match=exact&term=Ecosystems
This parameter is used to specify a term within the thesaurus. It is used for retrieval of known terms in the thesaurus.
This parameter specifies the return type(s) requested. It can
be used to narrow match requests. The valid types are (Descriptor |
EntryTerm | Category | Term). Term specifies any of the specific Terms.
Multiple types can be specified with multiple type= parameters.
The server may need to provide a method for a client to obtain a continuation of a request. For example, a client issues request resulting in many terms to be returned. The server may wish to respond only with the first 100 terms. The thesaurus response will contain a continuation parameter (simple string). The client can recall the same URL_function, with the added continuation parameter to retrieve the next set of terms.
It is suggested that the server return a number specifing the next term in the servers numeric sequence for the query.
Required: Yes
Returns: text/HTML
Parameters:
Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes
When the thesaurus is called without any parameters whatsoever, the server responds with information about the thesaurus.The result is an html document (unspecified which version). The following information must be included within the returned document.
A short title for the thesaurus. This should be unique to this thesaurus.
A general description of the thesaurus, it's intended use and audience.
Contact information for the thesaurus. If there is a different contact for the Thesaurus content, as opposed to its dissemination, that should be noted. An electronic URL/email address is mimimally required.
The baseURL that is used for this thesaurus
An indication of the server's support of optional URL_functions.
Optionally, The information returned can include:
Notes regarding the historical changes to the thesaurus.
Information regarding the Number of Descriptors, EntryTerms, etc.
The information can also include any other information pertaining to the thesaurus that may be of interest to the user. (Links to home page, relted information, etc.)
More than one thesaurus may be served by the same html document.
Parameters: identifier or term
Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes?Ecosystems
When a single parameter is passed to the baseURL, it will treat this as a request for term information regarding that specific identifier.. The response will be a description of that term alone. Only one term is ever specified in this type of call.
The server should expect that the client identify the term either by the identifier for the term, or by the term itself. The server should either determing the nature of the request from the parameter itself, or attempt a match by identifier before a match by term. In cases where the term is not a unique specification (eg. multi-lingual thesaurus) the server may only respond to identifier requests.
If the identifier is not found in the thesaurus, the server should return a warning to that effect.
Parameters: match=string term|match=string type=type_of_term
Example: http://ceres.ca.gov/cgi-bin/thesauri/Themes?match=sql&term=Eco%
the match URL_function is used for a client to receive multiple terms from the thesaurus. The term= parameter is used for a client to specify a search string for the server.
The match command has some additional syntax. The command b<match> may be sub-specified with a match parameter. These may be unique to the server, but the following behaviors are specified, and some are required.
This is the default behavior of the match function, and what is used if no match parameter is specified.
The term parameters are strings and the server must respond with terms that may share similar concepts to those of the parameters. How the server determines this is unspecified. No characters of the term parameters are considered 'special'; they are treated as simple text strings.
The server decides how to define 'similar' in it's thesaurus.
Multiple term= parameters should return an RDF document that contains matches to all the passed concepts.
The term parameters must match exactly the terms specified.
This atch parameter allows the server to limit terms to those that would help a user begin browsing the thesaurus.
The server's response may or may not be affected by any term parameters, but should behave responsibly if no term parameters are included in the request. If the server is affected by b<term> parameters, that should be included in the information.
The server may respond with all the terms in the thesaurus (modified by any type parameters.) The server may chose to implement this only for specific term types (ie. Category) or not implement this at all.
The server behavior is not modified by any term parameters. Multiple type= parameters should return all requested term types.
Since this is an optional function, the server need not repspond to this request, or may respond only with a subset of term types. If the client makes a request that is not supported by the server, it should be noted by the server.
term parameters are matched to terms using SQL style like pattern matching.
term parameters are matched to terms using the unix glob style matching.
term parameters are matched to terms using unix regexp style pattern matching.
Multiple term= parameters should return an RDF document that contains matchs to all the passed concepts.
Multiple match= parameters are not allowed.
Multiple type= parameters should return all requested term types.
See Thesaurus for a description of the RDF format of response to these queries.
timeHack ( To get terms modified since some date ) - I can see some difficulty in this since I don't know how to keep track of relationships that have gone away, save to ping both Terms, which might work I suppose.