Platform-independent higher-order distributed SQL database interface

From: ····@pobox.com
Subject: Platform-independent higher-order distributed SQL database interface
Date: Sun, 15 Nov 1998 00:00:00 +0000
Message-ID: <72nj6p$9n$1@nnrp1.dejanews.com>

This article describes a higher-order Scheme interface to SQL
databases. The interface is similar in spirit to ChezSybase or
WB-tree. The major distinction is that the interface and its
implementation are database- and platform-independent. It can be used
with any database that provides a command-line SQL access to a
database, on any platform that supports POSIX pipes, on any Scheme
system that has some POSIX interface. Furthermore, the database access
tool -- let alone the database server -- don't have to be available on
the same computer that runs a client Scheme application, as last
paragraphs of this article show. The interface is thus truly
distributed.

A familiar X/Open SQL Call Level Interface (CLI), which is a part of a
ODBC, is rather low level. You submit a query, get a 'result-set', and
keep calling its 'next-row' method until the result-set is
exhausted. Scheme as a higher-order language can do iterations better,
without handing over the internal iteration context -- the result
set-- to an application. If a user never gets hold of the result-set,
he doesn't have a chance to screw it, pass to wrong methods or at
wrong times. Therefore, a database access driver never has to bother
checking the validity of the result set the driver obtained from the
user.

Examples:
                ; find the tables whose names match a 'table-pat-name'
(let ((table-names-rows       ; assoc list of table names and # rows
      (DB:for-each
        (lambda (tab-name nrows)
          (cout "Table " tab-name " has " nrows " row(s)\n")
          (assert (string->number nrows))
          (cons tab-name (string->number nrows)))
          "select trim(N.dbsname) || ':' || N.owner || '.' || N.tabname, "
          " I.ti_nrows from sysmaster:informix.systabnames N,"
          " sysmaster:informix.systabinfo I "
          "where N.partnum = I.ti_partnum AND tabname LIKE '"
          table-pat-name "' ORDER BY 2;"))) ...)

; Execute a query as specified by the query-strings,
; and format the reply in a set of HTML OPTIONS strings.
; The query is expected to return one or several rows
; with two columns: an 'id' and a 'descr'
; where descr is some string associated with an id.
; For each returned row, we write out an HTML OPTION tag
; <OPTION VALUE=id>descr
(define (query->OPTIONS . query-strings)
  (apply DB:for-each
    `(,(lambda (id descr)
      (cout "\n<OPTION VALUE=\"" id "\">" descr)
      '())
    ,@query-strings)))

(cout "<H3>You are currently subscribed to</H3><SELECT NAME=cid SIZE=10>"
  (lambda ()
    (query->OPTIONS
      "SELECT class_id, descr FROM ClassesOfThings WHERE class_id IN "
      (list-intersperse (cgi#subscribed :as-list) ",") ";" ))
 "</SELECT>\n<BR>You may select a channel and ...")

The interface consists of a major procedure DB:for-each, and its few
minor variations optimized for common particular cases:
DB:for-singleton, which expects a query to return at most one row, and
DB:assoc-val that anticipates at most one row and one column of the
result. There is also a procedure DB:imperative-stmt, which does not
expect any result at all; this procedure is to be used for INSERT,
UPDATE, SET etc. and DDL statements

The major DB:for-each procedure is similar to 'for-each' or 'map'
primitives. Similar, but not identical: the list returned by
DB:for-each may contain fewer elements than the number of processed
rows, which in turn may be less than the number of rows in the result
table.

        http://pobox.com/~oleg/ftp/Scheme/db-util.scm
contains the full description and the implementation. The
implementation can either pool database connections, or establish them
anew for each session. The latter is default; pieces of code that
implement the connection pooling are identified with special comments.

The implementation relies on a database command-line access tool that
can read SQL statements from its standard input (pipe) and is capable
of 'unloading' the result table onto a file (which is a named pipe as
well). Almost any RDBMS has such a primitive command-line access
tool. A query result table is literally 'streamed' into a pipe and
processed by DB:for-each as a stream of column values. This is
somewhat similar to JDBC 2.0.

        http://pobox.com/~oleg/ftp/Scheme/vdbaccess.scm
is the validation code. This code takes advantage of the fact that
every RDBMS has a special database that describes its own
tables. Alas, each RDBMS names this database differently. In Informix,
it is called sysmaster. If you use the vdbaccess.scm code with other
RDBM systems, you need to make trivial adjustments. The validation
code runs several queries on the sysmaster database, which are
expected to return several rows, a single row, or no rows at all. When
there are several ways to obtain a particular result, both ways are
tried and compared. The validation code also tries to run queries on
non-existing tables and columns, and queries which are flat wrong
(syntactically).

        http://pobox.com/~oleg/ftp/Scheme/index.html#databases
For more details and references. This page points to a complete
working code, which you can actually see in action.

I have used this system on Sun/Solaris with Illustra and Informix, and
on HP-UX and Linux with Informix IDS. In the latter case, an Informix
database server ran on a different computer (an HP-UX box). Also,
there is no Linux version of dbaccess, an Informix sql tool. That
wasn't a problem however: rather than opening a "file"
        | dbaccess DBname - > /tmp/error-log
I merely had to open a "file"
        | rsh another-computer "dbaccess DBname - > /tmp/error-log"
write
        UNLOAD to '/tmp/sql-from' SELECT ...
into it, open another "file"
        rsh another-computer "cat /tmp/sql-from" |
and read from it. /tmp/sql-from is of course a named pipe. One can use
ssh for additional security and authentication.

This system has been in use for almost a year to manage a database of
weather observation reports, forecasts and advisories for the entire
globe. Performance appears to be quite satisfactory: for example,
decoding of 661 METAR reports took 31 secs, on HP J210 (HP-PA 120
MHz). This time includes reading of the reports, decoding them,
writing logs, etc, and 661 database queries for airport call
signs. The decoder is written in Gambit Scheme and is
interpreted. This shows that a database access overhead is _far_ less
than 50 ms per query.


-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own