contrib/rdkit is a PostgreSQL contribution module, which implements 
*mol* datatype to describe molecules and *fp* datatype for fingerprints,  
basic comparison operations, similarity operation (tanimoto, dice) and  index 
support (using GiST indexing framework).

Compatibility: PostgreSQL 8.4+
 If using PostgreSQL packages from your linux distribution, be sure
 that you have the -devel package installed.

Installation:

 Module assumes that the RDKit is installed in $RDBASE and that $BOOSTHOME
 points to the directory where boost is installed.
 (check Makefile).

 cd $RDBASE/Code/PgSQL/rdkit
 make && make install && make installcheck 
 
For versions of PostgreSQL < 9.1:
 To install rdkit cartridge into the database DB: 
   psql DB < rdkit.sql

 Uninstall cartridge from database DB:
   psql DB < uninstall_rdkit.sql

For versions of PostgreSQL >= 9.1:
 To install rdkit cartridge into the database DB: 
   psql -c 'CREATE EXTENSION rdkit' DB

 Uninstall cartridge from database DB:
   psql -c 'DROP EXTENSION rdkit CASCADE' DB

Data types:

Input/output functions for *mol* datatype follow SMILES format.
Input/output functions for *fp* datatype follow bytea format.

It's possible to use implicit cast mol to fp, i.e., mol::fp

Functions:

double tanimoto_sml(mol,mol) - calculates tanimoto similarity
double tanimoto_sml(fp,fp)

double dice_sml(mol,mol)     - calculates dice similarity
double dice_sml(fp,fp)     

bool substruct(mol a, mol b)  - returns TRUE if second argument is a 
                                substructure of the first argument.

size(fp) - calculates length of fingerprint


GUC variables rdkit.tanimoto_threshold, rdkit.dice_threshold should be
defined, see shared_preload_libraries', 'custom_variable_classes' 
instructions in postgresql.conf. Default value for both of them is 0.5. 
We recommend to initialize them via  'shared_preload_libraries'.

Operations:

  mol % mol, fp % fp  - returns TRUE if tanimoto similarity between operands is 
                        lower than rdkit.tanimoto_threshold
  
  mol # mol, fp # fp  - returns TRUE if dice similarity between operands is 
                        lower than rdkit.dice_threshold
 
  mol @> mol          - returns TRUE if left operand contains right one
  mol <@ mol          - returns TRUE if left operand contained in right one
 
  comparison operations ( <, <=, =, >=, > )  were implemented using memcpy()

Indexes:

  Btree and Hash indexes support comparison operations for mol and fp 
  datatypes. 
  Example:  CREATE INDEX molidx ON pgmol (mol);
            CREATE INDEX molidx ON pgmol (fp);

  GiST index over mol supports %,#, @>, <@ operations.
  GiST index over fp supports %,# operations.
  Example:  CREATE INDEX molidx ON pgmol USING gist (mol);


Example:

  See sql/* for examples.
