# @HEADER
#
########################################################################
#
#  Zoltan Toolkit for Load-balancing, Partitioning, Ordering and Coloring
#                  Copyright 2012 Sandia Corporation
#
# Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
# the U.S. Government retains certain rights in this software.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# 3. Neither the name of the Corporation nor the names of the
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
# Questions? Contact Karen Devine	kddevin@sandia.gov
#                    Erik Boman	        egboman@sandia.gov
#
########################################################################
#
# @HEADER
ZOLTAN Distributed Directory (DD) Software Documentation


The owner (i.e. the processor number) of any computational object is
subject to change during computation as a result of load balancing.
An application may use this directory module to find objects when it
needs them. A distributed directory balances the load (in terms of
memory and processing time) and avoids the bottle neck of a centralized
directory design.

This distributed directory module may be used alone or in conjunction
with Zoltan's load balancing capability and memory and communication
services.  The user should note that external names (subroutines, etc.)
which prefaced by Zoltan_DD_ are reserved when using this module.

The user initially creates an empty distributed directory using
Zoltan_DD_Create. Then GID information is added to the directory using
Zoltan_DD_Update. Zoltan_DD_Update is also called after data migration
or refinements.  The directory maintains the global ID's basic
information: local ID (optional), partition (optional), arbitrary user
data (optional), and current data owner. Zoltan_DD_Find returns this
information for a list of global IDs.  When the user has finished using
the directory, its memory is returned to the system by Zoltan_DD_Destroy.
When necessary, a selected list of global IDs may be removed from the
directory by Zoltan_DD_Remove.

An object is known by its global ID.  Hashing provides very fast
lookup for the information associated with a global ID in a two step
process.  The first hash of the global ID yields the processor number
owning the directory entry for that global ID.  The directory entry
owner remains constant even if the object (global ID) migrates in time.
Second, a different hash algorithm of the global ID looks up the
associated information in directory processor's hash table.  The user
may optionally register their own (first) hash function to take
advantage of their knowledge of their global ID naming scheme and the
global ID's neighboring processors. See the documentation for
Zoltan_DD_Hash() for more information.  If no user hash function is
registered, Zoltan's LB_Hash() will be used for the first hash. This
module's design was strongly influenced by the paper  "Communication
Support for Adaptive Computation" by Pinar and Hendrickson.

The files DD_Set_Neighbor_Hash_Fn1 and DD_Set_Neighbor_Hash_Fn1 provide
examples of how a user can create their own hash functions taking advantage
of their knowledge of how the GID naming scheme might correlate to a
reasonable location for its directory information.  These are also fully
tested and useful functions.






-----------------------------------
-----------------------------------

Source code location:        Utilities/Directory
Function prototype file:     Utilities/Directory/DD_Const.h
Library name:                libzoltan_dd.a

Other libraries used by this library:
   libmpi.a
   libzoltan_mem.a
   libzoltan_comm.a

Other header files used by this library
   mem_const.h
   comm_const.h
   zoltan_util.h


Data Stuctures:
   Zoltan_DD_Directory: state & storage used by all DD routines.

Routines:
   Zoltan_DD_Create: Allocates memory and initializes the directory.
   Zoltan_DD_Destroy: Terminate the directory and frees its memory.

   Zoltan_DD_Update: Adds or updates global IDs' directory information.
   Zoltan_DD_Find: Returns global IDs' information (owner, local ID, etc.)
   Zoltan_DD_Remove: Eliminates selected global IDs from the directory.

   Zoltan_DD_Stats: Provides statistics about hash table & linked lists.
   Zoltan_DD_Print: Displays the contents (GIDs, etc) of each directory.

   Zoltan_DD_Set_Hash_Fn: Registers a user's optional hash function.
   Zoltan_DD_Set_Hash_Fn1: Model for user's to create their own hash function.
   Zoltan_DD_Set_Hash_Fn2: Model for user's to create their own hash function.

Internal Data Structures:  (not user accessable)
   DD_Node: linked list element storing information about a global ID
   DD_Update_Msg: Message with new data about global IDs
   DD_Find_Msg: Messages to/from directory returning global IDs' info
   DD_Remove_Msg: Message to eliminate all traces of selected global IDs.

Internal Routines: (not user callable)
   DD_Update_Local:
   DD_Find_local:
   DD_Remove_Local:
   DD_Hash2:            // renamed copy of LB_Hash

Files in DDirectory:
  DD_Const.h            // return error codes, prototypes, structures

  DD_Create.c
  DD_Destroy.c
  DD_Find.c
  DD_Hash2.c
  DD_Remove.c
  DD_Set_Hash_Fn.c
  DD_Set_Neighbor_Hash_Fn.c
  DD_Stats.c
  DD_Update.c
  DD_Set_Hash_Fn1.c
  DD_Set_Hash_Fn2.c

  README                // this file






---------------------------------------
---------------------------------------
int Zoltan_DD_Create (Zoltan_DD_Directory **dd, MPI_Comm comm,
 int num_gid, int num_lid, int user_length, int table_length,
 int debug_level)
----------------------------------------

Zoltan_DD_Create() allocates and initializes memory for the DD_Directory
structure.  It must be called before any other distributed directory
routines. MPI must be initialized prior to calling Zoltan_DD_Create().

The DD_Directory struct must be passed to all other distributed directory
routines.  The MPI Comm argument designates the processors used for the
distributed directory.  The MPI Comm argument is duplicated and stored for
later use.

Note the need for double indirection in the DD_Directory calling argument
in order to allocate memory in behalf of the user program.

The user can set the debug level argument in the Zoltan_DD_Create()
to determine the module's response to multiple updates for any global ID
within one update cycle.  If the argument is set to 0, all multiple updates
are ignored (but the last determines the directory information.)  If the
argument is set to 1, an error is returned if the multiple updates
represent different owners for the same global ID.  If the debug level is 2,
an error return and an error message are generated if multiple updates
represent different owners for the same global ID.  If the level is 3, an
error return and an error message are generated for a multiple update even
if the updates represent the same owner for a global ID.



Arguments: (all are in)
   dd            Structure maintains directory state and hash table. (in/out)
   comm          MPI comm dup'ed & stored specifying directory processors.
   num_gid       Length of global ID.
   num_lid       Length of local ID or zero to ignore local IDs.
   user_length   Length of user defined data field (optional, may be zero).
   table_length  Length of hash table (zero forces default value).
   debug_level   Legal values range in [0,3]. Sets response to various error
                 conditions where 3 is the most verbose.

Return Value:
   int           Error code.

----------------------------------
----------------------------------
void Zoltan_DD_Destroy (Zoltan_DD_Directory **dd)
----------------------------------

This routine frees all memory allocated for the distributed directory.
No calls to any distributed directory functions are permitted after
calling this routine.  Free() has no error return.  Hence this routine
has no practical error return.  MPI is necessary for this routine only
to free the previously saved MPI comm.


Arguments:
   dd    Directory structure to be deallocated

Returned Value:
   NONE

-----------------------------------
-----------------------------------
int Zoltan_DD_Update (Zoltan_DD_Directory *dd, LB_ID_PTR gid,
 LB_ID_PTR lid, LB_ID_PTR user, int *partition, int count)
-----------------------------------

Zoltan_DD_Update() takes a list of global IDs and corresponding lists of
optional local IDs, optional user data, and optional partitions. This
routine updates the information for existing directory entries or creates
a new entry (filled with given data) if a global ID is not found.  NULL
lists should be passed for optional arguments not desired.  If all
entries were found (and updated), the return is ZOLTAN_DD_NORMAL_RETURN.
If at least one global ID was not found, the return value is
ZOLTAN_DD_GID_ADDED.  This function should be called initially and
whenever objects are migrated to keep the distributed directory current.

The user can set the debug level argument in the Zoltan_DD_Create()
to determine the module's response to multiple updates for any global ID
within one update cycle.  If the argument is set to 0, all multiple updates
are ignored (but the last determines the directory information.)  If the
argument is set to 1, an error is returned if the multiple updates
represent different owners for the same global ID.  If the debug level is 2,
an error return and an error message are generated if multiple updates
represent different owners for the same global ID.  If the level is 3, an
error return and an error message are generated for a multiple update even
if the updates represent the same owner for a global ID.



Arguments:
  dd         Distributed directory structure state information.
  gid        List of global IDs to update (in).
  lid        List of corresponding local IDs (optional) (in).
  user       List of corresponding user data (optional) (in).
  partition  List of corresponding partitions (optional) (in).
  count      Number of global IDs in update list.

Returned Value:
  int        Error code.

------------------------------------
------------------------------------
int Zoltan_DD_Find (Zoltan_DD_DDirectory *dd, LB_ID_PTR gid,
 LB_ID_PTR lid, LB_ID_PTR data, int *partition, int  count,
 int *owner)
------------------------------------


Given a list of global IDs, Zoltan_DD_Find() returns corresponding
lists of the global IDs' owners, local IDs, partitions, and optional
user data.  NULL lists must be provided for optional information not
being used.


Arguments:
  dd         Distributed directory structure state information.
  gid        List of global IDs whose information is requested.
  lid        Corresponding list of local IDs (optional) (out).
  data       Corresponding list of user data (optional) (out).
  partition  Corresponding list of partitions (optional) (out).
  count      Count of global IDs in above list
  owner      Corresponding list of data owners (out).

Returned Value:
   int       Error code.

------------------------------------
------------------------------------
int Zoltan_DD_Remove (Zoltan_DD_Directory *dd, LB_ID_PTR gid,
 int count)
------------------------------------

Zoltan_DD_Remove() takes a list of global_IDs and removes all of
their information from the distributed directory.


Arguments:
  dd         Distributed directory structure state information.
  gid        List of global IDs to eliminate from the directory.
  count      Number of global IDs to be removed.

Returned Value:
  int        Error code.

-------------------------------------
-------------------------------------
void Zoltan_DD_Set_Hash_Fn (Zoltan_DD_Directory *dd,
 unsigned int (*hash) (LB_ID_PTR, int, unsigned int))
-------------------------------------

Enables the user to register a new hash function for the distributed
directory. (If this routine is not called, the default hash function
LB_Hash() will be used automatically.)  This hash function determines
which processor maintains the distributed directory entry for a given
global ID.  Inexperienced users do not need this routine.

Experienced users may elect to create their own hash function based on
their knowledge of their global ID naming scheme.  The user's hash
function must have calling arguments compatible with LB_Hash().
Consider that a user has defined a hash function, myhash, as
  unsigned int myhash(LB_ID_PTR gid, int length, unsigned int naverage)
      {
      return *gid / naverage ;  // length assumed to be 1
                                // naverage = total_gids/nproc
      }
Then the call to register this hash function is:
   Zoltan_DD_Set_Hash (myhash) ;

NOTE:  This hash function might group the gid's directory information
       near the gid's owning processor's neighborhood, for an
       appropriate naming scheme.


Arguments:
   dd     Distributed directory structure state information.
   hash   Name of user's hash function.

Returned Value:
   NONE

----------------------------------------
----------------------------------------
void Zoltan_DD_Stats (Zoltan_DD_Directory *dd)
----------------------------------------

This routine prints out summary information about the local distributed
directory. It includes the hash table length, number of GIDs stored in
the local directory, the number of linked lists, and the length of the
longest linked list.  The dd->debug_level (set by an argument into the
Zoltan_DD_Create function controls this routines verbosity.

Arguments:
   dd      Distributed directory structure for state information

Returned Value:
   NONE

----------------------------------------
----------------------------------------
int Zoltan_DD_Set_Neighbor_Hash_Fn1 (Zoltan_DD_Directory *dd, int size)
----------------------------------------

This routine associates the first size GIDs to proc 0, the next size to
proc 1, etc.  It assumes the GIDs are consecutive numbers.  It assumes
that GIDs primarily stay near their original owner. The GID length is
assumed to be 1. GIDs outside of the range are evenly distributed among
the processors via modulo(nproc). This is a model for the user to develop
their own similar routine.

Arguments: (all are in)
   dd    Structure maintains directory state and hash table.
   size  Number of consecutive GIDs associated with a processor.

Return Value:
   int   Error Code.

----------------------------------------
----------------------------------------
int Zoltan_DD_Set_Neighbor_Hash_Fn2 (Zoltan_DD_Directory *dd, int *proc,
 int *low, int *high, int n)
----------------------------------------

This routine allows the user to specify a beginning and ending GID
"numbers" per directory processor. It assumes that GIDs primarily stay
near their original owner. It requires that the numbers of high, low, &
proc entries are all n. It assumes the GID length is 1. It is a model for
the user to develop their own similar routine. Users should note the
registration of a cleanup routine to free local static memory when the
distributed directory is destroyed. GIDs outside the range specified by
high and low lists are evenly distributed among the processors via modulo
(nproc).

Arguments: (all are in)
   dd    Structure maintains directory state and hash table.
   proc  List of processor ids labeling for corresponding high, low value.
   low   List of low GID limits corresponding to proc list.
   high  List of high GID limits corresponding to proc list.
   n     Number of elements in the above lists. Should be nproc!

Return Value:
   int   Error Code.

----------------------------------------
----------------------------------------
int Zoltan_DD_Print (Zoltan_DD_Directory *dd)

----------------------------------------

This utility displays (to stdout) the entire contents of the distributed
directory at one line per GID.

Arguments:
   dd    Structure maintains directory state and hash table.

Return Value:
   int   Error Code.

----------------------------------------

/***************** Developer's Notes *****************************/


Because Zoltan places no restrictions on the content or length of global
IDs, hashing does not guarantee a balanced distribution of objects in
the distributed directory.  Note also, the worst case behavior of a hash
table lookup is very bad (essentially becoming a linear search).
Fortunately, the average behavior is very good!  The user may specify
their own hash function in place of the default LB_Hash() to improve
performance.

This software module is built on top of the Zoltan Communications
functions for efficiency. Improvements to the communications library
will automatically benefit the distributed directory.




Global BUG:  The distributed directory should be implemented via
             client-server processes or threads.  However, MPI is
             not fully thread aware, yet.

FUTURE:      The C99 capability for variable length arrays would
             significantly simplify many of these following
             routines.  (It eliminates the malloc/free calls for
             temporary storage.  This helps prevent memory leaks.)
             Other C99 features may also improve code readability.
             The "inline" capability can potentially improve
             performance.

   
