safe_save_retrieve.anubis 7.39 KB
   
   *Project*      Anubis
   *Title*        Safe Save/Retrieve Mechanism
   *Copyright*    Copyright (c) Alain Proute' 2006
   
   *Author*       Alain Prouté

   
   
   *Overview*
   
   We may need a more secure mechanism than save/retrieve, because when saving a datum the
   physical machine may be interrupted, producing a truncated file.
   
   
   
   *** How it works.
   
   To solve this problem, we use 2 file  names for the same datum file.  Assume the normal
   name  of  the  file  is  'datum_file'.   We  will  use  the  names  'datum_file.0'  and
   'datum_file.1'.
   
   When we retrieve the  datum, we look for these two files:
   
     - If none is present the datum cannot be retrieved.
     - If one of them is present, we try to retrieve the datum from it.
     - If both are present, we check the integrity  of the most recent one. If it's ok, we
       retrieve the datum from it. Otherwise, we  try to retrieve the datum from the other
       one.
   
   When we save a datum, we proceed as follows:
   
     - No file present on disk: we save the datum under the name 'datum_file.0'.
     - One file present on disk: we save the datum under the other name.
     - Two files present  on disk: one of them  (say B) is more recent than  the other one
       (say A). If 'B' is coherent we save the datum under name 'A'. Otherwise, we save it
       under name 'B'. 

   The coherency may be tested because data a saved within an SHA1 envelop. 
   
   
   
   *** How safe it is ? 
   
   This system is clearly safe provided that:
   
     - Only one virtual machine may manipulate the two files.
     - No other mechanism (for example exterior to Anubis) may manipulate the two files. 
   
   
   
   *** The interface. 
   
   We Define 'safe_save'  and 'safe_retrieve' of the same types  as 'save' and 'retrieve',
   so   as   to   make  it   easy   to   convert   a   'save/retrieve'  program   into   a
   'safe_save/safe_retrieve' program.  However, the  two mecanisms are incompatible in the
   sens that the same filename cannot be handled by both mecanism.
   
   When you replace 'save/retrieve' by 'safe_save/safe_retrieve' for a given filename, you
   don't have  to modify the files, because  'safe_retrieve' is able to  retrieve the file
   saved by 'save'.  Of course, it will do it  only if the new files do  not exist. So the
   conversion of the files is automatic. 
   

   
public define SaveResult
   safe_save
     (
       $T         datum, 
       String     filename
     ).

public define RetrieveResult($T)
  safe_retrieve
    (
      String     filename
    ).
   
   As usual, the type '$T' must be serializable. 
   
   
   
   --- That's all for the public part ! --------------------------------------------------
   
   
   
   *** SHA1 envelops.
   
   The  datum 'd'  to be  saved is  serialized  as 's'.  The content  of the  file is  the
   serialisation of '(sha1(s),s)' (a so-called 'sha1 envelop').
   
   
   
   *** Retrieving. 
   
   Checking the coherency, and retrieving the content at the same time, works like this:

define RetrieveResult($T)
   check_and_retrieve
     (
       String filename_0    // may be 'datum_file.0' or 'datum_file.1'
     ) =
   if read_from_file(filename_0) is
     {
       cannot_find_file     then cannot_find_file, 
       read_error(ba)       then read_error,
       ok(ba)               then 
         if (Maybe((ByteArray,ByteArray)))unserialize(ba) is
           {
             failure     then print("Cannot unserialize enveloped file '"+filename_0+"'.\n");
                              type_error, 
             success(p)  then if p is (hash,s) then 
               if hash = sha1(s)
               then if (Maybe($T))unserialize(s) is 
                 {
                   failure     then print("Cannot unserialize table from file '"+filename_0+"'.\n");
                                    type_error, 
                   success(d)  then ok(d)
                 }
               else print("Bad envelop: '"+filename_0+"'.\n");
                    type_error   // type_error may also mean 'incoherent'. 
           }
     }.

   
   
   Now we apply our protocol for retrieving the datum. 
   
public define RetrieveResult($T)
   safe_retrieve
     (
       String     filename      // 'datum_file' without '.0' nor '.1'
     ) =
   if get_file_times(filename+".0") is
     {
       failure then 
         //
         // 'datum_file.0' does not exist, try 'datum_file.1'
         //
         if check_and_retrieve(filename+".1") is 
           {
             //
             // if neither 'datum_file.0' nor 'datum_file.1' exist, 
             // try 'datum_file' by the old method (no envelop)
             //
             cannot_find_file  then retrieve(filename),
             read_error        then retrieve(filename),
             type_error        then retrieve(filename),
             ok(ba)            then ok(ba)             
           },
       success(times_0) then if get_file_times(filename+".1") is 
         {
           failure then check_and_retrieve(filename+".0"),
           success(times_1) then 
             //
             // both files exist. We don't need to keep the file in the 
             // old format:
             //
             forget(remove(filename)); 
             //
             if times_0 is times(lm_0,_) then 
             if times_1 is times(lm_1,_) then 
             if lm_0 < lm_1
             then 
               //
               // 'datum_file.1' more recent than 'datum_file.0'
               // 
               if check_and_retrieve(filename+".1") is 
                 {
                   cannot_find_file  then check_and_retrieve(filename+".0"),
                   read_error        then check_and_retrieve(filename+".0"),
                   type_error        then check_and_retrieve(filename+".0"),
                   ok(ba)            then ok(ba)
                 }
             else 
               //
               // 'datum_file.0' more recent than 'datum_file.1'
               // 
               if check_and_retrieve(filename+".0") is 
                 {
                   cannot_find_file  then check_and_retrieve(filename+".1"),
                   read_error        then check_and_retrieve(filename+".1"),
                   type_error        then check_and_retrieve(filename+".1"),
                   ok(ba)            then ok(ba)
                 } 
         }
     }. 

   
   
   
   
   
   *** Saving. 
   
public define SaveResult
   safe_save
     (
       $T         datum, 
       String     filename
     ) =
   //
   // prepare the envelop
   //
   with        s = serialize(datum), 
         envelop = (sha1(s),s), 
   
   if get_file_times(filename+".0") is 
     {
       failure then 
         //
         // if 'datum_file.0' does not exist, save under this name
         //
         save(envelop,filename+".0"),
   
       success(times_0) then if get_file_times(filename+".1") is 
         {
           failure then 
             //
             // otherwise, if 'datum_file.1' does not exist save under this name
             //
             save(envelop,filename+".1"),
   
           success(times_1) then 
             //
             // if both files exist, save under the name of the oldest one
             //
             if times_0 is times(lm_0,_) then 
             if times_1 is times(lm_1,_) then 
             if lm_0 < lm_1
             then save(envelop,filename+".0")
             else save(envelop,filename+".1")
         }
     }.