safe_save_retrieve.anubis
7.39 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
*Project* Anubis
*Title* Safe Save/Retrieve Mechanism
*Copyright* Copyright (c) Alain Proute' 2006
*Author* Alain Prouté
*Overview*
We may need a more secure mechanism than save/retrieve, because when saving a datum the
physical machine may be interrupted, producing a truncated file.
*** How it works.
To solve this problem, we use 2 file names for the same datum file. Assume the normal
name of the file is 'datum_file'. We will use the names 'datum_file.0' and
'datum_file.1'.
When we retrieve the datum, we look for these two files:
- If none is present the datum cannot be retrieved.
- If one of them is present, we try to retrieve the datum from it.
- If both are present, we check the integrity of the most recent one. If it's ok, we
retrieve the datum from it. Otherwise, we try to retrieve the datum from the other
one.
When we save a datum, we proceed as follows:
- No file present on disk: we save the datum under the name 'datum_file.0'.
- One file present on disk: we save the datum under the other name.
- Two files present on disk: one of them (say B) is more recent than the other one
(say A). If 'B' is coherent we save the datum under name 'A'. Otherwise, we save it
under name 'B'.
The coherency may be tested because data a saved within an SHA1 envelop.
*** How safe it is ?
This system is clearly safe provided that:
- Only one virtual machine may manipulate the two files.
- No other mechanism (for example exterior to Anubis) may manipulate the two files.
*** The interface.
We Define 'safe_save' and 'safe_retrieve' of the same types as 'save' and 'retrieve',
so as to make it easy to convert a 'save/retrieve' program into a
'safe_save/safe_retrieve' program. However, the two mecanisms are incompatible in the
sens that the same filename cannot be handled by both mecanism.
When you replace 'save/retrieve' by 'safe_save/safe_retrieve' for a given filename, you
don't have to modify the files, because 'safe_retrieve' is able to retrieve the file
saved by 'save'. Of course, it will do it only if the new files do not exist. So the
conversion of the files is automatic.
public define SaveResult
safe_save
(
$T datum,
String filename
).
public define RetrieveResult($T)
safe_retrieve
(
String filename
).
As usual, the type '$T' must be serializable.
--- That's all for the public part ! --------------------------------------------------
*** SHA1 envelops.
The datum 'd' to be saved is serialized as 's'. The content of the file is the
serialisation of '(sha1(s),s)' (a so-called 'sha1 envelop').
*** Retrieving.
Checking the coherency, and retrieving the content at the same time, works like this:
define RetrieveResult($T)
check_and_retrieve
(
String filename_0 // may be 'datum_file.0' or 'datum_file.1'
) =
if read_from_file(filename_0) is
{
cannot_find_file then cannot_find_file,
read_error(ba) then read_error,
ok(ba) then
if (Maybe((ByteArray,ByteArray)))unserialize(ba) is
{
failure then print("Cannot unserialize enveloped file '"+filename_0+"'.\n");
type_error,
success(p) then if p is (hash,s) then
if hash = sha1(s)
then if (Maybe($T))unserialize(s) is
{
failure then print("Cannot unserialize table from file '"+filename_0+"'.\n");
type_error,
success(d) then ok(d)
}
else print("Bad envelop: '"+filename_0+"'.\n");
type_error // type_error may also mean 'incoherent'.
}
}.
Now we apply our protocol for retrieving the datum.
public define RetrieveResult($T)
safe_retrieve
(
String filename // 'datum_file' without '.0' nor '.1'
) =
if get_file_times(filename+".0") is
{
failure then
//
// 'datum_file.0' does not exist, try 'datum_file.1'
//
if check_and_retrieve(filename+".1") is
{
//
// if neither 'datum_file.0' nor 'datum_file.1' exist,
// try 'datum_file' by the old method (no envelop)
//
cannot_find_file then retrieve(filename),
read_error then retrieve(filename),
type_error then retrieve(filename),
ok(ba) then ok(ba)
},
success(times_0) then if get_file_times(filename+".1") is
{
failure then check_and_retrieve(filename+".0"),
success(times_1) then
//
// both files exist. We don't need to keep the file in the
// old format:
//
forget(remove(filename));
//
if times_0 is times(lm_0,_) then
if times_1 is times(lm_1,_) then
if lm_0 < lm_1
then
//
// 'datum_file.1' more recent than 'datum_file.0'
//
if check_and_retrieve(filename+".1") is
{
cannot_find_file then check_and_retrieve(filename+".0"),
read_error then check_and_retrieve(filename+".0"),
type_error then check_and_retrieve(filename+".0"),
ok(ba) then ok(ba)
}
else
//
// 'datum_file.0' more recent than 'datum_file.1'
//
if check_and_retrieve(filename+".0") is
{
cannot_find_file then check_and_retrieve(filename+".1"),
read_error then check_and_retrieve(filename+".1"),
type_error then check_and_retrieve(filename+".1"),
ok(ba) then ok(ba)
}
}
}.
*** Saving.
public define SaveResult
safe_save
(
$T datum,
String filename
) =
//
// prepare the envelop
//
with s = serialize(datum),
envelop = (sha1(s),s),
if get_file_times(filename+".0") is
{
failure then
//
// if 'datum_file.0' does not exist, save under this name
//
save(envelop,filename+".0"),
success(times_0) then if get_file_times(filename+".1") is
{
failure then
//
// otherwise, if 'datum_file.1' does not exist save under this name
//
save(envelop,filename+".1"),
success(times_1) then
//
// if both files exist, save under the name of the oldest one
//
if times_0 is times(lm_0,_) then
if times_1 is times(lm_1,_) then
if lm_0 < lm_1
then save(envelop,filename+".0")
else save(envelop,filename+".1")
}
}.