http_server_user_doc.maml
17.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
$begin
$define(article)(0)()
$define(documentation)(0)()
$input($anubisdir/library/MAML4/basis.maml)
$input($anubisdir/library/anubis_doc.maml)
$input($anubisdir/library/names.maml)
$htmloptions(justify:true)
$define(term)(1)($code(240,240,240)($att($1)))
$title(The Anubis HTTP server (version 2))
$subtitle(User's documentation)
$center($italic(Alain Prouté))
Before you start reading this documentation, you should compile the program $fname(example_web_site.anubis) in the
directory $fname(library/http). This will produce a $em(example web site), launch the HTTP server on port number 2000
(or 2001, if 2000 is already in use, etc.). Then, open a browser, enter the URI $tt(http://127.0.0.1:2000) in the
address field, and you will have an idea of some of the features of the HTTP server.
Later on, after you have read this documentation, you can use a copy of this example program as a template for
constructing your own web site.
$tableofcontents
$section(Introduction)
This document describes how to use the $Anubis HTTP/HTTPS server version 2, written in 2020. The previous $Anubis
HTTP/HTTPS server was written in 2003, and it was high time to write a new one from scratch. This new server follows
very closely the specifications given in several more recent RFCs, namely RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC
2234 and RFC 3986. Not all the features defined in these RFCs are implemented, but we have implemented what is useful
for a well behaved nowadays generalist web server, including the possibility of streaming videos, together with some
capabilities of our own, some of which where already present in version 1.
Special attention has been paid to security. Web servers can be subject to attacks and we have implemented passive
defenses together with active ones and recommandations to web site designers. Of course, the server can use TLS
encryption, which is now recommended in all situations.
The $Anubis HTTP/HTTPS server, despite the fact that it runs within a single thread of the underlying operating
system, is capable of serving simultaneously several web sites, and of course, several clients for each web site. The
server can start in HTTP or in HTTPS mode or in both modes.
We have implemented a $em(session) mecanism that allows to follow clients from page to page. For example, if a client
chooses a language in the first page, the other pages will be served in this language thanks to the presence of
$em(session tickets) within the web pages. This system is automatic. You don't have to worry about it, but you must
follow the instructions given below for using it.
Tightly linked to session tickets is a system of login/logout that is also implemented in this version of the server,
so that you don't have anything to do in this respect except use the tools described below.
Each web site has a so-called $em(web site directory) on the server. A subdirectory $fname(public) is created within
this web site directory, where $em(only public resources) should go. Indeed, the content of this directory and its
subdirectories will be freely accessible by any client. Confidential resources must be stored outside the directory
tree whose root is this $fname(public) subdirectory of the web site directory.
A mecanism for $em(private downloading) of files is implemented that allows only an authorized client to download a
given file (which is of source not located in the $fname(public) subtree on the server). Authorizations are produced
by strong cryptographic mecanisms.
Because web sites are realized as secondary $Anubis modules, they can be modified and recompiled at any moment and
the server will automatically reload them without stopping. Furthermore, the server monitors its $em(configuration
file) and reloads it when it is modified without stopping.
Among the projects we have for our web server is the ability to obtain server certificates automatically via the ACME
(Automatic Certificate Management Environment) protocol.
$section(Configuring and starting the server)
The $Anubis HTTP/HTTPS server version 2 has several configuration parameters that must be given within a
$em(configuration file). Starting the server is performed as follows:
$term(
anbexec http_server $nt(path of configuration file)
)
Of course, you can start several HTTP/HTTPS servers provided that you create several configuration files, but they
will need to listen on different ports. Since listening on ports in the range 0-1023 requires a privilege,
$att(anbexec) must be setup correctly if you want to use these ports. The commands for this setup are (under Linux,
for example for version 1.19.3):
$term(
sudo chown root ~/bin/anbexec-1-19-3
sudo chmod ug+s ~/bin/anbexec-1-19-3
)
(recall that $att(anbexec) is just a shell calling $att(anbexec-1-19-3)).
$subsection(Format of a configuration file)
$subsection(What can be dynamically changed in the configuration)
$subsection(How the server interprets standard HTTP headers)
HTTP headers have a semantics that is defined in RFC 7231 and others refered to from within RFC 7231.
Cache-Control
Expect: 100-continue
Host:
Max-Forwards (TRACE and OPTIONS only)
Pragma
Range
TE
If-match
If-None-Match
If-Modified-Since
If-Unmodified-Since
If-Range
Accept
Accept-Charset
Accept-Encoding
Accept-Language
Authorization
Proxy-Authorization
From
Referer
User-Agent
$section(Creating a web site)
The $Anubis HTTP server is a primary module (to be executed by $att(anbexec)), and each web site is loaded as a
secondary module. Hence, creating a web site requires the construction of such a module.
$subsection(The type of a web site secondary module)
The type of secondary modules that define web sites already contains by itself many informations on how to construct
a web site for the $Anubis web server. Indeed, from the point of view of this server, a web site is nothing other
than a datum of this type. Here it is:
$acode(
public type WebSiteV2:
web_site(
List(String) names,
String web_site_directory
).
)
$subsubsection($att(names))
These are the names that are acceptable as the value of the $tt(Host) HTTP header. For example, this can be
$att($["127.0.0.1", "www.my_site.com", "my_site.com"$]). In this example, $att("127.0.0.1") is ment to be used during
the development of the web site.
$subsubsection($att(web_site_directory))
This indicates the path of the directory on the server's disk the web site can use a its $em(dedicated directory) (the
so-called $em(web site directory)). Prefer an absolute path that will make this information independent of where
$att(anbexec) is started from.
The server creates (if they don't already exist) the following subdirectories in
the web site directory:
$list(
$item $fname(public/) This is where only public resources should go. Indeed, the content of this directory and its
subdirectories is freely accessible by any client of the web site.
$item $fname(states/) This is where the server stores session states.
$item $fname(members/) This is where the server keeps the database of registered clients (members).
$item $fname(private_download/) This is where the server puts files that are ready for private dwonloading.
$item $fname(upload_temporary/) This is where the server stores files that are uploaded before they are moved to
another place.
$item $fname(journal/) This is where the server puts informations on the history of server events.
)
Of course, you can create other subdirectories in the web site directory, for example for installing a database.
$subsection(Layout and styling)
$subsection(Home page)
Each web site is supposed to have a $em(home page) that can be obtained from the name of web site. For example, the
name $fname(google.com), without any further information, yields the Google home page.
This home page is the default entry point for your web site.
$subsection(When the client clicks)
Within a web page of your web site, a client can click on various buttons and links. Such a click can trigger one of
several kind of effect:
$list(
$item trigger a purely local action (JavaScript program) that does not require a connection to a server (local
action),
$item leaving the web site and visit another one (foreign link),
$item download a file or other resource from your web site that does not need to be computed on the fly (server
link),
$item trigger an action on your server that will compute another web page on the fly (server action).
)
$subsection(HTML elements and CSS)
$subsection(Web sockets)
$subsection(Login/logout)
The server has a ready to use mecanism for login/logout. You have to use the HTML element provided below, which
appears as a login form if the client is not logged in and as a logout form otherwise. The appearance of this gadget
is determined by some customizable CSS.
From now on, we call a client a $em(visitor) if he/she is not logged in, and a $em(member) if he/she is.
Session tickets are provided for both visitors and members.
$subsection(Session tickets and session states)
The server remembers $em(session informations) attached to each visitor. Of course, these session informations are
different for visitors and for members. Session informations are never transmitted over the network. They are stored
within the $fname(states/) subdirectory of the web site directory. To each such state is associated a ticket (a
cryptographic hash), that is placed within the web page sent to the client.
When the client triggers an action, the session ticket is sent as $em(form data) to the server. This allows the
server to recover the state of the client and to compute a new state for this client.
$subsection(Streaming)
HTTP/1.1 has, as explained in RFC 7233, the capability of serving a range of bytes from a resource instead of the
whole resource. This can be used for streaming, but also for gracious recovery after a connection is cut in the
midst of a downloading. This version of the $Anubis HTTP server implements this feature, and you have essentially
nothing to do with regard to it. You shall only construct your web sites according to this possibility, for example
for the streaming of videos.
$section(Security considerations)
$subsection(Reviewing RFC 7231 recommandations)
A (non exhaustive) list of possible attacks is given in section 9 of RFC 7231. They are discussed below.
$subsubsection(Attacks Based on File and Path Names)
The question is of controling the access to the file system on the server. For example, if the server accepts
$att(..), or $att(~) within an URI, a client could possibly download a file located anywhere on the server.
The $Anubis server allows access to the $fname(public/) directory associated to the web site under consideration and
to its subdirectories. It is not necessarily a good idea to completely disallow double dots because it can be the
case for example that an HTML page located somehere in the $fname(public/) subtree, refers to an image that is not in
one of the subdirectories of the location of this HTML file. For example, the HTML file can contain something like
$att(<img src="../images/theimage.png">). This is acceptable if $fname(theimage.png) is still within the
$fname(public/) subtree.
To this end, the server checks that the resource is indeed within the $fname(public/) subtree. If this fails, an
error message $tt(404. Not$~found.) is sent to the client, and the server records the IP address and browser
fingerprint into its dictionary of $em(dubious clients). See below how to manage dubious clients.
Isolated dots are normal in URIs, for example in $fname(theimage.png). Even an URI containing $att(.) as one of the
directory names does not create a problem.
$subsubsection(Attacks Based on Command, Code, or Query Injection)
The $Anubis server is not much concerned by this problem because it will never consider anything present in the
request line (or header's values) as an executable command. For example, if an attacker inserts SQL commands into the
request line, they cannot be executed by the server even if you use a SQL database.
Also, recall that the request line should never contain sensitive informations. For example, sessions tickets are
never inserted into the request line. They are transmited as part of form data or within cookies.
$subsubsection(Disclosure of Personal Information)
This question is to be addressed by the designer of the web site. Of course, encryption should be used for
transfering any personal information. The $Anubis server also provides a mecanism for $em(private download) so that
only those clients that are allowed can access certain resources (including files), and these resources are of
course not located within the $fname(public/) subtree of the server.
$subsubsection(Disclosure of Sensitive Information in URIs)
As explained in RFC 7231, URIs are intended to be shared, not secured, even if they identify secured resources.
Anyway, we have already said several times in this documentation that a request line should never contain any
sensitive or private information. This is mainly under the responsability of the web site designer.
This has to do with the distinction between the two HTTP methods $att(GET) and $att(POST). In the case of $att(GET),
there is normally no body in the request (RFC 7231 section 4.3.1). Sensitive informations should preferably be in the
body of a $att(POST) request.
Another question is the use of the $att(Referer) HTTP header. This header gives the URI of the resource from which
the request originates. The $Anubis server checks that any request that contains a session ticket has a $att(Referer)
header pointing to the right resource. If not, the request is rejected and the client marked as dubious.
$subsubsection(Disclosure of Fragment after Redirects)
A $em(fragment) is this part of the request line that comes after the $att(#), indicating a precise position within a
web page or any other resource, with a semantics that depends on this resource. Under some circumstances, this
fragment can be forwarded to another web site, which is why it should not contain sensitive informations. Again,
since the fragment is part of the request line, it should anyway not contain any sensitive information.
$subsubsection(Disclosure of Product Information)
Here the question is that some HTTP headers, $att(User-Agent), $att(Via) and $att(Server) contain information on
which particular software is used by the client. This information may help attackers, but the $Anubis server is not
sensible to such things.
$subsubsection(Browser Fingerprinting)
According to RFC 7231 section 9.7, browser fingerprinting is a set of techniques for identifying a specific user
agent over time through its unique set of characteristics. Here are some HTTP headers that can provide informations
on the client: $att(From Cookie User-Agent Accept Accept-Charset Accept-Encoding Accept-Language).
Because such informations should be considered as confidential, the web site designer should use them only for a
good honest reason. Now, these informations are also useful for preventing attacks on the server. This is why they
are part of dubious clients informations. They allow the server to reidentify a client previously marked as dubious
and possibly to deny access. This, together with other methods can be an efficient tool against denial of service
attacks.
$subsection(Handling dubious clients)
Each time a $em(dubious client) is detected, informations about this client are recorded into a dictionary. The
server uses an algorithm for handling such clients. You can setup parameters in the configuration file that affect
how this algorithm works.
This algorithm works as follows.
$list(
$item When a problem arises, administrators are warned by email. The email provides a link to a web page where the
administrators can follow the evolution of the situation in real time, and from where they can trigger
actions.
$item It associates a $em(level of dubiousness) to each dubious client, and a $em(level of trust) to each
registered client.
$item It refuses access to clients depending on the level of dubiousness of the client (configurable).
$item It deletes the dubious client informations after some time (or never) depending on the level of dubiousness
(configurable).
$item It rejects connections whose behavior is suspect (configurable).
$item In case it begins to be overwhelmed by connections, it restricts access through a special web page that only
allows clients to login. Already logged in clients can continue to operate, unless a suspect behavior is
detected, also depending on their trust level (configurable).
)
$end