http_trace.anubis 14.5 KB
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531

                             The Anubis Project.
   
             Tracing a dialog between an HTTP server and its client. 
   
                       Copyright (c) Alain Proute' 2001. 
                            All rights reserved.
   
   
   
   This file  defines the executable  module 'http_trace'. This  module is some kind  of a
   very simplified proxy server.  It just sits  in the middle between your browser and the
   web, allowing the HTTP  dialog between your browser and web servers  to be traced. This
   is useful if you want to prepare programs which use 'http_get'.
   
   In order  to start this  tracer, just type  something like this  at the prompt  of your
   system (assuming that  the tracer will run on the machine  with address 192.168.0.10 on
   your local network):
   
     http_trace 192.168.0.10
   
   All  the  HTTP dialogs  will  be dumped  to  the  console, and  also  to  a file  named
   'http_trace_result.txt'.
   
   In order  to tell your browser  to pass through the  tracer, you must  configure it for
   using the tracer as a proxy. For example, with Netscape, use the
   
                        edit > preferences > advanced > proxies
   
   option  in the  menu.  Then  choose 'manual  proxy configuration'  and click  on 'view'
   button. You  must enter  the HTTP  proxy address and  port (and  if necessary  the same
   information for secured (SSL) HTTP proxy).  The  address of the proxy is the IP address
   of the machine the tracer is running on  (on your LAN, but you can use 127.0.0.1 if the
   tracer and the  browser are running on the  same machine), and the port  number is 8080
   for HTTP, and 8081 for secured HTTP.
   
   However, if 8080 is not convenient (for example, you may already have a proxy listening
   on this port), start the tracer with the command:
   
     http_trace 192.168.0.10 9000
   
   if you want to  use port 9000 instead. If you also want to  change the port for secured
   HTTP, use the command (for example):
   
     http_trace 192.168.0.10 9000 9001
   
   Of course, this file must have been compiled before you can use 'http_trace'. 
   
   
   --- That's all ! ---
   
   
   
   
   An IP address is a Word32 which may be displayed numerically by the following function:

define String 
  ip_addr_to_string
    (
      Word32 addr
    ) =
  to_decimal(addr&255) + "." +
  to_decimal((addr>>8)&255) + "." +
  to_decimal((addr>>16)&255) + "." +
  to_decimal((addr)>>24).
      
define Int
  string_to_int
    (
      Maybe(String) mb_s
    ) =
  if mb_s is 
    { 
      failure then 0, 
      success(s) then if decimal_scan(s) is
        {
          failure then 0, 
          success(n) then n
        }
    }. 
   

   
   The next function prints the outgoing trafic to the console and to the trace file. 
   
define One
  print_outgoing
    (
      List(String) l
    ) =
  if l is 
    {
      [ ] then unique,
      [h . t] then print("out: "); print(h); print("\n"); print_outgoing(t)
    }.
  
   
   'trace_outgoing' is  the identity  function, with  the side effect  that it  prints the
   outgoing trafic.
   
define List(String)
  trace_outgoing
    (
      List(String) l
    ) =
   print_outgoing(l); l. 
   
   
   
   
   
   The following function receives  a header line from a connection. Such  a line is right
   delimited by the ASCII characters 13  (carriage return) and 10 (line feed), as required
   by the HTTP protocol. The function  reads character after character until it encounters
   13. Then  it reads  and discards  the line  feed.  The  variable 'so_far'  contains, in
   reverse order,  the characters which  have been read  so far. When  the end of  line is
   found, or if the connection is closed,  the characters read are returned in the form of
   a string.
   
define String 
  receive_header_line
    (
      RWStream conn,
      List(Word8) so_far
    ) = 
  if *conn is 
    {
      failure    then // connection has been closed
        implode(reverse(so_far)),
   
      success(c) then 
        if c = 13             // carriage return
        then forget(*conn);   // read and forget the line feed
             implode(reverse(so_far))
        else receive_header_line(conn,[c . so_far])
    }. 
   
   
   
   The following function reads all the header  lines of a request from the connection. As
   required by the HTTP protocol, the complete  header is right (or bottom) delimited by a
   blank  line.  Hence,  the  function reads  header  lines until  it  encounters a  blank
   line. The result is  returned in the form of a list of  strings (in the natural order),
   including the last blank line.
   
define List(String)
  receive_header
    ( 
      RWStream conn,
      List(String) so_far
    ) =
  with line = receive_header_line(conn,[]),
  if line = ""
  then reverse(["" . so_far])
  else receive_header(conn,[line . so_far]). 
   
      

   
   Given a complete HTTP request header (in the  form of a list of strings, one per line),
   the following function gets the destination  server of the request. The method consists
   in finding a header line of the form:
   
   Host: name
   
   then 'name'  is the destination  of the request.  The function returns 'failure'  if it
   cannot find the destination.   

define Word8
  tolower
    (
      Word8 c
    ) =
  if ('A' +=< c & c +=< 'Z')
  then c - 'A' + 'a'
  else c. 
   
define String
  tolower
    (
      String s,
      Int n,
      List(Word8) so_far
    ) =
  if nth(n,s) is 
    {
      failure then implode(reverse(so_far)),
      success(c) then tolower(s,n+1,[tolower(c) . so_far])
    }.
   
define String 
  tolower
    (
      String s
    ) =
  tolower(s,0,[]). 
   
define Maybe(String)
  find_destination
    (
      List(String) header
    ) =
  if header is 
    {
      [ ] then failure,
      [h . t] then 
        if sub_string(h,0,6) is
          { 
            failure    then find_destination(t),
            success(s) then if tolower(s) = "host: "
                            then sub_string(h,6,length(h)-6)
                            else find_destination(t)
          }
    }. 
   

   
   

   In order  to read the  body of  the request, we  first need to  know its exact  size in
   bytes. Indeed,  according to the  HTTP protocol, the  end of the  body of a  message is
   either known by this method or  corresponds to the closing of the connection (actually,
   there are  two other methods).  However,  in the case  of a request, the  connection is
   never closed by  the client, because the client  is waiting for the answer  on the same
   connection. The method is to find a header line of the form:
   
   Content-length: n
   
   then 'n' is the  size of the body. If this header line cannot  be found, we assume that
   there is no body at all, and return  0. This will be the right interpretation for 'GET'
   requests.
   
define Int
  find_body_size
    (
      List(String) header
    ) =
  if header is 
    {
      [ ] then 0,         // Content-length not found. 
      [h . t] then
        if sub_string(h,0,16) is
          {
            failure     then find_body_size(t),
            success(s)  then if tolower(s) = "content-length: "
                             then with size = 
                                  string_to_int(sub_string(h,16,length(h)-16)), 
                                    print("--- body size: "+abs_to_decimal(size)+"\n"); size
                             else find_body_size(t)
          }
    }.
   
   
   
   The next function  reads the body of the  request from the connection. The  size of the
   body,  which  is known,  is  used  as a  counter,  and  decremented  at each  character
   read. Caracters read are stored in a  list, which is transformed into a string when all
   characters are read.
   
define String
  receive_request_body
    (
      RWStream conn, 
      Int size,
      List(Word8) so_far
    ) =
  if size = 0
  then implode(reverse(so_far))
  else if *conn is 
    {
      failure        then implode(reverse(so_far)),
      success(c)     then 
        receive_request_body(conn,size-1,[c . so_far])
    }.
   
   
   

   The next  function opens a  connection with a  server, which is  known by its  name. It
   returns the connection (maybe).
   
define Maybe(RWStream)
  open_connection
    (
      String server_name
    ) =
  if dns(server_name) is 
    {
      host_not_found          then print("---- Host '" + server_name + "' not found.\n");       failure, 
      no_address_found        then print("---- No address found for '" + server_name + "'.\n"); failure, 
      try_again               then print("---- DNS server is busy.\n");                         failure, 
      non_recoverable_error   then print("---- Non recoverable DNS error.\n");                  failure, 
      ok(addr)                then 
        if (Result(NetworkConnectError,RWStream))connect(addr,80) is 
          {
            error(_)     then  print("--- Cannot connect to '" + ip_addr_to_string(addr) + ":80'.\n");   
                               failure,
            ok(conn)     then success(conn)
          }
    }. 
   
   


   Sending a string on a connection.
   
define One 
  send
    (
      RWStream conn,
      String s,
      Int n
    ) =
  if nth(n,s) is 
    {
      failure then unique, 
      success(c) then 
        if conn <- c is 
          {
            failure      then print("---- Cannot send to server.\n"), 
            success(_)   then send(conn,s,n+1)
          }
    }.
   
define One 
  send    
    (
      RWStream conn, 
      String s
    ) = 
  send(conn,s,0). 
   
   
   
   
   Forwarding the header (a  list of strings) amounts to send the  strings in the list and
   sending crlf after each. 
   
define One 
  forward_header
    (
      RWStream conn,
      List(String) l
    ) =
  if l is 
    {
      [ ] then unique, 
      [h . t] then 
        send(conn,h+implode([13,10])); forward_header(conn,t)
    }. 
   
   
   
   Getting the answer from the server and forwarding it to the client. 
   
 define Maybe(String)
  get_line
    (
      RWStream sconn,
      List(Word8) so_far
    ) =
  if *sconn is 
    {
      failure then 
        if so_far is 
          {
            [ ]     then failure, 
            [_._]   then success(implode(reverse(so_far)))
          },
   
      success(c) then 
        if c = 10
        then success(implode(reverse([c . so_far])))
        else get_line(sconn,[c . so_far])
    }. 
   
   
 define One
  get_answer
    (
      RWStream server_conn, 
      RWStream client_conn
    ) =
  if get_line(server_conn,[]) is 
    {
      failure then unique, 
      success(line) then 
    //    print(" in: " + line); 
        send(client_conn,line);
        get_answer(server_conn,client_conn)
    }. 

   
define One
  get_answer
    (
      RWStream server_conn, 
      RWStream client_conn
    ) =
  if *server_conn is 
    {
      failure then unique, 
      success(c) then if client_conn <- c is 
        {
          failure then print("Cannot forward incomming data.\n"), 
          success(_) then get_answer(server_conn,client_conn)
        }
    }.
   
   
   The  request  needs   to  be  forwarded  to  its   actual  destination.   The  function
   'forward_request'  knows the  destination, the  complete header,  and the  body  of the
   request. Its job is just to open  a connection with the destination server and to print
   the request  in the  connection.  Then  it waits for  the answer,  and when  the answer
   arrives, it is traced and forwarded to the client.
   
define One
  forward_request
    (
      RWStream client_conn, 
      String dest,
      List(String) header,
      String body
    ) =
  if open_connection(dest) is 
    {
      failure then unique,    // message already sent
      success(conn) then 
        forward_header(conn,header); 
        send(conn,body);
        get_answer(conn,client_conn)
    }. 
   

   
   
   
   Now, here is the request handler of our proxy server. 
   
define One
  handler
    (
      RWStream conn,
    ) =
  with header = trace_outgoing(receive_header(conn,[])), 
  if find_destination(header) is 
    {
      failure           then print("Cannot find destination.\n"),
      success(dest)     then 
        //print("---- destination is: " + dest + "\n");
        with bsize  = find_body_size(header), 
             body   = receive_request_body(conn,bsize,[]), 
          print_outgoing([body]); 
          forward_request(conn,dest,header,body)
    }. 
   
   
define One env = unique. 
   
   
define One notify (One _) = print("Notify !\n").    
   
   
   
   Starting the server. The address is first converted to an Word32. 
   
define One
  http_trace
    (
      String url,
      Word32 port
    ) =
  if dns(url) is ok(addr) 
  then    
    if start_server(addr,port,(Server _) |-> handler,notify) is 
      {
        cannot_create_the_socket  then print("Cannot create the listening socket.\n"),
        cannot_bind_to_port       then print("Cannot bind to port: " + to_decimal(port) + ".\n"),  
        cannot_listen_on_port     then print("Cannot listen on port: " + to_decimal(port) + ".\n"),  
        ok(server)                then 
          print("Tracing proxy server started on " + ip_addr_to_string(addr) + 
             ":" + to_decimal(port) + " ...\n");
          checking every 10000 milliseconds, wait for is_down(server) then print("Server is down.\n")
      }
  else 
    print("Cannot get numerical IP address for " + url + ".\n"). 
   
   
   
   Recalling the invocation syntax for 'http_trace': 
  
define One 
  recall_syntax =
    print("Usage: http_trace ip_address [port]\n" +
          "   where ip_address may be either a name or numerical\n" + 
          "   and where 0 =< port =< 65535\n" +
          "   Default value for port is 8080.\n"). 

   
   
   Now,  here  is our  module  'http_trace'.   It needs  at  least  one  argument: the  IP
   address. Then it  test the second argument if  it exists. It must be  a positive number
   less  than or  equal  to  65535. If  all  the conditions  are  satisfied, the  previous
   http_trace function is called.
   
global define One
  http_trace
    (
      List(String) args
    ) =
  if args is
    {
      [ ]        then recall_syntax,
      [addr . t]    then if t is 
        {
          [ ] then http_trace(addr,8080), 
          [port . _] then if decimal_scan(port) is 
            {
              failure      then recall_syntax,
              success(n)   then 
                if n < 0 
                then recall_syntax
                else if n > 65535
                     then recall_syntax
                     else http_trace(addr, truncate_to_Word32(n))
            }
        }
    }.