description.text
201 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
3456
3457
3458
3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530
3531
3532
3533
3534
3535
3536
3537
3538
3539
3540
3541
3542
3543
3544
3545
3546
3547
3548
3549
3550
3551
3552
3553
3554
3555
3556
3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
3596
3597
3598
3599
3600
3601
3602
3603
3604
3605
3606
3607
3608
3609
3610
3611
3612
3613
3614
3615
3616
3617
3618
3619
3620
3621
3622
3623
3624
3625
3626
3627
3628
3629
3630
3631
3632
3633
3634
3635
3636
3637
3638
3639
3640
3641
3642
3643
3644
3645
3646
3647
3648
3649
3650
3651
3652
3653
3654
3655
3656
3657
3658
3659
3660
3661
3662
3663
3664
3665
3666
3667
3668
3669
3670
3671
3672
3673
3674
3675
3676
3677
3678
3679
3680
3681
3682
3683
3684
3685
3686
3687
3688
3689
3690
3691
3692
3693
3694
3695
3696
3697
3698
3699
3700
3701
3702
3703
3704
3705
3706
3707
3708
3709
3710
3711
3712
3713
3714
3715
3716
3717
3718
3719
3720
3721
3722
3723
3724
3725
3726
3727
3728
3729
3730
3731
3732
3733
3734
3735
3736
3737
3738
3739
3740
3741
3742
3743
3744
3745
3746
3747
3748
3749
3750
3751
3752
3753
3754
3755
3756
3757
3758
3759
3760
3761
3762
3763
3764
3765
3766
3767
3768
3769
3770
3771
3772
3773
3774
3775
3776
3777
3778
3779
3780
3781
3782
3783
3784
3785
3786
3787
3788
3789
3790
3791
3792
3793
3794
3795
3796
3797
3798
3799
3800
3801
3802
3803
3804
3805
3806
3807
3808
3809
3810
3811
3812
3813
3814
3815
3816
3817
3818
3819
3820
3821
3822
3823
3824
3825
3826
3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
3842
3843
3844
3845
3846
3847
3848
3849
3850
3851
3852
3853
3854
3855
3856
3857
3858
3859
3860
3861
3862
3863
3864
3865
3866
3867
3868
3869
3870
3871
3872
3873
3874
3875
3876
3877
3878
3879
3880
3881
3882
3883
3884
3885
3886
3887
3888
3889
3890
3891
3892
3893
3894
3895
3896
3897
3898
3899
3900
3901
3902
3903
3904
3905
3906
3907
3908
3909
3910
3911
3912
3913
3914
3915
3916
3917
3918
3919
3920
3921
3922
3923
3924
3925
3926
3927
3928
3929
3930
3931
3932
3933
3934
3935
3936
3937
3938
3939
3940
3941
3942
3943
3944
3945
3946
3947
3948
3949
3950
3951
3952
3953
3954
3955
3956
3957
3958
3959
3960
3961
3962
3963
3964
3965
3966
3967
3968
3969
3970
3971
3972
3973
3974
3975
3976
3977
3978
3979
3980
3981
3982
3983
3984
3985
3986
3987
3988
3989
3990
3991
3992
3993
3994
3995
3996
3997
3998
3999
4000
4001
4002
4003
4004
4005
4006
4007
4008
4009
4010
4011
4012
4013
4014
4015
4016
4017
4018
4019
4020
4021
4022
4023
4024
4025
4026
4027
4028
4029
4030
4031
4032
4033
4034
4035
4036
4037
4038
4039
4040
4041
4042
4043
4044
4045
4046
4047
4048
4049
4050
4051
4052
4053
4054
4055
4056
4057
4058
4059
4060
4061
4062
4063
4064
4065
4066
4067
4068
4069
4070
4071
4072
4073
4074
4075
4076
4077
4078
4079
4080
4081
4082
4083
4084
4085
4086
4087
4088
4089
4090
4091
4092
4093
4094
4095
4096
4097
4098
4099
4100
4101
4102
4103
4104
4105
4106
4107
4108
4109
4110
4111
4112
4113
4114
4115
4116
4117
4118
4119
4120
4121
4122
4123
4124
4125
4126
4127
4128
4129
4130
4131
4132
4133
4134
4135
4136
4137
4138
4139
4140
4141
4142
4143
4144
4145
4146
4147
4148
4149
4150
4151
4152
4153
4154
4155
4156
4157
4158
4159
4160
4161
4162
4163
4164
4165
4166
4167
4168
4169
4170
4171
4172
4173
4174
4175
4176
4177
4178
4179
4180
4181
4182
4183
4184
4185
4186
4187
4188
4189
4190
4191
4192
4193
4194
4195
4196
4197
4198
4199
4200
4201
4202
4203
4204
4205
4206
4207
4208
4209
4210
4211
4212
4213
4214
4215
4216
4217
4218
4219
4220
4221
4222
4223
4224
4225
4226
4227
4228
4229
4230
4231
4232
4233
4234
4235
4236
4237
4238
4239
4240
4241
4242
4243
4244
4245
4246
4247
4248
4249
4250
4251
4252
4253
4254
4255
4256
4257
4258
4259
4260
4261
4262
4263
4264
4265
4266
4267
4268
4269
4270
4271
4272
4273
4274
4275
4276
4277
4278
4279
4280
4281
4282
4283
4284
4285
4286
4287
4288
4289
4290
4291
4292
4293
4294
4295
4296
4297
4298
4299
4300
4301
4302
4303
4304
4305
4306
4307
4308
4309
4310
4311
4312
4313
4314
4315
4316
4317
4318
4319
4320
4321
4322
4323
4324
4325
4326
4327
4328
4329
4330
4331
4332
4333
4334
4335
4336
4337
4338
4339
4340
4341
4342
4343
4344
4345
4346
4347
4348
4349
4350
4351
4352
4353
4354
4355
4356
4357
4358
4359
4360
4361
4362
4363
4364
4365
4366
4367
4368
4369
4370
4371
4372
4373
4374
4375
4376
4377
4378
4379
4380
4381
4382
4383
4384
4385
4386
4387
4388
4389
4390
4391
4392
4393
4394
4395
4396
4397
4398
4399
4400
4401
4402
4403
4404
4405
4406
4407
4408
4409
4410
4411
4412
4413
4414
4415
4416
4417
4418
4419
4420
4421
4422
4423
4424
4425
4426
4427
4428
4429
4430
4431
4432
4433
4434
4435
4436
4437
4438
4439
4440
4441
4442
4443
4444
4445
4446
4447
4448
4449
4450
4451
4452
4453
4454
4455
4456
4457
4458
4459
4460
4461
4462
4463
4464
4465
4466
4467
4468
4469
4470
4471
4472
4473
4474
4475
4476
4477
4478
4479
4480
4481
4482
4483
4484
4485
4486
4487
4488
4489
4490
4491
4492
4493
4494
4495
4496
4497
4498
4499
4500
4501
4502
4503
4504
4505
4506
4507
4508
4509
4510
4511
4512
4513
4514
4515
4516
4517
4518
4519
4520
4521
4522
4523
4524
4525
4526
4527
4528
4529
4530
4531
4532
4533
4534
4535
4536
4537
4538
4539
4540
4541
4542
4543
4544
4545
4546
4547
4548
4549
4550
4551
4552
4553
4554
4555
4556
4557
4558
4559
4560
4561
4562
4563
4564
4565
4566
4567
4568
4569
4570
4571
4572
4573
4574
4575
4576
4577
4578
4579
4580
4581
4582
4583
4584
4585
4586
4587
4588
4589
4590
4591
4592
4593
4594
4595
4596
4597
4598
4599
4600
4601
4602
4603
4604
4605
4606
4607
4608
4609
4610
4611
4612
4613
4614
4615
4616
4617
4618
4619
4620
4621
4622
4623
4624
4625
4626
4627
4628
4629
4630
4631
4632
4633
4634
4635
4636
4637
4638
4639
4640
4641
4642
4643
4644
4645
4646
4647
4648
4649
4650
4651
4652
4653
4654
4655
4656
4657
4658
4659
4660
4661
4662
4663
4664
4665
4666
4667
4668
4669
4670
4671
4672
4673
4674
4675
4676
4677
4678
4679
4680
4681
4682
4683
4684
4685
4686
4687
4688
4689
4690
4691
4692
4693
4694
4695
4696
4697
4698
4699
4700
4701
4702
4703
4704
4705
4706
4707
4708
4709
4710
4711
4712
4713
4714
4715
4716
4717
4718
4719
4720
4721
4722
4723
4724
4725
4726
4727
4728
4729
4730
4731
4732
4733
4734
4735
4736
4737
4738
4739
4740
4741
4742
4743
4744
4745
4746
4747
4748
4749
4750
4751
4752
4753
4754
4755
4756
4757
4758
4759
4760
4761
4762
4763
4764
4765
4766
4767
4768
4769
4770
4771
4772
4773
4774
4775
4776
4777
4778
4779
4780
4781
4782
4783
4784
4785
4786
4787
4788
4789
4790
4791
4792
4793
4794
4795
4796
4797
4798
4799
4800
4801
4802
4803
4804
4805
4806
4807
4808
4809
4810
4811
4812
4813
4814
4815
4816
4817
4818
4819
4820
4821
4822
4823
4824
4825
4826
4827
4828
4829
4830
4831
4832
4833
4834
4835
4836
4837
4838
4839
4840
4841
4842
4843
4844
4845
4846
4847
4848
4849
4850
4851
4852
4853
4854
4855
4856
4857
4858
4859
4860
Description of the Anubis Compiler
This file contains a description of the Anubis compiler together with a discussion on
how the next version of the compiler should be designed.
-------------------------------- Table of Contents ------------------------------------
*** (1) Overall description.
*** (1.1) Modules of the compiler.
*** (1.2) How it works.
*** (1.2.1) Phase 1.
*** (1.2.2) Phase 2.
*** (1.3) Garbage collection.
*** (2) Tools.
*** (2.1) Lisp-like representation.
*** (2.2) Exceptions to the Lisp-like 'Expr' representation.
*** (2.3) Catalog of all internal data representations.
*** (2.3.1) Internal representation of types.
*** (2.3.2) Internal representation of terms.
*** (2.3.3) Internal representation of interpretations.
*** (2.3.4) Internal representation of type implementations.
*** (2.3.5) Internal representation of virtual machine instructions.
*** (2.3.5.1) Pseudo instructions.
*** (2.3.5.2) Computing instructions.
*** (2.3.5.3) Garbage collector instructions.
*** (2.3.5.4) Closure instructions.
*** (2.3.5.5) Equality instructions.
*** (2.3.5.6) Dynamic variables instructions.
*** (2.3.5.7) Serialising/unserialising.
*** (2.3.5.8) Starting a new virtual machine.
*** (2.3.5.8) Obsolete and soon obsolete instructions.
*** (2.4) Unification.
*** (2.5) Organization of the knowledge base.
*** (2.4) Indexation of symbols.
*** (3) The lexer.
*** (3.1) Generalities.
*** (3.2) Finding source files.
*** (3.3) The 'read' stack.
*** (3.4) Remembering already read files.
*** (3.5) Computing visibilities.
*** (4) The parser.
*** (5) Computing interpretations of terms.
*** (5.1) Overview.
*** (5.2) Interpreting applicative terms.
*** (5.3) Interpreting conditionals.
*** (5.4) Interpreting selective conditionals.
*** (6) Implementation of types.
*** (6.1) Implementation of primitive types.
*** (6.1.1) 'String'.
*** (6.1.2) 'ByteArray'.
*** (6.1.3) 'Int32'.
*** (6.1.4) 'Float'.
*** (6.1.5) 'Listener'.
*** (6.2) Implementation of defined types.
*** (6.3) Implementation of functional types.
*** (6.4) Implementation of address types.
*** (6.4.1) File and network address types.
*** (6.4.2) Dynamic variable address types.
*** (6.5) implementation of C structure pointer types.
*** (7) Code generation.
*** (7.1) General conventions.
*** (7.2) Elimination of terminal recursion.
*** (7.3) Generating the code.
*** (7.3.1) Protect.
*** (7.3.2) Lock.
*** (7.3.3) Global symbols.
*** (7.3.4) Strings, Int32, small data and Floats.
*** (7.3.5) Local symbols.
*** (7.3.6) Microlocal symbols.
*** (7.3.7) Closures.
*** (7.3.8) Applicative terms.
*** (7.3.9) Conditionals.
*** (7.3.10) Selective conditionals.
*** (7.3.11) Local definitions ('with ...').
*** (7.3.12) Delegate.
*** (7.3.13) Serialize/Unserialize.
*** (7.4) Peephole optimisation.
*** (8) Garbage collector code.
*** (9) Equality code.
*** (10) Serialisation/unserialisation.
*** (10.1) How it works.
*** (10.2) Descriptions of small types.
*** (10.3) Pseudo-instructions set.
*** (10.4) Serialising.
*** (10.5) Unserialising.
*** (11) Outputing the code (making an *.adm file).
*** (12) How to make a system of macros ?
*** (12.1) Overview.
*** (12.2) The trick.
*** (12.3) How it works.
*** (12.4) An example.
*** (13) Propositions for a dynamically defragmenting memory management system.
*** (13.1) How it works.
*** (13.2) Moving an allocated segment within memory.
*** (13.3) Compatibility with segregated lists.
*** (13.4) Defragmenting memory.
*** (13.5) This allocator seen from the outside.
*** (13.5.1) Hidden words.
*** (13.5.2) Tools (interface to the allocator).
*** (13.6) Permanent data.
*** (13.7) Fake references.
*** (13.8) Only one main segment ?
*** (14) Static garbage-collection.
*** (14.1) The facts.
*** (14.2) Moving 'copy' and 'delete' instructions.
*** (14.3) Moving 'copy' instructions downwards.
*** (14.4) Should functions eat their arguments ?
*** (14.5) Moving 'delete' instructions upwards.
*** (15) Voluntary multitasking.
---------------------------------------------------------------------------------------
*** (1) Overall description.
*** (1.1) Modules of the compiler.
The Anubis compiler is made of the following main parts:
- the lexer (lexer.y)
- the parser (grammar.y)
- a module for analysing type definitions (typedef.c)
- a module for analysing data definitions (opdef.c)
- the interpretations module (this not an interpreter) (interp.c)
- the compiler proper (generation/optimization of symbolic code) (compile.c)
- the module for computing the implementations of data types (implem.c)
- the output module (making the '*.adm' files) (output.cpp)
It also contains secondary modules. Some of them are always used:
- a module for deciding if a type is recursive (rectype.c)
- tools for working on data types (unify.c, typetools.c, typewidth.c, typecmp.c, unknowns.c)
- computing the implicit destructors (destruct.c)
- computing the deletion code (delcode.c)
- computing the equality code (eqcode.c)
- a module containing various tools for manipulating expressions (expr.cpp)
- a system for managing the memory of the compiler itself (mallocz.c)
- a set of messages and functions for printing messages (msgtexts.c, show.c)
- a set of predefined paragraphs (predef.c, predef.anubis, predefined.anubis)
- tools for computing the size and binary format of instructions (vminstr.c)
Some others are used only on demand (options of the compiler):
- making an HTML index of all public symbols (index.c)
- outputing a C version of some Anubis types together with the constructors (dumpct.c)
- making a 'polish' translation (polish.c)
- producing a symbolic code dump (*.sc) (symcode.c)
*** (1.2) How it works.
The compiler works in two phases:
- phase 1: reading source files, checking paragraphs, computing the interpretations.
- phase 2: generating a module for each 'global' paragraph.
*** (1.2.1) Phase 1.
During phase 1, the compiler reads source files and does the following for each
paragraph:
1. reading the paragraph (using the parser and the lexer). This may eventually
produce a syntax error.
2. depending on the kind of paragraph:
- 'read' paragraph: checking if the file refered to has already been read. The
file is read only if it has not already been read. However, visibility is setup
correctly. See the section on the lexer below.
- 'replaced by' paragraph: this is the same as 'read' paragraphs, except that
visibility is computed differently. See the section on the lexer below.
- 'type' paragraph: the type definition is checked. See the section 'Checking a
type definition' below.
- 'define' paragraph: the definition is checked. See the section 'Checking a datum
definition' below.
- 'C constructor' paragraph. An Anubis type is translated into some program usable
from within a C program. (the output, for all those paragraphs, is written in the
files 'construc.h.out' and 'construc.c.out').
Only one error is considered per paragraph. If an error is found, the analysis restarts
at the next paragraph.
*** (1.2.2) Phase 2.
Phase 2 is performed only if there is no error after phase 1. Phase 2 performs the
following tasks:
- check if all type definitions are complete. Since a type may be defined using
several paragraphs, it is possible to define incomplete types (all alternatives have
not been defined).
- determine which types are infinite (recursive).
- Dump C versions of types (if required).
- make the modules (*.adm).
- dump implementations of types (if required).
- print statistics (if required).
- print possible errors after phase 2.
*** (1.3) Garbage collection.
Data are implemented in two different ways:
- directly
- indirectly
'Directly' means that what we have in hand is the datum itself, and 'indirectly' means
that what we have in hand is a pointer to a segment of memory containing the datum.
Data may be 'created', 'duplicated' and 'destroyed'. When we duplicate (or 'copy') a
direct datum, we have no choice, we just copy all the bits of the datum. When we
duplicate an indirect datum, we have the choice between:
- allocating a new segment of memory, and copying the content of the original segment
into the new segment (this is called 'real' copy),
- just duplicating the pointer (this is called 'virtual' copy).
Notice that these methods are semantically equivalent for abstract data. For concrete
data (for example: dynamic variables, connections, etc...), indirect implementation and
virtual copy are required.
Of course, virtual copy is faster, but it creates 'shared data'. When a (necessarily
indirect) datum is shared, we have several pointers pointing to the same segment of
memory. Then, the problem is just to know when the segment is no more used, so that we
can return it to the pool of memory. This problem is easily solved by 'reference
counting', which works as follows.
The first 32 bits word of any allocated segment is reserved for reference counting. It
countains an unsigned integer, called the 'counter' (for this segment). This counter is
either:
- zero, in which case, the segment is 'permanent' and will never be freed,
- non zero, in which case its value is the number of pointers pointing to this
segment.
Permanent segments are created in general (always ?) at compile time. For example, if
you write a character string into your source text, the compiler creates a permanent
segment for holding this string (this segment is within the program code itself). Of
course the counter of a permanent datum is never changed and remains always 0.
Non permanent data are created dynamically at run time. The function
'AllocateDataSegment' (see the virtual machine source code) receives the number of 32
bits words to allocate, and returns the pointer to the segment (actually casted to
'U32'). The first 32 bits words of the segment (the counter) is already initialized to
1.
When we need to make a virtual copy of a (non permanent) indirect datum, we just have
to increment the counter of this segment (so that it always represents the number of
(significant) pointers to the segment). Notice that this will never overflow, because
an overflow would mean a number of pointers bigger than the total memory can
hold.
When we want to delete a (non permanent) indirect datum, we perform a 'virtual
deletion'. This means that we decrement the counter. If the counter becomes 0, the
segment may be freed.
But freeing the segment is not as simple as this, because the segment itself may
contain pointers which are counted within other segments. Freeing the segment means
destroying these pointers, so that we must also decrement the corresponding
counters. We see that this process is recursive, because destroying a pointer may lead
to decrement a counter to zero, and to free another segment, and so on ...
Hence, the process of freeing the segments must be coded as a (recursive) program. This
is what we call 'garbage collector code'. This code is generated by the compiler and
depends only on the definition of types. Actually, there is one garbage collector
function for each defined type which may contain indirect data (i.e. 'mixed' and
'large' types). Some other types (like functional types) require special garbage
collector functions.
*** (2) Tools.
*** (2.1) Lisp-like representation.
The Anubis compiler uses a particular style of programming inspired by Lisp. The
advantage is that this system is very flexible. The disadvantage is that, being not
typed, it is not very safe. I think I will never do that again. It works as follows.
The type 'Expr' represents all symbolic 'expressions' to be manipulated by the
compiler. This includes tokens generated by the lexer, syntactical trees generated by
the parser, interpretations generated by the interpretations module, contexts,
environments, symbolic code, ...
Actually 'Expr' is just a clone of U32, the type of unsigned 32 bits integers. These
integers are 'dynamically' typed (using the low bits) as follows:
0 mod 2 ---------------------------------> (signed) integers
1 mod 2 ---> 1 mod 32 ------------------> tags
3 mod 32 ------------------> temporary pairs
5 mod 32 ------------------> strings
7 mod 32 ---> 7 mod 64 ---> user defined type variables
39 mod 64 ---> internal type variables (unknowns)
9 mod 32 ------------------> permanent pairs
In other words:
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxx0 integer
xxxxxxxx xxxxxxxx xxxxxxxx xxx00001 tags
xxxxxxxx xxxxxxxx xxxxxxxx xxx00011 temporary pairs
xxxxxxxx xxxxxxxx xxxxxxxx xxx00101 strings
xxxxxxxx xxxxxxxx xxxxxxxx xx000111 user defined type variables (like $T)
xxxxxxxx xxxxxxxx xxxxxxxx xx010111 unknowns
xxxxxxxx xxxxxxxx xxxxxxxx xxx01001 permanent pairs
Notice for example, that pairs (both temporary and permanent) have the 5 lowest bits
inhibited. Hence, the maximal number of temporary pairs is 2^27. The same for permanent
pairs. This makes at most 134217728 pairs of each sort. For temporary pairs, this is
not a problem, but for permanent pairs, this may be a problem. Nevertheless, since a
pair occupies 8 bytes, an array of permanent pairs containing 134217728 pairs would
contain 2^30 bytes, i.e. 1 gigabyte. So, with a machine of less than 1 gigabyte of
memory, no problem will happen (except that very big programs may not be compiled).
A good program should be protected against such problems independantly of the size of
available memory. Examining the code I see that it is not the case for Anubis
1. Indeed, the function 'pcons' (defined in 'expr.cpp'), which is used for making a new
permanent pair, has no protection against an enlargement of the array of permanent
pairs above the limit. This should be fixed as soon as possible. (Similarly, 'cons' for
temporary pairs is not well protected).
All 'Expr's except temporary and permanent pairs are called 'atoms'. Pairs, either
temporary or permanent, are just made of two 'Expr's, and represented (in the comments
of the source of the compiler), as follows:
(a . b)
where 'a' and 'b' are two 'Expr's respectively called the 'head' and 'tail' of the
pair. The macro 'cons' enables the construction of such pairs.
As in Lisp, lists are represented using pairs and the tag 'nil' (the empty list) as
follows (for a list made of 4 'Expr's):
(a b c d)
which is actually a shortcut for:
(a . (b . (c . (d . nil))))
Such lists may be easily constructed using the macros 'list1', 'list2', ... defined in
'compil.h'.
Now, an 'Expr' like (a . (b . (c . d))) is written '(a b c . d)', and called a
'multiple cons'. A set of macros, called 'mcons3', 'mcons4', ... enable a direct
construction of such data.
Examples of use and meaning of these macros (and some others):
The C term represents
---------------------------------------------------------------
nil ( )
cons(a,b) (a . b)
mcons3(a,b,c) (a b . c)
mcons4(a,b,c,d) (a b c . d)
etc...
list1(a) (a) same as (a . nil)
list2(a,b) (a b) same as (a b . nil)
etc...
If x = (a b c d e):
car(x) a
second(x) b
third(x) c
etc...
cdr(x) (b c d e)
cdr2(x) (c d e)
cdr3(x) (d e)
etc...
Notice that 'mcons2' does not exist because it is the same as 'cons'.
'consp(x)' is 1 if and only x is a pair (i.e. not an atom). Otherwise, it is 0. This
predicate ('consp' is a traditional Lisp abbreviation for 'cons predicate') is very
often used as follows:
while(consp(x))
{
Expr head = car(x);
... do something with the head of x ...
x = cdr(x); // continue with the tail of x
}
// in most cases, when we arrive here, we have x == nil (the empty list).
Normally, pairs are created temporary, and transformed into permanent only if
needed. The question is that, while working on a paragraph, the compiler constructs
data which will not be part of the final result. When the analysis of a paragraph is
finished, only the data to be kept are made permanent (using the function 'save'), and
the array of temporary pairs is reset to zero for the next paragraph. This is how
memory is managed in the compiler (this has nothing to do with garbage collection in a
virtual machine).
Integers are coded on 31 bits (bits 1 to 31), and the bit number 0, whose value is 0,
means that this 'Expr' is an integer. The file 'compil.h' contains all the macros
needed to test the type of an 'Expr', or to construct 'Expr's.
Tags are used for representing simple atomic concepts. For example, the tag 'nil'
represents the empty list. They are also used for 'tagging' non atomic data. For
example, when the function
(Int32 i, String s) |-> a
is read from a source file, the parser constructs the following 'Expr':
(lambda <lc> (([Int32] . i) ([String] . s)) . [a])
where '[Int32]' is a representation of type 'Int32' (similarly for 'String') , and
'[a]' a representation of the term 'a'.
Here, the tag 'lambda' represents the arrow '|->'.
*** (2.2) Exceptions to the Lisp-like 'Expr' representation.
Actually, this system has exceptions. Normally, all tokens returned by the lexer have a
value of type 'Expr', i.e. a datum containing its own (dynamic) type. Examining such a
value it is possible to know if we have an integer, a string, a pair, etc...
However, the two tokens 'yy__integer' and 'yy__char' have a value of type 'integer'.
This means that when the lexer reads a number (regular expression: [0-9]+), it returns
the token 'yy__integer', with a value to be interpretated as a C integer, not as a
Lisp-like integer. We must be careful because for example the integer 6 is represented
like this in the two systems:
...0110 (C integer: this is not an 'Expr')
...01100 (Lisp-like integer: this is an 'Expr')
^
|
+--------- this bit (0) means that this 'Expr' is an integer
The reason for these exceptions is that we want the compiler to be able to read 32 bits
integers, not only 31 bits integers from source files.
The parser contains the following grammar rule:
Term: yy__integer %prec prec_symbol { $$ = mcons3(integer,linecol(),$1); }
This means that when an integer has been read by the lexer, the parser constructs the
following multiple cons as a representation of this integer:
(integer <lc> . <n>)
where <n> is the value of the integer as a C integer, not as a Lisp-like integer.
As a consequence, we must always remember that a list whose head is the tag 'integer',
has always the form: (integer <lc> . <n>), where <n> is not an 'Expr'. It would be
meaningless to apply a predicate like 'is_integer' to <n>, and the result would be
unpredictable.
The exceptions are the following (where only <Cint> is not an 'Expr'):
(integer <lc> . <Cint>)
(small_datum <type> . <Cint>)
(anb_int32 <lc> . <Cint>)
(load_word32 . <Cint>)
(I hope there is no other exception, but I'm not absolutely sure).
Of course, the existence of these exceptions is very bad. A better solution should be
to represent (say) the integer 746542 as the list (7 4 6 5 4 2), where the elements of
the list would have been Lisp-like integers. In the compilation step, the compiler may
transform this representation into a packed integer of 32, 64 bits,... maybe with a
warning in the case of an overflow. Actually, this is what happens already at a smaller
scale. Indeed, the current version of the compiler finds 3 interpretations for the
integers 180 (of types 'Int8', Int32' and 'Float'), but only 2 for 4567, because this
number is greater that 255.
Added 2007/01/16: he tag 'integer' is now obsolete. There are two new tags:
integer_10
integer_16
They are used as follows:
(integer_10 <lc> bigit ... bigit)
(integer_16 <lc> bigit ... bigit)
where in both cases the bigits are Lisp-like integers which are bigits in basis
256. The reason why there are two tags is just for knowing how to print the number in
an error message.
*** (2.3) Catalog of all internal data representations.
The compiler represents the following data (internally):
- 'types' (as produced by the parser)
- 'terms' (as produced by the parser)
- 'interpretations' (as produced by 'term_interpretations' and related functions)
- 'implementations' (as produced by 'type_implementation' and related functions)
- 'instructions' (as produced by 'compile_term' and related functions)
Note: All data are represented either by Lisp-like atoms (string, tags, ...), or by
lists. In the case of a list, the head of list is always a tag.
*** (2.3.1) Internal representation of types.
<type> :=
yy__Symbol just a type name in the form of a Lisp-like string
For example, the types 'One', 'Bool' etc ... are in this category. The file
'compil.h' contains the definitions of a lot of 'predefined strings'. Within the C
source, such symbols may be represented by the C symbols 'pdstr_One', 'pdstr_Bool'
etc... ('pdstr_' for 'predefined strings').
Equivalently, 'pdstr_Bool' maybe replaced by 'new_string("Bool")'. This has exactly
the same effect, except that 'pdstr_Bool' is faster at execution (of the compiler).
Remark: 'new_string' (defined in 'expr.cpp') does not allocate a new segment for the
string if the string already exists. In this case, it returns the already existing
Lisp-like string.
All types (without parameters) whose names are created by the user are also
represented like this.
yy__utvar user type variable (type parameter)
Type parameters have a special Lisp-like representation as atoms (see above).
type_String a Lisp-like tag representing the type 'String'
type_ByteArray idem
type_Int32 idem
type_Float idem
type_Listener idem
The above being more 'primitive', they are represented by Lisp-like tags, instead of
Lisp-like strings.
(type_RAddr . <type>) represents 'RAddr(T)'; type_RAddr is a Lisp-like tag
(type_WAddr . <type>) idem
(type_RWAddr . <type>) idem
(type_GAddr . <type>) idem (obsolete: global variable)
(type_Var . <type>) idem (dynamic variable)
(type_MVar . <type>) idem (multiple dynamic variable)
The above are called 'address types'.
(app_ts <name> . <types>) type scheme applied to type operands
'app_ts' is for 'apply type scheme'. For example, 'List(String)' is represented as:
(app_ts new_string("List") type_String)
and 'Result(One,String)' as:
(app_ts new_string("Result") pdstr_One type_String)
Notice that there is no dot in the above Exprs. '(app_ts new_string("List")
. type_String)' is erroneous, because the second tail ('cdr2') of the list must be a
list of types, not a single type.
(type_struct_ptr . <id>) encapsulated C structure type
<id> is a Lisp-like integer identifying the structure.
(functype <types> . <type>) functional type
<types> is the list of the representations of the arguments types, and <type> is the
representation of the target type.
There is still another sort of 'types' represented internally: the 'type
unknowns'. They are also a special sort of Lisp-like Exprs. They are not generated by
the parser, but in general by the interpretation functions. When a symbol is found,
representing a scheme, all the type variables in this scheme are replaced by fresh
unknowns. Of course, there is a generator of fresh unknowns: 'fresh_unknown'.
From this we may conclude that defined types are those which are represented either by
a string or by (app_ts ...).
*** (2.3.2) Internal representation of terms.
Note: <lc> (generated by the lexer) is a packed version of the triplet
(filename,line,column). There are macros for extracting the components:
file_in(<lc>) a C char *
line_in(<lc>) a C int
col_in(<lc>) a C int
As usual, <Cint> represents a C integer (i.e. not a Lisp-like integer).
<term> :=
(alert <lc> . <filename>) <filename> is a Lisp-like string
(protect <lc> . <term>)
(lock <lc> <term> . <term>)
(debug_avm <lc> . <term>)
(avm <lc> . AVM) (AVM is explained below)
(terminal <lc> . <term>)
(symbol <lc> . <Lisp-like string>)
(integer <lc> . <Cint>)
(fpnum <lc> <mantissa> . <exponent>) (<mantissa> and <exponent> are Lisp-like integers)
(of_type <lc> <type> . <term>) (explicit typing)
(string <lc> . <string>) (<string> is a Lisp-like string)
(with <lc> <symbol> <value> . <term>)
(delegate <lc> <term> . <term>)
(lambda <lc> <FArgs> . <term>) (|->)
(rec_lambda <lc> <name> <FArgs> . <term>) (|-f->)
(app <lc> <term> . <terms>) (applicative term)
(cond <lc> <term> . <cases>) (conditional)
(select_cond <lc> <term> <case> . <term>) (selective conditional)
(serialize <lc> . <term>)
(unserialize <lc> . <term>)
The following are still used, but should become obsolete (and replaced by system calls).
(anb_read ...)
(anb_write ...)
(anb_exchange ...)
(connect_to_file ...)
(connect_to_IP ...)
(wait_for ...)
*** (2.3.3) Internal representation of interpretations.
The function 'term_interpretations' (in 'interp.c') computes the list of all
interpretations of a term (in a given appropriate context). An interpretation is a
pair:
(head . env)
where 'head' is called an 'interpretation head', and where env is an 'environment'.
Here, we describe interpretation heads.
Unfortunately, I've used the same tags as for terms. It would be safer to use different
tags. Furthermore, tags for some sort of expressions should have a uniform prefix, for
example:
typ_ for types
ter_ for terms
int_ for interpretations
imp_ for implementations
ins_ for instructions
The tags should also be numbered. This will limit the risk of forgetting a tag in a
switch. Hence for example, we should have the following tags for terms:
ter_00_alert
ter_01_protect
ter_02_lock
etc...
and a dummy one for marking the end of the list:
ter_20_dummy
Interpretations heads are the following:
(protect <lc> . <head>)
(lock <lc> <term> . <term>)
(avm <lc> . <instructions>)
(debug_avm <lc> . <head>)
(terminal <lc> . <head>)
(operation <lc> <opid> <name> <parms> <type> . <types>)
(string <lc> . <string>)
(anb_int32 <lc> . <Cint>)
(small_datum <type> . <Cint>)
(fpnum <lc> <mantissa> . <exponent>)
(local <name> <i> . <type>)
(micro_local <name> <d> <i> . <type>)
(closure <lc> (f_micro_ctxt <fname> <ftype> (<sym> . <type>) ...) <args> . <body>)
(app <lc> <head> . <heads>)
(cond <lc> <head> ((<name> (<sym> . <type>) ...) <lc> . <head>) ...)
(select_cond_interp <lc> <test head> <index> <clause head> <head then> . <head else>)
(with <lc> <symbol> <head> . <head>)
(delegate <lc> <head> . <head>)
(serialize <lc> . <head>)
(unserialize <lc> <type> . <head>)
The following are still used but should become obsolete:
(anb_read ...)
(anb_write ...)
(anb_exchange ...)
(wait_for ...)
*** (2.3.4) Internal representation of type implementations.
Type implementations are computed in 'implem.c'. We never compute an interpretation for
a scheme, but only for an actual type (i.e. with no type parameter). For example,
'Maybe(Int8)' has an implementation different from 'Maybe(String)', but 'Maybe($T)' has
no implementation at all.
In the future, it would be nice to have quantification 'à la Girard' on types:
forall $T ...
exists $T ...
In this case, we would have to implement for example the type:
forall $T Maybe($T)
which has no parameter, since $T is bound in this expression.
Internal representations of implementations are the following. Again the same tags are
used, which is quite unfortunate.
<implem> :=
type_String
type_ByteArray
type_Int32
type_Float
(functype <implem> . <implems>)
(type_RAddr . <implem>)
(type_WAddr . <implem>)
(type_RWAddr . <implem>)
(type_GAddr . <implem>) (obsolete)
(type_MVar . <implem>)
It is very unfortunate that dynamic variables are not implemented as '(type_Var
. <implem>). Instead, this type is implemented as a large type. I don't understand
how this can work properly (but it seems to work miraculously). This should be
changed of course.
(type_struct_ptr . <id>)
(small_type <nalt> <iw> <alt geom> ... <alt geom>)
(mixed_type <nalt> <iw> <alt geom> ... <alt geom>)
(large_type <nalt> <iw> <alt geom> ... <alt geom>)
where:
<alt geom> :=
(small_alt (<imp> <offset> . <width>) ... (<imp> <offset> . <width>))
(mixed_alt (<imp> <offset> . <width>) ... (<imp> <offset> . <width>))
(large_alt (<imp> <offset> . <width>) ... (<imp> <offset> . <width>))
*** (2.3.5) Internal representation of virtual machine instructions.
*** (2.3.5.1) Pseudo instructions.
These instructions are of size 0.
(label . <addr>) position within the code
(ret_point . <addr>) same as label
(header ...) used for decorating the .sc files
(comment ...) idem
(context ...) idem
*** (2.3.5.2) Computing instructions.
(glue_index . <index>) used for small types
(glue_mixed_index . <index>) used for mixed types
(store_index . <index>) used for large types
These instructions are used for constructing a datum of a defined type. They put the
right index (i.e. alternative number of the datum) into the datum.
For a small datum, 'glue_index' puts the index within the least significant bits (of
R). This is the first operation when constructing the datum, so that the content of R
is actually overwritten.
For a mixed datum, the index is put into the 2 least significant bits by
'glue_mixed_index'.
For a large datum, the index (a byte) is put at offset 4 (just after the counter) into
the segment.
(glue . <bit width>) used for small and mixed types
(store_0 . ?) used for mixed and large types
(store_1 . ?)
(store_2 . ?)
(store_4 . ?)
These instructions are used for putting the components into the datum under
construction (of some defined type).
The instruction '(glue . <bit width>) ORes the register R with a left shift of the top
of stack. The number of bits of the shift is given by the operand.
'(store_? . <offset>)' stores a word of 0, 1, 2 or 4 bytes at the offset given by the
operand within the segment pointed to by R.
(index_direct . <bit width>)
index_indirect
These instructions are used by conditionals. A datum of a defined type is in the
register 'R', and they put its index (alternative number) into the index register
'I'. The first one is used for small and mixed data, the second one for large data.
(switch <addr> ... <addr>)
This instruction check the content of the index register 'I', and jumps to the
corresponding address. It is used by conditionals for jumping at the right case.
(unglue <bit width> . <right shift>)
(unstore <offset> . <n>) where <n> is a Lisp-like integer equal to 0, 1, 2 or 4
(unstore_copy . <offset>)
(unstore_copy_mixed <offset> . <mask>)
(unstore_copy_ptr . <offset>)
(unstore_copy_function . <offset>)
These instructions are used by conditionals for pushing the values of resurgent symbols
onto the stack.
'(unglue <bit width> . <right shift>)' is used for extracting a component from a small
datum. The datum is pushed into the stack.
'unstore' instructions are similar but for indirect data. Some of them need to make a
virtual copy of the datum to be extracted.
(jmp . <addr>)
This simple 'jump' instructionn is used at the end of cases in conditionals for jumping
to the next code (which is common to all cases).
(apply . k)
(ret . k)
When '(apply . k)' is on the point to be executed, the stack contains (starting at top
of stack) the arguments of a function and a return address. '(apply . k)' The function
to be 'applied' to these arguments is in 'R'.
Thisn instruction acts in two different ways depending on the nature of the function,
which may be either 'top level' or a 'closure'. See the section on closures and
functions for more explanations.
'(ret . k)' is just an ordinary return instruction (except that is stops the virtual
machine if the return address is zero).
'k' is the number of arguments of the function.
(load_Int32 . <Cint>)
(load_float <mantissa> . <exponent>)
(string . <string index>)
These instructions are used for loading constants into 'R'.
(check_stack . <n>)
This instruction checks if the stack is big enough for pushing '<n>' 32 bits words. If
it is not, it puts the virtual machine in the state 'needs_bigger_stack' and gives up,
so that the scheduler may enlarge the stack of this machine.
(push_addr . <addr>)
This instruction pushes a return address on top of stack.
(address . <addr>)
The same but the address is put into 'R'.
push
This instruction pushes the content of 'R' onto the stack.
(select_index_direct <bit width> <index> . <addr>)
(select_index_indirect <index> . <addr>)
These instruction are generated for selective conditionals.
(peek <name> . <depth>)
(micro_peek <name> <depth> . <micro depth>)
The instruction 'peek' takes a 32 bits word at depth '<depth>' into the stack and put
it into 'R'. The datum is not virtually copied. The compiler generates a 'copy'
instruction to follow this one, if needed, depending on the type of the datum.
'micro_peek' is similar except that the datum to be put into 'R' is located into a
micro context. The micro context is at depth <depth> into the stack, and the datum is
at position <micro depth> into the micro context.
protect
(unprotect . <addr>)
unlock
lock
odd_align
This instruction is put in front of the code of a top level function if this code would
be at an even address.
pop1
pop2
pop3
These instructions are just removing 1, 2 or 3 32 bits words from the top of stack.
invalid
This instruction should never be executed. It is generated in the garbage collector
code for small alternatives in mixed types. This means that the garbage collector code
must never be called for a datum belonging to small alternative of a mixed type. Other
instructions with a <mask> operand must use this mask to avoid such a call.
give_up
This instructions make the virtual machine give up as if its slice of time was
finished.
start_debug_avm
stop_debug_avm
These instructions set and unset a flag. When this flag is set the virtual machine
prints all instructions executed with other informations.
finish
(program . <instructions>)
do_alert
*** (2.3.5.3) Garbage collector instructions.
The null 32 bits word '0' is always a valid datum (of any type) for the garbage
collector. Of course, this datum doesn't need to be collected, but the garbage
collector may be passed such a datum.
Allocation of memory:
(alloc . <number of bytes>)
The pointer to the resulting segment is put in R. The counter (first 32 bits word of
the segment) contains always 1 after the allocation.
Making virtual copies. The following instructions are used for making a virtual copy of
a maybe indirect datum. This amounts to increment the counter which is at the
beginning of the segment. However, these instructions are generated on behalf of the
type of the datum only, and this is not enough to know if the counter exists (some data
are direct). Hence, the existence of several variants for these instructions. Also,
these instructions always check if the counter is zero. If it is the case, the datum
is permanent, and the counter is not incremented.
Copying virtually within the stack. The datum is at position <depth> in the stack (0 is
the top of stack).
(copy_stack_ptr . <depth>)
The counter always exists (except if the datum is the null word). The counter is
incremented if it is non zero.
(copy_stack_function . <depth>)
The datum reprensents a function, i.e. the address of some piece of code if it is odd,
and a pointer to a closure if it is event. The counter exists only in the second case.
(copy_stack_mixed <depth> . <mask>)
The <mask> has at most 4 bits. Each bit indicates if the corresponding alternative is
large or small. The counter exists only if the alternative is large.
Virtually copying in R. This is the same except that the datum to be copied is in R.
copy_ptr
copy_function
(copy_mixed . <mask>)
The next one should become obsolete:
copy
It's almost the same as 'copy_ptr' except that it does not check for permanent
data. Should disappear and be replaced by 'copy_ptr'.
Virtual deletion within the stack.
(collapse . <depth>)
This instruction deletes the datum at position <depth> into the stack. The datum is
direct, so that the instruction has just to move the <depth> words at top of stack one
slot downwards, and adjust the stack pointer. For example, (collapse . 3) works as
follows:
the stack before the stack after
a b c d e f g .... a b c e f g ...
('d' has disappeared)
The next instructions delete virtually a datum which is in R. These instructions are
generated only for thoses types who have indirect data.
del_ptr
'del_ptr' expects a pointer in R to a segment not containing any indirect data. In this
case, the deletion amounts to decrement the counter pointed to by R. However, R may
contain 0 (in this case the instruction does nothing), or the counter may be zero (in
this case, the datum is permanent and the instruction does nothing). Otherwise, if the
counter becomes zero, the segment is freed.
del_conn
This is similar to 'del_ptr' except that the segment pointed to represents a
'connection' (file or network). The instruction works as 'del_ptr', except that if the
counter becomes zero, it also closes the connection.
del_function
This is similar to 'del_ptr', except that functions are either addresses of code (if
odd) or pointers to closures (if even). In the first case nothing is done. In the
second case, 'del_function' works like 'del_ptr', except that the address of the
deletion code is found within the closure itself instead of as an operand of the
instruction. This is because, the types of the data in the micro context cannot be
known from the type of the function.
(del . <addr>)
'(del . <addr>)' expects a datum belonging to a large type in R. '<addr>' is the
address of the deletion code for this type. The instruction checks if R is zero. In
this case it does nothing. Otherwise, R contain a pointer to a valid segment. If the
counter is zero (permanent datum) the instruction does nothing. Otherwise, the counter
is decremented. If the counter becomes zero, a return address and the datum (in R) are
pushed onto the stack, and the deletion code at '<addr>' is jumped to.
(del_mixed <mask> . <addr>)
This is the same as '(del . <addr>)' except that the datum in R belongs to a mixed
type. The mask is used to check if the datum is direct or indirect. If it is direct,
nothing is done. If it is indirect, the same as for '(del . <addr>)' is performed.
(del_struct_ptr . <id>)
This instruction is the same one but for C structure encapsulation pointers. Somewhat
curiously, it is coded in the compiler ('compile.c'), but has never been generated (the
proof: it is not coded in 'vminstr.c'). This is probably due to the fact that
encapsulated C structures are manipulated only from within 'predefined.anubis'. Hence,
this is not a big problem, because only a modification of 'predefined.anubis' can lead
to generate this instruction.
(del_mvar . <addr>)
This instruction is even more phantomatic than the previous one. This is more
problematic, because multiple variable may be manipulated by anyone. Hence, this
instruction should be coded as soon as possible.
Note: We should also have a '(del_var . <addr>)'. The type 'Var' is very badly
implemented as a large type (one of my worst ideas). Everything concerning this type
should be examined and corrected. I suspect that this creates at least memory leaks,
but maybe also more serious problems.
(del_stack_ptr . <depth>)
(del_stack_conn . <depth>)
(del_stack_function . <depth>)
(del_stack <depth> . <addr>)
(del_stack_mixed <depth> <mask> . <addr>)
(del_stack_struct_ptr <id> . <depth>)
(del_stack_var <depth> . <addr>) (phantom)
(del_stack_mvar <depth> . <addr>)
The above instructions are the same as the previous ones, except that the datum to be
virtually deleted is in the stack at depth <depth> instead of being in R.
mvar_slots_del_ptr
mvar_slots_del_conn
mvar_slots_del_function
(mvar_slots_del . <addr>)
(mvar_slots_del_mixed <mask> . <addr>)
(mvar_slots_del_struct_ptr . <id>)
(mvar_slots_del_var . <addr>)
(mvar_slots_del_mvar . <addr>)
Again, these instructions are the same as above except that they have to delete
virtually the data in all the slots of the multiple variable.
indirect_del_ptr
indirect_del_conn
indirect_del_function
(indirect_del . <addr>)
(indirect_del_var . <addr>) (phantom)
(indirect_del_mvar . <addr>)
(indirect_del_mixed <mask> . <addr>)
(indirect_del_struct_ptr . <id>)
Again, these instructions are working like the above except that the datum to be
virtually deleted is pointed to by the top of stack.
(increment_del . <byte width>)
This instruction is used when we are really deleting an indirect datum. A pointer to
some position within the segment to be freed is on top of stack. This instruction just
increments this pointer of the number of bytes indicated by its operand.
del_index_direct
This instruction expects a datum of a mixed type on top of stack, belonging to a large
alternative. This datum is a 32 bits word which may be decomposed as follows:
pppppppp pppppppp pppppppp ppppppii
i.e into a field of 30 bits and a field of 2 bits. The two bits 'ii' give the number of
the alternative the datum belongs to, and
pppppppp pppppppp pppppppp pppppp00
is the pointer to the segment. The role of the instruction is:
- put 'ii' into the index register I
- replace 'ii' by '00' on top of stack
- duplicate the top of stack.
The reason why we duplicate the top of stack is that one pointer will be incremented
for walking into the segment, and the other one will be used for freeing the segment.
del_index_indirect
This instruction is similar to 'del_index_direct', except that it concerns a large
datum. The index is got from the data segment, not from the manipulation word. The
pointer is also duplicated on top of stack.
free_seg_0
free_seg_1
These two instructions are just calling the 'FreeDataSegment' function of the memory
allocator. The first one transmits the word at *(SP-1) (top of stack: depth 0 in the
stack), whilst the second one transmits the word at *(SP-2) (depth 1 in the stack).
*** (2.3.5.4) Closure instructions.
These instructions are used for constructing new closures.
(put_copy_direct <depth> . <position>)
(put_copy_indirect <depth> . <position>)
(put_copy_function <depth> . <position>)
(put_copy_mixed <mask> <depth> . <position>)
A memory segment (the closure under construction) is pointed to by 'R'. The <depth>
operand is the stack depth of the datum to be put. The <position> operand is the
position where the datum must be put in the micro context of the closure. The offset
(in bytes) within the closure where the datum must be put is 4*(3+<position>), i.e.
<position> is 0 for the first datum of the micro context, 1 for the next one, etc...
For indirect data, the counter is incremented, because this is a duplication.
(put_micro_copy_direct <depth> <micro depth> . <position>)
(put_micro_copy_indirect <depth> <micro depth> . <position>)
(put_micro_copy_function <depth> <micro depth> . <positions>)
(put_micro_copy_mixed <mask> <depth> <micro depth> . <position>)
Same as above, except that the datum is located into another micro context which is at
depth <depth> in the stack. The datum itself is at position <micro depth> in this other
micro context. Again <micro depth> is 0 for the first datum in the micro context, 1 for
the next one, etc...
(put_closure_labels <code> . <del code>)
This instruction puts its two operands at offsets 4 and 8 into the closure (pointed to
by 'R').
Below is an example of how a closure is constructed. First of all, here is the source
text (copy-pasted from the sources of our web site):
(DBRow(MTxt) dbrt) |-> if dbrt is dbrow(_,_,t) then
if t is mtxt(_,v,time) then
success(mtxt(new_name,v,time))
and below is the code generated for the construction of the closure (copy-pasted from
the .sc file):
| ; |-> in '/home/alp/www_dev/anubis-language/_1_6/translate_item.anubis' at line 451
| ; function code at label 31092
146195 | alloc 16
| ;--- stack ------------------------
| ; 0 (return address)
| ; 1 i LangInfo
| ; 2.0 tdb DB(TDB)
| ; 2.1 old_name String
| ; 2.2 new_name String
| ; 3 (return address)
| ;----------------------------------
146197 | put_micro_copy_indirect 2 2 0
146201 | put_closure_labels 31092 31093
'alloc 16' allocates 4 32 bits words for the closure: counter, code, del and just one
datum in the micro context (the value of 'new_name').
Remark that 'new_name' is the sole local symbol occuring free in the above term, and
that its value is within an older micro context located at depth 2 in the stack. The
value of 'new_name' is at position 2 in this older micro context, so that the right
instruction for constructing the new micro context is:
146197 | put_micro_copy_indirect 2 2 0
which means: 'take the datum at position 2 in the micro context at depth 2, and put it
at position 0 in the new micro context' (of course, count a virtual copy).
('indirect' because the type is 'String'). The last instruction:
146201 | put_closure_labels 31092 31093
puts the address (label 31092) of the code for executing the function and the address
(label 31093) of the garbage collector code for deleting the closure itself.
*** (2.3.5.5) Equality instructions.
In the next version of Anubis, equality code should be generated as generic code (like
for implicit destructors), so that all the instructions below will become obsolete.
Furthemore, equality (with Boolean value) should be generated only for 'separable'
types, i.e. type for which this boolean equality, in the mathematical sens, is
computable. From a Topos theoretic view point, this means that it should be generated
only for types 'T' such that the internal equality '(T,T) -> Omega' factors through the
canonical arrow 'Bool -> Omega' (sending 'true' onto 'true' and 'false' onto 'false').
However, I don't know if there is an algorithm able to decide for this property. I know
only a recursive class of types having this property. Of course, internal equality
exists for all (abstract) types and may be defined by Leibnitz rule.
In Anubis 1, equality code is generated for all types (except 'Float', but this may be
the result of some misunderstanding). For concrete types (address types and several
primitive types, structure pointer types) equality is physical (compare the addresses
of the data). For functional types it is also physical, despite the fact that in some
cases, we could be more clever (when we know that the product of the types of the
arguments is finite). Theoretically, boolean equality should not be generated for
functional types in most cases.
So, we have something special to generate only for defined types. The principle of
comparison is as follows for 'x' and 'y' to be equal:
1. 'x' and 'y' must belong to the same alternative,
2. if so, the components must be equal two by two.
Of course, the second condition shows that this is a recursive process. Hence, an
'equality code' is generated for each defined type. This code uses special
instructions, which is quite stupid, because ordinary instructions would have done the
job (maybe somewhat more slowly).
(false_jmp . <addr>)
(true_jmp . <addr>)
Put either 'false' (i.e. 0) or 'true' (i.e. 1) in 'R' and jump (unconditionally) to the
operand address.
(jmp_false . <addr>)
Jump to address if the content of 'R' is 'false'.
(jmp_eq_stack . <addr>)
This instruction compares the two words on top of stack, puts 1 (true) in R if they are
equal and performs a jump to 'addr'. Otherwise the next instruction is executed, and R
remains empty.
(jmp_neq_indexes_large . <addr>)
(jmp_neq_indexes_mixed <bit width> <mask> . <addr>)
(jmp_neq_string . <addr>)
(jmp_neq_byte_array . <addr>)
(jmp_neq 0 . <addr>)
(jmp_neq 1 . <addr>)
(jmp_neq 2 . <addr>)
(jmp_neq 4 . <addr>)
(eq . <depth>)
This instruction test the equality of the two words at depths '<depth>' and '<depth>+1'
in the stack. The register R is set to 1 (true) if they are equal, and to 0 (false)
otherwise.
eq_string
eq_byte_array
Tests the equality of the two strings or byte arrays on top of stack (depths 0 and 1),
and puts either 'true' (if equal) or 'false' (if non equal) in R. Of course, the
strings or byte arrays are walked into.
push_eq_data
(increment_eq . <byte width>)
*** (2.3.5.6) Dynamic variables instructions.
(get_var_handler . <addr>)
(get_mvar_handler . <addr>)
get_var_monitors
get_mvar_monitors
push_mvar_length
remove_monitor
ret_if_zero
get_vv
xchg_vv
get_mvv
xchg_mvv
create_var
free_var_seg
free_mvar_seg
*** (2.3.5.7) Serialising/unserialising.
(serialize . <addr>)
(unserialize . <addr>)
*** (2.3.5.8) Starting a new virtual machine.
(start <depth> . <addr>)
This instruction is generated by the keyword 'delegate' and starts a new virtual
machine. '<depth>' is the number of slots on top of the stack of the old machine which
should be copied into the stack of the new machine. No virtual copies are needed,
because they are already performed (by 'copy_stack_...' instructions). The new machine
starts execution at the instruction following this one, whilst the old machine jumps to
the operand address.
*** (2.3.5.8) Obsolete and soon obsolete instructions.
The following are still used but should become obsolete (used for global variables):
(create_vars ...)
(gv_address ...)
(init_gv ...)
(del_gv ...)
get_gvv
xchg_gvv
initialization_address
variables_deletion_address
The following have been never used (experimental ?):
(call ...)
location
(code_for ...)
(type_list ...)
push_before_eq
(dec3 . <addr>)
*** (2.4) Unification.
Since the Anubis language allows the use of type parameters (parametric polymorphism),
we need to be able to unify types.
'Environments' are lists of the form:
((v_1 . T_1) ... (v_k . T_k))
where each 'v_i' is an 'unknown', and where each 'T_i' is a type (which may also
contain unknowns). Such an environment just says that:
v_1 has value T_1
...
v_k has value T_k
If a pair is of the form '(v_i . v_j)', i.e. if the value of 'v_i' is the unknown
'v_j', we assume that 'v_i > v_j' (recall that they are 'Expr's, hence integers). This
is to avoid some kind of circularity.
Because of the Lisp-like representation, unification is fairly simple. It works as
follows, where 'u(A,B)/E' is the result of unifying 'A' with 'B' relatively to
environment 'E'. The result is a new environment. '$1' and '$2' represent two arbitrary
unknowns.
u($1,$1)/E = E
u($1,$2)/E = (($1 . $2) . E) (if $1 > $2)
u($1,$2)/E = (($2 . $1) . E) (otherwise)
u($1,T)/E = (($1 . T) . E)
u(A,A)/E = E (for any atomic A)
| not_unifiable if u(A1,B1)/E = not_unifiable
u((A1 . B1),(A2 . B2))/E = |
| u(B1,B2)/(u(A1,A2)/E) otherwise
u(A,B)/E = not_unifiable in all other situations
We have no problem with the fake 'Expr' <Cint> because we unify only types, and types
never contain a <Cint>.
It may be the case that the two expressions to be unified are relative to different
environments. In that case we just have to merge the two environments before unifying
the two expression.
We also want to avoid circularities, and to that end we use a non circularity test
(like some Prolog systems). The problem is that the above rules would give the
following for example:
u($1,Maybe($1))/E = (($1 . Maybe($1)) . E)
But this is not correct because, a type 'T' cannot be equal to 'Maybe(T)'. Hence, when
we have to compute 'u($1,T)/E', we use the function 'depends_on' to test if 'T' depends
on '$1' directly or indirectly. If it is the case, the unification fails.
Important note: Unification is used each time we have to compare types. Indeed, a type
with unknowns is a non already completely known type. The comparison may succeed if
some unknowns are instanciated. This is why comparing types without unifying is
generally an error.
Also, all the computations of unification and interpretations of terms are purely
deterministic (no side effect at all). This warranties that when we try several
possibilities, there is no clash between them.
It is often the case that we want to dereference a type relatively to its
environment. This means replacing (recursively) all types unknows by their values (when
they have one). This is performed by the command 'dereference_type'.
*** (2.5) Organization of the knowledge base.
The compiler stores the result of checking the paragraphs into several arrays:
struct Type_struct *types; (array of type definitions)
struct Operation_struct *operations; (array of definitions, formerly called 'operations')
struct Variable_struct *variables; (array of global variables (obsolete))
See the definitions of these structures in 'compil.h'.
Checked paragraphs are stored into these arrays, which work like stacks. The stack
pointers are called 'next_type' and 'next_operation'.
*** (2.4) Indexation of symbols.
In a previous version of Anubis, when the interpretations of a symbol were searched
for, the array of types or of operations were looked up sequentially. During the year
2003 (as far as I can remember), it appeared that this was too slow (especially with
Olivier's program, containing thousands of types and symbols). I decided to index
these arrays by a hashing method. The result was that the compiler became two times
faster.
The 'symbol index' is just an array of 256 arrays of 'SymbolIndexEntry'. When a symbol
is searched for, its 'hash' is first computed. This hash is an U8 (computed by
'u8_symbol_hash' in 'interp.c'). With this hash we may go directly to the right array
of 'SymbolIndexEntry'.
This secondary array is searched for an entry corresponding to our symbol (each entry
contains the name of a symbol). When such an entry is found, we get the index of the
actual definition in the array of types or operations.
In the future this system should be replaced by a B-tree, indexed on names of symbols.
*** (3) The lexer.
*** (3.1) Generalities.
The role of the lexer is essentially to cut the input into 'tokens'. 'lexer.y' is a
FLEX file, which must be compiled by 'flex'. This produces the file 'lex.yy.c' which
becomes part of the C source.
However, the lexer performs several other tasks:
- handling 'read' and 'replaced by' paragraphs, computing visibilities,
- counting lines and columns, and level of nested comments,
- working on several special tokens (strings, floats, named arrows,...).
The lexer has several 'states' (in the sens of FLEX). They are the following:
- INITIAL (used when reading comments outside paragraphs)
- COM (used for reading /* ... */ comments)
- PAR (used for reading paragraphs)
- INCL (used for reading file names after either 'read' of 'replaced by')
- STR (used for reading strings)
- STRTL (used when a string is too long)
- CONF (used for reading the configuration file)
- CONFCOM (used for reading comments within the configuration file)
The names of all the tokens returned by the lexer are prefixed by 'yy__'. They are
declared in 'grammar.y'.
*** (3.2) Finding source files.
The compiler must identify files clearly, not only by the name, because different files
in different directories may have the same name. Notice that the path which is given
after the keyword 'read' is not enough for identifying the file. This is because this
path is itself relative.
The compiler keeps in memory the absolute path of a 'current' directory. Notice that
this current directory is a notion internal to Anubis. It has nothing to do with the
current directory in the sens of the host operating system. Anyway, the Anubis
compiler never changes the current directory in the sens of the host system.
When the compiler starts, the 'current' directory is the directory from which the
Anubis compiler has been launched.
When the compiler needs to find the source file corresponding to a path, either the
path given on the command line, or a path found after a keyword 'read' or 'replaced
by', it looks for the file using the following absolute paths successively:
<current directory>/<given path>
<my_anubis directory>/<given path>
<anubis directory>/<given path>
except if <given path> is absolute. In this case only this path is used.
When the file has been found, the directory of this file becomes the new 'current'
directory.
*** (3.3) The 'read' stack.
(also called 'include stack' in the source of the compiler).
Of course, when a 'read' or 'replaced by' keyword is encountered, the compiler must
read another file, but must be able to come back for reading the rest of the current
file. This is why it needs a stack.
Each 'frame' in this stack contains the following informations:
- a datum of type YY_BUFFER_STATE, needed by FLEX/BISON for each file,
- the absolute path of the file currently read
- the current line number in the file currently read,
- the absolute path of the 'current' directory for the file currently read.
Note: at the time of writting this, this stack is actually made of 4 distinct stacks,
one for each item. These stacks are growing in parallel, and hence share a unique stack
pointer. It would be a good idea to replace this by a single stack whose slots would be
a C structure.
When the end of a file is reached, a frame is poped from the stack, and the reading of
the 'calling' file may be continued.
At the time of writting this text, this stack may contain at most 100 frames.
*** (3.4) Remembering already read files.
The compiler remembers which files have been read. Indeed, during the same compilation,
it may encounter several times a request for reading the same file. Of course, the file
is read only once, and the compiler has just to adjust visibility.
The absolute paths of the files already read are stored into the array
'already_included' (of at most 'max_already_included' = 1024 paths, at the time of
writting this text).
*** (3.5) Computing visibilities.
A file B is visible from a file A only if the file A contains one of the paragraphs:
read B
replaced by B
Normally, a file containing a 'replaced by' paragraph does not contain any other
paragraph (but it may contain off paragraph comments). This kind of file is called a
'relay' file, and is used when the location of a file of the distribution is changed.
'read' is not transitive. In other words, if we have
read B
in file A, and
read C
in file B, this does not make file C visible from file A. However, if we have:
read B
in file A and
replaced by C
in file B, C is visible from file A.
The visibility informations are stored into several arrays. First of all, notice that
each file is identified by an 'id' which is the position of its path in the array
'already_included'. Notice that 'predefined.anubis' (which is 'pre-read') has id 0.
- the array 'number_of_visible' gives for each file id, the number (of type U16) of
files which are visible from this one.
- the array 'is_relay_file' is an array of booleans (actually an array of U8). For
each file it indicates if the file is a relay file or not.
- the two dimensional array 'visibility'. This array provides some kind of
'visibility stack' for each file id. For example, if the number of files visible from
the file whose file id is idA is 3, and if the ids of these files are idA, idB and
idC, we have:
visibility[idA][0] = idA
visibility[idA][1] = idB
visibility[idA][2] = idC
Notice that since a file is always visible from itself, the first entry into this stack
always contains the id of the file to which this stack belongs.
When file B becomes visible from file A, ibB is pushed into the visibility stack of
file A (if not already present). Such stacks are never poped off, because if B is
visible from A, it remains visible indefinitely. Notice that such stacks never need to
be enlarged, because the number of files read cannot be greater than
'max_already_included', which is the size of these stacks.
So, the question is just to know when a file becomes visible from another one. First of
all, each file must be visible from itself. Hence this is registered the first time the
file is encountered. As a particular case, 'predefined.anubis' is encountered before
any other one. Furthermore, it must be visible from all other files.
When a relay file becomes visible from some file, it must transmit its visibility to
the file it is pointing to.
*** (4) The parser.
The parser (grammar.y) is a BISON file. It provides the function 'yy_parse' used in
'main.cpp'. It contains essentially the following parts:
- Declaration of tokens and non terminals with their types.
- Precedence and association rules.
- Grammar rules.
The rules of precedence and association should not be changed (and their order is
important) because if they are changed, files which are currently compiling ok could
have syntax errors.
Each grammar rule has an 'action' part (between { and }) which constructs the
syntactical tree corresponding to the rule. These actions are the official source of
informations for the Lisp-like format of terms.
*** (5) Computing interpretations of terms.
*** (5.1) Overview.
The Anubis language allows overloading of symbols and type parameters. These features
introduce many ambiguities which must be resolved.
A term may have several distinct interpretations. For example, when you write the
number:
2
it has at least 3 interpretations, one as an 'Int8', one as an 'Int32' and one as a
'Float'. Similarly, a symbol or a binary operator like '+', may have several
definitions, hence several interpretations.
When atomic terms (like numbers and symbols) are combined together for forming compound
terms, like applicative terms, conditionals,... the compound term may itself have
several interpretation depending on which interpretation we choose for the atomic
subterms of this term.
Hence, what we need is a (necessarily recursive) function for computing the list of all
interpretations of a term. This function is called 'term_interpretations' and has
several auxiliary functions. All these functions are defined in 'interp.c'.
Also terms must be interpreted relative to a 'context'. A context is normally just a
set of declarations. But in the case of the Anubis compiler, a context is more
precisely an accurate representation of what will be the top of stack at run time. Such
a context is represented as a list, with the head of list being the most recently
pushed item.
For example, if we want to interpret a term like this one in some context C:
with x = a, t
we must first get the list of all interpretations of 'a' relative to C. For each
interpretation of 'a', we may compute the list of interpretations of 't', but it must
be relative to '((x . T) . C)', where 'T' is the type found for this interpretation of
'a', because after 'a' is computed (at run time) its value is pushed into the stack.
(Actually, in the case of this example, the compiler requires that 'a' has only one
interpretation. This is for making the source files more explicit and easier to read
for humans.)
*** (5.2) Interpreting applicative terms.
An applicative term is just made of a function applied to arguments. For example, it
may be:
f(a,b,c)
(or 'a + b'). Such a term is represented internally as:
(app <lc> [f] . ([a] [b] [c]))
where <lc> is a Lisp integer ('Expr') from which the name of the source file, together
with the line and the column of the term may be recovered.
In order to interpret this term, the compiler must first compute all interpretations of
the function itself. It also computes all the interpretations of the tuple of
arguments. This may lead to a very big number of interpretations. For example, if 'a',
'b' and 'c' have respectively 2, 4 and 7 interpretations, this yields 56
interpretations for the tuple '(a,b,c)'. If the number of interpretations of a tuple
exceeds say one million, the compiler may become very slow. The remedy for the time
being (comming from the user) is to type explicitly each argument in order to reduce
this number of interpretations.
Now, for each interpretation of the function and, each interpretation of the tuple of
arguments, we must unify the signature of the function (type of the tuple of arguments)
with the type of the actual tuple of arguments. If this unification fails, this
combination is impossible. This is how the number of interpretations may stay at a
reasonable level.
I've tried to proceed another way. Since for each interpretation of the function, we
get types for the arguments, I've tried to recompute the interpretations of the typed
arguments in each case. But experimentations showed that this is even slower.
Added 2006/12/10: Finally, I found the good solution. It is so simple that I don't
understand why I didn't see it at once. Actually, I know why. This is because from the
very beginning I wrote a function for computing the interpretations of a tuple (which
is used for applicative terms and for bodies of conditionals), and I never had the idea
to question the necessity of this function. It is actually not required, and is the
cause of the problem. Here is the solution:
Instead of computing the list of interpretations of the tuple of arguments, we just
compute the lists of interpretations of each argument. For example, if we have 3
arguments having respectively 2, 4 and 7 interpretations, this makes only 2+4+7 = 13
interpretations (dispatched into 3 lists) instead of 56 interpretations (in a single
list). Now for each interpretation of the function we select for each argument the
compatible interpretations. This will lead to a list of interpretations of the
applicative term for each interpretation of the function. In any case, we will have to
handle much less interpretations than with the current algorithm.
We could also enhance the error message in case no combination enables to interpret the
applicative term. The current error message (number 'E021') is very troublesome for
most users. We could better do as follows:
- 1. Print one section for each interpretation of the function.
- 2. In each section print informations only on incompatible arguments, not on all
arguments.
*** (5.3) Interpreting conditionals.
The test is first interpreted. The compiler requires (again for making the source files
more explicit and easier to read for humans) that the test has one and only one
interpretation. For each interpretation of the test, the type of the test is known. Of
course all interpretations whose type is not a type with alternatives are
rejected. However, components in this type may still be unknown, the essential being
that alternatives are known.
Next, the compiler analyses the cases. We must have one case per alternative, and in
the same order.
For each alternative, the compiler checks that the name of the case is the same as the
name of the alternative, and that the number of resurgent symbols is the same as the
number of components. If a resurgent symbol is explicitly typed, its type is unified
with the type of the corresponding component. If this unification fails, an error
message is sent.
For each case, the compiler constructs a new context for interpreting the body of the
case. Indeed, resurgent symbols are declarations which enrich the context. This may
yield several interpretations for each body of case. The compiler constructs the list
of all interpretations of the tuple of bodies. Again (as for applicative terms) this
may lead to a big number of interpretations and slow down the compiler or even exhaust
the memory.
For each such interpretation of the tuple of bodies, the types of the bodies are
unified. This is because in a conditional, the bodies must all have the same type
(which will be the type of the conditional). When the unification fails, the tuple is
rejected. This leads in general to a reasonable number of interpretations of the tuple
of bodies.
Added 2006/12/10: As for applicative terms, this method may be enhanced. First of all,
we may compute the lists of interpretations of all bodies, but without constructing a
list of interpretations of the tuple of bodies. Then we just have to find all tuples of
interpretations of the bodies within which all bodies have the same type. This may be
done by induction on the number of bodies. For 1 body this is obvious. For n bodies (n
> 1), begin (induction hypothesis) by computing all compatible interpretations of n-1
bodies (say all but the first one). Then for each interpretation of the first body
select only the tuples of interpretations of the others whose type unify with the type
of the first one.
Even better for performances: Order the lists of interpretations of bodies according to
their lengths, the longuest one first, and apply the above algorithm. That way the
first unifications of types of bodies will be performed on bodies having the smallest
number of interpretations, leading more quickly to the type to be found for all
bodies. In particular, if just one body is explicitly typed, and if this leads to only
one interpretation of that body, the type of all bodies will be known at once, and the
computation will be much faster.
*** (5.4) Interpreting selective conditionals.
The selected alternative is found using the name of the head of case. If there are
several alternatives with this name, the compiler checks the number of resurgent
symbols. If there are several alternatives with the same name and same number of
resurgent symbols, the compiler uses the declarations (if any) of the types of the
resurgent symbols. These types are unified with the actual types taken from the
definition of the type of the test.
Of course, if there is an 'else' case, the type of this case must be the same as the
type of the body of the selected case. This is checked by unification.
*** (6) Implementation of types.
Notice that we implement only 'actual types', not type schemes (for the time being; see
the discussion on quantified types). An actual type is a type with no unknown nor
parameter.
Types in Anubis 1 are of the following sorts:
- primitive types: String, Int32, Listener, ...
- defined types: types which are defined by a paragraph, even if this paragraph is in
predef.c or predefined.anubis.
- functional types: (T_1,...,T_k) -> U
- address types: RAddr(T), WAddr(T), RWAddr(T), GAddr(T) (obsolete), Var(T), MVar(T).
- structure pointer types: These types are used for encapsulating C structures
comming from linked libraries (like SSL or SQLite3).
*** (6.1) Implementation of primitive types.
There is a particular implementation for each primitive type.
*** (6.1.1) 'String'.
A string in Anubis 1 is implemented as a pointer:
*
|
V
+-------+-------------------------------+---+
| cnt | a b c d .... | 0 |
+-------+-------------------------------+---+
0 4
The first 4 bytes are the counter for the garbage collector. Notice that strings which
are defined statically in the source files, have a 0 at this place, meaning that the
string is permanent (never collected).
Starting at offset 4, are the characters of the string. A trailing 0 is added at the
end, like in C.
*** (6.1.2) 'ByteArray'.
A byte array is a pointer as follows:
*
|
V
+-------+-------+-------------------------------+
| cnt | n | a b c d .... |
+-------+-------+-------------------------------+
0 4 8
where again 'cnt' is the counter, and 'n' the number of bytes in the byte array (not
including 'cnt' and 'n'). The bytes are stored starting at offset 8. There is no
trailing 0.
*** (6.1.3) 'Int32'.
An 'Int32' is just a 'int' of C (of 32 bits of course), i.e. a 32 bits word containing
the number itself.
*** (6.1.4) 'Float'.
Because data in Anubis 1 cannot have more than 32 bits, double precision floating point
numbers are implemented indirectly (which is very bad for performances):
*
|
V
+--------+-----------------+
| cnt | IEEE754 number |
+--------+-----------------+
0 4 12
where again 'cnt' is the counter for the garbage collector.
*** (6.1.5) 'Listener'.
Despite the fact that 'Listener' is a primitive type, it is implemented exactly as
network address types. See below.
*** (6.2) Implementation of defined types.
Defined types (types with alternatives, defined by paragraphs beginning by 'type' or
'public type') are implemented according to 3 different methods: 'small', 'mixed' and
'large'.
First of all, the width of the type is computed. The width is the number of bits
required for accomodating all data of this type. It is computed as follows.
Some kind of base 2 logarithm of the number of alternatives is computed. This is the
smallest number of bits required for accomodating all alternative numbers (also called
'indexes'). It is:
0 for at most 1 alternative (example: 'Empty' or 'One')
1 for 2 alternatives (example: 'Bit' or 'Bool')
2 for 3 to 4 alternatives
3 for 5 to 8 alternatives
4 for 9 to 16 alternatives
etc...
Since the maximal number of alternatives is limited to 256, the width of an index field
is at most 8.
This logarithm of the number of alternatives is called the 'index width' for this type.
Now, each alternative has its own width. It is the sum of the index width of the type,
and the widths of all components of the alternative. For example, the type
'Maybe(Int8)' has 2 alternatives:
failure
success(Int8)
the index width is 1 (2 alternatives). The width of the first alternative is 1 (= 1 +
0; no component), and the width of the second alternative is 9 (= 1 + 8), because
'Int8' is of width 8 (as expected).
Now, the width of the type is the supremum of the width of its alternatives (0 for
empty types). So, it is 9 for 'Maybe(Int8)' for example.
These computations are performed in 'typewidth.c'.
Notice that this width is not well defined in the case of a recursive type. For a
recursive type (or a cross recursive type), the width is infinite (which corresponds to
the fact that the type itself is infinite (if it is not empty)). This is why we do not
compute the width of recursive types. Hence, we need a tool for detecting if a type is
recursive (maybe cross recursive through other types). This tool is defined in
'rectype.c'. Notice that a recursive type which is empty (this may happen: see the
discussion on 'NonEmptyList' in 'predefined.anubis') is still considered as recursive
by Anubis 1, and its width is still infinite.
The width of a type is used to determine the number of bits needed for holding a datum
of this type. Above a certain limit, and in particular for recursive types, the type
must be implemented indirectly, i.e. through a pointer to a segment.
The width discussed above will be called the 'theoretical width'.
Hence, we have another notion: that of 'real width'. The real width, for Anubis 1 is an
integer between 0 and 33 included. The value 33 means that the theoretical width is at
least 33, and that the type will be implemented indirectly. Values from 0 to 32 mean
that the type will be implemented directly within that number of bits.
Of course, allowing 64 bits or 128 bits direct data in the future, means updating all
these programs.
If the real width of a type is at most 32, the type is 'small'. It cannot be recursive,
and all its components, and the components of its components, and so on, are small. Any
datum of this type may be represented on 'width' bits.
If the real width is 33, there are two cases:
- the number of alternatives is at most 4: the type is 'mixed'.
- the number of alternatives is 5 or more: the type is 'large'.
If a type is large, all data of this type are represented indirectly, i.e. as a
pointer to a memory segment:
*
|
V
+----------+---+--------------+--------+---------------+
| cnt | i | c1 | c2 | c3 |
+----------+---+--------------+--------+---------------+
0 4 5
The first 4 bytes of the segment are always occupied by the counter ('cnt') of the
garbage collector. The next byte (at offset 4) is the alternative index for this
datum. It indicates to which alternative the datum belongs. Starting at offset 5, are
the representations of the components of the datum.
Since all pointers returned by the memory allocator are multiples of 4 (the two least
significant bits are always 0), these two bits are available for any purpose. I've
decided to use them for storing the alternative index (this is why this method cannot
be applied if there are more than 4 alternatives). This has the advantage of avoiding a
pointer to a segment for some alteratives. For example, the type 'List(T)' (for any
type 'T') is mixed. The first alternative has width 1, the second alternative has an
infinite width. Hence, it is tempting to avoid the use of an indirection for the first
alternative. This is what the 'mixed' implementation realizes.
Hence, the empty list (for any type 'T') is just the 32 bits word 0, while a non empty
list looks like this:
pppppppp pppppppp pppppppp pppppp01
where the least significant two bits (01) indicate alternative number 1 (second
alternative), and where:
pppppppp pppppppp pppppppp pppppp00
is the pointer:
*
|
V
+----------+-------+---------------+
| cnt | head | tail |
+----------+-------+---------------+
0 4
Notice that there is no alternative number at offset 4 (since it is already in the
least significant bits of the datum itself), and that components are stored starting at
offset 4. 'tail' always occupies 4 bytes, but 'head' may occupy 0, 1, 2 or 4 bytes.
*** (6.3) Implementation of functional types.
Functions are implemented in two different ways depending on how they are defined.
1. If the function is defined by a paragraph, it is just represented by the address
of its code.
2. If the function is defined by the arrow '|->' (or by the recursive arrow '|-f->'),
it is represented as a 'closure'. A closure is a pointer to a segment like this:
*
|
V
+----------+----------+----------+----------+----------+----------+...
| cnt | code | del | x1 | x2 | x3 |
+----------+----------+----------+----------+----------+----------+... etc
0 4 8 12 16 20 24
where 'cnt' is the counter of the garbage collector as usual. All slots in a closure
occupy 4 bytes. The slot 'code' is the address of the code of the function. The slots
'x1', 'x2', etc... contain the values of the symbols which where present in the context
at the time the function (i.e. the closure) was created, and which are refered to from
within the body of the function definition. This part of the closure is called the
'micro context'. The slot 'del' contains the address of the garbage collector code able
to properly destroy the closure itself.
When we have a fonction at hand (i.e. a 32 bits word representing a datum whose type is
functional), we must be able to decide if the function is implemented as a simple code
address or as a closure. Again we take advantage of the fact that pointers to memory
segments are divisible by 4. If the pointer is even (least significant bit = 0), the
function is a closure. If it is odd (least significant bit = 1) it is a simple code
address. For this reason the code of a function defined at top level (by a paragraph)
is always positioned at an odd address. This is the reason of the existence of the
instruction 'odd_align' (which is of course of size 1 and never executed).
*** (6.4) Implementation of address types.
There are two sorts of address types:
1. File and network address types: 'RAddr(T)', 'WAddr(T)', 'RWAddr(T)', 'Listener'
2. Dynamic variable address types: 'Var(T)', 'MVar(T)'.
We ignore GAddr(T), which is obsolete.
*** (6.4.1) File and network address types.
A datum of such a type is a pointer:
*
|
V
+----------+---+---+---+---+---------------------------------+
| cnt | t | m | _ | _ | 12 bytes |
+----------+---+---+---+---+---------------------------------+
0 4 5 6 7 8 20
As usual the first 4 bytes are for the counter of the garbage collector. The next four
bytes are:
t type indicator
m mode indicator
_ not used
_ not used
The next 12 bytes starting at offset 8 are used depending on the value of the type
indicator.
The type indicator may take the following values ('conn' is for 'connection', a synonym
(!?) of 'address'):
conn_invalid invalid connection (should never happen)
conn_file file on disk
conn_listener listening TCP/IP connection (server: type 'Listener')
conn_network client TCP/IP connection
conn_ssl_network client SSL connection
conn_local (obsolete: local variable)
conn_global (obsolete: global variable)
conn_stdin standard input
conn_stdout standard output
conn_stderr standard error output
The values of the above are defined in 'vm.h'.
The mode is an OR of the following flags:
conn_read readable connection
conn_write writable connection
conn_read_pw (experimental: connection with password access)
conn_write_pw (idem)
in_progress (connection not yet up)
The 12 bytes after offset 8 are used as follows:
Files (conn_file):
+---+---+---+---+---+---+---+---+---+---+---+---+
| * | | | | | | | | |
+---+---|---+---+---+---+---+---+---+---+---+---+
V
FILE (in the sens of C)
Network (conn_network, conn_listener):
+---+---+---+---+---+---+---+---+---+---+---+---+
| socket handle | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+
SSL connection (conn_ssl_network):
+---+---+---+---+---+---+---+---+---+---+---+---+
| (SSL *) | TCP/IP handle | (char *) |
+---+---|---+---+---+---+---+---+---+---|---+---+
V V
SSL struct of OpenSSL server common name
Standard connections (conn_stdin, conn_stdout, conn_stderr):
+---+---+---+---+---+---+---+---+---+---+---+---+
| | | | | | | | | | | | | (no byte used)
+---+---+---+---+---+---+---+---+---+---+---+---+
*** (6.4.2) Dynamic variable address types.
A simple dynamic variables (type 'Var(T)') is a pointer:
*
|
V
+-----------+-----------+-----------+-----------+
| cnt | value | n | * |
+-----------+-----------+-----------+-----|-----+
0 4 8 12 | 16
V
monitors table
The segment contains 4 32 bits words:
- at offset 0: the counter
- at offset 4: the current value of the variable
- at offset 8: size of monitors table (number of slots)
- at offset 12: pointer to monitors table
'n' should be 0 exactly when the pointer to monitors table is NULL. When the variable
is created (by instruction 'create_var'), n is 0 and the pointer to monitors table is
NULL.
A monitor is just a datum of type 'One -> One' (most often it is a closure).
Multiple dynamic variables (type 'MVar(T)'). This is again a pointer:
*
|
V
+----------+----------+----------+----------+----------+-----------------------
| cnt | ns | nm | * | nbw | data ...
+----------+----------+----------+----|-----+----------+-----------------------
0 4 8 12 V 16 20
monitors table
In this segement, we have:
- at offset 0: garbage collector counter
- at offset 4: number of slots for data in the variable
- at offset 8: number of slots for monitors
- at offset 12: pointer to the table of monitors
- at offset 16: normalized bit width (see below)
- at offset 20: data
A multiple dynamic variable is nothing else than a 'mutable array'. This array has a
size (number of data of type T that we can put in the array).
Normally, we should do in such a way that slots for data has just the size required for
type T. For example, if we have a multiple variable of type 'MVar(Bool)', each slot
should occupy just 1 bit, not 32 bits. If we have a multiple variable of type
'MVar(Maybe(Int8))', each data slot should occupy 9 bits.
Because of byte boundaries, we use only powers of 2 for the number of bits of a
slot. Hence, for example, in a multiple variable of type 'MVar(Maybe(Int8))', each data
slot will be of 16 bits, the least number of bits which is a power of 2 and at least
equal to 9.
This numner of bits is called the 'normalized bit width'. It may be computed as follows
in C:
if (bw <= 0) nbw = 0; else
if (bw <= 1) nbw = 1; else
if (bw <= 2) nbw = 2; else
if (bw <= 4) nbw = 4; else
if (bw <= 8) nbw = 8; else
if (bw <= 16) nbw = 16; else
nbw = 32;
where 'bw' is the real bit width, and 'nbw' the normalized bit width.
However, for the time being, Anubis 1 sets nbw = 32. This should be worked out, and
maybe, we should also consider 24 as an acceptable normalized bit width.
*** (6.5) implementation of C structure pointer types.
A datum of type '£StructPtr(Name)' is as follows:
*
|
V
+--------+--------+
| cnt | * |
+--------+----|---+
0 4 | 8
V
C structure
i.e. it is a pointer to a segment of 8 bytes containing only a counter for the
garbage collector, and a pointer to the actual C structure of name 'Name'. Names of C
structures are declared in 'bytecode.h'.
*** (7) Code generation.
The function 'compile_term' (in 'compile.c') generates byte code from
interpretations. Below, we describe how this code is generated.
*** (7.1) General conventions.
Code generation is a recursive process (just because interpretations themselves are of
a recursive nature). The code resulting from the compilation of an interpretation
should behave always the same way with respect to the stack and the register R. Our
conventions are the following.
First of all, consider an interpretation '(head . env)'. We may of course replace all
occurences of a type unknown in 'head' by its value as given by 'env'. Hence, we
consider only interpretation heads, and assume that they contain no unknowns.
An interpretation head must be compiled relative to a context. This context is a
precise description of what the stack will contain at run time. We represent context by
lists, the head of list being the top of stack. Each element in the list corresponds to
a slot in the stack. Furthermore, the stack in Anubis 1 is homogeneous. Each slot is a
32 bits word, able to contain a datum of any type.
The Exprs which are listed in a context are the following:
(<symbol> . <type>)
This means that this slot contains a datum of type <type>, and that its name is
<symbol>.
(ret . <arity>)
This means that this slot contains a return address, corresponding to a call to a
function of <arity> arguments.
(f_micro_ctxt <fname> <ftype> (<symbol> . <type>) ...)
This slot contains a 'closure' (i.e. function defined by '|->'). '<fname>' is either
'nil' (in the case of '|->'), or "f" (in the case of '|-f->'). '<ftype>' is the type of
the function. The list of '(<symbol> . <type>) is the micro context proper. It
indicates in which order the data remembered by the function are placed in the memory
segment of the closure.
(argument . <type>)
This means that the slot is occupied by a datum which has been pushed into the stack
for being an argument to a function call. This datum is anonymous, but its type is
known.
For example, you may see below how the compiler prints the context:
(
(argument . "SSL_Connection")
(argument . "Bool")
(argument . "DenialOfService")
(ret . ...)
("connection" . "SSL_Connection")
(f_micro_ctxt nil ... ("sites" app_ts "List" "Web_Site_Description")
("dos" . "DenialOfService"))
(ret . ...)
)
into a '.sc' file:
| ;--- stack ------------------------
| ; 0 SSL_Connection
| ; 1 Bool
| ; 2 DenialOfService
| ; 3 (return address)
| ; 4 connection SSL_Connection
| ; 5.0 sites List(Web_Site_Description)
| ; 5.1 dos DenialOfService
| ; 6 (return address)
| ;----------------------------------
If '<head>' is an interpretation head, and 'Ctxt' is a context, we denote by:
[<head>]Ctxt
the sequence of instructions resulting from compiling '<head>' relative to the context
'Ctxt'.
Convention 1: The stack after execution of the sequence of instructions in
'[<head>]Ctxt' is the same as before this execution.
Convention 2: After the execution of '[<head>]Ctxt', R contains the value of '<head>'.
*** (7.2) Elimination of terminal recursion.
Since the language has no notion of loop, it is of fundamental importance to eliminate
terminal recursion. Roughly speaking, this amounts to replace the sequence:
(apply . k)
(ret . 0)
by:
(collapse . k)
(apply . k)
Indeed, when the first sequence is on the point to be executed, the situation is as
follows:
stack R
--------------------------------------------
a_1 ... a_k r s ... f
where 'a_1,...,a_k' are the arguments of the function, 'r' the return address and 'f'
the function itself. 's' is the return address of the previous call.
'(apply . k)' will perform a jump to the code of the function (found in 'R'), and after
the return from this function call, 'a_1 ... a_k r' have been removed from the stack
which becomes:
s ...
Then the instruction '(ret . 0)' removes 's' from the stack and jumps to address 's',
so that the stack becomes:
...
The other sequence of instructions works as follows. '(collapse . k)' removes 'r' from
the stack, which becomes:
a_1 ... a_k s ...
Then '(apply . k)' call the function 'f', and after the return of this call, the stack
is:
...
as expected.
Actually, what it does is just avoid to perform the instruction '(ret . ?)' two
consecutive times (one time with address 'r', and one time with address 's').
Actually, the situation is more complex because, in general we have the arguments of
a previous call between 'r' and 's', so that the stack looks like this:
a_1 ... a_k r b_1 ... b_i s ...
When we return from the call to 'f', the stack becomes:
b_1 ... b_i s ...
and the next instruction cannot be '(ret . i)' (except if i == 0), because the top of
stack 'b_1' is not a return address. In this situation, terminal recursion can still
be eliminated. If the recursion is terminal, the next instructions are:
(del_stack ...)
...
(del_stack ...)
(ret . i)
(as many 'del_stack' or 'collapse' as they are arguments, i.e. 'i'). At that point the
machine deletes the arguments from the stack and performs the 'ret'.
So the general situation is the following sequence of instructions:
(apply . k)
(del_stack_? . <depth>)
(collapse . <depth>)
...
(ret . i)
and the stack is the following:
stack R
--------------------------------------------
a_1 ... a_k r b_1 ... b_i s ... f
We replace this sequence by the following one:
(collapse . k)
(del_stack_? . k+<depth>)
(collapse . k+<depth>)
...
(apply . k)
Here is how it works:
Instructions Stack
----------------------------------------------------------------------------------
a_1 ... a_k r b_1 b_2 b_3 ... b_i s ...
(collapse . k) a_1 ... a_k b_1 b_2 b_3 ... b_i s ...
(del_stack_? . k+<depth>) a_1 ... a_k b_2 b_3 ... b_i s ...
(collapse . k+<depth>) a_1 ... a_k b_3 ... b_i s ...
...
a_1 ... a_k s ...
(apply . k) ...
The main difference with the original code is just that 'b_1 ... b_i' are deleted
before the call to 'f'.
*** (7.3) Generating the code.
*** (7.3.1) Protect.
The interpretation head has the following form:
(protect <lc> . <head>)
The code generated is:
[(protect <lc> . <head>)]Ctxt =
label a:
protect
[<head>]Ctxt
unprotect a
<end code>
The 'protect' instruction is of size 2 (2 bytes). The first byte is the instruction
name 'protect', and the second byte is the 'lock'. This byte is zero when the code is
generated.
At run time, the instruction 'protect' checks the value of the lock (which may be 0 or
1). If it is 1, the subsequent code is locked, and the instruction gives up. If it is
0, the piece of code is not locked. So the instruction puts 1 in the lock so that the
subsequence code is now locked, and IP (Instruction Pointer) is incremented by 2.
The subsequent code '[<head>]Ctxt' is executed. Then the instruction 'unprotect' is
reached. This instruction puts 1 into the byte at offset 'a+1' (i.e. into the lock) so
that the code is now unlocked.
*** (7.3.2) Lock.
*** (7.3.3) Global symbols.
Global symbols are those which are defined by paragraphs.
*** (7.3.4) Strings, Int32, small data and Floats.
*** (7.3.5) Local symbols.
*** (7.3.6) Microlocal symbols.
*** (7.3.7) Closures.
The code generated must construct a closure. Hence, it begins by the instruction 'alloc'.
[(closure <lc> (f_micro_ctxt fname ftype (sym . type)...) <args> . <body>)]Ctxt =
alloc 3+n 32 bits words (n = size of micro context)
put_copy y_0 in closure (at 3+0 word)
...
put_copy y_n-1 in closure (at 3+n-1 word)
put_closure_labels a d
<end code>
'n' is the number of values to be put in the micro context. Each value is taken from
the stack, virtually copied (if needed), and put at its place in the micro context.
This is what the instructions 'put_copy ...' are doing. However these instructions
are specialised depending on the type of the datum to be put. The function
'get_closure_copy' returns the list of these instructions. It receives the description
of the micro context of the closure to be constructed, and the current context
(description of the current stack).
When this is done, the closure is quite ready. Only the code of the function and the
code for garbage collection of the closure is missing. These codes are put into the
closure by the instruction 'put_closure_labels'.
Of course, these two codes must also be generated by the compiler. The code of the
function is generated as follows:
label a:
[<body>]<args>+(<micro_context>)+(ret . k)
del args
del_closure
ret k
When a function represented by a closure is called the stack is prepared as follows:
a_1 a_2 ... a_k c r ...
(the top of stack on the left side), where 'a_1' ... 'a_k' are the arguments (first
argument on top of stack), 'c' the closure itself, and 'r' the return address. Just
before the function returns the stack must become:
r ...
So the body of the function is compiled relatively to a context (denoted
<args>+(<micro_context>)+(ret . k) above) representing the above stack. When this code
has finished, the value to be returned by the function is in R. At that point, we just
have to delete (virtually) the arguments and the closure from the stack, and return.
*** (7.3.8) Applicative terms.
*** (7.3.9) Conditionals.
*** (7.3.10) Selective conditionals.
*** (7.3.11) Local definitions ('with ...').
*** (7.3.12) Delegate.
*** (7.3.13) Serialize/Unserialize.
*** (7.4) Peephole optimisation.
*** (8) Garbage collector code.
*** (9) Equality code.
*** (10) Serialisation/unserialisation.
Serialisation is also called 'marshalling'. Anubis 1 has a built-in
serialisation/unserialisation mechanism. In the next version of Anubis, this mechanism
should be written in Anubis using macros. It will be much simpler (and safer).
In principle any abstract datum could be serialised. In the current system, only the
data of the following types are serialisable:
- the primitive types: String, ByteArray, Int, Float,
- defined types, provided that all their components are serialisable.
Functional types should in principle be serialisable, but they are not in Anubis 1.
This would require a more sophisticated mechanism, which has also something to do with
dynamically loadable libraries.
Serialisation is not performed by a piece of code, but by a piece of 'pseudo-code'.
The same 'pseudo-code' is used for serialising and for unserialising. The
'pseudo-instructions' have just a different execution. Hence the compiler produces (if
needed) only one such pseudo-code per serialisable type.
Actually the pseudo-code is essentially a description of the type of the data to be
serialised/unserialised.
Note (2006/08/16): I just remarked, looking at the '.sc' file of the web site, that the
compiler generates pseudo code paragraphs like this one:
---------|--------------------------------------------------------
|
| * * * type implementation pseudo code * * *
|
42901 | label 590
| ;--- stack ------------------------
| ; 0 Var(List(Int32))
| ; 1 (return address)
| ;----------------------------------
42901 | type_large_switch 1336
42907 | label 1336
42907 | large_alt_begin 9
42909 | indirect_type_mixed 2 591
42915 | pop1
42916 | large_alt_end 0
42918 | jmp 1335
42923 | label 1335
42923 | swap
42924 | ret 1
---------|--------------------------------------------------------
Strange, isn't it ? Because 'Var(...)' is not serialisable ! However, I also checked
that such pieces of code are never called (by 'serialize/unserialize' instructions). I
also checked with the following example:
global define One
truc
(
List(String) args
) =
forget(serialize(var(args))).
that the compiler doesn't see the problem. This is probably due to the very bad
implementation of 'Var(...)' which is implemented as a large type, not as a 'Var'
type. This must be fixed as soon as possible. The best way is to implement 'Var(...)'
correctly.
*** (10.1) How it works.
Serialising means 'transforming into a sequence of bytes'. This sequence of bytes (a
byte array) may be saved on disk, sent over a network connection etc... Of course,
serialisation is interesting because it does'nt loose information. Serialised data may
be unserialised.
Now, when we are unserialising a byte array, expecting a datum of a certain type, we
are not always sure that the byte array is the result of the serialisation of a datum
of this type. Hence, unserialisation may produce errors. This is why the primitive
'unserialize' returns a datum of type 'Maybe($T)'.
Notice that when a datum is serialised, its type is not written into the serialisation.
Hence, when unserialising a datum we need to know the type of the datum. In other
words, the type is the 'key' (or the 'grammar') for unserialising.
Serialisation is fairly simple. It works as follows:
- String: the characters of the string, followed by a zero byte are written.
- ByteArray: a 4 bytes word is written in little indian order, representing the
number of bytes in the byte array. Then the bytes of the byte array are written.
- Int: The first byte is the sign (0 = positive, 1 = negative). The next 4 bytes the
number of bigits (little endian). The subsequent groups of 4 bytes are the bigits
(most significant first, so that, unfortunately, the bytes are ordered as little
endian, whilst the bigits are ordered as big endian !).
- Float: The floating point number is written in the IEEE754 8 bytes format (and
stored little endian !).
- small type: The datum is written either as a 0 byte, 1 byte, 2 bytes or 4 bytes
word.
- mixed type: The index of the alternative is first written in the form of a
byte. Then the components are serialised. But in the case of a small alternative, the
components are serialized as a 'small datum', i.e. a 32 bits word containing all the
datum of the alternative (index included). As a consequence, in this case the index
is redundantly serialized. See the pseudo-instruction 'indirect_type_mixed' for more
informations.
- large type: Same as for mixed types.
The result of serialisation is a byte array which needs to be allocated before
serialisation begins. The virtual machine has special registers for serialisation:
U8 *serial_buf; buffer for serialisation
U32 serial_size; size of this buffer
U32 serial_next; next free position in the buffer
Bool unserial_failed flag for unserialisation
When serialisation begins (i.e. when the instruction 'serialize' is executed), the
datum to be serialised is on top of the stack. The instruction 'serialize' has an
address operand pointing to the pseudo code for the type of this datum. This
instruction works like a call (it calls the pseudo code), hence must push a return
address. However, the return address is inserted just below the datum to be
serialized. The instruction also changes the 'work sort' of the virtual machine from
'computing' to 'serializing', adn allocates a data segment of some default size whose
address is put in 'serial_buf'. 'serial_size' is intialized to the size of this
segment, and 'serial_next' is initialized to 8 = 4+4, so as to accomodate room for the
garbage collector counter and the word for the size of the resulting byte array. Of
course, if the segment cannot be allocated, the instruction gives up, waiting for next
time.
Form now on, we are serializing, executing some pseudo-code. This pseudo-code ends with
the instructions 'revert_to_computing' and 'ret'.
'revert_to_computing' changes the work sort from 'serializing' to 'computing',
reallocates the resulting byte array at the right size, puts the size at the right
place in the byte array, puts the new byte array in R, and a null pointer in
'serial_buf'.
'ret' returns from the pseudo-code to the normal code.
Unserializing works as follows.
The byte array 'b' to be unserialized is in R. The 'unserialize' instruction has an
address as its unique operand:
unserialize <addr>
It is the address of a piece of pseudo-code, namely the pseudo-code for the type 'T' of
the datum to be extracted from the byte array.
What 'unserialize' does is just:
1. put (the address of) 'b' into the 'serial_buf' register, and
initialize related registers ('serial_size' and 'serial_next'),
2. put 0 in the flag 'unserial_failed'
3. push a return address 'r',
4. put the machine in the state 'unserializing',
5. jump to <addr>
Hence, just after 'unserialize' has executed, we have:
R stack serial_buf unserial_failed
(empty) r ... b 0
The execution of the pseudo code yields one of the following two situations:
case 1: unserialization succeeded
R stack serial_buf unserial_failed
d ... b 0
case 2: unserialization did not succeed
R stack serial_buf unserial_failed
d ... b 1
In both cases, we get a datum 'd' of type 'T'. However,
- if 'unserial_failed' is 0, d is the datum successfully extracted from the byte
array,
- if 'unserial_failed' is 1, d is a valid datum of type T, at least from the point of
view of the garbage collector. Nevertheless, it is non significant.
The return address 'r' has been removed from the stack. The byte array 'b' is still in
the 'serial_buf' register. At that time, the machine is still in the 'unserializing'
state. The next instruction:
revert_to_computing <width>
has the responsability of testing 'unserial_failed', virtually delete 'b', put the
machine in the state 'computing', and:
case 1 (unserail_failed == 0):
- put 'd' into a 'success', so as to construct a datum of type 'Maybe(T)', and push a
zero on the stack. Hence, the situation becomes:
R stack serial_buf unserial_failed
success(d) 0 ... (empty) (empty)
case 2 (unserail_failed == 1):
- push 'd' on the stack, and put a zero ('failure' of type 'Maybe(T)') in R. Hence,
the situation becomes:
R stack serial_buf unserial_failed
failure d ... (empty) (empty)
The next instruction is:
del_stack_? 0
which will delete correctly the top of stack, hence yielding in any case the following
situation:
R stack serial_buf unserial_failed
result ... (empty) (empty)
where result is of type 'Maybe(T)'.
So, if the initial situation is:
R stack
(empty) ...
after execution of:
[term]
unserialize <addr>
revert_to_computing <width>
del_stack_? 0
it becomes:
R stack
d ...
where 'd' is the result of the unserialization (of type 'Maybe(T)').
Notice that the <width> operand is used by 'revert_to_computing' on order to construct
'success(d)'. Indeed,
- if <width> <= 31, success(d) is just (d<<1)|1
- if <width> >= 32, success(d) requires a new data segment of two words, with:
- at offset 0: counter (= 1)
- at offset 4: d
and the pointer to this data segment must be ORed with 1 (mixed index for type
'Maybe').
To summarize, the code for (unserialize <lc> <type> . <term>) is:
[<term>]
unserialize <addr>
revert_to_computing <width>
<end code>
except if the declared type is ByteArray. In that case, we have:
[<term>]
success 32 needed to put the byte array into a 'success'
<end code>
*** (10.2) Descriptions of small types.
They are lists on integers defined by the following grammar:
<na> := 8 bits integer (number of alternatives)
<iw> := 8 bits integer (index width)
<nc> := 8 bits integer (number of components)
<small type descr> := <iw> <na> <alt descr> ... (as many <alt descr> as <na>)
<alt descr> := <nc> <small type descr> ... (as many as <nc>)
Hence a small type description is a list of 8 bits integers. These integers become part
of serialization/unserialization instructions like 'type_8', 'type_16' etc... (see
'vminstr.c'). One question is: `how long this description may be ?'. The answer is not
so obvious ! Remember that the number of alternatives is limited to 256, and likewise
for the number of components in an alternative. Now, if 'iw' is the number of bits
required for the index, the total width of the components is at most: 'bw - iw' where
'bw' is the bit width of our type. So, let 'dl(iw,bw)' be the maximal length for a
description of a type of index width 'iw' and bit width 'bw'. We have,
dl(iw1,bw1) =< 2 + (2^iw1 * (1 + bl(iw2,bw1 - iw1)))
*** (10.3) Pseudo-instructions set.
The official list of pseudo-instructions is defined in 'bytecode.h'. Recall that these
instructions have two meanings: one for serialisation and one for unserialisation.
action while action while
serialising unserialising
---------------------------------------------------------------------------------------
swap swaps the top of stack nothing
type_0 nothing nothing
type_8
type_16
type_32
type_String
type_ByteArray
type_Int32
type_Float
indirect_type_0
indirect_type_8
indirect_type_16
indirect_type_32
indirect_type_String
indirect_type_ByteArray
indirect_type_Int32
indirect_type_Float
indirect_type_mixed
indirect_type_large
type_mixed_switch
type_large_switch
mixed_alt_begin
large_alt_begin
mixed_alt_end
large_alt_end
type_small_alt
revert_to_computing
type_mixed (obsolete; never used)
type_large (obsolete; never used)
small_mixed_alt (obsolete; never used)
*** (10.4) Serialising.
*** (10.5) Unserialising.
The machine is now doing unserialization. When unserialization begins (just after
the instruction 'unserialize'), the machine is in the following state:
R stack serial_buf unserial_failed
(empty) r ... b 0
where 'r' is a return address (to be removed by the 'ret' at the end of the pseudo
code, and where 'b' is a byte array.
During unserialization, the registers serial_buf and serial_size remain constant. On
the contrary, serial_next is incremented while the byte array is read. It always
gives the offset (relative to serial_buf) of the next byte to be read.
Below, we simulate a successful unserialization for various types. The instruction
'swap' does nothing during unserialization.
Below, 'p' is a pointer to a newly allocated segment, whose size (in bytes) is given
by the instruction ?_alt_begin. 'e' is the result of unserialization when
instructions call another pseudo code. 'pop1' does not pop anything, but put the
content of R at the address pointed to by the top of stack, and increments that
pointer. 'indirect' instructions which do not call another pseudo code, put their
result at the address pointed to by the top of stack.
small type: Bool
R stack
50 | label 20 (empty) r ...
50 | type_8 d r ...
51 | swap
52 | (ret . 1) d ...
primitive type: String
R stack
50 | label 4 (empty) r ...
50 | type_String d r ...
51 | swap
52 | (ret . 1) d ...
mixed type: List(String)
R stack
50 | label 15 (empty) r ...
50 | (type_mixed_switch 2 30 29)
61 | label 30 (empty) r ...
61 | type_32 d r ...
62 | (jmp . 28)
67 | label 29
67 | (mixed_alt_begin . 12) (empty) p+4 p r ...
69 | indirect_type_String (empty) p+8 p r ...
70 | (indirect_type_mixed 2 . 15) e p+8 p r ...
76 | pop1 (empty) p+12 p r ...
77 | (mixed_alt_end . 1) d = p|1 r ...
79 | (jmp . 28)
84 | label 28
84 | swap d r ...
85 | (ret . 1) d ...
large type: Printable_tree (defined in basis.anubis)
R stack
130 | label 20 (empty) r ...
130 | (type_large_switch 5 42 41 40 39 38)
152 | label 42
152 | (large_alt_begin . 5) (empty) p+5 p r ...
154 | (large_alt_end . 0) d = p|0 r ...
156 | (jmp . 37)
161 | label 41
161 | (large_alt_begin . 13) (empty) p+5 p r ...
163 | indirect_type_String (empty) p+9 p r ...
164 | (indirect_type_large . 20) e p+9 p r ...
169 | pop1 (empty) p+13 p r ...
170 | (large_alt_end . 1) d = p|1 r ...
172 | (jmp . 37)
177 | label 40
177 | (large_alt_begin . 10) (empty) p+5 p r ...
179 | indirect_type_8 (empty) p+6 p r ...
180 | (indirect_type_large . 20) e p+6 p r ...
185 | pop1 (empty) p+10 p r ...
186 | (large_alt_end . 2) d = p|2 r ...
188 | (jmp . 37)
193 | label 39
193 | (large_alt_begin . 13) (empty) p+5 p r ...
195 | indirect_type_Int32 (empty) p+9 p r ...
196 | (indirect_type_large . 20) e p+9 p r ...
201 | pop1 (empty) p+13 p r ...
202 | (large_alt_end . 3) d = p|3 r ...
204 | (jmp . 37)
209 | label 38
209 | (large_alt_begin . 13) (empty) p+5 p r ...
211 | (indirect_type_large . 20) e p+5 p r ...
216 | pop1 (empty) p+9 p r ...
217 | (indirect_type_large . 20) e p+9 p r ...
222 | pop1 (empty) p+13 p r ...
223 | (large_alt_end . 4) d = p|4 r ...
225 | (jmp . 37)
230 | label 37
230 | swap d r ...
231 | (ret . 1) d ...
Hence, instructions work as follows during unserialization (assuming no error occurs).
'direct type' instructions:
type_0
type_8
type_16
type_32
type_Int32
type_String
type_ByteArray
type_Float
--> unserialize a datum (by reading the byte array) and put it in R
'switch' instructions:
type_mixed_switch
type_large_switch
--> unserialize an index (always a byte), check it using second operand (number of
alternatives), and jump to the right label.
'alt_begin' instructions:
mixed_alt_begin
large_alt_begin
--> allocate a segment, push the pointer, duplicate it, and add either 4 or 5 to the
copy.
'indirect non calling type' instructions:
indirect_type_0
indirect_type_8
indirect_type_16
indirect_type_32
indirect_type_Int32
indirect_type_String
indirect_type_ByteArray
indirect_type_Float
--> unserialize a datum and put it at the address pointed to by pointer on top of
stack. Increment this pointer by the size of the datum.
'indirect calling type' instructions:
indirect_type_mixed
indirect_type_large
--> Call a pseudo code. This has the effect of putting an unserialized datum in R
upon return.
'pop1'
--> put the content of R at the address pointed to by the top of stack, and increment
that pointer (always by 4).
'alt_end' instructions:
mixed_alt_end
large_alt_end
--> remove the pointer on top of stack, put the next one in R, and OR it with the
index (given by the instruction itself).
Now, unserialization may fail, because the bytes in the byte array are arbitrary. As
soon as such a failure is detected, the flag 'unserial_failed' is set to 1. In that
case, we must continue a 'fake' unserialization, because, we must complete data
partially constructed. This is the reason why all instructions must first test the
flag.
If the flag is 0 they operate as described above. Nevertheless, since they may detect
a problem they may also put 1 in the flag.
'Fake execution' of the above instructions is as follows:
'direct type' instructions: put a value of 0 in R as the result of the unserialization
'switch instructions': jump to the first alternative, which always exists
'alt_begin' instructions: do not allocate a segment, but do as if, by pushing a NULL
pointer and duplicating it (hence we have two NULL pointers
on top of stack
'indirect non calling type' instructions: there are two cases:
- the two pointers on top of stack are NULL:
do not do anything
- the two pointers on top of stack are non NULL:
put a value of zero as the result of unserialization, and increment the topmost
pointer by the right value.
'indirect calling type' instructions: again there are two cases:
- the two pointers on top of stack are NULL:
do not do anything
- the two pointers on top of stack are non NULL:
do not call any pseudo MAM(m_code_begin), and put 0 in R.
'pop1': two cases:
- the two pointers on top of stack are NULL:
do not do anything
- the two pointers on top of stack are non NULL:
put 0 at the address pointed to by the top of stack, increment the pointer.
'alt_end' instructions: two cases:
- the two pointers on top of stack are NULL:
remove the two NULL pointers and put a NULL pointer in R,
- the two pointers on top of stack are non NULL:
remove the topmost pointer, and put the other one in R, and OR it with the index.
Notice that any instruction which detects a failure, and puts 1 in the flag, should do as if
the flag was already set.
Finally, we must also discuss 'revert_to_computing'. After the unserialization, the state of
the machine is:
R stack serial_buf unserial_failed
d ... b 0/1
where 'd' is a datum, which is significant only if unserial_failed is 0.
In any case, the byte array in serial_buf must be virtually deleted, and a zero must
be put in serial_buf (this will be checked by 'unserialize'/'serialize').
If unserial_failed is:
0 then construct 'success(d)' in R (use the <width> operand to know the
bit width of type of d), and push a zero in the stack, so that
the next instruction (del_stack_?) may operate.
1 then push R in the stack (it will be deleted by the next instruction),
and put 0 (failure of type 'Maybe(T)') in R.
Note: the sequence of instructions:
unserialize <addr>
revert_to_computing <width>
is always followed by:
del_stack_? 0 (for the type of the unserialized datum)
*** (11) Outputing the code (making an *.adm file).
*** (12) How to make a system of macros ?
*** (12.1) Overview.
Anubis has no system of macros, but should have one. This would (as already remarked)
induce many simplification in the system, and at the same time provide powerful
programming tools for the user. The best system of macros ever created is that of the
Lisp language. This system was natural in Lisp because of the particularly simple form
of the syntax. Of course, the syntax of Anubis is much more complex, but as we shall
see, constructing a system 'à la Lisp' is still possible.
What a language is good for ? It is good for representing 'things' (belonging to some
'target' world) by 'expressions'. For example, the natural language represents things
of the real world by words, sentences, etc... but also abstract things like
statements, etc... Of course, each language has a syntax (grammar) but also a
semantics. The 'semantics' is the correspondance between the expressions (the
'signifiers') and the thing they represent (the 'signified').
the semantics
|
symbolic side (language) | semantic side (target world)
-------------------------------------+----------------------------------------
|
|
expression |----------------------------> thing
(signifier) | (signified)
|
Remark that in most cases (programming languages, mathematics, but not the natural
language), the symbolic side is more concrete (less abstract) than the semantic
side. Indeed, for example in mathematics, we have many expressions for representing the
number (say) 2, like 2, 1+1, 5-3, etc... and these expressions may be manipulated using
computation rules. So, this is in some sens very concrete. On the contrary, the
signified, i.e. the mathematical object 2, is a complete abstraction by its very
nature.
So, we use the language for manipulating the things (or at least we give ourself the
illusion that we manipulate the things, because what we actually manipulate are the
expressions, for example when we are computing). This leads us in some circumstances to
write down several times the same expressions or at least similar expressions. So, we
may want to make this writing automatic. In order to do that, we need to consider the
expressions themselves as a new sort of things (the 'meta-things') and to create a new
language (the 'meta-language') for manipulating the meta-things:
meta-semantics semantics
meta-language | language | target world
-------------------------+---------------------------+--------------------------
| |
meta-expression |-----------> expression |--------------> thing
| |
| |
In most cases, the meta-language is different from the language and more simple. This
is the case for the C language, where the expressions of the meta-language are all
those which begin by: #define #if etc... (a very poor language actually).
However in Lisp, the meta-language is Lisp itself. The creators of Lisp were much
inspired when they did that. This has several consequences:
1. The meta-language is as powerful as the language itself.
2. The language is also the meta-meta-language, the meta-meta-meta-language, etc...
Fans of Lisp (like me) know how this feature is powerful.
*** (12.2) The trick.
In order to have the language be its own meta-language, it is enough to embed the
language into its own target universe. In other words, it should be possible to
consider all expressions of the language as data of the target universe:
+------------------------------------------------+
| |
| target universe (all sorts of data) |
| |
| |
| |
| |
| |
| +--------------------------+ |
| | | |
| | the language | |
| | | |
| | (a subset of the | |
| | universe) | |
| | | |
| | | |
| +--------------------------+ |
| |
| |
| |
+------------------------------------------------+
Hence the language, being able to represent all data, may also represent the
expressions of the language itself. It may also be able to represent expression which
themselves represent expressions and so on, so that the language is at the same time
the meta-language, the meta-meta-language, and so on... This is the big trick.
*** (12.3) How it works.
Basically, we need only a set of types (called 'syntactical types') for representing
expressions, and an operator (named the 'expansion' operator) for forcing the compiler
to evaluate data of these types. We may also have a 'quoting' operator for simplifying
the writing of data of these types.
One of the main types for representing expressions in Anubis should be a type of
'terms':
public type Term:
symbol(String name),
applicative(Term function, List(Term) arguments),
function(List(Declaration) arguments, Term body),
conditional(Term test, List(Case) cases),
...
The quoting operator (denoted: ' ) should simplify the writting of data of type
'Term'. For example, instead of writting:
conditional(symbol("x"),
[
case([("failure",[])],symbol("false")),
case([("success",[symbol("u")])],
applicative(symbol("f"),[symbol("u")]))
]
)
we would have just to write this:
'if x is
{
failure then false,
success(u) then f(u)
}
(Remark the quoting mark in front of this term.) The above term is not of type 'Bool'
but just of type 'Term'. Thus, the quoting mark just means: 'don't interpret this
term, just consider it as a term'.
Now, we may write down functions for producing terms (of type 'Term'), type definitions
and all sorts of 'syntactical concepts', considered as data of our semantical
universe. From now on, the expressions of the language may be manipulated as any other
kind of data.
Manipulating expressions as data is called 'meta-programming'. But this is not the
whole story, because the destiny of a datum of type 'Term' or of some similar type is
at the very end to be interpeted (evaluated).
When we want to evaluate a datum (say 't') of type 'Term', we just have to write:
@t
'@' is the 'expansion' (or 'evaluation') operator. In some sens it is the converse of
quoting. It transforms a datum of type (say) 'Term' into what this datum represents if
we consider it as an expression.
*** (12.4) An example.
The most difficult is to design the types: 'TypeDefinition', 'Definition', 'Term',
'Case', etc... They should be the perfect image of the whole language. As an example,
we consider only a small fragment of this family.
public type Type:
scheme (String name, List(Type) operands),
functional (List(Type) source_types, Type target_type),
...
public type Component:
anonymous (Type type),
named (Type type, String name).
public type Alternative:
singleton (String name),
usual (String name, List(Component) components).
public type TypeDefinition:
public (String name,
List(String) parameters,
List(Alternative) alternatives),
private (String name,
List(String) parameters,
List(Alternative) alternatives).
public type Paragraph:
type_definition(TypeDefinition),
...
Now, we may want to define this
define TypeDefinition
enumerated_type
(
String type_name
List(String) alternative_names
) =
private(type_name,
[],
map((String n) |-> singleton(n), alternative_names).
define List(String) my_names = ["ga","bu","zo", "meu"].
@enumerated_type("Shadok",my_names).
In this last paragraph (a 'macro-paragraph' beginning by '@'), the term:
enumerated_type("Shadok",my_names)
is of type 'TypeDefinition'. Because of the presence of '@', the compiler will expand
it into an actual paragraph, namely this one:
type Shadok:
ga,
bu,
zo,
meu.
The syntax of 'macro-paragraphs' is:
@t
(with '@' in the leftmost column) where t is a term of type 'Paragraph' or
'List(Paragraph)'. In the case of a list of paragraphs the compiler expands the term
into a sequence of paragraphs.
Of course, the expansion operator should also be usable for the expansion of terms (and
any other sort of syntactical entity), not only for paragraphs. For example, we may
write something like this:
define Term
my_cond
(
Term test
List(String) alts
) =
conditional(test,
map((String n) |-> singleton_case(n,length(n)),
alts)).
Now, we may write (within a paragraph):
@my_cond(_x,my_names)
which will be expanded into the term:
if x is
{
ga then 2,
bu then 2,
zo then 2,
meu then 3
}
where 'x' is the expansion of '_x'.
The expansion may also be implicit. For example if a term has an interpretation whose
type is a syntactical type, the compiler may also try to interpret its expansion. And
conversely the quoting may also be implicit. However, this may eventually lead to
problems and this should be experimented.
*** (13) Propositions for a dynamically defragmenting memory management system.
For the time being, the main segments of the memory allocator are never freed. This is
because since allocated segments cannot be moved within memory, main segments of the
allocator have almost no chance to become empty. Anyway, no system for testing for this
emptyness has been implemented.
It would be nice to be able to move an allocated segment within memory without
disturbing the program. That way, it would be possible to gather allocated memory
segments and to free big parts of memory. The system described below allows moving of
allocated segments. Compared with the current system (usual allocator and reference
counting), it has the following characteristics:
- doesn't need more memory than the current system within the allocated segments,
but the size of pointers is the double (64 bits) of the current size of pointers (32
bits).
- Virtual copy is almost as quick as in the current system, but virtual deletion may
be slower when there are data shared by many pointers (which does not seem to be
very often the case).
- An allocated segment may be moved within memory by the system without perturbing
the currently running virtual machine, provided it is done in one non interruptable
operation (but which duration is very short).
- Allocated segments may be moved and main segments returned to the host allocator
in a low priority mecanism.
*** (13.1) How it works.
I would like to call this system 'Reference Chaining Garbage Collector' (compare to:
'Reference Counting Garbage Collector', the current system). Here is how it works.
When a new segment is allocated a 'reference' to it is created and stored somewhere.
This reference represents the datum just created. It is made of two parts (two words):
--------------------
| |
| V
| +---------+---------+
| | * | (n<<2)|1| the 'reference' (two words)
| +----|----+---------+
| |
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the newly allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
The first part of the reference is an ordinary pointer (say 32 bits) to the useful part
of the newly allocated segment. The second part has the form (assuming pointers are 32
bits wide):
nnnnnnnn nnnnnnnn nnnnnnnn nnnnnn01
where n...n is the size of the allocated segment (in words). The segment itself has two
hidden words at the beginning. The second word contains a pointer to the reference
itself. Hence, the reference can be recovered from the allocated segment and
conversely. The size of the segment (needed only for deallocation) may be recovered
from the second word of the reference. The word marked 'mc' will be explained later.
Now, what happens if this reference is shared, i.e. if we make a virtual copy of it ?
Here is what happens:
-------------------- --------------
| | second | |
| V reference | V first reference
| +---------+----|----+ +---------+---------+
| | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+
| | |
| |-----------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
Hence, the references to the allocated segment are chained into a list. Since pointers
are divisible by 4, the end of the list is found by the fact that (n<<2)|1 is 1 modulo
4. The list is pointed to by the second hidden word in the allocated segment
itself. Each new reference is added in front of the list, as we show already with the
second one. For example, here is how it looks with 3 references to our segment:
------- -------- --------------
| | third | | second | |
| V reference | V reference | V first reference
| +---------+----|----+ +---------+----|----+ +---------+---------+
| | * | * | | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | | |
| -----------------------------------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
Notice that the segment is reachable directly from any reference, and that the
allocated segment has two words used for extra 'non data' informations. In the
Reference Counting system there are also two such words: one for the size of the
segment (used by the allocator), and another one for counting the references.
Now, here is how virtual deletion is performed. When a reference needs to be deleted
(i.e. a copy of the datum is virtually deleted), we must know if it is the head of list
or not. To that end, we first go to the allocated segment, and then follow the list
until we find our segment again. We determine if it is the first one in the list or if
it is not. If it is the first one, we may easily determine if it is the last reference
to our segment by having a look at the parity of the second word of the reference.
Case 1: There is only one reference to the allocated segment. In this case the segment
is freed.
Before deletion:
--------------------
| |
| V
| +---------+---------+
| | * | (n<<2)|1| the 'reference' (two words)
| +----|----+---------+
| |
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
After deletion: There is nothing left. The allocated segment returns to the pool of
free segments, after all the references it contains are virtually deleted (recursive
process).
Case 2: There are several references, and the reference to be deleted is the head of
list. In this case the next reference becomes the head of list:
Before deletion:
-- -------- ---........---
| | | | | |
| V | V | V
| +---------+----|----+ +---------+----|----+ +---------+---------+
| | * | * | | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | | |
| -----------------------------------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
After deletion:
-------------------------- ---........---
| deleted | | |
| reference V | V
| +---------+----|----+ +---------+----|----+ +---------+---------+
| | * | * | | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | |
| -----------------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
Notice that we just have to change the pointer in the second hidden word of the
segment.
Case 3: The reference to be deleted is not the head of list and not the last one in the
list. In this case it must be removed from the list. We illustrate as above:
After deletion:
-- ---------------------------........---
| | | deleted |
| V | reference V
| +---------+----|----+ +---------+----|----+ +---------+---------+
| | * | * | | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | |
| -----------------------------------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
Notice that this amounts to change a pointer in the previous element of the list.
Case 4: The reference to be deleted is the last one in the list. In this case we just
have to move the word (n<<2)|1 from the last reference to the previous one, as
illustrated below:
After deletion:
-- ---.......--
| | | | deleted
| V | V reference
| +---------+----|----+ +---------+---------+ +---------+---------+
| | * | * | | * | (n<<2)|1| | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | |
| ---------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
*** (13.2) Moving an allocated segment within memory.
This system has been designed for moving segments into memory without interrupting the
program more than the time needed for moving just one segment. However, there is a
problem. Indeed, if a pointer is moved into memory, it always points to the same
position. If a reference is moved, the left hand pointer will always point to the data
part of the segment, and the right hand pointer (if any) to the next reference, but the
problem is that the right hand pointer of the previous reference (or the second hidden
word in the segment) will no more point to our reference. In other words, moving a
reference entails that the right hand pointer in the previous reference is updated. Of
course, this is not very difficult to perform, but again, it implies a walk into the
list of references.
A more delicate problem is comming from the fact that the data part of the segment
itself may contain references. If we move this segment by a naïve copy, the chains of
these references will be broken.
A solution is that the compiler produces a ``move code'' for each sort of segment. When
called, this move code may properly move the segment, because it knows where are the
references within it. The field 'mc' in the first hidden word is the address of the
move code of the segment.
Hence, in order to move an allocated segment within memory, we just have to do the
following (without being interrupted):
- copy the segment at the new position using its move code,
- follow the chain of references, and in each reference change the pointer to the
segment.
Notice that the data part of the segment and the chain of its references cannot share a
reference. Indeed, such a reference would point to the segment containing it, which is
impossible.
Before moving the segment:
-- -------- ---........---
| | | | | |
| V | V | V
| +---------+----|----+ +---------+----|----+ +---------+---------+
| | * | * | | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | | |
| -----------------------------------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment |
| +---------+----|----+----------------------------------+
| |
--------------------
After the segment is moved:
-- -------- ---........---
| | | | | |
| V | V | V
| +---------+----|----+ +---------+----|----+ +---------+---------+
| | * | * | | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | | |
| -----------------------------------------------------
| |
| V
| +---------+---------+----------------------------------+
| | mc | * | the allocated segment moved |
| +---------+----|----+----------------------------------+
| |
----------------------------------------------
Notice that only the pointers in the first part of all references have to be
changed. Of course, all this operation must be done without being interrupted. In other
words, this process must complete before giving up (to the scheduler, probably).
*** (13.3) Compatibility with segregated lists.
It is known that segregated lists are a very important feature of allocators. When we
replaced our old system without segregated lists by a new one with segregated lists,
the speed of Anubis has been multiplied by 3 or 4. The system of segregated lists works
as follows.
The free segments (those which are not allocated) are gathered into several lists
instead of just one list. There is one list for each small lengths: 1 word, 2
words,..., 20 words (say). These lists are called 'segregated lists' because the
segments they contain are all of the same length. Of course there is another list for
allocating bigger segments. This last list is not segregated because it contains
segments of different lengths. It should be noted that most allocated segments are
small, hence the importance of segregated lists.
The allocator has a small family of pointers (maybe 21 pointers) for all these
lists. When an allocation is required, the right list is used depending on the size of
the segment to be allocated. If it is small, the segment is taken from one of the
segregated lists. Otherwise, it is taken from the non segregated list.
Since all segments in a segregated list have the same size, the first one in the list
is always ok. On the contrary, in a non segregated list, we must find a segment which
is big enough, and we must carve a piece from it. When a big segment is freed, it must
be put into the non segregated list and eventually glued to its neighbours in order to
make bigger free segments. It is easy to understand that it requires much more
work. Our experiments have shown that finding a segment in a non segregated list may
require to examine several thousands of segments. With the new system, the non
segregated list remains quite small (several tens of segments). In the previous system
the unique non segregated list had generally several thousands, or even hundreds of
thousands of segments.
The system of 'Reference Chaining' is obviously compatible with segregated lists. We
will assume that all the segments (free and allocated) are parts of a list of bigger
'main segments'. The segregated lists and the non segregated lists are mixed together
into these main segments, and with allocated segments.
When a segregated list is empty, and a segment is required from this list, a segment is
allocated from the non segregated list and put in the chosen segregated list. When the
non segregated list is empty, a new big 'main segment' is asked to the host system, and
the useful part of this segment become a new segment of the non segregated list.
Main segments are also chained together into a list.
*** (13.4) Defragmenting memory.
Now, we arrive at the big question: How to defragment memory ?
The question is to move segments so that a main segment becomes 'empty'. When a main
segment is empty, it may be returned to the host allocator. By 'empty', we mean that it
contains no allocated segment, but it may contain free segments. Of course, the lists
of free segments must be updated before the main segment is freed, i.e. the free
segments which lie within the main segment to be returned to the host must be removed
from the lists of free segments.
The purpose of defragmenting is freeing main segments. Moving an allocated segments
from one main segment to another one may result in a main segment containing no
allocated segment at all. In this case, the free segments it contains may be removed
from their respective lists, and the main segment may be freed.
When the allocator allocates a new segment, is it a heavy work to determine the main
segment to which it belongs ? A priori, we must follow the chain of main segment. For
each main segment we easily get the beginning and end address of the main segment, and
we just have to compare the address of the allocated segment to these beginning and
end.
However, each free segment could also contain a pointer to the main segment it belongs
to. This is easily done when creating the free segments. That way, we can determine the
main segment more quickly. However, this makes a problem when the allocated segment is
deallocated, because within an allocated segment, there is no pointer to the main
segment containing it. Hence, we could have to follow the chain of main segments when
deallocating an allocated segment. This is probably not a good solution.
But, we could also have two hidden words in any allocated segment, one pointing to the
chain of references to this segment, and another one pointing to the relevant main
segment. In this case, it is easy to count, for each main segment, the number of
allocated segment it contains. When this number becomes 0, the main segment may be
freed (after the lists of free segments have been cleaned up).
-- -------- ---........---
| | | | | |
| V | V | V
| +---------+----|----+ +---------+----|----+ +---------+---------+
| | * | * | | * | * | | * | (n<<2)|1|
| +----|----+---------+ +----|----+---------+ +----|----+---------+
| | | |
| -----------------------------------------------------
| |
| V
| +---------+---------+-----------------------------------+
| | * | * | the allocated segment |
| +----|----+----|----+-----------------------------------+
| | |
--------------- V
main segment containing this one
That a main segment becomes 'empty' (i.e. contains no allocated segment) may happen
even if we never move a segment within memory. Now we have to examine a strategy for
moving the allocated segments.
The most obvious strategy is to move the allocated segments from main segments which
are far from the beginning of the chain of main segments into main segments which are
near the beginning of this chain. That way, main segments which are far from the
beginning have some chance to become empty. This is probably what we should do.
*** (13.5) This allocator seen from the outside.
*** (13.5.1) Hidden words.
We discuss how other parts of the system should see and use this
allocator/garbage-collector. First of all, what is visible from the outside ? The parts
which are hidden are shown like this: //////// in the next picture:
+---------+---------+ +---------+---------+ +---------+---------+
| * |/////////| | * |/////////| | * |/////////|
+----|----+---------+ +----|----+---------+ +----|----+---------+
| | |
-----------------------------------------------------
|
V
+---------+---------+-----------------------------------+
|/////////|/////////| the allocated segment |
+---------+---------+-----------------------------------+
In other words:
- Each reference has the size of two words (presumably 64 bits for the time being).
- Only the first word in each reference may be used. It is used as a pointer to the
allocated segment. The second word is manipulated by the allocator only.
- The pointer points to the data area of the allocated segment (no 'reference
counting' word).
Of course, the references may be aligned anywhere in memory, i.e. not necessarily on an
address divisible by 4. So there is no unused bit in the second (hidden) word of a
references. On the contrary, the allocated segment may be aligned on an address
divisible by 4, or preferably by 8. In this last case, there are 3 unused bits in the
first word (the 3 bits of lowest weight). This allows to continue to used our
implementation of 'mixed' types, but with up to 8 alternatives intead of only 4. Of
course, in a clever implementation all these constants should be parameters.
*** (13.5.2) Tools (interface to the allocator).
Several tools should be provided by the allocator to the rest of the system:
- The 'allocate' function receives the number of words to be allocated, it also
receives the address at which the first reference should be created. It returns a
boolean saying that the segment has been allocated or not. For example, in C:
Bool allocate((U32 *)where, U32 number_of_words);
If it returns 'TRUE', this function fills up the two words at address 'where' with a
pointer to the data part of the allocated segment and with the word (n<<2)|1.
- The 'move_reference' macro or function moves the whole reference from one place to
another one:
void move_reference((U32 *)source, (U32 *)destination);
Of course the source should point to a valid reference, and the destination points to 2
words which will be overwritten. This function must update the reference chaining in
either the allocated segment or in the previous reference of the chain.
- The 'duplicate_reference' macro or function creates a duplicate of an existing
reference:
void duplicate_reference((U32 *)source, (U32 *)destination);
It chains the new references in front of the list of references.
- The 'delete_reference' macro or function. This function must update the chain of
references and eventually deallocate the segment:
void delete_reference((U32 *)reference, U32 del_code);
This function receives the address of a piece of deletion code. This deletion code is
specific to the type of data to be deleted. It is used when the reference to be deleted
is the last one. Of course, this is a recursive process as usual. Actually, the set of
all these pieces of code are the garbage-collector code proper.
*** (13.6) Permanent data.
We want to be able to handle permanent data, i.e. data created at compile, which are
never collected.
The second least significant bit in the (hidden) pointer to the main segment in any
data segment may be used for detecting permanent data: 0 = non permanent, 1 =
permanent. Of course, if the segment is permanent, is has actually never been
allocated, and lies within some area outside the allocator's main segments. In this
case, the pointer to the main segment will never be used (it is meaningless) and its
value is 0. Hence, that word is 1. The second word of each reference in this case is
2, indicating that the reference points to a permanent segment (this is redundant of
course, but avoid one indirection and we have plenty of room for that). References to
permanent segments are not chained together. Of course the hidden pointer to the chain
of references in the permanent segment doesn't exist. However, the second words in
references must exist:
+---------+---------+ +---------+---------+ +---------+---------+
| * | 2| | * | 2| | * | 2|
+----|----+---------+ +----|----+---------+ +----|----+---------+
| | |
-----------------------------------------------------
|
V
+---------+-----------------------------------+
| 1| a permanent segment |
+---------+-----------------------------------+
*** (13.7) Fake references.
We also want to be able to handle fake references. It is sometimes useful to construct
fake references (as we already did for serialising). It is understood that such fake
references are never manipulated by regular programs but only by the garbage-collector.
The value 0 for a pointer (first word of a reference) may be interpreted as a fake
reference. In a fake reference, the second word is not significant (but must be
present):
+---------+---------+
| 0| not used| a fake reference
+---------+---------+
Notice that a fake reference may be indistinguishable from a datum in a small
alternative in a mixed type. But this has no importance since the garbage-collector
will handle both in the same way (i.e. just ignore them).
*** (13.8) Only one main segment ?
It could be possible to design the system with only one main segment. In this case,
instead of adding new main segments, we just ask the host system to enlarge or decrease
the unique main segment. Of course, in this case, allocated segments must be moved
towards the beginning of the main segment. Here, the question will be to determine the
offset on the right of which nothing is allocated. This is easily done by maintaning a
pointer to this offset.
*** (14) Static garbage-collection.
Should we use the above reference chaining garbage-collector or the more traditional
reference counting garbage-collector, it is in all cases interesting to minimize the
number of virtual copies and virtual deletions by an optimisation of the code. This is
called ``static garbage-collection''. This may improve performances drastically,
because this will eliminate most unneeded copy/delete pairs.
*** (14.1) The facts.
Consider the function 'length' defined for lists by:
public define Int32
length
(
List($T) l
) =
if l is
{
[ ] then 0,
[h . t] then 1 + length(t)
}.
In the current implementation of Anubis (version 1.8), this function increments and
decrements 3 times the counters of all the nodes of the list. All this is wasted time.
Static garbage-collection is able to eliminate 2 of these 3 increment/decrement.
There is no hope to eliminate the third one. Indeed, imagine for example that the list
given to the function 'length' as its argument is not shared, i.e. that all the
counters in the nodes of the list are 1. Then, the list must have disappeared after the
function 'length' returns. Hence, it must be able to decrement counters. What will
happen in this case is that the counter of the tail is first incremented when a virtual
copy of the tail is taken from the first node. Then the first node is decremented to
zero, so that the second node is decremented to 1, etc... Now, if the list is shared,
for example if the counter of the first node is 2, and the counters of the others are
1, the counter of the tail is similarly incremented, but when the first node is
abandoned by the function 'length', its counter is just decremented to 1, etc...
Now, we examine the code for the function 'length'. Here is the code produced for
'length' by Anubis 1.8, instantiated for lists of strings.
9 | label 23
| ;--- stack ------------------------
| ; 0 l List(String)
| ; 1 (return address)
| ;----------------------------------
9 | check_stack 7
14 | peek "l" 0
19 | copy_mixed 2
21 | index_direct 2
23 | _switch 24 25
33 | label 25
33 | unstore_copy_mixed 8 2
36 | del_mixed 2 15
| ;--- stack ------------------------
| ; 0 t List(String)
| ; 1 l List(String)
| ; 2 (return address)
| ;----------------------------------
42 | push_addr 27
47 | peek "t" 1
52 | copy_mixed 2
54 | push
| ;--- stack ------------------------
| ; 0 List(String)
| ; 1 (return address)
| ; 2 t List(String)
| ; 3 l List(String)
| ; 4 (return address)
| ;----------------------------------
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
60 | apply 1
62 | ret_point 27
62 | push
63 | load_word32 1
68 | push
| ;--- stack ------------------------
| ; 0 Int32
| ; 1 Int32
| ; 2 t List(String)
| ; 3 l List(String)
| ; 4 (return address)
| ;----------------------------------
69 | int32_plus
72 | collapse 0
77 | collapse 0
82 | del_stack_mixed 0 2 15
92 | del_stack_mixed 0 2 15
102 | ret 1
103 | label 24
| ;--- stack ------------------------
| ; 0 l List(String)
| ; 1 (return address)
| ;----------------------------------
103 | load_word32 0
108 | del_stack_mixed 0 2 15
118 | ret 1
The argument 'l' is first copied at offset 19. In the case of an non empty list, we
jump at label 25. The tail 't' of the list 'l' is copied at offset 33, and 'l' is
virtually deleted at offset 36. 't' is copied again at offset 52, because it is present
two times in the stack (at offset 55 in slots 0 and 2). At address 60 the function
calls itself, so that 't', is again copied at offset 19 during the call. Hence, 't' has
been virtually copied 3 times. Of course, it is also virtually deleted 3 times (one
time at offset 82, and two times at offsets 36 and 92 (or 108) during the recursive
call). This is obviously too much and one of the main causes of the slowness of Anubis
1.
*** (14.2) Moving 'copy' and 'delete' instructions.
The first principle of static garbage-collection is that a sequence of instructions
like
copy
del
may be removed, provided they act on the same counter. Indeed, the counter is
incremented, and decremented immediately. So, this has no effect.
Consequently, the question is how to move the copy instructions downwards within the
code, and the delete instructions upwards within the code, so that they disappear when
they meet. Of course, a delete instruction may move up or down to a certain point
where if is blocked. In this case, if the corresponding copy instruction is also
blocked, they will not meet. In other words, static garbage-collection will not
eliminate all copy/delete pairs. Nevertheless, it eliminates most of them, providing an
important enhancement in speed. In the case of the 'length' function, it eliminates 2
of the 3 pairs, as we shall see.
The code for a function is not linear. It looks much more like a tree, where nodes
correspond to conditionals. If the conditional has 'n' cases, the node has 'n'
branches. For example, the code for 'length' above, corresponds to the following tree,
where the numbers are the offsets of the instructions:
9 (beginning of function code)
|
|
23 (_switch)
---------------
33 103
| |
branch for non empty lists | | branch for the empty list
| |
| |
102 (ret) 118 (ret)
The reason why the branches do not join together is that optimisation using the end
code (in particular, elimination of terminal recursion) is not the same in all
branches. This is why the end code is duplicated and each branch optimized separately
with it.
Now, we consider the same tree, but with the copy/delete instructions:
9
|
19 (copy 'l' in R)
|
23
----------------------------------
33 (copy 't' during unstore) 103
36 (delete 'l' in R) |
| |
52 (copy 't' in R)
| |
82 (delete 't' in stack)
92 (delete 'l' in stack) 108 (delete 'l' in stack)
| |
102 118
We did not represent 'collapse' instruction used for deleting Int32s, but which
actually never call the garbage-collector nor increment/decrement anything, since Int32
is a direct type.
To simplify our explanations, we redraw the above tree like this:
|
clR
|
----------------
ctU |
dlR |
| |
ctR |
| |
dtS |
dlS dlS
| |
It may seem strange that l is deleted 2 times in the left hand branch and 1 time in the
right hand branch. The first copy instruction clR copies 'l' from the stack to R. The
two instructions dlR and dlS in the left hand branch delete the two copies of 'l' in R
and in the stack. On the contrary, in the right hand branch only the copy of 'l' in the
stack is deleted. This is due to the fact that in the right hand branch the compiler
knows that 'l' in R is in the first alternative of 'List(String)' (empty list). He
also knows that this first alternative is small, hence direct. This is why it does not
generate an instruction like dlR at the beginning of the right hand branch. Notice that
the instruction clR tests the alternative of 'l' and uses a mask for determining if
there is a counter or not.
*** (14.3) Moving 'copy' instructions downwards.
We examine how to move copy instructions downwards. In our example, the clR instruction
is concerned. We have the following:
19 | copy_mixed 2
21 | index_direct 2
23 | _switch 24 25
Is it true that 'copy_mixed' commutes with 'index_direct' ? We may have a
doubt. Indeed, imagine that we transform the code into:
21 | index_direct 2
19 | copy_mixed 2
23 | _switch 24 25
(where, of course, offset become meaningless, but we keep them because they allow to
identify where instruction are comming from), and that the virtual machine gives up
precisely after 'index_direct'. Another virtual machine may have a referecne to our
'l', and virtually delete this reference, so that the counter of 'l' is decremented
before having been incremented by our current virtual machine. Actually,
incrementations and decrementations may be permuted provided that the counter never
becomes zero. If a virtual machine executes a 'copy' instruction, this is because it
already has a reference to the datum. Hence, other machines cannot decrement the
counter to zero, because this would mean that this first reference of our machine would
have been uncounted by another machine, which is impossible. Summarizing, if a machine
has access to a datum, even just through one path of pointers, the counter of this
datum cannot be decremented to zero by other machines.
So, this is ok. Notice that 'copy' instructions commute with most instructions, but of
course not all instructions.
Now, our 'copy_mixed' is just before the _switch:
21 | index_direct 2
19 | copy_mixed 2
23 | _switch 24 25
33 | label 25
33 | unstore_copy_mixed 8 2
36 | del_mixed 2 15
........
103 | label 24
103 | load_word32 0
108 | del_stack_mixed 0 2 15
........
Any instruction commutes with the _switch provided it is duplicating in all
branches. Hence, we will have this:
21 | index_direct 2
23 | _switch 24 25
33 | label 25
19 | copy_mixed 2
33 | unstore_copy_mixed 8 2
36 | del_mixed 2 15
........
103 | label 24
19 | copy_mixed 2
103 | load_word32 0
108 | del_stack_mixed 0 2 15
........
In other words, should we take one branch or the other one, we still have to count a
virtual copy for 'l'.
Now, 'copy_mixed' commutes with 'unstore_copy_mixed' for the reason that
incrementations of any two counters commute, even if they increment the same counter.
Hence, in the left hand branch, we have:
33 | label 25
33 | unstore_copy_mixed 8 2
19 | copy_mixed 2
36 | del_mixed 2 15
........
Now, we are at the point where 'copy_mixed' meets 'del_mixed', and both instructions
may disappear:
33 | label 25
33 | unstore_copy_mixed 8 2
........
In the right hand branch we have:
103 | label 24
19 | copy_mixed 2
103 | load_word32 0
108 | del_stack_mixed 0 2 15
118 | ret 1
which may clearly be transformed into:
103 | label 24
103 | load_word32 0
19 | copy_mixed 2
108 | del_stack_mixed 0 2 15
118 | ret 1
However, 'copy_mixed' cannot be put after 'del_stack_mixed', and more generally a
'copy' instruction cannot pass behind a 'del' instruction, even if they concern
distinct counters (and data of distinct types). Indeed, imagine that the delete
instruction concerns a segment containing a reference to the segment concerned by the
copy instruction:
+------+-------------------------+-----+
| 1 | segment to be deleted | * |
+------+-------------------------+--|--+
|
V
+------+-------------------------+
| 1 | segment to be copied |
+------+-------------------------+
If the second segment is first copied, its counter becomes 2, and is decremented to 1
by the real deletion of the first segment. On the contrary, if the first segment is
deleted first, the two segments will return to the allocator, and our reference to the
second segment will be dandling.
At that point, 'copy_mixed' seems to be meaningless, because there is no corresponding
delete instruction and no function call. However, this is just because the compiler,
did not generate the dlR instruction in the right hand branch, as we have already
noted. In order to perform these optimisations, the compiler should generate this
instructions. In that case, the code would have been:
103 | label 24
19 | copy_mixed 2
| del_mixed 2 15
103 | load_word32 0
108 | del_stack_mixed 0 2 15
118 | ret 1
(even if 'del_mixed 2 15' has no effect, since at this point 'l' is the empty list),
and the instructions 'copy_mixed' and 'del_mixed' disappear.
After this 'moving downwards' of our clR instruction, our whole code becomes (where we
have removed the stack representations which may have become erroneous):
9 | label 23
9 | check_stack 7
14 | peek "l" 0
21 | index_direct 2
23 | _switch 24 25
33 | label 25
33 | unstore_copy_mixed 8 2
42 | push_addr 27
47 | peek "t" 1
52 | copy_mixed 2
54 | push
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
60 | apply 1
62 | ret_point 27
62 | push
63 | load_word32 1
68 | push
69 | int32_plus
72 | collapse 0
77 | collapse 0
82 | del_stack_mixed 0 2 15
92 | del_stack_mixed 0 2 15
102 | ret 1
103 | label 24
103 | load_word32 0
108 | del_stack_mixed 0 2 15
118 | ret 1
and each node of the list is incremented/decremented only 2 times. Now, we will see how
to eliminate another pair of copy/delete instructions.
First of all, we consider the 'copy_mixed' at offset 52. It is followed by a 'push'
instruction:
52 | copy_mixed 2
54 | push
The effect of 'push' is to move the content of R onto the stack, so that R becomes
empty. Hence, the counter to be incremented is not reached the same way, and the
permutation of the two instructions gives:
54 | push
52 | copy_stack_mixed 0 2
where '0' means slot 0 in the stack. Recall that the operand '2' in this instruction
is a bit mask (binary 10 in this example) indicating that the first alternative of
'List(String)' is direct (no pointer) and the second one indirect (through a pointer).
At that point, we have this:
54 | push
52 | copy_stack_mixed 0 2
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
60 | apply 1
62 | ret_point 27
The instruction 'copy_stack_mixed' commutes with 'address', since 'address' only puts
an address into R. Notice that 'copy_mixed' would not have commuted with 'address', but
this case will never happen since it would mean that 'address' is used with a non empty
R, which is impossible. Hence, we have:
54 | push
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
52 | copy_stack_mixed 0 2
60 | apply 1
62 | ret_point 27
Now, our 'copy_stack_mixed 0 2' copies a datum with is tranmitted as an argument to the
function called by 'apply'. The called function will virtually delete this
datum. Hence, the corresponding delete instruction is not in our code, but in the code
of the called function (even if it is the same function). Hence, we are block here.
Now, we consider the case of the instruction ctU, i.e. 'unstore_copy_mixed 8 2' at
offset 33, whose corresponding delete is 'del_stack_mixed 0 2 15' at offset 82. We
have the following between these two instructions (after our previous transformations):
33 | unstore_copy_mixed 8 2
42 | push_addr 27
47 | peek "t" 1
54 | push
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
52 | copy_stack_mixed 0 2
60 | apply 1
62 | ret_point 27
62 | push
63 | load_word32 1
68 | push
69 | int32_plus
72 | collapse 0
77 | collapse 0
82 | del_stack_mixed 0 2 15
First of all, 'unstore_copy_mixed 8 2' should be replaced by:
33 | unstore 8 4
33 | copy_stack_mixed 0 2
which is equivalent. So, we have this:
33 | copy_stack_mixed 0 2
42 | push_addr 27
47 | peek "t" 1
54 | push
.........
'copy_stack_mixed' commutes with 'push_addr' like this:
42 | push_addr 27
33 | copy_stack_mixed 1 2
47 | peek "t" 1
54 | push
.........
This is because the datum is now in the slot 1 of the stack. It commutes with 'peek'
without modification:
42 | push_addr 27
47 | peek "t" 1
33 | copy_stack_mixed 1 2
54 | push
.........
Again the commutation with 'push' modifies the instruction, and the commutation with
'address' makes no problem:
.........
47 | peek "t" 1
52 | copy_mixed 2
54 | push
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
33 | copy_stack_mixed 2 2
52 | copy_stack_mixed 0 2
60 | apply 1
.........
The commutation between the two 'copy_stack_mixed' is also possible:
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
52 | copy_stack_mixed 0 2
33 | copy_stack_mixed 2 2
60 | apply 1
Now, we are just before 'apply 1'. Here, there is no problem for commuting because the
slot 2 in the stack is below the return address at slot 1. The position of this return
address is also the operand of 'apply', so that the decision for permuting is easy to
take. However, since the called function deletes its arguments and pops its return
address from the stack, the instruction depth must be modified (it becomes 0 here):
.........
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
52 | copy_stack_mixed 0 2
60 | apply 1
62 | ret_point 27
33 | copy_stack_mixed 0 2
62 | push
63 | load_word32 1
68 | push
69 | int32_plus
.........
Of course, we pass 'apply' and 'ret_point' together, because the return of the call is
just at the 'ret_point'. The subsequent 'push', 'load_word32' and 'push' may commute
like this:
.........
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
52 | copy_stack_mixed 0 2
60 | apply 1
62 | ret_point 27
62 | push
63 | load_word32 1
68 | push
33 | copy_stack_mixed 2 2
69 | int32_plus
.........
The instruction 'int32_plus' (a syscall) does not modify the stack. This is why its two
arguments are poped from the stack by the next two 'collapse' instructions, which are
also easily permuted with our 'copy_stack_mixed'. Despite the fact that it is a delete
instruction, 'collapse' commutes with a copy instruction since it is just a stack
adjustment (no counter is decremented by 'collapse'). Now, we have:
.........
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
52 | copy_stack_mixed 0 2
60 | apply 1
62 | ret_point 27
62 | push
63 | load_word32 1
68 | push
69 | int32_plus
72 | collapse 0
77 | collapse 0
33 | copy_stack_mixed 0 2
82 | del_stack_mixed 0 2 15
.........
At that point, our copy instruction has met its corresponding delete instruction. That
it is actually the corresponding instruction is visible by the fact that they have the
same depth operand (0 in this case). Hence they disappear together, however not
exactly, because they produce a 'collapse 0' for stack synchronization.
Finally, our code for the function 'length' is optimized as follows:
9 | label 23
9 | check_stack 7
14 | peek "l" 0
21 | index_direct 2
23 | _switch 24 25
33 | label 25
33 | unstore 8 4
42 | push_addr 27
47 | peek "t" 1
54 | push
55 | address 23 ('length' in '/home/alp/essai.anubis' at line 3)
52 | copy_stack_mixed 0 2
60 | apply 1
62 | ret_point 27
62 | push
63 | load_word32 1
68 | push
69 | int32_plus
72 | collapse 0
77 | collapse 0
| collapse 0
92 | del_stack_mixed 0 2 15
102 | ret 1
103 | label 24
103 | load_word32 0
108 | del_stack_mixed 0 2 15
118 | ret 1
and we have still one increment/decrement operation for each node of our list. The
increment is performed by 'copy_stack_mixed 0 2' at (pseudo-)offset 52, and the
corresponding decrement is performed by the called function either at offset 92 or at
offset 108.
*** (14.4) Moving 'delete' instructions upwards.
If a copy instruction cannot move downwards, we may try to move the corresponding
'delete' instruction upwards. In the case of our example, already optimized as shown
above, we know that we cannot eliminate the last pair of copy/delete instructions. If
we try to move the 'del_stack_mixed' at offset 92 upwards, it will be blocked just
below the 'copy_stack_mixed' at (pseudo-)offset 52, as already noticed. Hence, the
other 'del_stack_mixed', at offset 108, will not be able to pass the switch, because
the same instruction is not comming from the other branch.
However, what would happen in another situation ? Probably, moving delete instruction
upwards has no interest, because if a delete instruction can move upwards to meet a
copy instruction, the copy instruction may probably as well go downwards to meet the
delete instruction. I did not check that formally, because there are too many
instructions in the Anubis 1 virtual machine.
*** (14.5) More peephole optimizations.
What we have done in the above subsections is a special case of 'peephole
optimization'. We may examine other possible similar optimizations.
*** (15) Voluntary multitasking.
In Anubis 1.8, multitasking is realized within one thread of the host(UNIX, Windows,
...). In other words, Anubis has its own multitasking system. This system works as
follows. The scheduler starts a virtual machine with a ``credit'' of instructions to
execute (for example 500). At each instruction executed, the system decrements the
credit. When the credit becomes 0, the system returns to the scheduler.
However, a virtual machine may give up (return to the scheduler) voluntarily. Some
instructions are doing that. Of course, the instruction 'give_up', but also the
instruction which tries to read a byte from a connection. It it cannot read the byte
immediately, it gives up (i.e. it waits).
This is not very good because of the overhead created by the decrements and test of the
credit. A better solution would be to have an instruction 'may_give_up', for example
at the beginning of all functions, in any case at places where it is sure that such
instructions are executed often. This instruction may either give up directly, or
decrement a credit, and give up only if the credit is 0. I will call this ``voluntary
multitasking''.
Since function codes contain no loop (as we saw already they are much like trees, with
the entrance at the root and an exit at each leaf), the time needed for going from the
root to a leaf is always very small. Hence, putting the instruction 'may_give_up' at
the beginning of functions only, is enough. This would also have the advantage, that we
could rely on the assumption that the virtual machine will never be interrupted between
two function calls. This may be useful (maybe).
We may still enhance this solution, especially if we generate native code of the
physical machine. We may use a hardware interruption (a timer). This interruption would
set a flag meaning ``it's time to give up''. The 'may_give_up' instruction would just
test this flag, without having to decrement a credit counter.