*Project* Anubis *Title* Outputing a parser automaton in Anubis. *Copyright* Copyright (c) Alain Proute' 2006. *Author* Alain Proute' *Created* March 2006 *Revised* March 2006 *Overview* The tool defined in this file is part of the Anubis Parser Generator (APG). Given an abstract parser automaton (as computed by `make_APG_automaton`; see `make_automaton.anubis`), it outputs the automaton in the form of an Anubis program. *Public* read common.anubis read tools/streams.anubis *Name* anubis_output *Description* This function takes a grammar and outputs a corresponding parser in Anubis into a file. public define One anubis_output ( APG_Grammar g, String output_filename, List(APG_Option) options ). *Private* read tools/basis.anubis ------------------------------- Table of Contents ------------------------------------- *** [1] Precedence, association mode and type tables. *** [1.1] The precedence table. *** [1.1.1] Computing the precedence table. *** [1.1.2] Using the precedence table. *** [1.1.3] Printing the precedence table. *** [1.2] The association mode table. *** [1.2.1] Computing the association mode table. *** [1.2.2] Using the association mode table. *** [1.2.3] Printing the association mode table. *** [1.3] The type table. *** [1.3.1] Computing the type table. *** [1.3.2] Using the type table. *** [1.3.3] Printing the type table. *** [1.4] Flattening classes. *** [1.5] Getting the longuest stack for a state. *** [1.6] Getting a grammar rule by its id. *** [1.7] Computing the precedence level of a grammar rule. *** [2] Resolving conflicts. *** [2.1] Computing the list of behaviors for a state. *** [2.2] Making the list of conflicts for a state. *** [2.2.1] Computing the resolution of a shift/reduce conflict. *** [2.2.2] Computing all conflicts and their resolutions. *** [2.2.3] Getting the resolution of a conflict. *** [2.3] Printing conflicts. *** [3] Outputing the parser program. *** [3.1] Outputing parser specific types. *** [3.1.1] The types of tokens. *** [3.1.2] The type of non terminals. *** [3.1.3] The type 'Ret_...'. *** [3.1.4] Outputing the 'Lexer_...' type. *** [3.2] Outputing the declaration of the function 'vmsg'. *** [3.3] Outputing the 'reduce_n' functions. *** [3.5] Outputing states. *** [3.5.1] Printing the state header. *** [3.5.2] Printing scenarios. *** [3.5.3] Printing the transitions. *** [3.5.4] Printing the reductions. *** [3.5.5] Printing acceptable tokens. *** [3.5.6] Printing the restart function *** [3.5.7] Printing the state function. *** [3.5.8] Printing a whole state. *** [4] The interface. *** [5] Testing. --------------------------------------------------------------------------------------- define One print(Stream s, String t) = forget(write_string(s,t)). *** [1] Precedence, association mode and type tables. The grammar contains precedence/association declarations and type declarations. We must organize these data into several tables (actually association lists): - a 'precedence table' with entries of the form (symbol,precedence_level), - a 'association mode table' with entries of the form (precedence_level,mode), - a 'type table' with entries of the form (symbol,type). *** [1.1] The precedence table. *** [1.1.1] Computing the precedence table. type PrecEntry: prec_entry(String symbol, Word32 level). define List(PrecEntry) make_precedence_table ( List(APG_Precedence_Dec) decs ) = if decs is { [ ] then [ ], [h . t] then with n = level(h), names = symbol_names(h), rest = make_precedence_table(t), map((String s) |-> prec_entry(s,n), names) + rest }. *** [1.1.2] Using the precedence table. define Maybe(Word32) get_precedence ( String symbol, List(PrecEntry) prec_table ) = if prec_table is { [ ] then failure, [h . t] then if h is prec_entry(s,l) then if symbol = s then success(l) else get_precedence(symbol,t) }. *** [1.1.3] Printing the precedence table. define One print ( Stream s, List(PrecEntry) prec_table ) = print(s,"\n --- Precedence table ---"); map_forget((PrecEntry e) |-> if e is prec_entry(sym,level) then print(s,"\n "+right_pad(sym,15)+" "+level), prec_table). *** [1.2] The association mode table. *** [1.2.1] Computing the association mode table. type AssocMode: left, right, non_assoc. define String to_string ( AssocMode m ) = if m is { left then "left", right then "right", non_assoc then "non_assoc" }. type AssocEntry: assoc_entry(Word32 level, AssocMode mode). define List(AssocEntry) make_association_table ( List(APG_Precedence_Dec) decs ) = if decs is { [ ] then [ ], [h . t] then [if h is { left(level,names) then assoc_entry(level,left), right(level,names) then assoc_entry(level,right), non_assoc(level,names) then assoc_entry(level,non_assoc), } . make_association_table(t)] }. *** [1.2.2] Using the association mode table. define Maybe(AssocMode) get_association_mode ( Word32 level, List(AssocEntry) assoc_table ) = if assoc_table is { [ ] then failure, [h . t] then if h is assoc_entry(l,mode) then if l = level then success(mode) else get_association_mode(level,t) }. *** [1.2.3] Printing the association mode table. define One print ( Stream s, List(AssocEntry) assoc_table ) = print(s,"\n --- Association mode table ---"); map_forget((AssocEntry e) |-> if e is assoc_entry(level,mode) then print(s,"\n "+right_pad(""+level,15)+" "+to_string(mode)), assoc_table). *** [1.3] The type table. type TypeEntry: type_entry(String symbol, String type). *** [1.3.1] Using the type table. define String get_type ( String symbol, List(TypeEntry) type_table ) = if type_table is { [ ] then "One", [h . t] then if h is type_entry(s,type) then if symbol = s then type else get_type(symbol,t) }. *** [1.3.2] Computing the type table. define List(TypeEntry) make_type_table_1 ( List(APG_Type_Dec) decs ) = if decs is { [ ] then [ ], [h . t] then if h is type_dec(type,names) then map((String n) |-> type_entry(n,type), names) + make_type_table_1(t) }. define List(TypeEntry) make_type_table ( List(APG_Type_Dec) decs, String axiom ) = with tt = make_type_table_1(decs), t = get_type(axiom,tt), [type_entry("start",t) . tt]. *** [1.3.3] Printing the type table. define One print ( Stream s, List(TypeEntry) type_table ) = print(s,"\n --- Type table ---\n"); map_forget((TypeEntry e) |-> if e is type_entry(sym,type) then print(s,"\n "+right_pad(sym,15)+" "+type), type_table). *** [1.4] Flattening classes. We do no more need classes of scenarios, and put all scenarios of a set of classes into a single list of scenarios. define List(APG_Scenario) flat_classes ( List(APG_Class) classes ) = if classes is { [ ] then [ ], [c1 . others] then scenarios(c1) + flat_classes(others) }. *** [1.5] Getting the longuest stack for a state. For each state, there is a longuest sequence of grammar symbols before the dot in the scenarios of the state. This sequence is the 'longuest stack' for that state. define List(String) get_longuest_stack ( List(APG_Scenario) l ) = if l is { [ ] then [ ], [sc1 . other_scenarios] then if sc1 is scenario(_,_,l1,_,_,_,_) then with l2 = get_longuest_stack(other_scenarios), if length(l1) > length(l2) then l1 else l2 }. *** [1.6] Getting a grammar rule by its id. define APG_Grammar_Rule get_rule ( Word32 rule_id, List(APG_Grammar_Rule) rules ) = if rules is { [ ] then should_not_happen(grammar_rule(0,symbol_value("",""),[],failure)), [g1 . others] then if g1 is grammar_rule(id,_,_,_) then if id = rule_id then g1 else get_rule(rule_id,others) }. *** [1.7] Computing the precedence level of a grammar rule. If the rule has a declared precedence, this is the precedence of the rule. Otherwise, it is the precedence of the rightmost token in its body (if there a token with a declared precedence in the body). Otherwise, there is no precedence level. The first function computes the precedence using the body of rule only. define Maybe(Word32) compute_rule_precedence ( List(String) body, // in reverse order List(PrecEntry) prec_table ) = if body is { [ ] then failure, [last . others] then if get_precedence(last,prec_table) is { failure then compute_rule_precedence(others,prec_table), success(level) then success(level) } }. define Maybe(Word32) compute_rule_precedence ( List(String) body, Maybe(String) mb_prec, List(PrecEntry) prec_table ) = if mb_prec is { failure then compute_rule_precedence(reverse(body),prec_table) success(sym) then if get_precedence(sym,prec_table) is { failure then success(-1), // a token used for a rule precedence should have a precedence success(level) then success(level) } }. *** [2] Resolving conflicts. When it reads a token 't' from the input in some state 'S', the automaton must choose between 3 possible behaviors: - 'shifting' the token and making a transition to some state, - 'reducing' using some grammar rule, - reporting an error. Possible behaviors are represented by the following type. We don't record error behaviors in our list of behaviors for a state, because by convention, if a token is not recorded in the list, it generates an error. Furthermore, we record restarting transitions, not only shifting transitions. type Behavior: restart (Word32 target_state_id), shift (Word32 target_state_id), reduce (Word32 grammar_rule_id). type BehaviorEntry: behavior (String symbol, Behavior behavior). We will compute a list of 'BehaviorEntry' below for each state. A 'conflict' arises when the automaton has several choices. The conflicts are of two sorts: - 'shift/reduce' conflict: the automaton may either shift or reduce, - 'reduce/reduce' conflict: the automaton has several different grammar rules for reducing. Shift/reduce conflict may always be resolved by declaring appropriate precedence levels and association modes. On the contrary, reduce/reduce conflicts cannot be resolved that way. They are in general the result of a bad design of the grammar itself. The resolution of a shift/reduce conflict is of the following type: type Resolved_As: non_assoc_error, // cannot be resolved because of non_assoc declaration unresolved, // unresolved because some precedence rule is missing no_token_level(Word32 rule_level), // idem no_rule_level(Word32 token_level), // idem shift1(Word32 tok_level, Word32 rule_level), // resolved as 'shift' shift2(Word32 level, AssocMode mode), // idem reduce1(Word32 tok_level, Word32 rule_level), // resolved as 'reduce' reduce2(Word32 level, AssocMode mode). // idem Conflicts are represented as follows: type Conflict: reduce_reduce(String token), shift_reduce (String token, Resolved_As resolution). define (Word32,Word32) number_of_conflicts ( List(Conflict) l, Word32 sr_so_far, Word32 rr_so_far ) = if l is { [ ] then (sr_so_far,rr_so_far), [h . t] then if h is { reduce_reduce(_) then number_of_conflicts(t,sr_so_far,rr_so_far+1), shift_reduce(_,ra) then if ra is { non_assoc_error then number_of_conflicts(t,sr_so_far,rr_so_far), unresolved then number_of_conflicts(t,sr_so_far+1,rr_so_far), no_token_level(_) then number_of_conflicts(t,sr_so_far+1,rr_so_far), no_rule_level(_) then number_of_conflicts(t,sr_so_far+1,rr_so_far), shift1(_,_) then number_of_conflicts(t,sr_so_far,rr_so_far), shift2(_,_) then number_of_conflicts(t,sr_so_far,rr_so_far), reduce1(_,_) then number_of_conflicts(t,sr_so_far,rr_so_far), reduce2(_,_) then number_of_conflicts(t,sr_so_far,rr_so_far) } } }. *** [2.1] Computing the list of behaviors for a state. The behaviors are comming from two sources: - the reducing scenarios, which produce one behavior per lookahead, - transitions, which produce either shifting behaviors or restarting behaviors define List(BehaviorEntry) compute_behaviors ( List(APG_Scenario) scs, // all scenarios in the state List(APG_Transition) transitions, // all transitions of the state List(String) all_tokens // all symbols which are tokens ) = if scs is { [ ] then map((APG_Transition tr) |-> if tr is transition(sym,target_id) then if sym:all_tokens then behavior(sym,shift(target_id)) else behavior(sym,restart(target_id)), transitions), [sc1 . others] then if sc1 is scenario(rule_id,head,bd,ad,prop,hg,lh_v) then if ad is { [ ] then map((String lookahead) |-> behavior(lookahead,reduce(rule_id)), *lh_v) + compute_behaviors(others,transitions,all_tokens), [_ . _] then compute_behaviors(others,transitions,all_tokens) } }. define Bool has_restarts ( List(BehaviorEntry) behaviors ) = if behaviors is { [ ] then false, [h . t] then if h is behavior(sym,b) then if b is { restart(_) then true, shift(_) then has_restarts(t), reduce(_) then has_restarts(t) } }. *** [2.2] Making the list of conflicts for a state. *** [2.2.1] Computing the resolution of a shift/reduce conflict. define Resolved_As compute_resolution // of shift/reduce conflict ( String token, // the token that may be shifted Word32 rule_id, // the id of the rule by which we may reduce List(APG_Grammar_Rule) rules, List(PrecEntry) prec_table, List(AssocEntry) assoc_table ) = if get_rule(rule_id,rules) is grammar_rule(_,_,body,mb_prec) then if compute_rule_precedence(map(name,body),mb_prec,prec_table) is { failure then if get_precedence(token,prec_table) is { failure then unresolved, success(tl) then no_rule_level(tl) }, success(rule_level) then if get_precedence(token,prec_table) is { failure then no_token_level(rule_level), success(token_level) then if token_level +< rule_level then reduce1(token_level,rule_level) else if token_level >+ rule_level then shift1 (token_level,rule_level) else if get_association_mode(token_level,assoc_table) is { failure then should_not_happen(unresolved), // if there is a precedence, there is an association mode success(mode) then if mode is { left then reduce2(token_level,mode), right then shift2 (token_level,mode), non_assoc then non_assoc_error } } } }. *** [2.2.2] Computing all conflicts and their resolutions. define List(Conflict) compute_conflicts ( BehaviorEntry b1, List(BehaviorEntry) others, List(APG_Grammar_Rule) rules, List(PrecEntry) prec_table, List(AssocEntry) assoc_table ) = if others is { [ ] then [ ], [b2 . rest] then if b1 is behavior(sym1,a1) then if b2 is behavior(sym2,a2) then if sym1 = sym2 then if a1 is { restart (target_state_id1) then compute_conflicts(b1,rest,rules,prec_table,assoc_table), shift (target_state_id1) then if a2 is { restart (target_state_id2) then compute_conflicts(b1,rest,rules,prec_table,assoc_table), shift (target_state_id2) then compute_conflicts(b1,rest,rules,prec_table,assoc_table), reduce (grammar_rule_id2) then [shift_reduce(sym1, compute_resolution(sym1,grammar_rule_id2,rules,prec_table,assoc_table)) . compute_conflicts(b1,rest,rules,prec_table,assoc_table)] } reduce (grammar_rule_id1) then if a2 is { restart (target_state_id2) then compute_conflicts(b1,rest,rules,prec_table,assoc_table), shift (target_state_id2) then [shift_reduce(sym1, compute_resolution(sym1,grammar_rule_id1,rules,prec_table,assoc_table)) . compute_conflicts(b1,rest,rules,prec_table,assoc_table)], reduce (grammar_rule_id2) then [reduce_reduce(sym1) . compute_conflicts(b1,rest,rules,prec_table,assoc_table)] } } else compute_conflicts(b1,rest,rules,prec_table,assoc_table) }. define List(Conflict) compute_conflicts ( List(BehaviorEntry) behaviors, List(APG_Grammar_Rule) rules, List(PrecEntry) prec_table, List(AssocEntry) assoc_table ) = if behaviors is { [ ] then [ ], [b1 . others] then compute_conflicts(b1,others,rules,prec_table,assoc_table) + compute_conflicts(others,rules,prec_table,assoc_table) }. *** [2.2.3] Getting the resolution of a conflict. define Resolved_As get_conflict_resolution ( String token, List(Conflict) conflicts ) = if conflicts is { [ ] then unresolved, [h . t] then if h is { reduce_reduce(_) then get_conflict_resolution(token,t), shift_reduce(tok,resol) then if tok = token then resol else get_conflict_resolution(token,t) } }. *** [2.3] Printing conflicts. The function below prints the conflicts for a state (it prints nothing if no conflict). define One print_conflicts ( Stream s, List(Conflict) conflicts ) = if conflicts is [] then unique else print(s,"\n\n --- Conflicts ---\n"); map_forget((Conflict c) |-> if c is { reduce_reduce(token) then print(s," "+right_pad(token,21)+" reduce/reduce\n"), shift_reduce(token,resol) then print(s," "+right_pad(token,21)+" shift/reduce "); if resol is { non_assoc_error then print(s,"(produces a 'non_assoc' syntax error)\n"), unresolved then print(s,"%(* * * unresolved * * * ?/?)\n"), no_token_level(rl) then print(s,"%(* * * unresolved * * * ?/"+rl+")\n"), no_rule_level(tl) then print(s,"%(* * * unresolved * * * "+tl+"/?)\n"), shift1(tl,rl) then print(s,"(resolved as 'shift' "+tl+"/"+rl+")\n"), shift2(l,m) then print(s,"(resolved as 'shift' "+l+"/"+to_string(m)+")\n"), reduce1(tl,rl) then print(s,"(resolved as 'reduce' "+tl+"/"+rl+")\n") reduce2(l,m) then print(s,"(resolved as 'reduce' "+l+"/"+to_string(m)+")\n") } },conflicts). *** [3] Outputing the parser program. The LALR1 parser constructed by APG needs a stack. In this stack we have to push two sorts of things: - return adresses (actually addresses of states of the automaton), - values of grammar symbols. However, unlike YACC/BISON, APG does not implement a stack. On the contrary, it uses the Anubis system stack as a stack for the automaton. This is a logical consequence of the rigidity of the typing system of Anubis. An 'implemented' stack would have been an array (or list) of heterogeneous data (i.e. data of some sum type). When working with such a datum, a conditional would be necessary to determine its actual type. However, the type of the datum is implicitly known by the automaton. Hence, this conditional (even if mandatory if we use this method) is clearly a waste of time, and always the same case would be used. In order to avoid this incongruity, we have designed a system for using the Anubis system stack. The consequence is that the type of data in the stack is always known at compile time. Note: In C there is no such problem, because the designer may always cast to an appropriate type, which is not possible in Anubis for safety reasons. For each state (say state number 'n'), we construct two functions: 'state_n' and 'restart_n'. The function 'state_n' reads the next token and decides what to do with it (either shift, reduce or report an error). When a reduction occurs, the 'state_n' function returns a value, which is designed in such a way that the calling functions will know at which state the returns must end. Then the state which ends the returns (i.e. which captures in some sens the result of the reduction) calls its 'restart_?' function in order to restart parsing from the right state. Reductions are performed by a set of 'reduce_n' functions (one per grammar rule). *** [3.1] Outputing parser specific types. Our parser program works with several data types whose definitions must be output. *** [3.1.1] The types of tokens. We need two types of tokens, both having one alternative per token. The first one is an enumeration. In the second one, each alternative has a component for holding the value of the token. define One print_token_type_alts ( Stream s, List(String) all_tokens ) = if all_tokens is { [ ] then unique, [h . t ] then print(s,"\n "+h+(if t is [] then "." else ",")); print_token_type_alts(s,t) }. define One print_token_type ( Stream s, String parser_name, List(String) all_tokens ) = print(s,"\n\npublic type Token_"+parser_name+":"); print_token_type_alts(s,all_tokens). define One print_token_name_function ( Stream s, String parser_name, List(String) all_tokens ) = print(s,"\n\npublic define String\n"); print(s," token_name_"+parser_name+"\n"); print(s," (\n"); print(s," Token_"+parser_name+" tok\n"); print(s," ) =\n"); print(s," if tok is\n"); print(s," {\n"); map_forget((String t) |-> print(s," "+right_pad(t,30)+" then \""+t+"\"\n"),all_tokens); print(s," }.\n"); print(s,"\n\npublic define String\n"); print(s," token_name_"+parser_name+"\n"); print(s," (\n"); print(s," Token_Value_"+parser_name+" tok\n"); print(s," ) =\n"); print(s," if tok is\n"); print(s," {\n"); map_forget((String t) |-> print(s," "+right_pad(t,30)+"(_) then \""+t+"\"\n"),all_tokens); print(s," }."). define One print_token_value_type_alts ( Stream s, List(String) all_tokens, List(TypeEntry) type_table ) = if all_tokens is { [ ] then unique, [h . t ] then print(s,"\n "+h+"("+get_type(h,type_table)+")"+(if t is [] then "." else ",")); print_token_value_type_alts(s,t,type_table) }. define One print_token_value_type ( Stream s, String parser_name, List(String) all_tokens, List(TypeEntry) type_table ) = print(s,"\n\npublic type Token_Value_"+parser_name+":"); print_token_value_type_alts(s,all_tokens,type_table). *** [3.1.2] The type of non terminals. We need a type with an alternative per non terminal with a component holding the value. define One print_non_terminals_type_alts ( Stream s, List(String) non_terminals, List(TypeEntry) type_table ) = if non_terminals is { [ ] then unique, [h . t] then print(s,"\n "+h+"("+get_type(h,type_table)+")"+(if t is [] then "." else ",")); print_non_terminals_type_alts(s,t,type_table) }. define One print_non_terminals_type ( Stream s, List(String) non_terminals, List(TypeEntry) type_table, String parser_name ) = print(s,"\n\ntype Non_Terminal_Value_"+parser_name+":"); print_non_terminals_type_alts(s,non_terminals,type_table). *** [3.1.3] The type 'Ret_...'. define One print_type_Ret ( Stream s, String parser_name ) = print(s,"\n\ntype Ret_"+parser_name+"($T):"); print(s,"\n error(Token_Value_"+parser_name+",List(Token_"+parser_name+")),"); print(s,"\n end_ret($T),"); print(s,"\n do_ret(Ret_"+parser_name+"($T))."). *** [3.1.4] Outputing the 'Lexer_...' type. The parser gets the tokens from a 'lexer'. This lexer is an object of the following type: define One print_Lexer_type ( Stream s, String parser_name ) = print(s,"\n\ntype Lexer_"+parser_name+":"); print(s,"\n lexer(One -> Token_Value_"+parser_name+" read_token,"); print(s,"\n Token_Value_"+parser_name+" -> One unput_token)."). *** [3.1.5] Outputing the parser function. define One print_parser_function ( Stream s, String parser_name, List(TypeEntry) type_table, Maybe(Extra) mb_extra ) = print(s,"\n\npublic define Result((Token_Value_"+parser_name+",List(Token_"+parser_name+")),\n"+ " "+get_type("start",type_table)+")\n"); print(s," "+parser_name+"\n"); print(s," (\n"); if mb_extra is { failure then unique, success(e) then if e is extra(t,n) then print(s," "+t+" "+n+",\n") }; print(s," One -> Token_Value_"+parser_name+" read_token\n"); print(s," ) =\n"); print(s," with unput_list = var((List(Token_Value_"+parser_name+"))[]),\n"); print(s," input = lexer((One u) |-> if *unput_list is\n"); print(s," {\n"); print(s," [ ] then read_token(unique),\n"); print(s," [h . t] then \n"); print(s," unput_list <- t;\n"); print(s," h\n"); print(s," },\n"); print(s," (Token_Value_"+parser_name+" t) |->\n"); print(s," unput_list <- [t . *unput_list]),\n"); if mb_extra is { failure then print(s," if state_0(input) is\n") success(e) then if e is extra(t,n) then print(s," if state_0("+n+",input) is\n") }; print(s," {\n"); print(s," error(a,b) then error((a,b)),\n"); print(s," end_ret(r) then if r is start(value)\n"); print(s," then ok(value)\n"); print(s," else should_not_happen(error((eof(unique),[]))),\n"); print(s," do_ret(_) then should_not_happen(error((eof(unique),[])))\n"); print(s," }."). define One print_parser_function_declaration ( Stream s, String parser_name, List(TypeEntry) type_table, Maybe(Extra) mb_extra ) = print(s,"\n\n public define Result((Token_Value_"+parser_name+",List(Token_"+parser_name+")),\n"+ " "+get_type("start",type_table)+")\n"); print(s," "+parser_name+"\n"); print(s," (\n"); if mb_extra is { failure then unique, success(e) then if e is extra(t,n) then print(s," "+t+" "+n+",\n") }; print(s," One -> Token_Value_"+parser_name+" read_token\n"); print(s," ).\n"). *** [3.2] Outputing the declaration of the function 'vmsg'. When the 'trace' option is used, APG outputs terms of the form 'vmsg("..."). These terms send messages which allow to follow the behavior of the parser. This is used only for debugging purpose. The function 'vmsg' must be provided by the user. The next function outputs the declaration of 'vmsg'. define One print_vmsg_declaration ( Stream s ) = print(s, "\n\n Declaration of 'vmsg'. This function must be provided by the user of APG. "); print(s,"\n\ndefine One vmsg(String text)."). *** [3.3] Outputing the 'reduce_n' functions. Reductions are performed by the 'reduce_n' functions (one per grammar rule). define String put_do_ret ( String s, Int n ) = if n =< 0 then "\n end_ret("+s+")\n " else "do_ret("+put_do_ret(s,n-1)+")". define One print_reduce_function_args ( Stream s, List(APG_Symbol_Value) rbody, // body of rule in reverse order List(TypeEntry) type_table ) = if rbody is { [ ] then unique, [h . t] then if h is symbol_value(sym,val) then print(s," "+get_type(sym,type_table)+" "+val+ (if t is [] then "" else ",")+"\n"); print_reduce_function_args(s,t,type_table) }. define One print_reduce_functions ( Stream s, List(APG_Grammar_Rule) rules, List(TypeEntry) type_table, String parser_name, Maybe(Extra) mb_extra, List(APG_Option) options ) = if rules is { [ ] then unique, [rule1 . other_rules] then if rule1 is grammar_rule(id,head,body,prec) then print(s,"\n\ndefine Ret_"+parser_name+"(Non_Terminal_Value_"+parser_name+")\n"); print(s," reduce_"+id+"\n"); print(s," (\n"); if mb_extra is { failure then unique, success(e) then if e is extra(t,n) then print(s," "+t+" "+n+",\n") }; if head is symbol_value(name,val) then if body is { [ ] then print(s," ) =\n"); (if trace:options then print(s," vmsg(\"Reducing using rule "+to_decimal(id)+"\");\n") else unique); print(s," end_ret("+name+"("+val+")).\n"), [_ . _] then print_reduce_function_args(s,reverse(body),type_table); print(s," ) =\n"); (if trace:options then print(s," vmsg(\"Reducing using rule "+to_decimal(id)+"\");\n") else unique); print(s," "+put_do_ret(name+"("+val+")", length(body))+".\n") }; print_reduce_functions(s,other_rules,type_table,parser_name,mb_extra,options) }. *** [3.5] Outputing states. *** [3.5.1] Printing the state header. define One print_state_header ( Stream s, Word32 state_id ) = print(s,"\n\n === State "+state_id+" ============================\n\n"). *** [3.5.2] Printing scenarios. Printing a single scenario. define One print ( Stream s, APG_Scenario sc ) = if sc is scenario(rid,head,bd,ad,prop,hg,lh_v) then print(s,right_pad(" ("+rid+")",10)+" "); print(s,right_pad(head+":",15)+" "); map_forget((String x) |-> print(s,x+" "),reverse(bd)); print(s,". "); map_forget((String x) |-> print(s,x+" "),ad); print(s,"\n"). Printing a list of classes of scenarios. define One print_scenarios ( Stream s, List(APG_Class) classes ) = map_forget((APG_Class c) |-> if c is class(_,scs) then map_forget((APG_Scenario sc) |-> print(s,sc),scs), classes). *** [3.5.3] Printing the transitions. define One print_transitions ( Stream s, List(APG_Transition) transitions, List(String) all_tokens ) = print(s,"\n --- Transitions/Reductions ---"); map_forget((APG_Transition tr) |-> if tr is transition(sym,id) then if sym:all_tokens then print(s,"\n "+right_pad(sym,20)+" shift and go to state "+id) else print(s,"\n "+right_pad(sym,20)+" restart from state "+id), transitions). *** [3.5.4] Printing the reductions. define One print_reductions ( Stream s, List(APG_Scenario) scs ) = if scs is { [ ] then unique, [sc1 . others] then if sc1 is scenario(rid,head,bd,ad,prop,hg,lh_v) then if ad is { [ ] then map_forget((String lh) |-> print(s,"\n "+right_pad(lh,20)+" reduce using rule "+rid), *lh_v), [_ . _] then print_reductions(s,others) } }. *** [3.5.5] Printing acceptable tokens. define One print_acceptable_tokens ( Stream s, List(String) tokens // with possible repetitions ) = if tokens is { [ ] then unique, [h . t] then if h:t then print_acceptable_tokens(s,t) else print(s,"\n "+h+(if t is [] then "" else ",")); print_acceptable_tokens(s,t) }. define One print_acceptable_tokens ( Stream s, Word32 state_id, String parser_name, List(BehaviorEntry) behaviors, List(String) non_terminals ) = print(s,"\n\ndefine List(Token_"+parser_name+") token_list_"+state_id+" ="); print(s,"\n ["); print_acceptable_tokens(s,map(symbol,behaviors) - non_terminals); print(s,"\n ]."). *** [3.5.6] Printing the restart function define Maybe(Word32) find_transition ( String symbol, List(APG_Transition) transitions ) = if transitions is { [ ] then failure, [tr1 . others] then if tr1 is transition(sym,target_id) then if symbol = sym then success(target_id) else find_transition(symbol,others) }. define String format_restart_case_args ( Int n ) = if n < 0 then "" else format_restart_case_args(n-1)+(if n = 0 then "" else ",")+"_"+abs_to_decimal(n). define List(String) get_longuest_stack_for ( String tok_name, List(APG_Scenario) scs ) = if scs is { [ ] then [ ], [h . t] then with rest = get_longuest_stack_for(tok_name,t), if h is scenario(id,head,bd,ad,prop,hg,lh_v) then if ad is { [ ] then rest, [sym . _] then if sym = tok_name then if length(bd) > length(rest) then bd else rest else rest } }. define One print_restart_args ( Stream s, List(String) stack, List(TypeEntry) type_table, Word32 i ) = if stack is { [ ] then unique, [h . t] then print(s,"\n "+right_pad(get_type(h,type_table),20)+right_pad(" _"+i,4)+ (if t is [] then " " else ",")+" // "+h); print_restart_args(s,t,type_table,i+1) }. define One print_restart_cases ( Stream s, Word32 state_id, List(APG_Scenario) scs, List(String) non_terminals, List(APG_Transition) transitions, Int num_args, List(APG_Option) options, Maybe(Extra) mb_extra ) = if non_terminals is { [ ] then unique, [h . t] then print(s,"\n "+h+"(value)"+" then"); (if trace:options then print(s,"\n vmsg(\"Got a '"+h+"'\");") else unique); if find_transition(h,transitions) is { failure then if (state_id = 0 & h = "start") then print(s,"\n end_ret(start(value))") else print(s,"\n should_not_happen(error(eof(unique),[]))"), success(target_id) then with n = length(get_longuest_stack_for(h,scs)), print(s,"\n if state_"+target_id+"("+ if mb_extra is { failure then "", success(e) then if e is extra(_,n1) then n1+"," } +"input,value"+ (if n = 0 then "" else ",")+ format_restart_case_args(n-1)+") is"); print(s,"\n {"); print(s,"\n error(a,b) then error(a,b),"); print(s,"\n end_ret(v) then /* 1 */ restart_"+state_id+"("+ if mb_extra is { failure then "", success(e) then if e is extra(_,n1) then n1+"," }+"input,v"+ (if num_args = 0 then "" else ",")+ format_restart_case_args(num_args-1)+"),"); print(s,"\n do_ret(v) then v"); print(s,"\n }") }; print(s,if t is [] then "" else ","); print_restart_cases(s,state_id,scs,t,transitions,num_args,options,mb_extra) }. define One print_restart_function ( Stream s, Word32 state_id, String parser_name, List(String) stack, List(APG_Scenario) scs, List(TypeEntry) type_table, List(String) non_terminals, List(APG_Transition) transitions, List(APG_Option) options, Maybe(Extra) mb_extra ) = print(s,"\n\ndefine Ret_"+parser_name+"(Non_Terminal_Value_"+parser_name+")"); print(s,"\n restart_"+state_id); print(s,"\n ("); if mb_extra is { failure then unique, success(e) then if e is extra(t,n) then print(s,"\n "+t+" "+n+",") }; print(s,"\n Lexer_"+parser_name+" input,"); print(s,"\n Non_Terminal_Value_"+parser_name+" result"+(if stack is [] then "" else ",")); print_restart_args(s,stack,type_table,0); print(s,"\n ) ="); (if trace:options then print(s,"\n vmsg(\"Restarting\");") else unique); print(s,"\n if result is"); print(s,"\n {"); print_restart_cases(s,state_id,scs,non_terminals,transitions,length(stack),options,mb_extra); print(s,"\n }."). *** [3.5.7] Printing the state function. Printing the declarations of the arguments of the state function. There is one argument per symbol in the (top of) stack. Types of arguments are obtained from the type table, and names of arguments are of the form: "_0", "_1", "_2" etc... define One print_state_function_args ( Stream s, List(String) stack, List(TypeEntry) type_table, Word32 rank // of next argument to be printed ) = if stack is { [ ] then unique, [name . t] then print(s," "+right_pad(get_type(name,type_table),20)+" _"+ right_pad(rank+(if t is [] then " " else ","),5)+" // "+name+"\n"); print_state_function_args(s,t,type_table,rank+1) }. Printing the beginning of the state function (which is common to the declaration and the definition). The state function begins like this: define Ret_...(Non_Terminal_Value_...) state_n ( Lexer input, Type0 _0, Type1 _1 ... ) define One print_state_function_beginning ( Stream s, Word32 state_id, String parser_name, List(String) stack, List(TypeEntry) type_table, Maybe(Extra) mb_extra ) = print(s,"\n\ndefine Ret_"+parser_name+"(Non_Terminal_Value_"+parser_name+")\n"); print(s," state_"+state_id+"\n"); print(s," (\n"); if mb_extra is { failure then unique, success(e) then if e is extra(t,n) then print(s," "+t+" "+n+",\n") }; print(s," Lexer_"+parser_name+" input"+(if stack is [] then "" else ",")+"\n"); print_state_function_args(s,stack,type_table,0); print(s," )"). Printing the declaration of the state function. This amounts to print the beginning and a dot. define One print_state_function_declaration ( Stream s, Word32 state_id, String parser_name, List(String) stack, List(TypeEntry) type_table, Maybe(Extra) mb_extra ) = print_state_function_beginning(s,state_id,parser_name,stack,type_table,mb_extra); print(s,"."). Printing the definition of the state function. We have to print the beginning, an 'equal' sign and the body of the function. The body is a conditional with one case per token of the grammar. First of all, here is a function for printing the cases of this conditional. define Maybe(APG_Scenario) find_reduction ( String token, List(APG_Scenario) scs ) = if scs is { [ ] then failure, [sc1 . others] then if sc1 is scenario(id,head,bd,ad,prop,hg,lh_v) then if token : *lh_v then if ad is { [ ] then success(sc1), [_ . _] then find_reduction(token,others) } else find_reduction(token,others) }. define One print_reduce_body ( Stream s, String token, List(APG_Scenario) scs, Word32 state_id, List(APG_Option) options, Maybe(Extra) mb_extra ) = if find_reduction(token,scs) is { failure then (if trace:options then print(s," vmsg(\"Unexpected token '"+token+"'\");\n") else unique); print(s," error(next,token_list_"+state_id+")"), success(sc) then if sc is scenario(rid,_,bd,_,_,_,_) then print(s," unput_token(input)(next);\n"); with nargs = length(bd), (if nargs = 0 then print(s," if reduce_"+rid+ if mb_extra is { failure then "", success(e) then if e is extra(t,n) then "("+n+")" }+" is\n") else print(s," if reduce_"+rid+"("+ if mb_extra is { failure then "", success(e) then if e is extra(t,n) then n+"," }+format_restart_case_args(nargs-1)+") is\n")); print(s," {\n"); print(s," error(a,b) then error(a,b),\n"); (if nargs = 0 then print(s," end_ret(v) then /* 2 */ restart_"+state_id+"("+ if mb_extra is { failure then "", success(e) then if e is extra(t,n) then n+"," }+"input,v"+ (with n = length(get_longuest_stack(scs)), (if n = 0 then "" else ",")+ format_restart_case_args(n-1))+"),\n") else print(s," end_ret(v) then end_ret(v),\n")); print(s," do_ret(v) then v\n"); print(s," }") }. define One print_state_function_cases ( Stream s, List(String) all_tokens, List(APG_Scenario) scs, List(APG_Transition) transitions, Word32 state_id, List(String) stack, List(Conflict) conflicts, List(APG_Option) options, Bool has_restarts, Maybe(Extra) mb_extra ) = if all_tokens is { [ ] then unique, [tok1 . others] then print(s," "+tok1+"(value)"+" then\n"); with resol = get_conflict_resolution(tok1,conflicts), (if resol is reduce1(Word32 tl, Word32 rl) then print_reduce_body(s,tok1,scs,state_id,options,mb_extra) else if resol is reduce2(Word32 l, AssocMode m) then print_reduce_body(s,tok1,scs,state_id,options,mb_extra) else if find_transition(tok1,transitions) is { failure then print_reduce_body(s,tok1,scs,state_id,options,mb_extra), success(target_state_id) then (if trace:options then print(s," vmsg(\"Shifting token '"+tok1+"'\");\n") else unique); with n = length(get_longuest_stack_for(tok1,scs)), print(s," if state_"+target_state_id+"("+ if mb_extra is { failure then "", success(e) then if e is extra(_,n1) then n1+"," }+"input,value"+ (if n = 0 then "" else ",")+ format_restart_case_args(n-1)+") is\n"); print(s," {\n"); print(s," error(a,b) then error(a,b),\n"); with m = length(stack), (if has_restarts then print(s," end_ret(value1) then /* 3 */ restart_"+state_id+ "("+if mb_extra is { failure then "", success(e) then if e is extra(_,n1) then n1+"," }+"input,value1"+ (if m = 0 then "" else ",")+ format_restart_case_args(m-1)+"),\n") else print(s," end_ret(value1) then should_not_happen(error(eof(unique),[])),\n")); print(s," do_ret(value1) then "); (if trace:options then print(s,"vmsg(\"Ignoring state "+state_id+ "\");\n ") else unique); print(s,"value1\n"); print(s," }") }); print(s,if others is [] then "\n" else ",\n"); print_state_function_cases(s,others,scs,transitions,state_id, stack,conflicts,options,has_restarts,mb_extra) }. define One print_state_function ( Stream s, Word32 state_id, String parser_name, List(String) stack, List(TypeEntry) type_table, List(String) all_tokens, List(APG_Transition) transitions, List(APG_Scenario) scs, List(Conflict) cfls, List(APG_Option) options, Bool has_restarts, Maybe(Extra) mb_extra ) = print_state_function_beginning(s,state_id,parser_name,stack,type_table,mb_extra); print(s," =\n"); (if trace:options then print(s," vmsg(\"Entering state "+state_id+"\");\n") else unique); print(s," with next = read_token(input)(unique),\n"); print(s," if next is\n"); print(s," {\n"); print_state_function_cases(s,all_tokens,scs,transitions,state_id,stack,cfls,options, has_restarts,mb_extra); print(s," }.\n"). *** [3.5.8] Printing a whole state. define One print_state ( Stream s, APG_State st, List(String) all_tokens, List(PrecEntry) prec_table, List(AssocEntry) assoc_table, List(TypeEntry) type_table, List(String) non_terminals, List(APG_Grammar_Rule) rules, String parser_name, List(APG_Option) options, Var(Word32) count_shift_reduce_conflicts, Var(Word32) count_reduce_reduce_conflicts, Maybe(Extra) mb_extra ) = if st is state(st_id,classes,transitions) then with scs = flat_classes(classes), stack = get_longuest_stack(scs), behaviors = compute_behaviors(scs,transitions,all_tokens), conflicts = compute_conflicts(behaviors,rules,prec_table,assoc_table), has_rest = has_restarts(behaviors), if number_of_conflicts(conflicts,0,0) is (sr,rr) then ( count_shift_reduce_conflicts <- *count_shift_reduce_conflicts + sr; count_reduce_reduce_conflicts <- *count_reduce_reduce_conflicts + rr ); print_state_header(s,st_id); print_scenarios(s,classes); print_transitions(s,transitions,all_tokens); print_reductions(s,scs); print_conflicts(s,conflicts); //print_conflicts(make_stream(stdout),conflicts); print_acceptable_tokens(s,st_id,parser_name,behaviors,non_terminals); (if has_rest then print_restart_function(s,st_id,parser_name,stack,scs, type_table,non_terminals,transitions,options,mb_extra) else unique); print_state_function(s,st_id,parser_name,stack,type_table, all_tokens,transitions,scs,conflicts,options,has_rest,mb_extra). *** [4] The interface. read make_automaton.anubis define One do_output ( APG_Grammar g, List(APG_State) auto, Stream s, List(APG_Option) options ) = if g is grammar(preambule,parser_name,prec_decs,type_decs,rules,mb_extra,postambule) then with non_terminals = (List(String))["start" . all_non_terminals(rules)], all_tokens = (List(String))["eof" . all_symbols(rules) - non_terminals], prec_table = make_precedence_table(prec_decs), assoc_table = make_association_table(prec_decs), _A = if get_rule(1,rules) is grammar_rule(_,_A,_,_) then _A, type_table = make_type_table(type_decs,name(_A)), sr_count_v = var((Word32)0), rr_count_v = var((Word32)0), (if verbose:options then print("\nOutputting the result ... "); forget(flush(stdout)) else unique); print(s,"\n\n This file was generated by APG (the Anubis Parser Generator)\n"); print(s," (to find unresolved shift/reduce conflicts, search for 'unresolved')\n"); print(s," (to find reduce/reduce conflicts, search for 'reduce/reduce')\n\n"); print(s," Your parser is declared as follows below in this file:"); print_parser_function_declaration(s,parser_name,type_table,mb_extra); print(s,"\n The public types 'Token_"+parser_name+"' and 'Token_Value_"+parser_name+"'\n"); print(s," are defined below in this file.\n\n"); print(s,preambule); print_token_type(s,parser_name,all_tokens); print_token_value_type(s,parser_name,all_tokens,type_table); print_token_name_function(s,parser_name,all_tokens); print_non_terminals_type(s,non_terminals,type_table,parser_name); print_type_Ret(s,parser_name); print_Lexer_type(s,parser_name); (if trace:options then print_vmsg_declaration(s) else unique); print_reduce_functions(s, [grammar_rule(0,symbol_value("start","_0"), [symbol_value(name(_A),"_0")],failure) . rules], type_table,parser_name,mb_extra,options); map_forget((APG_State st) |-> if st is state(state_id,classes,transitions) then with scs = flat_classes(classes), print_state_function_declaration(s, state_id, parser_name, get_longuest_stack(scs), type_table, mb_extra), auto); print_parser_function(s,parser_name,type_table,mb_extra); map_forget((APG_State state) |-> print_state(s, state, all_tokens, prec_table, assoc_table, type_table, non_terminals, rules, parser_name, options, sr_count_v, rr_count_v, mb_extra), auto); print(s,postambule); (if verbose:options then print("Done. \n") else unique); (if *sr_count_v /= 0 then print("\nThere are "+(*sr_count_v)+" shift/reduce conflicts.") else unique); (if *rr_count_v /= 0 then print("\nThere are "+(*rr_count_v)+" reduce/reduce conflicts.") else unique); (if (*rr_count_v + *sr_count_v) /= 0 then print("\n\n") else unique). public define One anubis_output ( APG_Grammar g, String output_filename, List(APG_Option) options ) = with auto = make_APG_automaton(g,options), if file(output_filename,new) is { failure then print("Cannot create file '"+output_filename+"'.\n"), success(f) then with s = make_stream(f), if time:options then show_time("do_output: ",(One u) |-> do_output(g,auto,s,options)) else do_output(g,auto,s,options) }. *** [5] Testing. read read_grammar.anubis global define One apg_test ( List(String) args ) = if read_APG_grammar(make_stream(example_grammar),[]) is { error(msg) then en_print(msg), ok(g) then anubis_output(g,make_stream(stdout), [ //trace ]) }.