REBOL[ Title: "RTF Tools" Author: "Brett Handley" Email: brett@codeconscious.com Date: 6-July-2000 Purpose: {To interpret RTF files and re-export if desired.} File: %rtf-tools.r Category: 'general Comment: { The spec can found at http://msdn.microsoft.com/library/en-us/dnrtfspec/html/rtfspec.asp I've seperated parsing from interpretation. This enabled me to do more code without cluttering up reading of the parser. The controller here only does a little interpreting in order to nest the group into blocks appropriately and to convert a couple of symbols into text. One can now build a more involved interpreter that takes the output from my code here and does something with the information such as output to HTML, XML whatever. I started on this journey to produce HTML from RTF but with a level of control (filtering) you cannot get in other products. Having gone this far and learnt more about RTF I've decided to look for alternatives to using RTF as an input source for producing HTML. I may yet take it up again. If another reason arises. But at the moment, I'll maybe spend me time elsewhere. This code has not been rigourously tested and it is entirely upon you to determine it's suitability to your purposes. So that we can all benefit, I ask that if you improve the code then, at your discretion, you return the amendments to me. I hope someone finds this of use! } To-do: { Create interpreters to handle special chars , headers, binary data. Semicolons are a normal part of text, but they are also sometimes used as delimiters for control-words. This needs addresssing. } ] ; ; A couple of stack like operations ; spop: function[a-stack][a-val][ if 0 < length? a-stack [ a-val: last a-stack remove back tail a-stack a-val ] ] spush: func[a-stack val][ append/only a-stack val ] ; ; A controller object for loading rtf into a block. ; rtf-to-block: make object! [ ; dbg: func[x][print x] dbg: func[x][] output: none group: none group-references: none initialise: func[][ dbg "initialise" group-references: make block! 1 ] begin-group: function[][new-block][ dbg "begin-group" new-block: make block! 2 spush group-references new-block if group [ append/only group new-block ] group: new-block ] end-group: func[][ dbg "end-group" spop group-references output: group group: either empty? group-references [ none ] [ last group-references ] ] interpret: func[code param delim] [ dbg "interpret" append group to-word code if param [append group to-integer param] if delim [append group delim] ] symbol: func[symb] [ dbg "symbol.." switch/default symb [ "{" [text symb] "}" [text symb] "\" [text symb] ][ append group to-word symb ] ] text: function[t][append-point][ dbg "text" append-point: group if not empty? group [ if equal? string! type? last group [append-point: last group]] append append-point t ] ] ; ; The parser ; rtf-reader: make object! [ digit: charset [#"0" - #"9"] letter: charset [#"A" - #"Z" #"a" - #"z"] other-control: complement charset [ #"0" - #"9" #"A" - #"Z" #"a" - #"z" ] control-word-rule: [ copy control-code 1 32 letter copy control-param [opt "-" any digit ] copy control-delim opt " " (if equal? control-delim " " [control-delim: none] controller/interpret control-code control-param control-delim) ] control-symbol-rule: [ copy control-code 1 other-control (controller/symbol control-code)] control-rule: [ "\" [control-symbol-rule | control-word-rule ] ] plain-text-data: complement charset "{}\^/" plain-text-rule: [copy text some plain-text-data (if text [controller/text text])] rtf-rule: [ "{" (controller/begin-group) any [ "^/" | control-rule | plain-text-rule | rtf-rule ] "}" (controller/end-group) ] controller: none process: func [ rtf /mode c ][ rtf-reader/controller: either mode [ get c ] [rtf-to-block] rtf-reader/controller/initialise either parse/all rtf rtf-rule [ rtf-reader/controller/output ][ none ] ] ] export-special-char: func[x][ replace/all replace/all replace/all copy x "\" "\\" "{" "\{" "}" "\}" ] export-rtf: function[ rtf-block [block!] ][ result-rtf e ][ result-rtf: make string! 10000 append result-rtf "{" foreach e :rtf-block [ switch to-string type? e [ "word" [append result-rtf rejoin ["\" to-string e]] "integer" [append result-rtf to-string e] "block" [append result-rtf export-rtf e] "string" [append result-rtf rejoin [" " export-special-char e]] ] ] append result-rtf "}" result-rtf ] load-rtf: func[x][rtf-reader/process x] ; To use it do something like.. ; rtf-block: load-rtf read %your-file.rtf halt