Between the release of GF 3.5 and the next version, two changes were made relating to character encodings in GF grammar files:
flags coding = ...
declaration in the source file, you should now
use a pragma --# -coding=...
at the top of the file instead.
UTF-8 is the default encoding for text files on many systems these days, so it makes sense to use it as the default for GF grammar files too.
Changing how alternate encodings are specified allows conversion to Unicode to be done before parsing, which means that
If your files still compile without errors after the change, you don't need to do anything. (But see Known problems below!) If you get one of the following errors,
lexical error
,
encoding mismatch
,
Warning: default encoding has changed from Latin-1 to UTF-8
you need to add a
--# -coding=...
pragma to your file (or convert it to UTF-8).
flags coding=utf8
declaration), no change is needed.
#-- -coding=latin1
pragma at the top of the file.
flags coding=
enc to a corresponding --# -coding=
enc.
Grammars will still compile with GF-3.5 after these changes.
Note that GF only understands one option per pragma line. If you already
have a --path=...
pragma, you can not put the -coding=...
option on
the same line. Add it on a separate line:
--# -path=... --# -coding=...
The recommendation for the future is to use UTF-8 for all source files.
The intention is that if a grammar file is affected by the changed default encoding, then you will see one of the messages listed in the previous section when you compile the grammar. But there are a couple if issues to be aware of: