Hi all, I've found that the Spanish XML files have not been encoded properly and use an assorted list of non-ASCII characters (one file has even a Unicode encoding). In order to find and fix the culprits (they were not being encoded properly in the LaTeX files and, consequently, in the PDFs I generated) I've devised a set of scripts: - convert-accent.pl : converts accented characters commonly used in Spanish to their XSL definitions - find-nonascii.pl: simple, yet efective way to find non-ASCII characters in files. All the files can be easily converted using the first one (takes a file in input and outputs it fixed) and checked with the second one. If you do this for all files they will get fixed (save for 03lcdk which has a Unicode char at line 1085 -M'lare- which needs to be fixed by manually editing the file) Oh, and attached is yet another revision of the latex.xsl files with more unicode characters properly defined. I've used this to generate PDF files for all four spanish books at the Project (I've sent these to the coordinator for review), but they could be useful in the future for other internationalised editions of the books (if some other group starts transcribing them into XML). Regards Javier
#!/usr/bin/perl -p s/á/\<ch.aacute\/\>/g; s/é/\<ch.eacute\/\>/g; s/í/\<ch.iacute\/\>/g; s/ó/\<ch.oacute\/\>/g; s/ú/\<ch.uacute\/\>/g; s/ñ/\<ch.ntilde\/\>/g; s/Á/\<ch.Aacute\/\>/g; s/É/\<ch.Eacute\/\>/g; s/Í/\<ch.Iacute\/\>/g; s/Ó/\<ch.Oacute\/\>/g; s/Ú/\<ch.Uacute\/\>/g; s/ä/\<ch.auml\/\>/g; s/ë/\<ch.euml\/\>/g; s/ï/\<ch.iuml\/\>/g; s/ö/\<ch.ouml\/\>/g; s/ü/\<ch.uuml\/\>/g; s/Ñ/\<ch.Ntilde\/\>/g; s/´/\<ch.acute\/\>/g; s/¡/\<ch.iexcl\/\>/g; s/¿/\<ch.iquest\/\>/g; s/«/\<ch.laquo\/\>/g; s/»/\<ch.raquo\/\>/g; #s/\&/\<ch.ampersand\/\>/g;
#!/usr/bin/perl -nw print if /[^[:ascii:]]/