1-Feb-2001 9:50:58-GMT,6405;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id CAA21120 for ; Thu, 1 Feb 2001 02:50:56 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f119oTp25396; Thu, 1 Feb 2001 10:50:30 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id KAA03952; Thu, 1 Feb 2001 10:49:57 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 485685 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 10:49:56 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA03945 for ; Thu, 1 Feb 2001 10:49:55 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA42618 for ; Thu, 1 Feb 2001 10:49:55 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f119nup25260 for ; Thu, 1 Feb 2001 10:49:56 +0100 (MET) Received: from [195.20.224.220] (helo=mrvdom04.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14OGNG-0002Iu-00 for LATEX-L@urz.uni-heidelberg.de; Thu, 1 Feb 2001 10:49:54 +0100 Received: from manz-3e3648b1.pool.mediaways.net ([62.54.72.177] helo=istrati.zdv.uni-mainz.de) by mrvdom04.kundenserver.de with esmtp (Exim 2.12 #2) id 14OGN8-0000Qh-01 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 10:49:46 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id KAA22329; Thu, 1 Feb 2001 10:48:38 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <14968.34118.306909.315983@istrati.zdv.uni-mainz.de> <200101312200.XAA09346@bar.loria.fr> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14969.12533.759505.917813@istrati.zdv.uni-mainz.de> Date: Thu, 1 Feb 2001 10:48:37 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <200101312200.XAA09346@bar.loria.fr> Denis, > Well, when I brought up that issue, it was not pure theory. didn't expect it otherwise. > (inputenc2 is a variant of inputenc where you can switch the input encoding > within a paragraph; it is possible that there is a standard package > achieving this now) think it is called inputenc these days > Yes, I had such a problem with differed layouts like the table of contents. > For instance, say you have a French, then a Russian section. If you write > > switch encoding to French > > \tableofcontents > > \section{French} > > switch encoding to Russian > > \section{Russian} > > you'll end up with `French' appearing in Cyrillic, because > the state at the end of the \tableofcontents is not restored. > You have to add an explicit change of encoding, for instance after > \tableofcontents, or at the end of your document. basically what you are saying is that moving text needs to keep information about its state with it, right? it fortunately doesn't need to keep information about its input encoding since that got all normalised into the internal representation but unfortunately you need to keep information about the encoding used (or rather the encoding intended) a bit inconsistent that, isn't it? but would it help if the language has a tie to the encoding? i think current babel would handle the toc example right (if the output encoding is set by the language) but i guess this would not be true for mark entries. concerning the output encoding: assuming there is something like \languagefontencoding which is either unset or set (per language) and if set results in a change to the font encoding whenever a language switch happens. (if unset for a language that should probably mean revert to the document default encoding). for languages like Russian things are relatively clear (though not really either) since due to the limitation of TeX we are forced to select an encoding, so there the language might as well provide a default like T2A, but for German this it is not the case, ie one can type in OT1, T1, or even OT4. a scheme like the above would then do well enough but the user still has to make some font encoding decisions (like what is the default font encoding, or do i want T1 with english ...) however it would be far nicer if TeX (or LaTeX) would be able to take a bit of text written in the internal LaTeX representation (plus language tags), ie ascii + font encoding specific commands and automagically figures out behind the scene how to typeset the lot. only i can't see how to make this happen more automatically with the above scheme, which requires a) language tags and b) potentially customisation in the preamble. a lot of people find (and i agree) that \usepackage[T1]{fontenc} is already to much to expect (and difficult or impossible to explain) --- and why should a user be concerned with it? but then, the same people have diametral ideas what are the right values of fontencodings for a certain language: just look at the different opinions on this list concerning French and OT1 or T1. so we have to offer a choice, question is, is there a better way to present it? frank 1-Feb-2001 10:07:32-GMT,4833;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id DAA21453 for ; Thu, 1 Feb 2001 03:07:31 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f11A7Mp01969; Thu, 1 Feb 2001 11:07:22 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id LAA04261; Thu, 1 Feb 2001 11:07:14 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 485703 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 11:07:13 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA04254 for ; Thu, 1 Feb 2001 11:07:12 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA47148 for ; Thu, 1 Feb 2001 11:07:12 +0100 Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f11A7Cp01870 for ; Thu, 1 Feb 2001 11:07:12 +0100 (MET) Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef.ens.fr (8.10.1/1.01.28121999) with ESMTP id f11A7B550804 for ; Thu, 1 Feb 2001 11:07:11 +0100 (CET) Received: from (ebrunet@localhost) by clipper.ens.fr (8.9.2/jb-1.1) Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2i Message-ID: <20010201110710.A16753@clipper.ens.fr> Date: Thu, 1 Feb 2001 11:07:10 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Eric Brunet Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L Franck Mittelbach wrote: > Johannes talkes about the difficulties in providing this, but is it really > something one wants? > - except when using mule (or emacs) one doesn't (automatically) > change input encodings when changing a language in the middle of > the document. I am not quite sure I understand what you mean here, so I won't comment. > - beside, even if, for the same language many different input > encodings would be in use so you can't even pick a default without > making a lot of people unhappy. you can write german using cp437de > or latin1 or ansinew or ... depending the OS used or the keyboard > or ... > - same is true for font encodings: my question about OT1 T1 showed > that clearly, some people never use OT1 these days others only (and > both writing in the same language) Sure, there might be many different possible settings, and one can only choose a default one. But I don't quite see how this could be an argument for not making a decision. For the moment, all the users must define an inputencoding/fontencoding tuple for all the languages in their document. If a default is choosen, then only some of them (and hopefully a minority of them) would have to specify something. Surely it is on the whole a better solution. > i guess the only way to tie something like this to the language is as > an offering, ie by default nothing is tied to a language but you have a > mechanism to say that all switches to language X result in switching > the inputenc to Y and give the user a chance to specify this in the > preamble. Certainly it would be usefull to have a mechanism that binds an input and a font encodings to each language. But again, it is not because it is impossible to please everybody that no default should be choosen. Imagine that I don't like the default margins in a LaTeX document. As it would be impossible to please me (and probably others), would that mean that there should be no default margin and that everybody should invoke the geometry package and explicitely give their prefered sizes ? I used to teach LaTeX to beginners, and it was always very difficult to explain why it was necessary that any document should have 5 or 6 lines of preambule, even to just typeset a single sentence. -- Éric Brunet 1-Feb-2001 11:31:09-GMT,4276;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id EAA22868 for ; Thu, 1 Feb 2001 04:31:08 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f11BUap07490; Thu, 1 Feb 2001 12:30:36 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id MAA05975; Thu, 1 Feb 2001 12:30:16 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 485921 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 12:30:15 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA05968 for ; Thu, 1 Feb 2001 12:30:13 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA37706 for ; Thu, 1 Feb 2001 12:30:14 +0100 Received: from waldorf.cs.uni-dortmund.de (waldorf.cs.uni-dortmund.de [129.217.4.42]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f11BUFp07386 for ; Thu, 1 Feb 2001 12:30:15 +0100 (MET) Received: from peano.cs.uni-dortmund.de (peano.cs.uni-dortmund.de [129.217.28.154]) by waldorf.cs.uni-dortmund.de with ESMTP id MAA19791; Thu, 1 Feb 2001 12:30:13 +0100 (MET) Received: from peano.cs.uni-dortmund.de (tinne@localhost [127.0.0.1]) by peano.cs.uni-dortmund.de (8.9.3/8.9.3/Debian 8.9.3-21) with ESMTP id MAA02188; Thu, 1 Feb 2001 12:30:13 +0100 X-Mailer: exmh version 2.1.1 10/15/1999 (debian) References: <20010131103838.A5711@clipper.ens.fr> <14968.34118.306909.315983@istrati.zdv.uni-mainz.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID: <200102011130.MAA02188@peano.cs.uni-dortmund.de> Date: Thu, 1 Feb 2001 12:30:12 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Karsten Tinnefeld Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: Your message of "Wed, 31 Jan 2001 22:36:06 +0100." <14968.34118.306909.315983@istrati.zdv.uni-mainz.de> > > http://www.Uni-Mainz.DE/cgi-bin/ltxbugs2html?pr=babel/3046 > > Johannes talkes about the difficulties in providing this, but is it really > something one wants? > > - except when using mule (or emacs) one doesn't (automatically) change input > encodings when > changing a language in the middle of the document. I have tried so, using emacs (for an English-German-Russian doc), and was not quite happy about it. Then I found out that a utf-8 based encoding is work-in-progress, and this would be A Good Thing to pursue and integrate into the kernel in mid-term. See the current version at http://www.unruh.de/DniQ/latex/unicode/ , CTAN is not up-to-date :-(hark, Robin). Its author Dominique Unruh (cc:) also put some effort in an experimental automatic font encoding selector (unsupported/autofe.sty) which is used primarily for use with CJK (afai understood, I don't speak any fareast tongue). I would not vote for following the Java train to all-unicode-processing now, but it would be fine if latex entered the platform. Karsten -- Karsten Tinnefeld tinnefeld@ls2.cs.uni-dortmund.de Fachbereich Informatik, Lehrstuhl 2 T +49 231 755-4737 Universität Dortmund, D-44221 Dortmund, Deutschland F +49 231 755-2047 1-Feb-2001 13:16:21-GMT,9013;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id GAA24929 for ; Thu, 1 Feb 2001 06:16:19 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f11DEfp03414; Thu, 1 Feb 2001 14:14:41 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id OAA08392; Thu, 1 Feb 2001 14:14:29 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486235 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 14:14:28 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id OAA08385 for ; Thu, 1 Feb 2001 14:14:27 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id OAA56944 for ; Thu, 1 Feb 2001 14:14:27 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f11DESp03365 for ; Thu, 1 Feb 2001 14:14:28 +0100 (MET) Received: from [195.20.224.220] (helo=mrvdom04.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14OJZC-0004BZ-00 for LATEX-L@urz.uni-heidelberg.de; Thu, 1 Feb 2001 14:14:26 +0100 Received: from manz-3e3645ca.pool.mediaways.net ([62.54.69.202] helo=istrati.zdv.uni-mainz.de) by mrvdom04.kundenserver.de with esmtp (Exim 2.12 #2) id 14OJZ9-0004qu-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Thu, 1 Feb 2001 14:14:23 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id OAA22870; Thu, 1 Feb 2001 14:13:17 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <20010201110710.A16753@clipper.ens.fr> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14969.24812.867227.593317@istrati.zdv.uni-mainz.de> Date: Thu, 1 Feb 2001 14:13:16 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <20010201110710.A16753@clipper.ens.fr> Eric, > Franck Mittelbach wrote: it is Frank :-) > > - except when using mule (or emacs) one doesn't (automatically) > > change input encodings when changing a language in the middle of > > the document. > > I am not quite sure I understand what you mean here, so I won't comment. what i mean is that most people write their document in a single input encoding and do not switch that encoding (or even can switch) just because they switch from one language to another. > Sure, there might be many different possible settings, and one can only > choose a default one. But I don't quite see how this could be an argument > for not making a decision. For the moment, all the users must define an > inputencoding/fontencoding tuple for all the languages in their document. > If a default is choosen, then only some of them (and hopefully a minority > of them) would have to specify something. Surely it is on the whole a > better solution. i think we have to distinugish between input encodings and font encodings. for font encodings several language need a default, or say, need a setting which differs from the system default (which is OT1) though even for Russian or Greek or other language with drastically different character set there are often more than a single font encoding that could apply. For example Polish could be typeset in OT4 (but you have only a small set of fonts available in that encoding (typically, the situation might be different in Poland)) or it could be typeset in T1. Anyway, for font encodings a default setting different from the system default (if necessary) does make sense and current babel already tries to do that, though as Denis report shows not always successfully. but with input encodings the situation is quite different. on the desk here writing in German i could use latin1, ansinew or cp437de (on the same computer depending only on the OS i'm starting) so here chosing a default that is better than saying specify it would be difficult. furthermore, because of the argument that the input encoding doesn't really change "wenn ich jetzt in Deutsch schreibe" (both are latin1 as far as this mail is concerned) so for english and german and french it should probably be the same. ansinew because a lot of people use PCs? or latin1 because Linux is going to take over the world? or should it change in a year or two when the latter happens --- with the result that then older documents would compile incorrectly because they assume the no longer correct default? finally applying the wrong input encoding to a document not in that encoding results in typesetting errors but not in compilation errors. true, this can also happen if you explicitly specify the wrong encoding but this is a conscious act (or so we would hope) and not something htat happens behind the scene i know that it is painful to have a five or 10 line preamble but at least you know then what is in the document and that information is valid. so in summary in my opinion: - output encoding should probably have a language dependent default (and already do to some extend). Babel in its present form will not be able to change the default for languages that currently use the document encoding (whatever it is) because of compatibility reasons but a successor probably could if there are good arguments for it. - input encoding should stay ascii by default and should not change on a per language change basis though i think a successor to babel should offer this functionality on request, eg for the example from Denis the code that i have here on my computer would be set up like this to achieve an input encoding switch on a perlanguage basis: \SetLanguageCommandAction{default}\inputencodingname{ascii} \SetLanguageCommandAction{russian}\inputencodingname{koi8-r} \SetLanguageCommandAction{french} \inputencodingname{latin1} which would mean that the input encoding would switch to ascii for all languages except Russian and French (though in real life one would probably make the "default" latin one in that cas)e. > Certainly it would be usefull to have a mechanism that binds an input and > a font encodings to each language. But again, it is not because it is > impossible to please everybody that no default should be choosen. Imagine > that I don't like the default margins in a LaTeX document. As it would be > impossible to please me (and probably others), would that mean that there > should be no default margin and that everybody should invoke the geometry > package and explicitely give their prefered sizes ? no, but i don't think the example quite fits: a) there isn't a single default but rather a large number of defaults you can to chose from, ie you use \documentclass[a4paper]{article} or you use the koma classes or you use the xyz journal classes or ... and b) should the margins be tied to the language? ie would you like to get your margins changed when you switch languages in the middle of the document? > I used to teach LaTeX to beginners, and it was always very difficult to > explain why it was necessary that any document should have 5 or 6 lines > of preambule, even to just typeset a single sentence. I think the more "international" we get the more we need to expicitly tag our data, LATeX's preamble stuff isn't the best but it isn't the worst either i guess and specifying, say, an input encoding there, is in my opinion preferable to hidden default especially if there is no easy explanation why the default is as it is (ie if it is difficult to remember the default) which reminds me: please take the list of languages babel currently supports and attach to them input/font encoding defaults that would be suitable, i would really be interested in see such a list (and have it disucssed) cheers frank 2-Feb-2001 12:00:30-GMT,4146;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id FAA28879 for ; Fri, 2 Feb 2001 05:00:28 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12Bxhp15882; Fri, 2 Feb 2001 12:59:43 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id MAA23510; Fri, 2 Feb 2001 12:59:05 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 485693 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 12:59:04 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA23503 for ; Fri, 2 Feb 2001 12:59:03 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA22544 for ; Fri, 2 Feb 2001 12:59:03 +0100 Received: from alpha.ntp.springer.de (alpha.ntp.springer.de [192.129.24.9]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12Bx2p15758 for ; Fri, 2 Feb 2001 12:59:03 +0100 (MET) Received: from ALPHA.NTP.SPRINGER.DE by ALPHA.NTP.SPRINGER.DE (PMDF V5.2-32 #35169) id <01JZMVN1N6ZQ0009XR@ALPHA.NTP.SPRINGER.DE> for LATEX-L@URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 13:00:06 MEZ X-VMS-To: IN%"LATEX-L@URZ.UNI-HEIDELBERG.DE" MIME-version: 1.0 Content-type: TEXT/PLAIN; CHARSET=ISO-8859-1 Content-transfer-encoding: 8BIT Message-ID: <01JZMVN1N7XK0009XR@ALPHA.NTP.SPRINGER.DE> Date: Fri, 2 Feb 2001 13:00:06 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: J%ORG KNAPPEN Subject: Re: inputenc text (and/or math) To: Multiple recipients of list LATEX-L Frank Mittelbach schrieb: > > If the standard inputenc files are changed, I strongly plea for "dual use" > > characters. The standard ASCII characters with a few exceptions can be used > > in text and math as well. A user expects the high character not to be > > different > > in this respect. > depends on what dual means. if you mean that you want the result of a key > mapping (eg \"a) be available text or math then No, no, I didn't want to go that far ... > instead i would think something like \DeclareInputTextAndMath should be > offered for those languages/keyboards where a dual nature for input "keys" > really makes sense, greek and cyrillic comes to mind. Yes, this was my intention. If I type "×" (the times sign in Latin-1,2,3, and 4) I want that it works both in text and in math mode giving something sensible (i.e. \textmultiply or \times resp.) I don't want to have the command \times in text mode and \textmultiply in math mode necessarily (though it might lower the burden of memorising commands). > \DeclareInputTextAndMath{}{\textalpha}{\alpha} > > you may have meant that as well, have you? Yes, and the syntax looks fine to me. --J"org Knappen P.S. > [...] i strongly plea for math only accepts the small set of real > ascii, ie > > 0-9 a-z A-Z !"/()=?`'+*<>|,;.:- > plus commands > > by default (repeat: default). There are two characters which are pretty useless in math, namely '"' (the double quote) and '`' (the grave accent). The former fact is real good luck because it allows TeX fomulae in HTML ALT-tags without breaking HTML syntax. The later is just a cutiosity, but without AMS fonts there isn't any sensible symbol mapping to '`'. 2-Feb-2001 14:05:58-GMT,3764;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id HAA01519 for ; Fri, 2 Feb 2001 07:05:57 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12E5Zu17512; Fri, 2 Feb 2001 15:05:35 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id PAA26614; Fri, 2 Feb 2001 15:05:15 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486120 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 15:05:14 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA26606 for ; Fri, 2 Feb 2001 15:05:13 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA22574 for ; Fri, 2 Feb 2001 15:05:13 +0100 Received: from ams.org (sun06.ams.org [130.44.1.6]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12E5Bu17381 for ; Fri, 2 Feb 2001 15:05:11 +0100 (MET) Received: (from mjd@localhost) by ams.org (8.11.1/8.11.1) id f12E56T25093; Fri, 2 Feb 2001 09:05:06 -0500 (EST) References: <01JZMVN1N7XK0009XR@ALPHA.NTP.SPRINGER.DE> Lines: 35 X-Mailer: Gnus v5.7/Emacs 20.7 Message-ID: Date: Fri, 2 Feb 2001 09:05:05 -0500 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Michael John Downes Subject: Re: inputenc text (and/or math) To: Multiple recipients of list LATEX-L In-Reply-To: J%ORG KNAPPEN's message of "Fri, 2 Feb 2001 13:00:06 +0100" J%ORG KNAPPEN writes: > Yes, this was my intention. If I type "×" (the times sign in > Latin-1,2,3, and 4) I want that it works both in text and in math mode giving > something sensible (i.e. \textmultiply or \times resp.) I don't want to > have the command \times in text mode and \textmultiply in math mode necessarily The text/math ambiguity is a major problem for higher-level user interfaces like Scientific Word. If the user enters \gamma + 1 without first starting a math formula, then the proper way for the software to write it is: \textgamma \textplus 1 instead of $\gamma + 1$ Or, if all the math symbols are made to work equally well outside of math, simply \gamma +1 in the middle of the text. This is first of all bad markup, and thus sure to lead to problems later on; but there is already a problem at the outset, namely that there will be either no space around the plus sign, or (if the user explicitly adds spaces, which requires extra typing on their part) then the spacing will not be the correct math formula spacing for a binary operator. Nevertheless it seems clear that it would be better to have a separate hash table for math commands and text commands. So \gamma could have one definition in math and another one in text without the constant use of \relax \ifmmode a\else b\fi. This of course is not possible in TeX 3.x; perhaps in NTS or e-TeX or Omega. 2-Feb-2001 17:13:33-GMT,4395;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id KAA06781 for ; Fri, 2 Feb 2001 10:13:32 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12HDFu04032; Fri, 2 Feb 2001 18:13:15 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id SAA29946; Fri, 2 Feb 2001 18:13:01 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486341 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 18:13:00 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA29939 for ; Fri, 2 Feb 2001 18:12:59 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA50400 for ; Fri, 2 Feb 2001 18:13:00 +0100 Received: from venus.open.ac.uk (venus.open.ac.uk [137.108.143.2]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12HCwu03968 for ; Fri, 2 Feb 2001 18:12:58 +0100 (MET) Received: from fell.open.ac.uk by venus.open.ac.uk via SMTP Local (Mailer 3.1) with ESMTP; Fri, 2 Feb 2001 17:12:56 +0000 Received: (from car2@localhost) by fell.open.ac.uk (8.9.3+Sun/8.9.1) id RAA14925; Fri, 2 Feb 2001 17:13:08 GMT X-Authentication-Warning: fell.open.ac.uk: car2 set sender to car2@fell.open.ac.uk using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <14968.34118.306909.315983@istrati.zdv.uni-mainz.de> <200101312200.XAA09346@bar.loria.fr> <14969.12533.759505.917813@istrati.zdv.uni-mainz.de> X-Mailer: VM 6.76 under Emacs 20.7.1 Message-ID: <14970.60068.179603.570418@fell.open.ac.uk> Date: Fri, 2 Feb 2001 17:13:08 +0000 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Chris Rowley Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <14969.12533.759505.917813@istrati.zdv.uni-mainz.de> Frank wrote -- > > basically what you are saying is that moving text needs to keep information > about its state with it, right? it fortunately doesn't need to keep > information about its input encoding since that got all normalised into the > internal representation but unfortunately you need to keep information about > the encoding used (or rather the encoding intended) > > a bit inconsistent that, isn't it? Not really: since input encoding really does mean just that. Once the text is `inside LaTeX' the input encoding is irrelevant: that is the beauty and strength of the LaTeX text character model. The confusion perhaps comes because part of the `inside of LaTeX' is various external files, .toc, .aux etc. These are unfortunately not internal to TeX, only to LaTeX. The whole concept of `moving argument' arises from this fundamental distinction between TeX and LaTeX: LaTeX, in this sense, is not just a TeX macro package. > but would it help if the language has a tie > to the [font] encoding? Whether the `intended font encoding' should be part of a moving argument leads to an important question. Note the word `intended': will it always be the case that text from a moving argument should be turned into glyphs using the same font encoding as was used for the original text? > so we have to offer a choice, question is, is there a better way to present > it? This is not really an answer but we can certainly provide a better interface than: > \usepackage[T1]{fontenc} ... no, I am not sure what it would be:-). chris 2-Feb-2001 17:47:45-GMT,5691;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id KAA07966 for ; Fri, 2 Feb 2001 10:47:44 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12HlQu10331; Fri, 2 Feb 2001 18:47:26 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id SAA00527; Fri, 2 Feb 2001 18:47:18 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486370 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 18:47:17 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA00520 for ; Fri, 2 Feb 2001 18:47:16 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA22552 for ; Fri, 2 Feb 2001 18:47:17 +0100 Received: from abel.math.umu.se (abel.math.umu.se [130.239.20.139]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12HlHu10295 for ; Fri, 2 Feb 2001 18:47:17 +0100 (MET) Received: from [130.239.20.144] (mac144.math.umu.se [130.239.20.144]) by abel.math.umu.se (8.9.2/8.9.2) with ESMTP id SAA31818; Fri, 2 Feb 2001 18:45:35 +0100 (CET) X-Sender: lars@abel.math.umu.se References: <14968.6710.114015.220264@ux28.nets.de.eds.com> <200101292234.RAA14964@pluto.math.albany.edu> <14967.8829.903878.620595@istrati.zdv.uni-mainz.de> <200101310003.BAA02073@peano.cs.uni-dortmund.de> <14967.46479.253389.421142@istrati.zdv.uni-mainz.de> <14968.6710.114015.220264@ux28.nets.de.eds.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id SAA00521 Message-ID: Date: Fri, 2 Feb 2001 18:47:16 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Lars =?iso-8859-1?Q?Hellstr=F6m?= Subject: Re: inputenc -> text+math To: Multiple recipients of list LATEX-L In-Reply-To: <14968.26883.994029.840070@istrati.zdv.uni-mainz.de> At 20.35 +0100 2001-01-31, Frank Mittelbach wrote: >I wrote: > > > do you mind outlining the solution in a few sentences? how do you want > > to be able to find out that you are not in math but will be once > > something (eg tha actual letter) triggered the \halign u part without > > actually triggering it (with something like \relax which kills > > ligaturing)? > >but by now i got a chance to looked at it. quite a nice idea but i don't think >it is fully correct yet. you change \if@mmode at each \halign thus an \halign >that doesn't generate math mode cells will have this setting throughout, eg >something like > >\begin{tabular}[t]{..} > >will have broken text inside, wouldn't it? or do i overlook something? There is an \ifmmode in \@inmathwarn, which appears in both \@current@cmd and \@changed@cmd. If that isn't fully expandable all text command might be affected by broken ligatures and kerns. >assuming that the analysis is right, what follows is that instead of changing >\halign internally you would have to change those uses of \halign where it is >needed (only) and that cuts through all existing macros and isn't transparent >ie you can't simply get it done by a single package or inclusion of code in >the kernel you actually have to change every second use of \halign Simply fiddling with the definition of \ifmmode isn't sufficient for taking care of all related problems. The small document \documentclass{article} \usepackage{array} \usepackage[T1]{fontenc} \begin{document} \begin{tabular}{>{\fontencoding{OT1}\selectfont}l} a\\ \r{a}\\ \'{a} \end{tabular} \end{document} will try to use the T1 definitions in an OT1 font; the log file contains: Missing character: There is no ^^e5 in font cmr10! Missing character: There is no ^^e1 in font cmr10! Only the first row comes out right. However, I noticed that a \noexpand\empty will stop the alignment mechanism from looking any further for an \omit, so it isn't necessary to insert an actual command to prevent it from scanning any further. Unfortunately a \noexpand\empty also breaks ligatures (and most likely kerning as well; I haven't checked) so simply inserting that into e.g. \IeC and \T1-cmd and friends isn't the solution either. What I suspect is the right solution is to have \protect set to \@unexpandable@protect when scanning for \omit and have it reset to \@typeset@protect in the column template---then the robustness mechanisms for normal robust commands, text commands, and in the \IeC command respectively would take care of sorting things out. I doubt this can be done by patching the \halign primitive, but it could be built into e.g. the array package. Lars Hellström 2-Feb-2001 19:10:22-GMT,3101;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id MAA11016 for ; Fri, 2 Feb 2001 12:10:20 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12JA6u28662; Fri, 2 Feb 2001 20:10:06 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id UAA02015; Fri, 2 Feb 2001 20:09:57 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486414 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 20:09:56 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA02008 for ; Fri, 2 Feb 2001 20:09:54 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA16526 for ; Fri, 2 Feb 2001 20:09:54 +0100 Received: from ams.org (sun06.ams.org [130.44.1.6]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12J9su28567 for ; Fri, 2 Feb 2001 20:09:54 +0100 (MET) Received: (from mjd@localhost) by ams.org (8.11.1/8.11.1) id f12J9qs21856; Fri, 2 Feb 2001 14:09:52 -0500 (EST) References: <14968.6710.114015.220264@ux28.nets.de.eds.com> <200101292234.RAA14964@pluto.math.albany.edu> <14967.8829.903878.620595@istrati.zdv.uni-mainz.de> <200101310003.BAA02073@peano.cs.uni-dortmund.de> <14967.46479.253389.421142@istrati.zdv.uni-mainz.de> <14968.6710.114015.220264@ux28.nets.de.eds.com> Lines: 11 X-Mailer: Gnus v5.7/Emacs 20.7 Message-ID: Date: Fri, 2 Feb 2001 14:09:52 -0500 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Michael John Downes Subject: Re: inputenc -> text+math To: Multiple recipients of list LATEX-L In-Reply-To: Lars Hellström's message of "Fri, 2 Feb 2001 18:47:16 +0100" Lars Hellström writes: > What I suspect is the right solution is to have \protect set to > \@unexpandable@protect when scanning for \omit and have it reset to > \@typeset@protect in the column templat ... > ... I doubt this can be > done by patching the \halign primitive, but it could be built into e.g. the > array package. I agree. 2-Feb-2001 21:08:54-GMT,7795;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id OAA14912 for ; Fri, 2 Feb 2001 14:08:53 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12L8au21383; Fri, 2 Feb 2001 22:08:36 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id WAA03159; Fri, 2 Feb 2001 22:08:25 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486452 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 22:08:24 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id WAA03150 for ; Fri, 2 Feb 2001 22:08:23 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id WAA47778 for ; Fri, 2 Feb 2001 22:08:24 +0100 Received: from moutvdom01.kundenserver.de (moutvdom01.kundenserver.de [195.20.224.200]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12L8Nu21271 for ; Fri, 2 Feb 2001 22:08:24 +0100 (MET) Received: from [195.20.224.204] (helo=mrvdom00.kundenserver.de) by moutvdom01.kundenserver.de with esmtp (Exim 2.12 #2) id 14OnRK-0006ot-00 for LATEX-L@urz.uni-heidelberg.de; Fri, 2 Feb 2001 22:08:18 +0100 Received: from manz-3e36464d.pool.mediaways.net ([62.54.70.77] helo=istrati.zdv.uni-mainz.de) by mrvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14OnR2-00056k-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 22:08:00 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id WAA13293; Fri, 2 Feb 2001 22:06:00 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <14968.34118.306909.315983@istrati.zdv.uni-mainz.de> <200101312200.XAA09346@bar.loria.fr> <14969.12533.759505.917813@istrati.zdv.uni-mainz.de> <14970.60068.179603.570418@fell.open.ac.uk> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14971.8503.549122.613285@istrati.zdv.uni-mainz.de> Date: Fri, 2 Feb 2001 22:05:59 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <14970.60068.179603.570418@fell.open.ac.uk> Chris wrote: > > a bit inconsistent that, isn't it? > > Not really: since input encoding really does mean just that. i meant inconsistent that we got input encodings fine but font encodings not (or rather font encodings as well but missed out an important extra bit) > Once the text is `inside LaTeX' the input encoding is irrelevant: that > is the beauty and strength of the LaTeX text character model. yes it is :-) so inputencodings are fine. but the problem that i was trying to point at is this: assuming we have a bit of text in the internal LaTeX representation, eg this: Trank der G\"otter \M{d} Trank der ... then there is no way for LaTeX without further help to determine the best font encoding to typeset this in. why is this so? - one first would need to analyse the whole text to find out which collection of glyphs are needed (that would result in a number of possible encodings, but it also might result in the need for more than one encoding) - but which of the possible encodings to use can depend on factors like do i have the desired fonts in this encoding or only in others ... anyway, already the first analysis is a problem inside TeX because TeX works sequentially so you would need to implement a multi pass system leaning about all the snippest of text as you go along and then reuse that information on later passes. looks like a nightmare to me. so if TeX can't do it automatically, we have to tell it what to use and with NFSS2 we need to tell it which font encodings to use at those points. And this is bad because users shouldn't be forced to bother about this font only available in encoding A and that one in B and ... Karsten pointed to some undocumented alpha code autofe.sty which attempts to provide a solution for the problem. But this really is intended for a different environment where you can (or more easily) change font encodings as you go along. so back to the strange text above and think about how some algorithm (like autofe) would work on finding the right encodings. assuming we start in OT1 Trank der G % no problem up to this point \"o %* ahh, now this is in OT1 but it would be far better to use T1 % now. but switching would be bad as well since we are in the % middle of a word ... tter % so we are now either in T1 or OT1 depending on the decision % above \M{d} % but this strange beast only exists in T4 so we have to switch Trank der %* so what do we use now for this? % T4 does contain those letter. do we carry on? whatever happens at the points marked * the typeset result would be a mess. when we write \fontencoding{FOO}\selectfont we tell the system that we want it to select a font with the current characteristics (ie family,shape...) in a very specific encoding but what we actually only should say is "the following text is in a certain glyph collection, ie contains certain glyphs" we unfortunately can't express the latter so we are forced to do the former. with moving argument, eg a section head this becomes a real problem. if the section head is, say in Russian (as in Denis example) we have to somehow state that the glyph collection for typesetting is one with cyrillic characters. since we have no concept for this we can only express that it should be in the encoding TA2 or X2 or whatever, which is (technically) fine for the heading itself being typeset. but passing the information about the FONT encoding to, say, the toc is wrong, since the toc might be typeset with different fonts or different sizes for which we do not have TA2 fonts but only X2 fonts this is i think a longer example of what Chris wrote: > > but would it help if the language has a tie > > to the [font] encoding? > > Whether the `intended font encoding' should be part of a moving > argument leads to an important question. > > Note the word `intended': will it always be the case that text from a > moving argument should be turned into glyphs using the same font encoding > as was used for the original text? no it need not, it only needs the same glyph collection. so we would do better by tying "glyph collections" to languages and let the system worry about which actual font encoding to use given other constraints during the typesetting process. this is the kind of extension NFSS2 would need in my opinion. frank 2-Feb-2001 21:35:56-GMT,6208;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id OAA15709 for ; Fri, 2 Feb 2001 14:35:55 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12LZeu25810; Fri, 2 Feb 2001 22:35:40 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id WAA03372; Fri, 2 Feb 2001 22:35:33 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486465 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 22:35:32 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id WAA03365 for ; Fri, 2 Feb 2001 22:35:31 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id WAA45366 for ; Fri, 2 Feb 2001 22:35:32 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f12LZVu25775 for ; Fri, 2 Feb 2001 22:35:31 +0100 (MET) Received: from [195.20.224.204] (helo=mrvdom00.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14Onrf-0002x6-00 for LATEX-L@urz.uni-heidelberg.de; Fri, 2 Feb 2001 22:35:31 +0100 Received: from manz-3e364887.pool.mediaways.net ([62.54.72.135] helo=istrati.zdv.uni-mainz.de) by mrvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14OnrK-0005Af-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Fri, 2 Feb 2001 22:35:11 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id WAA13415; Fri, 2 Feb 2001 22:33:39 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <14968.6710.114015.220264@ux28.nets.de.eds.com> <200101292234.RAA14964@pluto.math.albany.edu> <14967.8829.903878.620595@istrati.zdv.uni-mainz.de> <200101310003.BAA02073@peano.cs.uni-dortmund.de> <14967.46479.253389.421142@istrati.zdv.uni-mainz.de> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14971.10163.320991.33744@istrati.zdv.uni-mainz.de> Date: Fri, 2 Feb 2001 22:33:39 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: inputenc -> text+math To: Multiple recipients of list LATEX-L In-Reply-To: Lars wrote: > > > >will have broken text inside, wouldn't it? or do i overlook something? > > There is an \ifmmode in \@inmathwarn, which appears in both \@current@cmd > and \@changed@cmd. If that isn't fully expandable all text command might be > affected by broken ligatures and kerns. yes this is why I claim it is wrong. However, thinking about it: for cyrillic this isn't a problem (most likely) since the internal representation of text is then consisting nearly exclusively of font encoding specific commands so the first glyph would already reset the \ifmmode setting and thus no ligatures get broken. the problem only appears in languages which contain a good deal of ascii so that the first font encoding specific command might appear in the middle of a word. > Simply fiddling with the definition of \ifmmode isn't sufficient for taking > care of all related problems. The small document > > \documentclass{article} > \usepackage{array} > \usepackage[T1]{fontenc} > > \begin{document} > > \begin{tabular}{>{\fontencoding{OT1}\selectfont}l} who would do such a thing? :-) actually more or less the same sample document was shown to me last week. yes you can't change font encodings safely inside array's >{...} right now. > However, I noticed that a \noexpand\empty will stop the alignment mechanism > from looking any further for an \omit, so it isn't necessary to insert an > actual command to prevent it from scanning any further. Unfortunately a > \noexpand\empty also breaks ligatures (and most likely kerning as well; I > haven't checked) so simply inserting that into e.g. \IeC and \T1-cmd and > friends isn't the solution either. \noexpand\@empty is a somewhat expensive way to say \relax actually :-) > What I suspect is the right solution is to have \protect set to > \@unexpandable@protect when scanning for \omit and have it reset to > \@typeset@protect in the column template---then the robustness mechanisms > for normal robust commands, text commands, and in the \IeC command > respectively would take care of sorting things out. I doubt this can be > done by patching the \halign primitive, but it could be built into e.g. the > array package. yes, that would solve the problem i'm pretty sure of it but as i said in my earlier mail this really doesn't help because it would then only work in array but not in, any contributed package that uses \halign (this is why Vladimir tries to patch \halign). perhaps one should investigate combining your solution (change of protect) with a patch to \halign after all, eg via a clever use of \everycr and the like. where is the solution Donald? this should be the kind of problem you like to tackle, or not? frank 3-Feb-2001 14:22:14-GMT,6522;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id HAA05874 for ; Sat, 3 Feb 2001 07:22:13 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f13ELgu08422; Sat, 3 Feb 2001 15:21:42 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id PAA11094; Sat, 3 Feb 2001 15:19:18 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486782 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sat, 3 Feb 2001 15:19:17 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA11087 for ; Sat, 3 Feb 2001 15:19:16 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA41284 for ; Sat, 3 Feb 2001 15:19:16 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f13EJGu08220 for ; Sat, 3 Feb 2001 15:19:16 +0100 (MET) Received: from [195.20.224.209] (helo=mrvdom02.schlund.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14P3X2-0001NT-00 for LATEX-L@urz.uni-heidelberg.de; Sat, 3 Feb 2001 15:19:16 +0100 Received: from manz-3e36464b.pool.mediaways.net ([62.54.70.75] helo=istrati.zdv.uni-mainz.de) by mrvdom02.schlund.de with esmtp (Exim 2.12 #2) id 14P3XQ-0007uX-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Sat, 3 Feb 2001 15:19:41 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id PAA01253; Sat, 3 Feb 2001 15:15:04 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <14968.34118.306909.315983@istrati.zdv.uni-mainz.de> <200101312200.XAA09346@bar.loria.fr> <14969.12533.759505.917813@istrati.zdv.uni-mainz.de> <14970.60068.179603.570418@fell.open.ac.uk> <14971.8503.549122.613285@istrati.zdv.uni-mainz.de> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14972.4711.612679.929965@istrati.zdv.uni-mainz.de> Date: Sat, 3 Feb 2001 15:15:03 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: glyph collections viz font encodings To: Multiple recipients of list LATEX-L In-Reply-To: <14971.8503.549122.613285@istrati.zdv.uni-mainz.de> after having handwaved myself through the ideas of specifying glyph collections rather than font encodings, here is a "hand waving" sort of implementation of the idea. basically \fontencoding is changed to accept a comma list of encodings and \selectfont is changed to try these encodings in order (keeping the other font characteristics) until it finds a font or runs out of encodings. in the latter case it trys to find a font by changing the characteristics to defaults. the latter process could and should be made smarter, eg given encodings T1,OT1 family xxx series yyy shape zzz it will try T1/xxx/yyy/zzz OT1/xxx/yyy/zzz T1/xxx/yyy/ T1/xxx// T1/// FAIL instead of T1/xxx/yyy/zzz OT1/xxx/yyy/zzz T1/xxx/yyy/ OT1/xxx/yyy/ T1/xxx// OT1/xxx// ... which might be a better approach. anyway, here it is and it seems to work more or less. what do you think? frank --------------------------- glyphcoll.tex \documentclass{article} \usepackage[T1,OT6,OT1]{fontenc} \makeatletter \DeclareRobustCommand\fontencoding[1]{\edef\glyph@collection{#1}} \DeclareRobustCommand\Xfontencoding[1]{% \expandafter\ifx\csname T@#1\endcsname\relax \@latex@error{Encoding scheme `#1' unknown}\@eha \else \edef\f@encoding{#1}% \ifx\cf@encoding\f@encoding \let\enc@update\relax \else \let\enc@update\@@enc@update \fi \fi } \DeclareRobustCommand\Xselectfont {% \ifx\f@linespread\baselinestretch \else \set@fontsize\baselinestretch\f@size\f@baselineskip \fi \expandafter\get@next@encoding\glyph@collection,\@nil \xdef\font@name{% \csname\curr@fontshape/\f@size\endcsname}% \Xpickup@font \font@name \size@update \enc@update } \def\get@next@encoding#1,#2\@nil{% \Xfontencoding{#1}% \def\sub@glyph@collection{#2}% } \def\Xpickup@font{% \@font@warning{Trying for \font@name ...}% \expandafter \ifx \font@name \relax \Xdefine@newfont \fi} \def\Xdefine@newfont{% \@font@warning{ ... undefined ... try to load it ...}% \begingroup \let\typeout\@font@info \escapechar\m@ne \expandafter\expandafter\expandafter \split@name\expandafter\string\font@name\@nil \try@load@fontshape % try always \expandafter\ifx \csname\curr@fontshape\endcsname \relax \@font@warning{ ... unloadable ... }% \Xwrong@fontshape\else \@font@warning{ ... loadable ... extract it}% \extract@font\fi \endgroup} \def\Xwrong@fontshape{% \expandafter\get@next@encoding\sub@glyph@collection,\@nil \ifx\f@encoding\@empty \expandafter\get@next@encoding\glyph@collection,\@nil \wrong@fontshape \else \xdef\font@name{% \csname\curr@fontshape/\f@size\endcsname}% \Xpickup@font \fi } \makeatother \begin{document} \fontfamily{cmr} \fontencoding{OT6,T1}\selectfont AA \fontfamily{ptm}\selectfont BB \fontseries{b}\selectfont CC \showoutput \stop 4-Feb-2001 12:48:01-GMT,4406;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id FAA00435 for ; Sun, 4 Feb 2001 05:48:00 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f14ClKu15796; Sun, 4 Feb 2001 13:47:20 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id NAA22104; Sun, 4 Feb 2001 13:46:50 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486577 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sun, 4 Feb 2001 13:46:49 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id NAA22097 for ; Sun, 4 Feb 2001 13:46:48 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id NAA44440 for ; Sun, 4 Feb 2001 13:46:49 +0100 Received: from smtp.wanadoo.es (m1smtpisp02.wanadoo.es [62.36.220.21] (may be forged)) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f14Cknu15738 for ; Sun, 4 Feb 2001 13:46:49 +0100 (MET) Received: from [62.36.68.228] (usuario2-36-68-228.dialup.uni2.es [62.36.68.228]) by smtp.wanadoo.es (8.10.2/8.10.2) with ESMTP id f14CkjS18998 for ; Sun, 4 Feb 2001 13:46:46 +0100 (MET) X-Mailer: Microsoft Outlook Express Macintosh Edition - 4.5 (0410) Mime-version: 1.0 X-Priority: 3 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Message-ID: <200102041246.f14CkjS18998@smtp.wanadoo.es> Date: Sun, 4 Feb 2001 13:43:51 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Javier Bezos Subject: Re: glyph collections viz font encodings To: Multiple recipients of list LATEX-L > after having handwaved myself through the ideas of specifying glyph > collections rather than font encodings, here is a "hand waving" sort of > implementation of the idea. > > basically \fontencoding is changed to accept a comma list of encodings and > \selectfont is changed to try these encodings in order (keeping the other font > characteristics) until it finds a font or runs out of encodings. in the latter > case it trys to find a font by changing the characteristics to defaults. > > the latter process could and should be made smarter, eg given > > encodings T1,OT1 > family xxx > series yyy > shape zzz Hi, Apparently, the sample always loads the ot1 variants, no matter which encoging is selected. I think that you mean \DeclareRobustCommand\selectfont ...etc. instead of \DeclareRobustCommand\Xselectfont ...etc. But anyway... As you know, I was experinmenting a couple of month ago with this idea in my draft for Lambda (the multilingual environment for Omega). However, I found several problems. For example: - if I say \fontencoding{T1,OT1} we will get t1cmr which points to another font (ec) and not to a t1 encoded cmr, - more importantly, we lost the control of the final result, because a faked accented letter may be not exactly the same as an actual composite letter. It so happens that no TeX installations are the same and perhaps a different font in selected in another system just because a file has not been installed. Despite that, I think that is the right way, and I'm studying how to solve these issues. Any ideas? Javier ___________________________________________________________ Javier Bezos | TeX y tipografia jbezos at wanadoo dot es | http://perso.wanadoo.es/jbezos/ ........................................................... CervanTeX http://apolo.us.es/CervanTeX/CervanTeX.html 4-Feb-2001 16:16:46-GMT,7129;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id JAA04264 for ; Sun, 4 Feb 2001 09:16:44 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f14GGSu12791; Sun, 4 Feb 2001 17:16:28 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id RAA24666; Sun, 4 Feb 2001 17:16:11 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486668 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sun, 4 Feb 2001 17:16:10 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA24659 for ; Sun, 4 Feb 2001 17:16:09 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA48752 for ; Sun, 4 Feb 2001 17:16:09 +0100 Received: from moutvdom01.kundenserver.de (moutvdom01.kundenserver.de [195.20.224.200]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f14GG7u12757 for ; Sun, 4 Feb 2001 17:16:08 +0100 (MET) Received: from [195.20.224.204] (helo=mrvdom00.kundenserver.de) by moutvdom01.kundenserver.de with esmtp (Exim 2.12 #2) id 14PRpZ-0007Zt-00 for LATEX-L@urz.uni-heidelberg.de; Sun, 4 Feb 2001 17:16:01 +0100 Received: from manz-3e364712.pool.mediaways.net ([62.54.71.18] helo=istrati.zdv.uni-mainz.de) by mrvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14PRpL-0007fR-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Sun, 4 Feb 2001 17:15:48 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id RAA22481; Sun, 4 Feb 2001 17:13:55 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200102041246.f14CkjS18998@smtp.wanadoo.es> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14973.32706.432640.436896@istrati.zdv.uni-mainz.de> Date: Sun, 4 Feb 2001 17:13:54 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: glyph collections viz font encodings To: Multiple recipients of list LATEX-L In-Reply-To: <200102041246.f14CkjS18998@smtp.wanadoo.es> Javier, > Apparently, the sample always loads the ot1 variants, no matter > which encoging is selected. I think that you mean > > \DeclareRobustCommand\selectfont ...etc. > > instead of > > \DeclareRobustCommand\Xselectfont ...etc. hmmm, yes. plead guilty on that one. thought i had tested it after making final changes from the private version to the one overwriting the nfss primitives, but ... > But anyway... yes, anyway. please everybody: change the above in the sample if you try it. > As you know, I was experinmenting a couple of month ago with this idea > in my draft for Lambda (the multilingual environment for Omega). was actually not aware of that (that you experimented with multiple encodings) > However, I found several problems. For example: > - if I say \fontencoding{T1,OT1} we will get t1cmr which points to another > font (ec) and not to a t1 encoded cmr, perhaps we should have that as a separate debate, but the ec fonts where supposed to be extended cmr at least this is what they started out to be. I know that Joerg in the end did one or the other change but at least the original intention was that those fonts should have been indistinguishable on their common subset of glyphs (and this is why in LaTeX they are considered both family cm). if a lot of people think this is not the case then this opens an important discussion about what to do with them, but it doesn't seem to me a criticism or a problem with the general approach taken in my sample code. > - more importantly, we lost the control of the final result, because > a faked accented letter may be not exactly the same as an actual composite > letter. It so happens that no TeX installations are the same and perhaps > a different font in selected in another system just because a file has not > been installed. but this is true already, isn't it? as of today a formatting of latex document depends on a number of factors, one of which is the available fonts. so 100% output compatibility is only achieved if you - have an identical set of fd files - have identical metrics (this was especially with PostScript fonts an issue in the past) - actually have the fonts installed that the fd files and metrics are pointing to. so i don't see that the situation would get a different quality. Agreed, with more possibilities you are likely (potentially at least) to get a wider range of results; but on the other hand either you let the system take responsibility (which means trying to find fonts suitable for the intended script (glyph collection)) or you force this selection onto the user. And we know that the latter is unsatisfactory as well since not many people do understand why they need to say \usepackage[...]{fontenc} etc, or rightly feel that they should not have to worry about it. i don't really see that there is a chance in the world to achieve 100% output compatibility between sites unless you enforce a far more rigid scheme (which isn't really possible). I mean you would need to define a far bigger set of files to be untouchable and required and you need to also enforce and you don't have additional files that might change your setup. if I now write a document and specify \fontencoding{T1} it might not run at all at a site not having T1 fonts (though such a site is in theory not allowed to exist) or it might switch to ec as default fonts, while with a range of encodings i would get a result "closer" to the intended output. also please note that my code (after your fix:-) does both: you can still specify a single encoding and then only that encoding will get used ie you get the situation as it is now where the user has total control (assuming that fd files are the same). > Despite that, I think that is the right way, and I'm studying how to solve > these issues. Any ideas? do you have any other issues than the two above? you mentioned them as "for example". cheers frank 4-Feb-2001 21:07:25-GMT,4084;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id OAA06452 for ; Sun, 4 Feb 2001 14:07:24 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f14GYZu14290; Sun, 4 Feb 2001 17:34:35 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id RAA24908; Sun, 4 Feb 2001 17:34:30 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486685 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Sun, 4 Feb 2001 17:34:29 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA24901 for ; Sun, 4 Feb 2001 17:34:28 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA33302 for ; Sun, 4 Feb 2001 17:34:28 +0100 Received: from venus.open.ac.uk (venus.open.ac.uk [137.108.143.2]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f14GYRu14272 for ; Sun, 4 Feb 2001 17:34:27 +0100 (MET) Received: from fell.open.ac.uk by venus.open.ac.uk via SMTP Local (Mailer 3.1) with ESMTP; Sun, 4 Feb 2001 16:34:26 +0000 Received: (from car2@localhost) by fell.open.ac.uk (8.9.3+Sun/8.9.1) id QAA15339; Sun, 4 Feb 2001 16:34:39 GMT X-Authentication-Warning: fell.open.ac.uk: car2 set sender to car2@fell.open.ac.uk using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200102041246.f14CkjS18998@smtp.wanadoo.es> X-Mailer: VM 6.76 under Emacs 20.7.1 Message-ID: <14973.33950.634055.464928@fell.open.ac.uk> Date: Sun, 4 Feb 2001 16:34:38 +0000 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Chris Rowley Subject: Re: glyph collections viz font encodings To: Multiple recipients of list LATEX-L In-Reply-To: <200102041246.f14CkjS18998@smtp.wanadoo.es> Javier > - more importantly, we lost the control of the final result, because > a faked accented letter may be not exactly the same as an actual composite > letter. I am not quite sure what you are saying is the problem here. Do you think that these two should be `the same' in some ways in which they are not in current fonts? In particular, are concerned about: -- only differences in the final glyphs or -- differences in the metrics (these can cause major differences in the typesetting). In fact differences equally great in the actual glyphs can happen simply because two versions of what claims to be the same font are finally used in the actual rendering on some actual physical device. > It so happens that no TeX installations are the same and perhaps > a different font in selected in another system just because a file has not > been installed. This is true and a fact of the universe (and always will be). The interesting question is: what differences in the font resources are important? How can a typesetting system usefully interact with these differences? > > Despite that, I think that is the right way, and I'm studying how to solve > these issues. Any ideas? Probably: but please be specific about what issues you think need to be dealt with --- then maybe we can deal with some of them by turning them into non-issues:-). chris 5-Feb-2001 11:42:56-GMT,6771;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id EAA26517 for ; Mon, 5 Feb 2001 04:42:54 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f15BgXu18361; Mon, 5 Feb 2001 12:42:33 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id MAA06404; Mon, 5 Feb 2001 12:42:00 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 486957 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Mon, 5 Feb 2001 12:41:59 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA06397 for ; Mon, 5 Feb 2001 12:41:57 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA26460 for ; Mon, 5 Feb 2001 12:41:57 +0100 Received: from smtp.wanadoo.es (m1smtpisp02.wanadoo.es [62.36.220.21] (may be forged)) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f15Bfuu18187 for ; Mon, 5 Feb 2001 12:41:57 +0100 (MET) Received: from wanadoo.es (m1wmail1.wanadoo.es [62.36.220.41]) by smtp.wanadoo.es (8.10.2/8.10.2) with ESMTP id f15BfuS19825 for ; Mon, 5 Feb 2001 12:41:56 +0100 (MET) MIME-Version: 1.0 Content-Type: text/plain X-XaM3-API-Version: 1.1.11.1.5 X-SenderIP: 195.53.220.3 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id MAA06398 Message-ID: Date: Mon, 5 Feb 2001 12:41:56 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: jbezos Subject: Re: glyph collections viz font encodings To: Multiple recipients of list LATEX-L Frank, Chris, I must admit that I'm too critic with my own macros and that I try to study every possible implication of them, even if finally they are unimportant. > > However, I found several problems. For example: > > - if I say \fontencoding{T1,OT1} we will get t1cmr which points to another > > font (ec) and not to a t1 encoded cmr, > > perhaps we should have that as a separate debate, but the ec fonts where > supposed to be extended cmr at least this is what they started out to be. I > know that Joerg in the end did one or the other change but at least the > original intention was that those fonts should have been indistinguishable on > their common subset of glyphs (and this is why in LaTeX they are considered > both family cm). > > if a lot of people think this is not the case then this opens an important > discussion about what to do with them, but it doesn't seem to me a criticism > or a problem with the general approach taken in my sample code. Actually, this is not a criticism to this approach, just an issue. While there are free PostScript ot1cmr fonts, there are only MF t1cmr ones, which to me is a huge difference. Sometimes I combine Palatino (T1) and cmtt (OT1), and \selectfont{T1,OT1} is not enough. The solution I took in my macros was to allow explicit declarations like: \SetFontEnconding{cmtt}{OT1} > > - more importantly, we lost the control of the final result, because > > a faked accented letter may be not exactly the same as an actual composite > > letter. It so happens that no TeX installations are the same and perhaps > > a different font in selected in another system just because a file has not > > been installed. > > but this is true already, isn't it? as of today a formatting of latex document > depends on a number of factors, one of which is the available fonts. so 100% > output compatibility is only achieved if you > > - have an identical set of fd files > - have identical metrics (this was especially with PostScript fonts an issue > in the past) > - actually have the fonts installed that the fd files and metrics are > pointing to. But TeX complains, except in the case of locally generated metrics. One of the solutions I considered was to generate a file recording the decisions taken in a system when a document is typeset, so that if we really want to ensure that TeX complains if there is a different configuration we can distribute that file with the main .tex ones. And after all if we really really want a certain layout the obvius solution is to distribute only a pdf file... > [Chris]-- differences in the metrics (these can cause major differences in > the typesetting). Yes, differences in the metrics, which can reshape the whole document. > if I now write a document and specify \fontencoding{T1} it might not run at > all at a site not having T1 fonts (though such a site is in theory not allowed > to exist) or it might switch to ec as default fonts, while with a range of > encodings i would get a result "closer" to the intended output. > > > also please note that my code (after your fix:-) does both: you can still > specify a single encoding and then only that encoding will get used ie you get > the situation as it is now where the user has total control (assuming that fd > files are the same). > > > Despite that, I think that is the right way, and I'm studying how to solve > > these issues. Any ideas? > > do you have any other issues than the two above? you mentioned them as "for > example". There are in part related to the fact that a language or a script can provide a set of default encogings. (Note: in the current draft for Lambda there are files for both languages and scripts). But I think that: > also please note that my code (after your fix:-) does both: you can still > specify a single encoding and then only that encoding will get used ie you get > the situation as it is now where the user has total control (assuming that fd > files are the same). is the best solution. Cheers Javier ______________________________________________________________________________ Consigue tu cuenta de correo universal y gratuita en http://webmail.wanadoo.es 5-Feb-2001 14:58:18-GMT,4129;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id HAA00675 for ; Mon, 5 Feb 2001 07:58:16 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f15Evmu08730; Mon, 5 Feb 2001 15:57:48 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id PAA10159; Mon, 5 Feb 2001 15:57:21 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 487243 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Mon, 5 Feb 2001 15:57:20 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA10152 for ; Mon, 5 Feb 2001 15:57:19 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id PAA85284 for ; Mon, 5 Feb 2001 15:57:18 +0100 Received: from knatte.tninet.se (knatte.tninet.se [195.100.94.10]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f15EvGu08572 for ; Mon, 5 Feb 2001 15:57:16 +0100 (MET) Received: (qmail 25171 invoked from network); 5 Feb 2001 15:57:15 +0100 Received: from delenn.tninet.se (HELO algonet.se) (195.100.94.104) by knatte.tninet.se with SMTP; 5 Feb 2001 15:57:15 +0100 Received: from [195.100.226.134] (du134-226.ppp.su-anst.tninet.se [195.100.226.134]) by delenn.tninet.se (BLUETAIL Mail Robustifier 2.2.1) with ESMTP id 59298.385033.981delenn-s2 for ; Mon, 05 Feb 2001 15:57:13 +0100 X-Sender: haberg@pop.matematik.su.se (Unverified) References: J%ORG KNAPPEN's message of "Fri, 2 Feb 2001 13:00:06 +0100" <01JZMVN1N7XK0009XR@ALPHA.NTP.SPRINGER.DE> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id PAA10153 Message-ID: Date: Mon, 5 Feb 2001 11:28:47 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Hans Aberg Subject: Re: inputenc text (and/or math) To: Multiple recipients of list LATEX-L In-Reply-To: At 09:05 -0500 2001/02/02, Michael John Downes wrote: >> Yes, this was my intention. If I type "×" (the times sign in >> Latin-1,2,3, and 4) I want that it works both in text and in math mode >>giving >> something sensible (i.e. \textmultiply or \times resp.) I don't want to >> have the command \times in text mode and \textmultiply in math mode >>necessarily > >The text/math ambiguity is a major problem for higher-level user >interfaces like Scientific Word. Should there not be a text-word mode, entirely dedicated at construction natural language words, a text-symbol mode for non-math symbols, and a math mode for math typesetting. >If the user enters \gamma + 1 without >first starting a math formula, then the proper way for the software to >write it is: > > \textgamma \textplus 1 > >instead of > > $\gamma + 1$ Then an expression like \gamma + 1 without first starting a math formula would simply fail as `+' and `1' are not used in froming natural language words. Also, if there is a non-math use of the times sign in Latin-x, x in [1, 4], one would have to put it in the text-symbol environment. Hans Aberg 5-Feb-2001 23:28:56-GMT,3368;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id QAA17917 for ; Mon, 5 Feb 2001 16:28:54 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f15NRSu26217; Tue, 6 Feb 2001 00:27:28 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id AAA21569; Tue, 6 Feb 2001 00:26:08 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 487403 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 00:26:07 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id AAA21562 for ; Tue, 6 Feb 2001 00:26:06 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id AAA121750 for ; Tue, 6 Feb 2001 00:26:06 +0100 Received: from venus.open.ac.uk (venus.open.ac.uk [137.108.143.2]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f15NQ5u26048 for ; Tue, 6 Feb 2001 00:26:05 +0100 (MET) Received: from fell.open.ac.uk by venus.open.ac.uk via SMTP Local (Mailer 3.1) with ESMTP; Mon, 5 Feb 2001 23:26:03 +0000 Received: (from car2@localhost) by fell.open.ac.uk (8.9.3+Sun/8.9.1) id XAA16090; Mon, 5 Feb 2001 23:26:17 GMT X-Authentication-Warning: fell.open.ac.uk: car2 set sender to car2@fell.open.ac.uk using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: X-Mailer: VM 6.76 under Emacs 20.7.1 Message-ID: <14975.13976.503037.872425@fell.open.ac.uk> Date: Mon, 5 Feb 2001 23:26:16 +0000 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Chris Rowley Subject: Re: glyph collections viz font encodings To: Multiple recipients of list LATEX-L In-Reply-To: Javier > > [Chris]-- differences in the metrics (these can cause major differences in > > the typesetting). > > Yes, differences in the metrics, which can reshape the whole document. Oh dear, yes! This does rather mess-up fine typesetting. My feeling is that since the fonts available are going to get more and more diverse (if only very slowly) the robust medium-term solution for a TeX-like typesetting engine is to add the ability to bundle the font-metrics with the document (possibly virtually when network font warehouses really exist). This is analogical to putting the glyph information into a ps or pdf form of a document. chris 6-Feb-2001 9:19:19-GMT,5804;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id CAA01133 for ; Tue, 6 Feb 2001 02:19:17 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f169J7u06145; Tue, 6 Feb 2001 10:19:07 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id KAA26383; Tue, 6 Feb 2001 10:17:06 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 487762 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 10:17:05 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA26376 for ; Tue, 6 Feb 2001 10:17:04 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA38784 for ; Tue, 6 Feb 2001 10:17:04 +0100 Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f169H4u05609 for ; Tue, 6 Feb 2001 10:17:04 +0100 (MET) Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef.ens.fr (8.10.1/1.01.28121999) with ESMTP id f169H3485465 for ; Tue, 6 Feb 2001 10:17:03 +0100 (CET) Received: from (ebrunet@localhost) by clipper.ens.fr (8.9.2/jb-1.1) Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2i Message-ID: <20010206101702.A5774@clipper.ens.fr> Date: Tue, 6 Feb 2001 10:17:03 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Eric Brunet Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L Sorry for replying late. Hey, in the internet age, a 4 days delay is a long time... Frank Mittelbach wrote: > what i mean is that most people write their document in a single input > encoding and do not switch that encoding (or even can switch) just > because they switch from one language to another. Sure. Mainly because usualy people don't switch from a language to another, or, if they do, it is usually languages with compatible encodings. But I imagine (maybe wrongly) that you need to switch encodings when writing an english-russian document. > Anyway, for font encodings a default setting different from the system default > (if necessary) does make sense and current babel already tries to do that, > though as Denis report shows not always successfully. I am happy to hear that. Now, about input encodings... > furthermore, because of the argument that the input encoding doesn't really > change "wenn ich jetzt in Deutsch schreibe" (both are latin1 as far as > this mail is concerned) so for english and german and french it should > probably be the same. ansinew because a lot of people use PCs? or latin1 > because Linux is going to take over the world? or should it change in a > year or two when the latter happens --- with the result that then older > documents would compile incorrectly because they assume the no longer > correct default? I should go to the latin1 by default, because it is somehow a more accepted standard (in the sens that it is an ISO standard) than ansinew. But we are lucky: ansinew and latin1 are compatible, in the sens that latin1 is a subset of ansinew (there are 24 extra characters in ansinew, in the 130--159 range), so a source.tex composed in the ansinew encoding would be readable on a unix system, except for some very rare characters. Probably the best of all worlds would be to advertize and document that latin1 is the default encoding (for standards compliance), and thus encourage people to use \oe or \dots or -- instead of the characters 156, 133 or 150, but silently accept all the extra ansinew characters so that careless window users don't get surprised. What is sure, is that once a default encoding is choosen, it will be hard to change it (the only way would probably be to change \documentclass into \documenttype :-) > finally applying the wrong input encoding to a document not in that > encoding results in typesetting errors but not in compilation errors. > true, this can also happen if you explicitly specify the wrong encoding > but this is a conscious act (or so we would hope) and not something htat > happens behind the scene I have seen many beginners that begin typing in french their document without declaring an inputenc, and not realizing at once that accents are missing in the document. I would not call forgetting an \usepackage declaration a conscious act. > which reminds me: please take the list of languages babel currently supports > and attach to them input/font encoding defaults that would be suitable, i > would really be interested in see such a list (and have it disucssed) Oh, I am certainly not able to do that. If I was to make a choice, I would use the appropriate latinxxx encodings for each language, but I am certainly not qualified to choose for all those languages. Éric 6-Feb-2001 9:45:15-GMT,3222;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id CAA01594 for ; Tue, 6 Feb 2001 02:45:09 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f169j0u13451; Tue, 6 Feb 2001 10:45:00 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id KAA27264; Tue, 6 Feb 2001 10:43:32 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 487821 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 10:43:31 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA27257 for ; Tue, 6 Feb 2001 10:43:30 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id KAA98728 for ; Tue, 6 Feb 2001 10:43:30 +0100 Received: from smtp.wanadoo.es (m1smtpisp02.wanadoo.es [62.36.220.21] (may be forged)) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f169hUu12915 for ; Tue, 6 Feb 2001 10:43:30 +0100 (MET) Received: from wanadoo.es (m1wmail1.wanadoo.es [62.36.220.41]) by smtp.wanadoo.es (8.10.2/8.10.2) with ESMTP id f169hTS20412 for ; Tue, 6 Feb 2001 10:43:29 +0100 (MET) MIME-Version: 1.0 Content-Type: text/plain X-XaM3-API-Version: 1.1.11.1.5 X-SenderIP: 195.53.220.3 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id KAA27258 Message-ID: Date: Tue, 6 Feb 2001 10:43:29 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: jbezos Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L > But we are lucky: ansinew and latin1 are compatible, in the sens that > latin1 is a subset of ansinew (there are 24 extra characters in ansinew, > in the 130--159 range), Not at all. In latin1, character in the range 130--159 are assigned to control characters (and hence they are not free), and Unix editors may use them. Further, a character in the ASCII range (I have not the tables at hands, but IIRC is right single quote) has been reassigned in ansinew. Regards Javier ______________________________________________________________________________ Consigue tu cuenta de correo universal y gratuita en http://webmail.wanadoo.es 6-Feb-2001 10:17:12-GMT,4752;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id DAA02163 for ; Tue, 6 Feb 2001 03:17:11 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16AH7u00989; Tue, 6 Feb 2001 11:17:07 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id LAA28214; Tue, 6 Feb 2001 11:15:39 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 487890 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 11:15:38 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA28207 for ; Tue, 6 Feb 2001 11:15:36 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA44032 for ; Tue, 6 Feb 2001 11:15:37 +0100 Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16AFbu29944 for ; Tue, 6 Feb 2001 11:15:37 +0100 (MET) Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef.ens.fr (8.10.1/1.01.28121999) with ESMTP id f16AFa492364 for ; Tue, 6 Feb 2001 11:15:36 +0100 (CET) Received: from (ebrunet@localhost) by clipper.ens.fr (8.9.2/jb-1.1) Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2i Message-ID: <20010206111535.A18424@clipper.ens.fr> Date: Tue, 6 Feb 2001 11:15:35 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Eric Brunet Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L jbezos wrote: > Not at all. In latin1, character in the range 130--159 > are assigned to control characters (and hence I wouldn't say it like that: the 128--159 range is reserved so that if some document pass through a buggy programm that strips the eight bit, then the resulting document doesn't have any extra control characters (in the 0--31 range) with some ``interesting'' properties. > they are not free), and Unix editors may > use them. No, they may not use it in a text file, as they are not text characters. On the other hand, all the text editors I know are perfectly able to handle binary files, or meaningless characters in a text file. If someone runs any editor I know of on an ansinew file, he will some strange codes in lieu of the ansinew characters, and that would be it. Do you know of any editor which misbehaves ? And finally, the important think is that LaTeX would get the document right (which it would). > Further, a character in the ASCII > range (I have not the tables at hands, but IIRC is > right single quote) has been reassigned in ansinew. $ diff ansinew.def latin1.def| grep DeclareInputText < \DeclareInputText{130}{\quotesinglbase} < \DeclareInputText{131}{\textflorin} < \DeclareInputText{132}{\quotedblbase} < \DeclareInputText{133}{\dots} < \DeclareInputText{134}{\dag} < \DeclareInputText{135}{\ddag} < \DeclareInputText{136}{\^{}} < \DeclareInputText{137}{\textperthousand} < \DeclareInputText{138}{\v S} < \DeclareInputText{139}{\guilsinglleft} < \DeclareInputText{140}{\OE} < \DeclareInputText{145}{\textquoteleft} < \DeclareInputText{146}{\textquoteright} < \DeclareInputText{147}{\textquotedblleft} < \DeclareInputText{148}{\textquotedblright} < \DeclareInputText{149}{\textbullet} < \DeclareInputText{150}{\textendash} < \DeclareInputText{151}{\textemdash} < \DeclareInputText{152}{\~{}} < \DeclareInputText{153}{\texttrademark} < \DeclareInputText{154}{\v s} < \DeclareInputText{155}{\guilsinglright} < \DeclareInputText{156}{\oe} < \DeclareInputText{159}{\"Y} well, it doesn't show in the def file. IIRC, the story is that the ascii single quote character looks vertical in windows fonts, and so people prefer (incorrectly) to use the character 146 as an apostrophe. Éric 6-Feb-2001 11:21:12-GMT,5154;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id EAA03378 for ; Tue, 6 Feb 2001 04:21:10 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16BKuu00562; Tue, 6 Feb 2001 12:20:56 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id MAA00088; Tue, 6 Feb 2001 12:20:42 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488031 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 12:20:41 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA00079 for ; Tue, 6 Feb 2001 12:20:40 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA101846 for ; Tue, 6 Feb 2001 12:20:41 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16BKfu00306 for ; Tue, 6 Feb 2001 12:20:41 +0100 (MET) Received: from [195.20.224.204] (helo=mrvdom00.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14Q6Aq-0006Uq-00 for LATEX-L@urz.uni-heidelberg.de; Tue, 6 Feb 2001 12:20:40 +0100 Received: from dialin360.zdv.uni-mainz.de ([134.93.175.60] helo=istrati.zdv.uni-mainz.de) by mrvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14Q6AE-0000yE-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 12:20:02 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id MAA20562; Tue, 6 Feb 2001 12:12:11 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <20010206101702.A5774@clipper.ens.fr> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14975.56331.365469.731085@istrati.zdv.uni-mainz.de> Date: Tue, 6 Feb 2001 12:12:11 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <20010206101702.A5774@clipper.ens.fr> Eric, > Sorry for replying late. Hey, in the internet age, a 4 days delay is a > long time... perhaps but sometimes on this list half a year is fast. > What is sure, is that once a default encoding is choosen, it will be hard > to change it (the only way would probably be to change \documentclass > into \documenttype :-) which is a problem at this current point in time as well. > > finally applying the wrong input encoding to a document not in that > > encoding results in typesetting errors but not in compilation errors. > > true, this can also happen if you explicitly specify the wrong encoding > > but this is a conscious act (or so we would hope) and not something htat > > happens behind the scene > > I have seen many beginners that begin typing in french their document > without declaring an inputenc, and not realizing at once that accents are > missing in the document. I would not call forgetting an \usepackage > declaration a conscious act. this is a problem agreed, because of this unfortunate fact of letting 8bit loose if no inputenc is specified. but providing an input encoding automatically would be a big compatibility problem and within 2e kernel we have a firm policy of not doing this (and if we move it to a package (aka inputenc) then you are back to your above problem). > > which reminds me: please take the list of languages babel currently supports > > and attach to them input/font encoding defaults that would be suitable, i > > would really be interested in see such a list (and have it disucssed) > > Oh, I am certainly not able to do that. If I was to make a choice, I > would use the appropriate latinxxx encodings for each language, but I am > certainly not qualified to choose for all those languages. who will? the user groups? for many lanugages there isn't a user group anyway, for the code that i'm currently writing i've added a way to specify inputencodings by language (or script) at least for the moment frank 6-Feb-2001 11:42:45-GMT,4124;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id EAA03798 for ; Tue, 6 Feb 2001 04:42:43 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16BgKu06629; Tue, 6 Feb 2001 12:42:20 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id MAA00851; Tue, 6 Feb 2001 12:42:13 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488108 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 12:42:12 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA00836 for ; Tue, 6 Feb 2001 12:41:58 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA77058 for ; Tue, 6 Feb 2001 12:41:58 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Bfwu06474 for ; Tue, 6 Feb 2001 12:41:58 +0100 (MET) Received: from [195.20.224.204] (helo=mrvdom00.kundenserver.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14Q6VR-0000Kc-00 for LATEX-L@urz.uni-heidelberg.de; Tue, 6 Feb 2001 12:41:57 +0100 Received: from dialin337.zdv.uni-mainz.de ([134.93.175.37] helo=istrati.zdv.uni-mainz.de) by mrvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14Q6V2-0000Op-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 12:41:33 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id MAA20676; Tue, 6 Feb 2001 12:34:47 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14975.57687.924527.572758@istrati.zdv.uni-mainz.de> Date: Tue, 6 Feb 2001 12:34:47 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: glyph collections viz font encodings To: Multiple recipients of list LATEX-L In-Reply-To: Javier, you wrote: > > do you have any other issues than the two above? you mentioned them as > "for > > example". > > There are in part related to the fact that a language or a script > can provide a set of default encogings. (Note: in the current draft > for Lambda there are files for both languages and scripts). > But I think that: > > > also please note that my code (after your fix:-) does both: you can still > > specify a single encoding and then only that encoding will get used ie you > get > > the situation as it is now where the user has total control (assuming that > fd > > files are the same). > > is the best solution. i don't; yet (see that it is the best solution) and i would be glad if you try to explain anything that is potentially a problem if one thinks about languages and scripts. yes i know that you differenciates between language and script and i'm currently musing over it :-) cheers frank 6-Feb-2001 11:42:46-GMT,5000;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id EAA03800 for ; Tue, 6 Feb 2001 04:42:44 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16BgEu06575; Tue, 6 Feb 2001 12:42:14 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id MAA00834; Tue, 6 Feb 2001 12:41:57 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488103 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 12:41:56 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA00827 for ; Tue, 6 Feb 2001 12:41:55 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id MAA44494 for ; Tue, 6 Feb 2001 12:41:54 +0100 Received: from moutvdom01.kundenserver.de (moutvdom01.kundenserver.de [195.20.224.200]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Bfsu06442 for ; Tue, 6 Feb 2001 12:41:54 +0100 (MET) Received: from [195.20.224.204] (helo=mrvdom00.kundenserver.de) by moutvdom01.kundenserver.de with esmtp (Exim 2.12 #2) id 14Q6VH-0006yn-00 for LATEX-L@urz.uni-heidelberg.de; Tue, 6 Feb 2001 12:41:47 +0100 Received: from dialin337.zdv.uni-mainz.de ([134.93.175.37] helo=istrati.zdv.uni-mainz.de) by mrvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14Q6V6-0000Op-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 12:41:36 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id MAA20670; Tue, 6 Feb 2001 12:31:39 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14975.57499.573295.373950@istrati.zdv.uni-mainz.de> Date: Tue, 6 Feb 2001 12:31:39 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: glyph collections viz font encodings To: Multiple recipients of list LATEX-L In-Reply-To: Javier, > Actually, this is not a criticism to this approach, just an issue. While > there are free PostScript ot1cmr fonts, there are only MF t1cmr ones, which > to me is a huge difference. Sometimes I combine Palatino (T1) and cmtt > (OT1), > and \selectfont{T1,OT1} is not enough. The solution I took in my macros was > to allow explicit declarations like: > \SetFontEnconding{cmtt}{OT1} found that by now in the code (it wasn't in the general documentation) well, can we analyse that a bit? what exactly is the problem here? or, say, why do you want to do that? it seems to me that this all boils down to "i want to ensure that all is Type1" so that i get proper pdf files. or am i wrong? perhaps i'm wrong and there are other reasons , but the above seems to me the kind of natural reason. so perhaps it is not the encoding you really want to force but to prevent selection of certain fonts. in that case, wouldn't it be better if we could come up with a different method of specifying this? assume for a moment that you have a font family with a large number of glyphs (say CM fonts :-) then a setting like yours \SetFontEnconding{cmtt}{OT1} would not work very well in a document with spanish and russian text. why? because there are russian CM fonts at least they are identified as cmr and cmtt and so on. (and i think this is not really wrong even though perhaps not absolutely right either. but they have been made to look and fit the latin CM fonts). so fixing the encoding to OT1 would through typerwriter in the russian part of the document off track since that one would need LCY encoding. i don't know what is the right approach but if it is really something related to type1 viz MF fonts then perhaps the whole thing should and can be done differently what do you think? frank 6-Feb-2001 12:58:15-GMT,3864;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id FAA05144 for ; Tue, 6 Feb 2001 05:58:13 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Cvvu27028; Tue, 6 Feb 2001 13:57:57 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id NAA02845; Tue, 6 Feb 2001 13:57:41 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488234 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 13:57:40 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id NAA02838 for ; Tue, 6 Feb 2001 13:57:39 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id NAA69124 for ; Tue, 6 Feb 2001 13:57:37 +0100 Received: from zambeze.ujf-grenoble.fr (zambeze.ujf-grenoble.fr [152.77.2.3]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Cvau26901 for ; Tue, 6 Feb 2001 13:57:37 +0100 (MET) Received: from mozart.ujf-grenoble.Fr (mozart.ujf-grenoble.fr [193.54.241.5]) by zambeze.ujf-grenoble.fr (Pro-8.9.3/8.9.3/Configured by AD & JE 25/10/1999) with ESMTP id NAA22996 for ; Tue, 6 Feb 2001 13:57:34 +0100 (MET) Received: (from bouche@localhost) by mozart.ujf-grenoble.Fr (8.9.3/8.8.5) id NAA02455; Tue, 6 Feb 2001 13:57:33 +0100 (MET) References: <20010206101702.A5774@clipper.ens.fr> X-Mailer: VM 6.22 under 19.15 XEmacs Lucid Mime-Version: 1.0 (generated by tm-edit 7.106) Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <200102061257.NAA02455@mozart.ujf-grenoble.Fr> Date: Tue, 6 Feb 2001 13:57:33 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Thierry Bouche Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <20010206101702.A5774@clipper.ens.fr> Concernant « Re: default inputenc/fontenc tight to language », Eric Brunet écrit : « » I should go to the latin1 by default, because it is somehow a more » accepted standard (in the sens that it is an ISO standard) than ansinew. You mean that people with macintoshes should have something do declare in their files and others not? And that this would be forced somewhere in latex or babel kernel? » I have seen many beginners that begin typing in french their document » without declaring an inputenc, and not realizing at once that accents are » missing in the document. Well, if they have T1/fontenc, they won't see anything wrong. As long as french is concerned, making latin1 the default is restrictive as T1 is mandatory for correct hyphenation/kerning and T1 coincides with latin1. You can live with only « » active with current latex, I don't see a good reason for requiring all >127 actives. Thierry Bouche __ « Ils vivent pour vivre, et nous, hélas ! nous vivons pour savoir. » Charles Baudelaire, Paris. 6-Feb-2001 13:09:04-GMT,3074;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id GAA05364 for ; Tue, 6 Feb 2001 06:09:03 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16D8qu02998; Tue, 6 Feb 2001 14:08:52 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id OAA03178; Tue, 6 Feb 2001 14:08:45 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488275 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 14:08:44 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id OAA03171 for ; Tue, 6 Feb 2001 14:08:43 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id OAA18134 for ; Tue, 6 Feb 2001 14:08:43 +0100 Received: from nag.co.uk (openmath.nag.co.uk [62.232.54.144]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16D8fu02936 for ; Tue, 6 Feb 2001 14:08:41 +0100 (MET) Received: (from davidc@localhost) by nag.co.uk (AIX4.2/UCB 8.7/8.7) id NAA23622; Tue, 6 Feb 2001 13:08:21 GMT References: <20010206101702.A5774@clipper.ens.fr> <200102061257.NAA02455@mozart.ujf-grenoble.Fr> Message-ID: <200102061308.NAA23622@nag.co.uk> Date: Tue, 6 Feb 2001 13:08:21 GMT Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: David Carlisle Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <200102061257.NAA02455@mozart.ujf-grenoble.Fr> (message from Thierry Bouche on Tue, 6 Feb 2001 13:57:33 +0100) > I don't see a good reason for requiring all >127 actives. For French. For English you could use 7bit TeX2, but in general to support multiple input encodings that are different to the font encodings you need them active. Changing catcodes anywhere causes multiple problems (It's not at all clear that the slight convenience feature of the \verb command was worth all the user errors and confusion it causes with \verb being used in arguments). So if you can't change catcodes and some encodings need the characters active, the only solution appeared to be (back then) that all characters above 127 should be active. David 6-Feb-2001 15:08:02-GMT,4158;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id IAA07979 for ; Tue, 6 Feb 2001 08:08:01 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16F7qu07622; Tue, 6 Feb 2001 16:07:52 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id QAA07250; Tue, 6 Feb 2001 16:07:33 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488365 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 16:07:33 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id QAA07243 for ; Tue, 6 Feb 2001 16:07:31 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id QAA22976 for ; Tue, 6 Feb 2001 16:07:32 +0100 Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16F7Vu07390 for ; Tue, 6 Feb 2001 16:07:31 +0100 (MET) Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef.ens.fr (8.10.1/1.01.28121999) with ESMTP id f16F7U423344 for ; Tue, 6 Feb 2001 16:07:30 +0100 (CET) Received: from (ebrunet@localhost) by clipper.ens.fr (8.9.2/jb-1.1) Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2i Message-ID: <20010206160729.B14544@clipper.ens.fr> Date: Tue, 6 Feb 2001 16:07:29 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Eric Brunet Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L Thierry Bouche wrote: > You mean that people with macintoshes should have something do declare > in their files and others not? And that this would be forced somewhere > in latex or babel kernel? Well, macintosh users have to declare something now, don't they ? And setting a default wouldn't change that; it is not as if a default would force macintosh users to have a preambule longer than what they now have. It would be exactly the same thing. But if we declare a default, me might shorten the preambule for all the macintosh users, or for all the unix plus all the windows users (thanks to the near compatibility of latin and ansinew). So it is possible to make latex easier to a large majority without making it more difficult to the minority. I cannot see how this could be regarded as a bad solution ? > Well, if they have T1/fontenc, they won't see anything wrong. As long > as french is concerned, making latin1 the default is restrictive as T1 > is mandatory for correct hyphenation/kerning and T1 coincides with > latin1. You can live with only « » active with current latex, I don't > see a good reason for requiring all >127 actives. So you suggest that unix (or windows) users should just declare \usepackage[T1]{fontenc} without declaring any inputenc package (relying on the fact that the T1 layout reproduce the latin1 layout), and that macintosh users should declare \usepackage[T1]{fontenc} and \usepackage[applemac]{inputenc} (because otherwise it would not work) ? Well, I propose exactly the same thing, except that with my proposal, the characters above 127 would __moreover__ be correctly handled. Éric Brunet 6-Feb-2001 15:14:46-GMT,4574;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id IAA08162 for ; Tue, 6 Feb 2001 08:14:45 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16FEfu11389; Tue, 6 Feb 2001 16:14:41 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id QAA07663; Tue, 6 Feb 2001 16:14:34 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488424 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 16:14:33 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id QAA07656 for ; Tue, 6 Feb 2001 16:14:32 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id QAA40298 for ; Tue, 6 Feb 2001 16:14:33 +0100 Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Eu6u01183 for ; Tue, 6 Feb 2001 15:56:06 +0100 (MET) Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef.ens.fr (8.10.1/1.01.28121999) with ESMTP id f16Eu5421915 for ; Tue, 6 Feb 2001 15:56:05 +0100 (CET) Received: from (ebrunet@localhost) by clipper.ens.fr (8.9.2/jb-1.1) Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2i Message-ID: <20010206155604.A14544@clipper.ens.fr> Date: Tue, 6 Feb 2001 15:56:04 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Eric Brunet Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L Frank wrote: > this is a problem agreed, because of this unfortunate fact of letting 8bit > loose if no inputenc is specified. > but providing an input encoding automatically would be a big compatibility > problem and within 2e kernel we have a firm policy of not doing this (and if > we move it to a package (aka inputenc) then you are back to your above > problem). Oh, that I understand very well that the 2e kernel should be always perfectly backward compatible. It is a good policy. However, I believe it is not true of the packages: if two people use two different versions of the same package, it is allowed that their document differ in the end for the same source, I think. That is why I was proposing in the beginning of this thread to put this new facility in a new version of the babel package. After all, the kernel has no way to determine a good default input encoding without knowing first which is the language of the document, and thus it makes sense to let babel choose the default encoding. In this way the 2e kernel stays compatible, only babel user see a difference. Moreover, the change wouldn't disturb anybody: if someone doesn't hear about the change, he would probably keep on declaring \usepackage[something]{inputenc} and might never become aware that there is a default encoding for the language he is using. > will? the user groups? for many lanugages there isn't a user group If not all languages have their newsgroups, I think you can find them in comp.text.tex. If not, somebody must have written all those language files for babel. This person might be qualified to choose. In last recourse, I would choose the suitable latinxxx for the language. Or maybe leave it to ascii till someone complains; we don't need to give a default to all language in one go. > anyway, for the code that i'm currently writing i've added a way to specify > inputencodings by language (or script) at least for the moment Yes, that is needed anyway, and it is excellent news. But it would be so much nicer to have a default value. Éric 6-Feb-2001 16:10:29-GMT,3737;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id JAA09874 for ; Tue, 6 Feb 2001 09:10:27 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16G9wu01075; Tue, 6 Feb 2001 17:09:58 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id RAA10162; Tue, 6 Feb 2001 17:09:42 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488822 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 17:09:41 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA10129 for ; Tue, 6 Feb 2001 17:09:35 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA29346 for ; Tue, 6 Feb 2001 17:09:36 +0100 Received: from csc.albany.edu (sarah.albany.edu [169.226.1.103]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16G9Zu00834 for ; Tue, 6 Feb 2001 17:09:36 +0100 (MET) Received: from pluto.math.albany.edu (pluto.math.albany.edu [169.226.23.44]) by csc.albany.edu (8.9.3/8.9.3) with ESMTP id LAA00378 for ; Tue, 6 Feb 2001 11:09:11 -0500 (EST) Received: (from hammond@localhost) by pluto.math.albany.edu (8.9.3/8.9.3) id LAA21018 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 11:09:10 -0500 (EST) Message-ID: <200102061609.LAA21018@pluto.math.albany.edu> Date: Tue, 6 Feb 2001 11:09:10 -0500 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: "William F. Hammond" Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L Just out of curiosity, I'm wondering what those here think about unicode and, in particular: 1. Is its concept of character -- basically unsigned 32 bit integer -- durable for, say, the next 100 years? (As I read the discussion here, I think not.) 2. Do we think that 2^32 is a wise upper bound? (This question vanishes if we think that representing characters as integers, rather than as more complicated data structures, is inadequate.) Unicode is directly relevant to the future of LaTeX to the extent that LaTeX is going to be robust for formatting XML document types because normal document content can consist of arbitary sequences of unicode characters. XML systems are designed to make decisions only where markup occurs. It is reasonable for an XML processor writing in a typesetting language to know the markup ancestry of a character, e.g., whether it is within a math zone, but not reasonable -- unless the processor, like David Carlisle's xmltex, is a TeX thing -- for it to know that a particular character must have \ensuremath applied. I note that in GNU Emacs these days characters can have property lists. Thanks for your thoughts. -- Bill 6-Feb-2001 16:17:46-GMT,3355;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id JAA10169 for ; Tue, 6 Feb 2001 09:17:44 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16GHPu03804; Tue, 6 Feb 2001 17:17:25 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id RAA10339; Tue, 6 Feb 2001 17:17:19 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488866 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 17:17:19 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA10332 for ; Tue, 6 Feb 2001 17:17:17 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA30980 for ; Tue, 6 Feb 2001 17:17:17 +0100 Received: from smtp.wanadoo.es (m1smtpisp02.wanadoo.es [62.36.220.21] (may be forged)) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16GHHu03760 for ; Tue, 6 Feb 2001 17:17:17 +0100 (MET) Received: from wanadoo.es (m1wmail1.wanadoo.es [62.36.220.41]) by smtp.wanadoo.es (8.10.2/8.10.2) with ESMTP id f16GHGS21140 for ; Tue, 6 Feb 2001 17:17:16 +0100 (MET) MIME-Version: 1.0 Content-Type: text/plain X-XaM3-API-Version: 1.1.11.1.5 X-SenderIP: 195.53.220.3 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id RAA10333 Message-ID: Date: Tue, 6 Feb 2001 17:17:16 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: jbezos Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L > jbezos wrote: > > Not at all. In latin1, character in the range 130--159 > > are assigned to control characters (and hence > > I wouldn't say it like that: the 128--159 range is reserved so that if > some document pass through a buggy programm that strips the eight bit, > then the resulting document doesn't have any extra control characters (in > the 0--31 range) with some ``interesting'' properties. Right. And because that, the ISO Standard states that these characters "correspond to bit combinations that do not represent graphic characters". So, ansinew is uncompatible with ISO 8859-1. Regards Javier ______________________________________________________________________________ Consigue tu cuenta de correo universal y gratuita en http://webmail.wanadoo.es 6-Feb-2001 16:32:02-GMT,3951;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id JAA10620 for ; Tue, 6 Feb 2001 09:32:01 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16GVnu07197; Tue, 6 Feb 2001 17:31:49 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id RAA10509; Tue, 6 Feb 2001 17:31:43 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488872 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 17:31:42 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA10502 for ; Tue, 6 Feb 2001 17:31:41 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA26366 for ; Tue, 6 Feb 2001 17:31:40 +0100 Received: from nag.co.uk (openmath.nag.co.uk [62.232.54.144]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16GVcu07165 for ; Tue, 6 Feb 2001 17:31:39 +0100 (MET) Received: (from davidc@localhost) by nag.co.uk (AIX4.2/UCB 8.7/8.7) id QAA23642; Tue, 6 Feb 2001 16:31:23 GMT References: <200102061609.LAA21018@pluto.math.albany.edu> Message-ID: <200102061631.QAA23642@nag.co.uk> Date: Tue, 6 Feb 2001 16:31:23 GMT Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: David Carlisle Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <200102061609.LAA21018@pluto.math.albany.edu> (hammond@CSC.ALBANY.EDU) > but not reasonable -- unless the > processor, like David Carlisle's xmltex, is a TeX thing -- for it to > know that a particular character must have \ensuremath applied. That isn't clear. A unicode text processor is supposed to know an awful lot about each character. It has to "know" that combing characters combine, and is supposed to know the default writing direction of every character, and various other properties. The property of being a math character is really just one of these. In fact it _is_ one of those see http://www.unicode.org/Public/UNIDATA/UnicodeData.html Informative Categories Abbr. Description Lm Letter, Modifier Lo Letter, Other Pc Punctuation, Connector Pd Punctuation, Dash Ps Punctuation, Open Pe Punctuation, Close Pi Punctuation, Initial quote (may behave like Ps or Pe depending on usage) Pf Punctuation, Final quote (may behave like Ps or Pe depending on usage) Po Punctuation, Other Sm Symbol, Math ^^^^^^^^^^^^^^^^^^^^^^ Sc Symbol, Currency Sk Symbol, Modifier So Symbol, Other one of the problems xmltex has is that it _doesn't_ know this stuff (and doesn't combine combing characters, for example) Unicode as currently devised hasn't got 2^32 characters, just 17 planes of 2^16, but even so, that's probably enough. But whether the internal canonical form is a unicode number or a latex style 7bit string \'e the issues of mapping between input encodings and this internal form, and from there to font encodings, are probably about the same. David 6-Feb-2001 16:59:36-GMT,3538;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id JAA11502 for ; Tue, 6 Feb 2001 09:59:34 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16GxVu14469; Tue, 6 Feb 2001 17:59:31 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id RAA10948; Tue, 6 Feb 2001 17:59:24 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488906 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 17:59:23 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA10939 for ; Tue, 6 Feb 2001 17:59:22 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id RAA38632 for ; Tue, 6 Feb 2001 17:59:22 +0100 Received: from venus.open.ac.uk (venus.open.ac.uk [137.108.143.2]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16GxLu14364 for ; Tue, 6 Feb 2001 17:59:22 +0100 (MET) Received: from fell.open.ac.uk by venus.open.ac.uk via SMTP Local (Mailer 3.1) with ESMTP; Tue, 6 Feb 2001 16:59:17 +0000 Received: (from car2@localhost) by fell.open.ac.uk (8.9.3+Sun/8.9.1) id QAA16376; Tue, 6 Feb 2001 16:59:32 GMT X-Authentication-Warning: fell.open.ac.uk: car2 set sender to car2@fell.open.ac.uk using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200102061609.LAA21018@pluto.math.albany.edu> <200102061631.QAA23642@nag.co.uk> X-Mailer: VM 6.76 under Emacs 20.7.1 Message-ID: <14976.11635.576650.311233@fell.open.ac.uk> Date: Tue, 6 Feb 2001 16:59:31 +0000 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Chris Rowley Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <200102061631.QAA23642@nag.co.uk> David Carlisle wrote -- > But whether the internal > canonical form is a unicode number or a latex style 7bit string \'e > the issues of mapping between input encodings and this internal form, > and from there to font encodings, are probably about the same. And I just want to agree with this. In fact I would go a lot further and say that the problems raised in this discussion such as the following are the same whatever system you use to do quality typesetting: what is a character? what is the relationship between character strings and relatively positioned glyphs on a surface? Thus LaTeX and its choice of internally using 7-bit strings is a also a mere detail. And these problems do not go away just because you use larger integers to represent text streams. chris 6-Feb-2001 17:02:15-GMT,3272;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id KAA11617 for ; Tue, 6 Feb 2001 10:02:14 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16H2Au15936; Tue, 6 Feb 2001 18:02:10 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id SAA11030; Tue, 6 Feb 2001 18:02:04 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488914 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 18:02:03 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA11023 for ; Tue, 6 Feb 2001 18:02:02 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA23916 for ; Tue, 6 Feb 2001 18:02:02 +0100 Received: from csc.albany.edu (sarah.albany.edu [169.226.1.103]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16H1qu15721 for ; Tue, 6 Feb 2001 18:01:53 +0100 (MET) Received: from pluto.math.albany.edu (pluto.math.albany.edu [169.226.23.44]) by csc.albany.edu (8.9.3/8.9.3) with ESMTP id MAA11966 for ; Tue, 6 Feb 2001 12:01:28 -0500 (EST) Received: (from hammond@localhost) by pluto.math.albany.edu (8.9.3/8.9.3) id MAA21339 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 12:01:27 -0500 (EST) Message-ID: <200102061701.MAA21339@pluto.math.albany.edu> Date: Tue, 6 Feb 2001 12:01:27 -0500 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: "William F. Hammond" Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L David Carlisle writes: > But whether the internal > canonical form is a unicode number or a latex style 7bit string \'e > the issues of mapping between input encodings and this internal form, > and from there to font encodings, are probably about the same. But isn't \'e an abbreviation for \acute{e}, and don't the French conceptualize it as an accented 'e'? And isn't that a better way to handle this particular thing when the author thinks of it as an accented 'e' rather than as a different character? I see \'e and \uE9 as formally different things, which probably should be typeset the same way by TeX in this case since \uE9 is a legacy hack for handling \'e . -- Bill 6-Feb-2001 17:13:47-GMT,3593;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id KAA12033 for ; Tue, 6 Feb 2001 10:13:46 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16HDbu22320; Tue, 6 Feb 2001 18:13:37 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id SAA11262; Tue, 6 Feb 2001 18:13:28 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488926 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 18:13:27 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA11255 for ; Tue, 6 Feb 2001 18:13:26 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA20938 for ; Tue, 6 Feb 2001 18:13:25 +0100 Received: from nag.co.uk (openmath.nag.co.uk [62.232.54.144]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16HDKu22206 for ; Tue, 6 Feb 2001 18:13:20 +0100 (MET) Received: (from davidc@localhost) by nag.co.uk (AIX4.2/UCB 8.7/8.7) id RAA22236; Tue, 6 Feb 2001 17:13:04 GMT References: <200102061701.MAA21339@pluto.math.albany.edu> Message-ID: <200102061713.RAA22236@nag.co.uk> Date: Tue, 6 Feb 2001 17:13:04 GMT Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: David Carlisle Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <200102061701.MAA21339@pluto.math.albany.edu> (hammond@CSC.ALBANY.EDU) > But isn't \'e an abbreviation for \acute{e}, No. No on two levels, firstly \' doesn't expand to any (document usable) command form, it is essentially, but more importantly the latex internal form should be thought of as a symbolic name consisting of those characters. \'e (actually the internal form isn't quite that because of the annoying tabbing restrictions, but ignore that for now). \'e is a three letter name for taht character, like e-acute or U+E9 or é Sometimes latex passes it round as a string of three tokens, and sometimes the \' is tokenised but again this is an implementation detail. Conceptually it is just latex's cannonical name for an e acute. > I see \'e and \uE9 as formally different things, That isn't the latex way. If you use a latin 1 input encoding and enter a é (which was an e acute if this mail path isn't 8bit safe) then latex will convert that to \é internally before converting that back to the same byte as it started with if typesetting in T1 encoding. This in fact is similar to a unicode combining character if you do e' where 'is teh combining acute it is (to a unicode/xml system) supposed to be the same as if you'd entered the e acute character (but don't try it in xmltex) David 6-Feb-2001 17:38:54-GMT,2892;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id KAA12828 for ; Tue, 6 Feb 2001 10:38:52 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Hcnu27691; Tue, 6 Feb 2001 18:38:49 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id SAA11657; Tue, 6 Feb 2001 18:38:42 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488944 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 18:38:42 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA11645 for ; Tue, 6 Feb 2001 18:38:40 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA18610 for ; Tue, 6 Feb 2001 18:38:40 +0100 Received: from wisbech.cl.cam.ac.uk (mta1.cl.cam.ac.uk [128.232.0.15]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Hceu27643 for ; Tue, 6 Feb 2001 18:38:40 +0100 (MET) Received: from pallas.cl.cam.ac.uk ([128.232.8.88] helo=cl.cam.ac.uk ident=rf) by wisbech.cl.cam.ac.uk with esmtp (Exim 3.092 #1) id 14QC4Z-0006nV-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 06 Feb 2001 17:38:35 +0000 Message-ID: Date: Tue, 6 Feb 2001 17:38:35 +0000 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Robin Fairbairns Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: Your message of "Tue, 06 Feb 2001 12:01:27 EST." <200102061701.MAA21339@pluto.math.albany.edu> > But isn't \'e an abbreviation for \acute{e}, as david has already said, no. > and don't the French > conceptualize it as an accented 'e'? no, that's for english-speaking schoolchildren learning the language. just the same as, when i learnt latin, i learned that "v" and "u" are interchangeable, since there was in classical times no letter "u". the french alphabet has more letters than ours, that's all. 6-Feb-2001 17:58:40-GMT,6717;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id KAA13448 for ; Tue, 6 Feb 2001 10:58:39 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16HwZu01565; Tue, 6 Feb 2001 18:58:35 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id SAA12053; Tue, 6 Feb 2001 18:58:29 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488963 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 18:58:28 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA12046 for ; Tue, 6 Feb 2001 18:58:26 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id SAA26524 for ; Tue, 6 Feb 2001 18:58:26 +0100 Received: from abel.math.umu.se (abel.math.umu.se [130.239.20.139]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16HwQu01477 for ; Tue, 6 Feb 2001 18:58:26 +0100 (MET) Received: from [130.239.20.144] (mac144.math.umu.se [130.239.20.144]) by abel.math.umu.se (8.9.2/8.9.2) with ESMTP id SAA19522 for ; Tue, 6 Feb 2001 18:56:41 +0100 (CET) X-Sender: lars@abel.math.umu.se Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id SAA12047 Message-ID: Date: Tue, 6 Feb 2001 18:58:25 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Lars =?iso-8859-1?Q?Hellstr=F6m?= Subject: More template experience To: Multiple recipients of list LATEX-L I've spent the weekend actually getting the templated index package running, so now I've gained some new experience with these things. The good news are that I got the design as I wanted it almost immediately; this suggests to me that the separation of design (which gets put in ) and control structures (which is the for the template) furthers good programming. In fact I suspect that if one is to develop a new LaTeX2e package which involves a lot of design issues, the best route might be to start develop it as a LaTeX2e* package (or at least a templated package) and then write a 2e version which follows the structure of the 2e* version (emulating the instances only, not the templates or template types). In any case, I will try this approach in the next step of the templated index package project. The bad news are that before I got it running I had to spend several hours tracking down the exact reason that the weren't parsed as I wanted them to. This was initially very hard but it got easier once I hacked ldcsetup.sty to actually print how it had parsed the keyvals (see latex-bugs expl3/3302; apparently something of the kind will be included in the official code as well), however it was still much too easy to make \DeclareTemplate go haywire by making a typo in the argument. One thing I think would solve most of these problems, or at least enable TeX to catch them at a much earlier stage, would be to change the keyval defaults from being bracked-delimited to brace-delimited (i.e. put in a group). It is certainly no harder to check if the next character is a `{' than to check if it is a `[': just do \@ifnextchar\bgroup instead of \@ifnextchar[. The main advantage with this is that the use of special characters inside the keyval default will no longer mess up the parsing; currently \DeclareTemplate{foo}{bar}{0}{ foo =f0 [, ] \punctuation }{\DoParameterAssignments} will die rather horribly because the comma inside the brackets will start the declaration of a new key. When \TP@test@pt tries to grab the default value it will only stop at the end of the file or a \par, because the right brace that was intended to end the default value is hidden inside a group. Currently \DeclareTemplate{foo}{bar}{0}{ foo =f0 [{, }] \punctuation }{\DoParameterAssignments} does work and the braces are removed when the default value is grabbed, but one would avoid errors better if it was \DeclareTemplate{foo}{bar}{0}{ foo =f0 {, } \punctuation }{\DoParameterAssignments} since then the comma wouldn't be visible to \KV@parse. Other characters that need this kind of special care in defaults are = and ]. Some other things I reacted to: The difference between a template and an instance could be explained better. I had gotten the impression that instance=template where there are no unrestricted keyvals left, but after looking closer at the code I can see that this is not the case; in particular you need to do much more processing of a template before you can use it than you need to do an instance. You'll have to put a lot of \describecsfamily's into template.dtx before you can call it reasonably documented, but before you do that you may want to consider changing the control sequence families used. Currently control sequence names are pieced together as ... (for templates the seems to be TP>/, the is the template type, is /, is the template name, and n=2). This practice works fine if n=1 or the 's cannot be appear in the 's, but that is not the case with the template code. A solution less sensitive to strange characters in 's would be to surround each with a brace group (\csname doesn't mind category 1 and 2 characters), since even an ingenious fool has work very hard to get braces mismatched in arguments, whereas with the current naming scheme \DeclareTemplate{foo}{bar/baz}... and \DeclareTemplate{foo/bar}{baz}... would be stored in the same macros. Lars Hellström 6-Feb-2001 18:15:03-GMT,3338;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id LAA14035 for ; Tue, 6 Feb 2001 11:15:01 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16IEsu09533; Tue, 6 Feb 2001 19:14:54 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id TAA12330; Tue, 6 Feb 2001 19:14:27 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488975 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 19:14:26 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id TAA12323 for ; Tue, 6 Feb 2001 19:14:25 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id TAA33780 for ; Tue, 6 Feb 2001 19:14:24 +0100 Received: from ams.org (sun06.ams.org [130.44.1.6]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16IENu09416 for ; Tue, 6 Feb 2001 19:14:23 +0100 (MET) Received: (from mjd@localhost) by ams.org (8.11.1/8.11.1) id f16IEKn00192; Tue, 6 Feb 2001 13:14:20 -0500 (EST) References: Lines: 17 X-Mailer: Gnus v5.7/Emacs 20.7 Message-ID: Date: Tue, 6 Feb 2001 13:14:20 -0500 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Michael John Downes Subject: Recording font file requirements of a document To: Multiple recipients of list LATEX-L In-Reply-To: jbezos's message of "Mon, 5 Feb 2001 12:41:56 +0100" jbezos writes: > One of the solutions I considered was to generate a file recording the > decisions taken in a system when a document is typeset, so that if we > really want to ensure that TeX complains if there is a different > configuration we can distribute that file with the main .tex ones. I wrote a package "snapshot" that provides a way to make such complaints, by providing a way to embed a dependency list of all external files used by the document, along with their version numbers. But only files that use normal LaTeX input mechanisms are trackable. Because the checksum information of .tfm files is not accessible to LaTeX, this cannot be written into the dependency list by LaTeX. An external script could do it. There are some tricky questions such as, do we include all fonts whose .tfm files are loaded, or only those which actually have a glyph used in the document? (And I think most of you will see how it begins to get trickier from there ...) 6-Feb-2001 18:39:58-GMT,4068;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id LAA14857 for ; Tue, 6 Feb 2001 11:39:57 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Idhu13776; Tue, 6 Feb 2001 19:39:43 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id TAA12697; Tue, 6 Feb 2001 19:39:36 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488986 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 19:39:35 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id TAA12690 for ; Tue, 6 Feb 2001 19:39:33 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id TAA21636 for ; Tue, 6 Feb 2001 19:39:34 +0100 Received: from nag.co.uk (openmath.nag.co.uk [62.232.54.144]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16IdYu13728 for ; Tue, 6 Feb 2001 19:39:34 +0100 (MET) Received: (from davidc@localhost) by nag.co.uk (AIX4.2/UCB 8.7/8.7) id SAA14684; Tue, 6 Feb 2001 18:39:18 GMT References: Message-ID: <200102061839.SAA14684@nag.co.uk> Date: Tue, 6 Feb 2001 18:39:18 GMT Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: David Carlisle Subject: Re: More template experience To: Multiple recipients of list LATEX-L In-Reply-To: (message from Lars =?iso-8859-1?Q?Hellstr=F6m?= on Tue, 6 Feb 2001 18:58:25 +0100) > The main advantage with this is that the use of special > characters inside the keyval default will no longer mess up the parsing; > currently Yes this is of course a general problem with the latex [] syntax, but it is particularly bad here I agree, because if it goes wrong in the middle of that code then it _really_ goes wrong. There is a general rule of optional things using {} and mandatory things using [] but perhaps since the syntax for template declarations are so formalised anyway, the rules can be different here. Frank? Chris? > The difference between a template and an instance could be explained > better. ah documentation. Yes that could be improved. > in particular you need to do much more > processing of a template before you can use it than you need to do an > instance. Yes, in effect you have to make a (nameless) instance and run that. > practice works fine if n=1 or the 's cannot be appear in the > 's, but that is not the case with the template code. there were earlier versions where instances (at least) were called as \csname rather than via a \UseInstance{xxx} syntax that could allow such characters. Also the number of parts in those names grew as features such as collections got added (and the template/instance code got re-redesigned) I agree that it should be cleaned up. Hopefully though those internal names can be changed without really affecting the main interface or packages using it. Thanks again for your detailed reading of the code. It is heartening to have someone say that it is basically a good idea, modulo some technical "features". There's always a danger that nobody likes it at all:-) David 6-Feb-2001 19:34:11-GMT,3601;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id MAA16803 for ; Tue, 6 Feb 2001 12:34:10 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16JY0u29011; Tue, 6 Feb 2001 20:34:00 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id UAA13484; Tue, 6 Feb 2001 20:32:52 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 489034 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 20:32:51 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA13477 for ; Tue, 6 Feb 2001 20:32:50 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id UAA33902 for ; Tue, 6 Feb 2001 20:32:51 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16JWpu28833 for ; Tue, 6 Feb 2001 20:32:51 +0100 (MET) Received: from [195.20.224.209] (helo=mrvdom02.schlund.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14QDr8-0006kL-00 for LATEX-L@urz.uni-heidelberg.de; Tue, 6 Feb 2001 20:32:50 +0100 Received: from manz-3e3646fc.pool.mediaways.net ([62.54.70.252] helo=istrati.zdv.uni-mainz.de) by mrvdom02.schlund.de with esmtp (Exim 2.12 #2) id 14QDrA-00079u-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 20:32:53 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id UAA22368; Tue, 6 Feb 2001 20:18:34 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14976.19977.570637.825825@istrati.zdv.uni-mainz.de> Date: Tue, 6 Feb 2001 20:18:33 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: More template experience To: Multiple recipients of list LATEX-L In-Reply-To: > each with a brace group (\csname doesn't mind category 1 and 2 > characters), since even an ingenious fool has work very hard to get braces > mismatched in arguments, whereas with the current naming scheme this is weird :-) and i thought my learning days on TeX the programs have been over. even \expandafter\show \csname ab{cd\endcsname does work. well, off hand i think you do not only have a good point but also a good solution. frank 6-Feb-2001 20:45:10-GMT,5340;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id NAA19050 for ; Tue, 6 Feb 2001 13:45:09 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Kixu16434; Tue, 6 Feb 2001 21:44:59 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id VAA14503; Tue, 6 Feb 2001 21:43:55 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 489083 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 21:43:54 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA14496 for ; Tue, 6 Feb 2001 21:43:53 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id VAA05866 for ; Tue, 6 Feb 2001 21:43:53 +0100 Received: from moutvdom00.kundenserver.de (moutvdom00.kundenserver.de [195.20.224.149]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16Khru16169 for ; Tue, 6 Feb 2001 21:43:53 +0100 (MET) Received: from [195.20.224.209] (helo=mrvdom02.schlund.de) by moutvdom00.kundenserver.de with esmtp (Exim 2.12 #2) id 14QExs-0006ee-00 for LATEX-L@urz.uni-heidelberg.de; Tue, 6 Feb 2001 21:43:52 +0100 Received: from manz-3e364879.pool.mediaways.net ([62.54.72.121] helo=istrati.zdv.uni-mainz.de) by mrvdom02.schlund.de with esmtp (Exim 2.12 #2) id 14QEyG-0004lH-00 for LATEX-L@URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 21:44:17 +0100 Received: (from latex3@localhost) by istrati.zdv.uni-mainz.de (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id VAA22635; Tue, 6 Feb 2001 21:41:25 +0100 X-Authentication-Warning: istrati.zdv.uni-mainz.de: latex3 set sender to frank@mittelbach-online.de using -f MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit References: <200102061839.SAA14684@nag.co.uk> X-Mailer: VM 6.75 under Emacs 20.4.1 Message-ID: <14976.24948.777202.226167@istrati.zdv.uni-mainz.de> Date: Tue, 6 Feb 2001 21:41:24 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Frank Mittelbach Subject: Re: More template experience To: Multiple recipients of list LATEX-L In-Reply-To: <200102061839.SAA14684@nag.co.uk> David Carlisle writes: > There is a general rule of optional things using {} and mandatory things > using [] but perhaps since the syntax for template declarations are so you meant the other way around do you? > formalised anyway, the rules can be different here. Frank? Chris? they could i have to deep feelings here. actually in practice defaults set up this way are far less efficient than coding them in the body of the template though for serious template writing i started to avoid using them. but that doesn't mean that it we should not have them. for the less experienced writer spcifying defaults up at the top is a) easier to specify and b) easier to modify and c) visually easier to capture and understand. however, my goal would be that on top of a good selection of templates you have something like lynx sitting and allowing you to produce instances etc from a gui interface. with a formalised description block (not existing yet to that extend) you can even have a generic gui interface that can handle any template file and provides sensible documentation on what can be specified and changed to produce instances thereoff. so yes, in my opinion, the most important factor is that the syntax is strict and not that it resembles LaTeX body syntax. > > The difference between a template and an instance could be explained > > better. > > ah documentation. Yes that could be improved. well, i would say it could be written. actually what is needed is a real a article about it. perhaps for tugboat > I agree that it should be cleaned up. Hopefully though those internal > names can be changed without really affecting the main interface > or packages using it. i think it can. and i would be really happy if Lars and or anybody else continues with finding further snags and then we should try at getting it to the next level. > Thanks again for your detailed reading of the code. It is heartening > to have someone say that it is basically a good idea, modulo some > technical "features". There's always a danger that nobody likes > it at all:-) isn't that true good night (up since 4:50am and i'm gettin' ....) frank 6-Feb-2001 22:28:20-GMT,3942;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id PAA22170 for ; Tue, 6 Feb 2001 15:28:18 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f16MS9u17100; Tue, 6 Feb 2001 23:28:09 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id XAA16069; Tue, 6 Feb 2001 23:25:42 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 489174 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Tue, 6 Feb 2001 23:25:41 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id XAA16062 for ; Tue, 6 Feb 2001 23:25:40 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id XAA26960 for ; Tue, 6 Feb 2001 23:25:40 +0100 Received: from musse.tninet.se (musse.tninet.se [195.100.94.12]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with SMTP id f16MPeu16851 for ; Tue, 6 Feb 2001 23:25:40 +0100 (MET) Received: (qmail 29540 invoked from network); 6 Feb 2001 23:25:38 +0100 Received: from delenn.tninet.se (HELO algonet.se) (195.100.94.104) by musse.tninet.se with SMTP; 6 Feb 2001 23:25:38 +0100 Received: from [195.100.226.137] (du137-226.ppp.su-anst.tninet.se [195.100.226.137]) by delenn.tninet.se (BLUETAIL Mail Robustifier 2.2.1) with ESMTP id 406673.498337.981delenn-s1 for ; Tue, 06 Feb 2001 23:25:37 +0100 X-Sender: haberg@pop.matematik.su.se (Unverified) References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-ID: Date: Tue, 6 Feb 2001 23:24:40 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Hans Aberg Subject: Re: More template experience To: Multiple recipients of list LATEX-L In-Reply-To: <14976.19977.570637.825825@istrati.zdv.uni-mainz.de> At 18:39 +0000 2001/02/06, David Carlisle wrote: > each with a brace group (\csname doesn't mind category 1 and 2 > characters), since even an ingenious fool has work very hard to get braces > mismatched in arguments, whereas with the current naming scheme At 20:18 +0100 2001/02/06, Frank Mittelbach wrote: >this is weird :-) and i thought my learning days on TeX the programs have been >over. > >even > > \expandafter\show \csname ab{cd\endcsname > >does work. TeX uses the character catcodes to tokenize the input, but if the lexer finds a macro name, it must do an additional table lookup in order to stamp an additional, internal token number (not described in the TeX book) for the parser, evidently, as different macros can obey a different syntax. But my guess is that both the lexer regular words syntax and the parser LALR(!) (?) syntax are fixed, only the character catcodes and the macro internal codes can change. (So why is TeX's syntax then commonly called "extensible" if both grammars are fixed?) Has this TeX parser grammar been published somewhere? Hans Aberg 7-Feb-2001 10:56:10-GMT,3058;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id DAA08394 for ; Wed, 7 Feb 2001 03:28:32 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17ASKu03123; Wed, 7 Feb 2001 11:28:20 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id LAA22276; Wed, 7 Feb 2001 11:25:31 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488076 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Wed, 7 Feb 2001 11:25:30 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA22269 for ; Wed, 7 Feb 2001 11:25:29 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA23282 for ; Wed, 7 Feb 2001 11:25:28 +0100 Received: from Sina.sharif.ac.ir (sina.Sharif.AC.IR [194.225.40.9]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17APOu02450 for ; Wed, 7 Feb 2001 11:25:25 +0100 (MET) Received: from localhost (roozbeh@localhost) by Sina.sharif.ac.ir (8.9.3/8.9.3) with ESMTP id NAA04123 for ; Wed, 7 Feb 2001 13:55:18 +0330 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Message-ID: Date: Wed, 7 Feb 2001 13:55:18 +0330 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Roozbeh Pournader Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: On Tue, 6 Feb 2001, jbezos wrote: > Further, a character in the ASCII > range (I have not the tables at hands, but IIRC is > right single quote) has been reassigned in ansinew. I can't get what you mean. The "Apostrophe" (character 0x27 in ASCII) is considered both an opening and closeing quotation mark, and also an apostrophe in ASCII. But the new characters that you see in places 0x91 and 0x92 in CP1252, are "Left Single Quotation Mark" and "Right Single Quotation Mark". They're not the same characters, although we may have used 0x60 ("Grave Accent") and 0x27 in the TeX world instead of them. --roozbeh 7-Feb-2001 10:59:53-GMT,2839;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id DAA09031 for ; Wed, 7 Feb 2001 03:59:51 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17Axdu12314; Wed, 7 Feb 2001 11:59:39 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id LAA23038; Wed, 7 Feb 2001 11:59:32 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488155 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Wed, 7 Feb 2001 11:59:31 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA23031 for ; Wed, 7 Feb 2001 11:59:30 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA31238 for ; Wed, 7 Feb 2001 11:59:30 +0100 Received: from Sina.sharif.ac.ir (sina.Sharif.AC.IR [194.225.40.9]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17AxKu12152 for ; Wed, 7 Feb 2001 11:59:24 +0100 (MET) Received: from localhost (roozbeh@localhost) by Sina.sharif.ac.ir (8.9.3/8.9.3) with ESMTP id OAA05452 for ; Wed, 7 Feb 2001 14:29:15 +0330 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Message-ID: Date: Wed, 7 Feb 2001 14:29:15 +0330 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Roozbeh Pournader Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: On Tue, 6 Feb 2001, jbezos wrote: > Right. And because that, the ISO Standard states that > these characters "correspond to bit combinations that do not represent graphic characters". So, ansinew > is uncompatible with ISO 8859-1. I agree. I think that using a ISO 8859-1 character in range (0x80--0x9F) should create a warning or such, and not get silently replaced with the CP1252 character. 6-Feb-2001 0:25:27-GMT,5407;000000000001 Return-Path: Received: from suncore.math.utah.edu (suncore0.math.utah.edu [128.110.198.5]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id RAA19482; Mon, 5 Feb 2001 17:25:26 -0700 (MST) Received: (from beebe@localhost) by suncore.math.utah.edu (8.9.3/8.9.3) id RAA26769; Mon, 5 Feb 2001 17:25:25 -0700 (MST) Date: Mon, 5 Feb 2001 17:25:25 -0700 (MST) From: "Nelson H. F. Beebe" To: latex-l@URZ.UNI-HEIDELBERG.DE Cc: beebe@math.utah.edu X-US-Mail: "Center for Scientific Computing, Department of Mathematics, 322 INSCC, University of Utah, 155 S 1400 E RM 233, Salt Lake City, UT 84112-0090, USA" X-Telephone: +1 801 581 5254 X-FAX: +1 801 585 1640, +1 801 581 4148 X-URL: http://www.math.utah.edu/~beebe Subject: Re: glyph collections viz font encodings Message-ID: Chris Rowley writes on Mon, 5 Feb 2001 23:26:16 +0000: >> ... >> ... since the fonts available are going to get more and more >> diverse (if only very slowly) the robust medium-term solution for a >> TeX-like typesetting engine is to add the ability to bundle the >> font-metrics with the document (possibly virtually when network >> font warehouses really exist). >> ... To some extent, TeX already takes care of this, in that it records a font checksum in the DVI file that the DVI driver should match against the fonts used, and report any discrepancies. Of course, when they do mismatch, the average user has no idea what to do about it. Adobe Portable Document Format (PDF) takes this approach of saving metrics for all fonts used. I have an example here on my desk of a PostScript output from earlier today using a font in which the round-trip from PostScript to PDF to PostScript resulted in a substitution of Adobe Sans MM in place of the original, Impact (a Microsoft TrueType poster font). This is both good and bad: the substitution happens silently, so the user is neither informed of it, or bothered by it. However, the two PostScript files definitely look different when printed; even their page dimensions changed. PDF was trying to ensure that if font substitutions occurred, the character spacing should be close to the original, even if the glyph appearance is changed. Some people find this objectionable: the U.S. National Science Foundation, for example, REQUIRES that PDF files for grant proposals contain embedded subsetted fonts to ensure uniform appearance everywhere. The notion of network font warehouses is, I suspect, far off, given the copyright and licensing issues on the vast majority of fonts. While Adobe has officially taken the position that their licensed fonts, if subsetted, can be legally embedded in distributed PDF and PostScript documents without further license fee payments, other vendors are not so accommodating: Bitstream's license makes even the font metrics in the .afm files confidential, e.g. >> Comment Copyright 1987-1992 as an unpublished work by Bitstream Inc., Cambridge, MA. >> Comment All rights reserved >> Comment Confidential and proprietary to Bitstream Inc. It is not clear what this means for PDF; a lawyer might conceivably argue that a Bitstream font cannot be legally used in a PDF file at all, because that constitutes copying of the font metrics. Whether this is a good business decision is another question :^). Thus, this brings us round to the alternative offered by PostScript to PDF conversion software, like Adobe Acrobat Distiller, and ghostscript ps2pdf, of forcing the embedding of all fonts used. The feature was undoubtedly added because customers wanted it, which suggests that if a descendant of TeX were to consider this issue, it ought to carefully examine the PDF experience. In closing, I note that the PDF-1.0 specification defined 14 base fonts (1993 edition of Portable Document Format Reference Manual, p. 64) which all PDF viewers are required to provide, so that they never need to be embedded. The PDF-1.3 specification (2000 second edition of same) no longer even has the notion of `base fonts'. Perhaps this is in recognition of a problem that has been discussed before on the tex-fonts list, that various flavors of the PDF Times serif base font from different vendors have not only had somewhat different glyph offerings, but also different metrics. Or perhaps it is because with 20K+ fonts on the market (see http://www.math.utah.edu/~beebe/fonts/fonts-to-vendors.html for a catalog), why should 14 of them be singled out for special treatment as `base fonts'. As more (La)TeX users learn how to expand their font choices beyond (Extended) Computer Modern, our community has to face the font availability problem too. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - Center for Scientific Computing FAX: +1 801 585 1640, +1 801 581 4148 - - University of Utah Internet e-mail: beebe@math.utah.edu - - Department of Mathematics, 322 INSCC beebe@acm.org beebe@computer.org - - 155 S 1400 E RM 233 beebe@ieee.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe - ------------------------------------------------------------------------------- 7-Feb-2001 10:36:02-GMT,2712;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id DAA08572 for ; Wed, 7 Feb 2001 03:36:01 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17AZmu05223; Wed, 7 Feb 2001 11:35:48 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id LAA22597; Wed, 7 Feb 2001 11:35:41 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488129 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Wed, 7 Feb 2001 11:35:40 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA22589 for ; Wed, 7 Feb 2001 11:35:39 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA40106 for ; Wed, 7 Feb 2001 11:35:39 +0100 Received: from Sina.sharif.ac.ir (sina.Sharif.AC.IR [194.225.40.9]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17AZVu05130 for ; Wed, 7 Feb 2001 11:35:36 +0100 (MET) Received: from localhost (roozbeh@localhost) by Sina.sharif.ac.ir (8.9.3/8.9.3) with ESMTP id OAA04575 for ; Wed, 7 Feb 2001 14:05:26 +0330 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Message-ID: Date: Wed, 7 Feb 2001 14:05:26 +0330 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Roozbeh Pournader Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <14975.56331.365469.731085@istrati.zdv.uni-mainz.de> On Tue, 6 Feb 2001, Frank Mittelbach wrote: > who will? the user groups? for many lanugages there isn't a user group There are many interested experts around for those languages without a user group. One of the gathering places is the Omega mailing list. --roozbeh 7-Feb-2001 10:56:10-GMT,3006;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id DAA08466 for ; Wed, 7 Feb 2001 03:31:09 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17AV0u03910; Wed, 7 Feb 2001 11:31:00 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id LAA22420; Wed, 7 Feb 2001 11:30:53 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488094 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Wed, 7 Feb 2001 11:30:52 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA22413 for ; Wed, 7 Feb 2001 11:30:51 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id LAA38716 for ; Wed, 7 Feb 2001 11:30:51 +0100 Received: from Sina.sharif.ac.ir (sina.Sharif.AC.IR [194.225.40.9]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17AUdu03755 for ; Wed, 7 Feb 2001 11:30:41 +0100 (MET) Received: from localhost (roozbeh@localhost) by Sina.sharif.ac.ir (8.9.3/8.9.3) with ESMTP id NAA04302 for ; Wed, 7 Feb 2001 13:59:49 +0330 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Message-ID: Date: Wed, 7 Feb 2001 13:59:49 +0330 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Roozbeh Pournader Subject: Re: default inputenc/fontenc tight to language To: Multiple recipients of list LATEX-L In-Reply-To: <20010206111535.A18424@clipper.ens.fr> On Tue, 6 Feb 2001, Eric Brunet wrote: > jbezos wrote: > > Not at all. In latin1, character in the range 130--159 > > are assigned to control characters (and hence > > I wouldn't say it like that: the 128--159 range is reserved so that if > some document pass through a buggy programm that strips the eight bit, > then the resulting document doesn't have any extra control characters (in > the 0--31 range) with some ``interesting'' properties. No. They have different meaning as control characters. They're not the same as lower control characters (0x00--0x1F). --roozbeh 7-Feb-2001 13:51:27-GMT,5417;000000000001 Return-Path: Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by sunshine.math.utah.edu (8.9.3/8.9.3) with ESMTP id GAA12232 for ; Wed, 7 Feb 2001 06:51:25 -0700 (MST) Received: from relay.urz.uni-heidelberg.de (relay-eth.urz.uni-heidelberg.de [129.206.100.201]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17DpDu10264; Wed, 7 Feb 2001 14:51:13 +0100 (MET) Received: from relay (relay.urz.uni-heidelberg.de [129.206.119.201]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with SMTP id OAA27934; Wed, 7 Feb 2001 14:50:54 +0100 (MET) Received: from RELAY.URZ.UNI-HEIDELBERG.DE by RELAY.URZ.UNI-HEIDELBERG.DE (LISTSERV-TCP/IP release 1.8b) with spool id 488619 for LATEX-L@RELAY.URZ.UNI-HEIDELBERG.DE; Wed, 7 Feb 2001 14:50:53 +0100 Received: from ix.urz.uni-heidelberg.de (mail.urz.uni-heidelberg.de [129.206.119.234]) by relay.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id OAA27927 for ; Wed, 7 Feb 2001 14:50:52 +0100 (MET) Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by ix.urz.uni-heidelberg.de (8.8.8/8.8.8) with ESMTP id OAA05914 for ; Wed, 7 Feb 2001 14:50:52 +0100 Received: from abel.math.umu.se (abel.math.umu.se [130.239.20.139]) by relay.uni-heidelberg.de (8.10.2+Sun/8.10.2) with ESMTP id f17Doqu10171 for ; Wed, 7 Feb 2001 14:50:52 +0100 (MET) Received: from [130.239.20.144] (mac144.math.umu.se [130.239.20.144]) by abel.math.umu.se (8.9.2/8.9.2) with ESMTP id OAA32060; Wed, 7 Feb 2001 14:49:05 +0100 (CET) X-Sender: lars@abel.math.umu.se References: <14968.6710.114015.220264@ux28.nets.de.eds.com> <200101292234.RAA14964@pluto.math.albany.edu> <14967.8829.903878.620595@istrati.zdv.uni-mainz.de> <200101310003.BAA02073@peano.cs.uni-dortmund.de> <14967.46479.253389.421142@istrati.zdv.uni-mainz.de> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by relay.urz.uni-heidelberg.de id OAA27928 Message-ID: Date: Wed, 7 Feb 2001 14:50:50 +0100 Reply-To: Mailing list for the LaTeX3 project Sender: Mailing list for the LaTeX3 project From: Lars =?iso-8859-1?Q?Hellstr=F6m?= Subject: Re: inputenc -> text+math To: Multiple recipients of list LATEX-L In-Reply-To: <14971.10163.320991.33744@istrati.zdv.uni-mainz.de> At 22.33 +0100 2001-02-02, Frank Mittelbach wrote: >Lars wrote: [snip] > > What I suspect is the right solution is to have \protect set to > > \@unexpandable@protect when scanning for \omit and have it reset to Actually, after thinking about it a bit more I think it should be \noexpand\protect rather than \noexpand\protect\noexpand; there's no need for the extra \noexpand as the expansion is stopped at the first thing which is \noexpand'ed. > > \@typeset@protect in the column template---then the robustness mechanisms > > for normal robust commands, text commands, and in the \IeC command > > respectively would take care of sorting things out. I doubt this can be > > done by patching the \halign primitive, but it could be built into e.g. the > > array package. > >yes, that would solve the problem i'm pretty sure of it but as i said in my >earlier mail this really doesn't help because it would then only work in array >but not in, any contributed package that uses \halign (this is why Vladimir >tries to patch \halign). > >perhaps one should investigate combining your solution (change of protect) >with a patch to \halign after all, eg via a clever use of \everycr and the >like. One variant would be to start offering LaTeX-variants of \halign and \valign which (i) has less corny syntax, (ii) takes care to prevent errors of the kind discussed, and (iii) might offer a few extra features (e.g. calc-like syntax for \tabskip glue specifications or something; I don't know if that might be useful). What I'm thinking of is that you could write something like \latex@halign[]{ ... \do[]{} ... } \latex@endhalign and have it work sort of like \tabskip= \halign{ ... &\tabskip=& ... \cr \crcr} but also do the (ii) above (so there would probably be a prepended to each ). The arguments are optional in case you don't want to set the \tabskip glue (but occurs to me that an explicit \NoValue would be more appropriate for this level). The idea is that since we don't try to invent a new syntax for the