diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/addendum.ca | 7 | ||||
-rw-r--r-- | doc/addendum.es | 7 | ||||
-rw-r--r-- | doc/addendum.fr | 6 | ||||
-rw-r--r-- | doc/addendum.it | 7 | ||||
-rw-r--r-- | doc/addendum.ja | 7 | ||||
-rw-r--r-- | doc/addendum.pl | 6 | ||||
-rw-r--r-- | doc/addendum.zh_Hans | 6 | ||||
-rw-r--r-- | doc/addendum_man.de | 15 | ||||
-rw-r--r-- | doc/addendum_man.es | 6 | ||||
-rw-r--r-- | doc/addendum_man.fr | 6 | ||||
-rw-r--r-- | doc/addendum_man.ja | 7 | ||||
-rw-r--r-- | doc/addendum_man.pl | 6 | ||||
-rw-r--r-- | doc/addendum_man.zh_Hans | 6 | ||||
-rw-r--r-- | doc/po4a.7.pod | 887 |
14 files changed, 979 insertions, 0 deletions
diff --git a/doc/addendum.ca b/doc/addendum.ca new file mode 100644 index 0000000..369c5d5 --- /dev/null +++ b/doc/addendum.ca @@ -0,0 +1,7 @@ +PO4A-HEADER:mode=after;position=^=head1 AUTORS;beginboundary=^=head1 + +=head1 TRADUCCIÓ + + Carme Cirera <menxu@hotmail.com> + Jordi Vilalta <jvprat@gmail.com> + diff --git a/doc/addendum.es b/doc/addendum.es new file mode 100644 index 0000000..7e3f13c --- /dev/null +++ b/doc/addendum.es @@ -0,0 +1,7 @@ +PO4A-HEADER:mode=after;position=^=head1 AUTORES;beginboundary=^=head1 + +=head1 TRADUCCION + + Jordi Vilalta <jvprat@gmail.com> + Omar Campagne <ocampagne@gmail.com> + diff --git a/doc/addendum.fr b/doc/addendum.fr new file mode 100644 index 0000000..2bf0199 --- /dev/null +++ b/doc/addendum.fr @@ -0,0 +1,6 @@ +PO4A-HEADER:mode=after;position=^=head1 AUTEURS;beginboundary=^=head1 + +=head1 TRADUCTION + + Martin Quinson (mquinson#debian.org) + diff --git a/doc/addendum.it b/doc/addendum.it new file mode 100644 index 0000000..def8282 --- /dev/null +++ b/doc/addendum.it @@ -0,0 +1,7 @@ +PO4A-HEADER:mode=after;position=^=head1 AUTORI;beginboundary=^=head1 + +=head1 TRADUZIONE + + Danilo Piazzalunga <danilopiazza@libero.it> + Marco Ciampa <ciampix@posteo.net> + diff --git a/doc/addendum.ja b/doc/addendum.ja new file mode 100644 index 0000000..e3c0ac4 --- /dev/null +++ b/doc/addendum.ja @@ -0,0 +1,7 @@ +PO4A-HEADER:mode=after;position=^=head1 著者;beginboundary=^=head1 + +=head1 訳者 + + 倉澤 望 <nabetaro@debian.or.jp> + Debian JP Documentation ML <debian-doc@debian.or.jp> + diff --git a/doc/addendum.pl b/doc/addendum.pl new file mode 100644 index 0000000..b2a2a74 --- /dev/null +++ b/doc/addendum.pl @@ -0,0 +1,6 @@ +PO4A-HEADER:mode=after;position=^=head1 AUTORZY;beginboundary=^=head1 + +=head1 TŁUMACZENIE + + Robert Luberda <robert@debian.org> + diff --git a/doc/addendum.zh_Hans b/doc/addendum.zh_Hans new file mode 100644 index 0000000..42308e5 --- /dev/null +++ b/doc/addendum.zh_Hans @@ -0,0 +1,6 @@ +PO4A-HEADER:mode=after;position=^=head1 作者;beginboundary=^=head1 + +=head1 翻译 + +taotieren <admin@taotieren.com> + diff --git a/doc/addendum_man.de b/doc/addendum_man.de new file mode 100644 index 0000000..2799713 --- /dev/null +++ b/doc/addendum_man.de @@ -0,0 +1,15 @@ +PO4A-HEADER:mode=after;position=^\.TH;beginboundary=^\.SH "AUTOR" + +.SH ÜBERSETZUNG +Diese Übersetzung wurde 2011, 2012, 2017-2024 von Helge Kreutzmann und 2012 +von Chris Leick <c.leick@vollbio.de> erstellt. Sie unterliegt +der GNU GPL Version 2 (oder neuer). + +Um die englische Originalversion zu lesen, geben Sie »man -L C BEFEHL« ein. + +Fehler in der Übersetzung melden Sie bitte über die Fehlerdatenbank (BTS) +von Debian oder indem Sie eine E-Mail an +.nh +<\fIdebian\-l10\-german@lists.debian.org\fR>, +.hy +schreiben. diff --git a/doc/addendum_man.es b/doc/addendum_man.es new file mode 100644 index 0000000..61ad2b3 --- /dev/null +++ b/doc/addendum_man.es @@ -0,0 +1,6 @@ +PO4A-HEADER:mode=after;position=^\.SH AUTOR;beginboundary=^\.SH + +.SH TRADUCCION + +Omar Campagne <ocampagne@gmail.com> + diff --git a/doc/addendum_man.fr b/doc/addendum_man.fr new file mode 100644 index 0000000..bd524f9 --- /dev/null +++ b/doc/addendum_man.fr @@ -0,0 +1,6 @@ +PO4A-HEADER:mode=after;position=^\.SH AUTEUR;beginboundary=^\.SH + +.SH TRADUCTION + +Nicolas François <nicolas.francois@centraliens.net> + diff --git a/doc/addendum_man.ja b/doc/addendum_man.ja new file mode 100644 index 0000000..b78accb --- /dev/null +++ b/doc/addendum_man.ja @@ -0,0 +1,7 @@ +PO4A-HEADER:mode=after;position=^\.SH 著者;beginboundary=^\.SH + +.SH 訳者 + +倉澤 望 <nabetaro@debian.or.jp> +Debian JP Documentation ML <debian-doc@debian.or.jp> + diff --git a/doc/addendum_man.pl b/doc/addendum_man.pl new file mode 100644 index 0000000..c313199 --- /dev/null +++ b/doc/addendum_man.pl @@ -0,0 +1,6 @@ +PO4A-HEADER:mode=after;position=^\.SH AUTOR;beginboundary=^\.SH + +.SH TŁUMACZENIE + +Robert Luberda <robert@debian.org> + diff --git a/doc/addendum_man.zh_Hans b/doc/addendum_man.zh_Hans new file mode 100644 index 0000000..37ac5d7 --- /dev/null +++ b/doc/addendum_man.zh_Hans @@ -0,0 +1,6 @@ +PO4A-HEADER:mode=after;position=^\.SH 作者;beginboundary=^\.SH + +.SH 翻译 + +taotieren <admin@taotieren.com> + diff --git a/doc/po4a.7.pod b/doc/po4a.7.pod new file mode 100644 index 0000000..2e81f7d --- /dev/null +++ b/doc/po4a.7.pod @@ -0,0 +1,887 @@ +=encoding UTF-8 + +=head1 NAME + +po4a - framework to translate documentation and other materials + +=head1 Introduction + +po4a (PO for anything) eases the maintenance of documentation translation using +the classical gettext tools. The main feature of po4a is that it decouples the +translation of content from its document structure. + +This document serves as an introduction to the po4a project with a focus on +potential users considering whether to use this tool and on the curious wanting +to understand why things are the way they are. + +=head1 Why po4a? + +The philosophy of Free Software is to make the technology truly available to +everyone. But licensing is not the only consideration: untranslated free +software is useless for non-English speakers. Therefore, we still have some +work to do to make software available to everybody. + +This situation is well understood by most projects and everybody is now +convinced of the necessity to translate everything. Yet, the actual +translations represent a huge effort of many individuals, crippled by small +technical difficulties. + +Thankfully, Open Source software is actually very well translated using the +gettext tool suite. These tools are used to extract the strings to translate +from a program and present the strings to translate in a standardized format +(called PO files, or translation catalogs). A whole ecosystem of tools has +emerged to help the translators actually translate these PO files. The result +is then used by gettext at run time to display translated messages to the end +users. + +Regarding documentation, however, the situation still somewhat disappointing. +At first translating documentation may seem to be easier than translating a +program as it would seem that you just have to copy the documentation source +file and start translating the content. However, when the original +documentation is modified, keeping track of the modifications quickly turns +into a nightmare for the translators. If done manually, this task is unpleasant +and error-prone. + +Outdated translations are often worse than no translation at all. End-users can +be tricked by documentation describing an old behavior of the program. +Furthermore, they cannot interact directly with the maintainers since they +don't speak English. Additionally, the maintainer cannot fix the problem as +they don't know every language in which their documentation is translated. +These difficulties, often caused by poor tooling, can undermine the motivation +of volunteer translators, further aggravating the problem. + +B<The goal of the po4a project is to ease the work of documentation translators>. +In particular, it makes documentation translations I<maintainable>. + +The idea is to reuse and adapt the gettext approach to this field. As with +gettext, texts are extracted from their original locations and presented to +translators as PO translation catalogs. The translators can leverage the +classical gettext tools to monitor the work to do, collaborate and organize as +teams. po4a then injects the translations directly into the documentation +structure to produce translated source files that can be processed and +distributed just like the English files. Any paragraph that is not translated +is left in English in the resulting document, ensuring that the end users never +see an outdated translation in the documentation. + +This automates most of the grunt work of the translation maintenance. +Discovering the paragraphs needing an update becomes very easy, and the process +is completely automated when elements are reordered without further +modification. Specific verification can also be used to reduce the chance of +formatting errors that would result in a broken document. + +Please also see the B<FAQ> below in this document for a more complete list +of the advantages and disadvantages of this approach. + +=head2 Supported formats + +Currently, this approach has been successfully implemented to several kinds +of text formatting formats: + +=over + +=cut + +# TRANSLATOR: 'man' refers to manpages and should probably not be translated + +=item man (mature parser) + +The good old manual pages' format, used by so many programs out there. po4a +support is very welcome here since this format is somewhat difficult to use and +not really friendly to newbies. + +The L<Locale::Po4a::Man(3pm)|Man> module also supports the mdoc format, used by +the BSD man pages (they are also quite common on Linux). + +=item AsciiDoc (mature parser) + +This format is a lightweight markup format intended to ease the authoring of +documentation. It is for example used to document the git system. Those +manpages are translated using po4a. + +See L<Locale::Po4a::AsciiDoc> for details. + +=item pod (mature parser) + +This is the Perl Online Documentation format. The language and extensions +themselves are documented using this format in addition to most existing Perl +scripts. It makes easy to keep the documentation close to the actual code by +embedding them both in the same file. It makes programmer's life easier, but +unfortunately, not the translator's, until you use po4a. + +See L<Locale::Po4a::Pod> for details. + +=item sgml (mature parser) + +Even if superseded by XML nowadays, this format is still used for documents +which are more than a few screens long. It can even be used for complete books. +Documents of this length can be very challenging to update. B<diff> often +reveals useless when the original text was re-indented after update. +Fortunately, po4a can help you after that process. + +Currently, only DebianDoc and DocBook DTD are supported, but adding support for +a new one is really easy. It is even possible to use po4a on an unknown SGML +DTD without changing the code by providing the needed information on the +command line. See L<Locale::Po4a::Sgml(3pm)> for details. + +=item TeX / LaTeX (mature parser) + +The LaTeX format is a major documentation format used in the Free Software +world and for publications. + +The L<Locale::Po4a::LaTeX(3pm)|LaTeX> module was tested with the Python +documentation, a book and some presentations. + +=item text (mature parser) + +The Text format is the base format for many formats that include long blocks of +text, including Markdown, fortunes, YAML front matter section, debian/changelog, +and debian/control. + +This supports the common format used in Static Site Generators, READMEs, and +other documentation systems. See L<Locale::Po4a::Text(3pm)|Text> for details. + +=item xml and XHMTL (probably mature parser) + +The XML format is a base format for many documentation formats. + +Currently, the DocBook DTD (see L<Locale::Po4a::Docbook(3pm)> for details) and +XHTML are supported by po4a. + +=item BibTex (probably mature parser) + +The BibTex format is used alongside LaTex for formatting lists of references (bibliographies). + +See L<Locale::Po4a::BibTex> for details. + +=item Docbook (probably mature parser) + +A XML-based markup language that uses semantic tags to describe documents. + +See L<Locale::Po4a:Docbook> for greater details. + +=item Guide XML (probably mature parser) + +A XML documentation format. This module was developed specifically to help with supporting +and maintaining translations of Gentoo Linux documentation up until at least March 2016 +(Based on the Wayback Machine). Gentoo have since moved to the DevBook XML format. + +See L<Locale::Po4a:Guide> for greater details. + +=item Wml (probably mature parser) + +The Web Markup Language, do not mixup WML with the WAP stuff used on cell phones. +This module relies on the Xhtml module, which itself relies on the XmL module. + +See L<Locale::Po4a::Wml> for greater details. + +=item Yaml (probably mature parser) + +A strict superset of JSON. YAML is often used as systems or configuration projects. +YAML is at the core of Red Hat's Ansible. + +See L<Locale::Po4a::Yaml> for greater details. + +=item RubyDoc (probably mature parser) + +The Ruby Document (RD) format, originally the default documentation format for Ruby +and Ruby projects before converted to RDoc in 2002. Though apparently the Japanese +version of the Ruby Reference Manual still use RD. + +See L<Locale::Po4a::RubyDoc> for greater details. + + +=item Halibut (probably experimental parser) + +A documentation production system, with elements similar to TeX, debiandoc-sgml, +TeXinfo, and others, developed by Simon Tatham, the developer of PuTTY. + +See L<Locale::Po4a:Halibut> for greater details. + +=item Ini (probably experimental parser) + +Configuration file format popularized by MS-DOS. + +See L<Locale::Po4a::Ini> for greater details. + +=item texinfo (very highly experimental parser) + +All of the GNU documentation is written in this format (it's even one of the +requirements to become an official GNU project). The support for +L<Locale::Po4a::Texinfo(3pm)|Texinfo> in po4a is still at the beginning. +Please report bugs and feature requests. + +=item gemtext (very highly experimental parser) + +The native plain text format of the Gemini protocol. The extension +".gmi" is commonly used. Support for this module in po4a is still in +its infancy. If you find anything, please file a bug or feature +request. + +=item Others supported formats + +Po4a can also handle some more rare or specialized formats, such as the +documentation of compilation options for the 2.4+ Linux kernels (L<Locale::Po4a::KernelHelp>) or the diagrams +produced by the dia tool (L<Locale::Po4a:Dia>). Adding a new format is often very easy and the main +task is to come up with a parser for your target format. See +L<Locale::Po4a::TransTractor(3pm)> for more information about this. + +=item Unsupported formats + +Unfortunately, po4a still lacks support for several documentation formats. Many +of them would be easy to support in po4a. This includes formats not just used +for documentation, such as, package descriptions (deb and rpm), package +installation scripts questions, package changelogs, and all the specialized +file formats used by programs such as game scenarios or wine resource files. + +=back + +=head1 Using po4a + +The easiest way to use this tool in your project is to write a configuration file for the B<po4a> +program, and only interact with this program. Please refer to its documentation, in L<po4a(1)>. The +rest of this section provides more details for the advanced users of po4a wanting to deepen their +understanding. + +=head2 Detailed schema of the po4a workflow + +Make sure to read L<po4a(1)> before this overly detailed section to get a simplified overview of the +po4a workflow. Come back here when you want to get the full scary picture, with almost all details. + +In the following schema, F<master.doc> is an example name for the documentation to be translated; +F<XX.doc> is the same document translated in the language XX while F<doc.XX.po> is the translation +catalog for that document in the XX language. Documentation authors will mostly be concerned with +F<master.doc> (which can be a manpage, an XML document, an AsciidDoc file, etc); the translators +will be mostly concerned with the PO file, while the end users will only see the F<XX.doc> file. + +Transitions with square brackets such as C<[po4a updates po]> represent the execution of a po4a tool +while transitions with curly brackets such as C<{update of master.doc}> represent a manual +modification of the project's files. + + master.doc + | + V + +<-----<----+<-----<-----<--------+------->-------->-------+ + : | | : +{translation} | {update of master.doc} : + : | | : + XX.doc | V V + (optional) | master.doc ->-------->------>+ + : | (new) | + V V | | + [po4a-gettextize] doc.XX.po -->+ | | + | (old) | | | + | ^ V V | + | | [po4a updates po] | + V | | V + translation.pot ^ V | + | | doc.XX.po | + | | (fuzzy) | + {translation} | | | + | ^ V V + | | {manual editing} | + | | | | + V | V V + doc.XX.po --->---->+<---<-- doc.XX.po addendum master.doc + (initial) (up-to-date) (optional) (up-to-date) + : | | | + : V | | + +----->----->----->------> + | | + | | | + V V V + +------>-----+------<------+ + | + V + [po4a updates translations] + | + V + XX.doc + (up-to-date) + +Again, this schema is overly complicated. Check on L<po4a(1)> for a simplified overview. + +The left part depicts how L<po4a-gettextize(1)> can be used to convert an existing translation +project to the po4a infrastructure. This script takes an original document and its translated +counterpart, and tries to build the corresponding PO file. Such manual conversion is rather +cumbersome (see the L<po4a-gettextize(1)> documentation for more details), but it is only needed +once to convert your existing translations. If you don't have any translation to convert, you can +forget about this and focus on the right part of the schema. + +On the top right part, the action of the original author is depicted, updating the documentation. +The middle right part depicts the automatic updates of translation files: the new material is +extracted and compared against the exiting translation. The previous translation is used for the +parts that didn't change, while partially modified parts are connected to the previous translation +with a "fuzzy" marker indicating that the translation must be updated. New or heavily modified +material is left untranslated. + +Then, the I<manual editing> block depicts the action of the translators, that +modify the PO files to provide translations to every original string and +paragraph. This can be done using either a specific editor such as the B<GNOME +Translation Editor>, KDE's B<Lokalize> or B<poedit>, or using an online +localization platform such as B<weblate> or B<pootle>. The translation result is +a set of PO files, one per language. Please refer to the gettext documentation +for more details. + +The bottom part of the figure shows how B<po4a> creates a translated source document from the +F<master.doc> original document and the F<doc.XX.po> translation catalog that was updated by the +translators. The structure of the document is reused, while the original content is replaced by its +translated counterpart. Optionally, an addendum can be used to add some extra text to the +translation. This is often used to add the name of the translator to the final document. See below +for details. + +Upon invocation, B<po4a> updates both the translation files and the translated documentation files +automatically. + +=head2 Starting a new translation project + +If you start from scratch, you just have to write a configuration file for po4a, and you are set. +The relevant templates are created for the missing files, allowing your contributors to translate +your project to their language. Please refer to L<po4a(1)> for a quick start tutorial and for all +details. + +If you have an existing translation, i.e. a documentation file that was translated manually, you can +integrate its content in your po4a workflow using B<po4a-gettextize>. This task is a bit cumbersome +(as described in the tool's manpage), but once your project is converted to po4a workflow, +everything will be updated automatically. + +=head2 Updating the translations and documents + +Once setup, invoking B<po4a> is enough to update both the translation PO files and translated +documents. You may pass the C<--no-translations> to B<po4a> to not update the translations (thus +only updating the PO files) or C<--no-update> to not update the PO files (thus only updating the +translations). This roughly corresponds to the individual B<po4a-updatepo> and B<po4a-translate> +scripts which are now deprecated (see "Why are the individual scripts deprecated" in the FAQ below). + +=head2 Using addenda to add extra text to translations + +Adding new text to the translation is probably the only thing that is easier in +the long run when you translate files manually :). This happens when you want +to add an extra section to the translated document, not corresponding to any +content in the original document. The classical use case is to give credits to +the translation team, and to indicate how to report translation-specific +issues. + +With po4a, you have to specify B<addendum> files, that can be conceptually +viewed as patches applied to the localized document after processing. Each +addendum must be provided as a separate file, which format is however very +different from the classical patches. The first line is a I<header line>, +defining the insertion point of the addendum (with an unfortunately cryptic +syntax -- see below) while the rest of the file is added verbatim at the +determined position. + +The header line must begin with the string B<PO4A-HEADER:>, followed by a +semi-colon separated list of I<key>B<=>I<value> fields. + +For example, the following header declares an addendum that must be placed +at the very end of the translation. + + PO4A-HEADER: mode=eof + +Things are more complex when you want to add your extra content in the middle +of the document. The following header declares an addendum that must be +placed after the XML section containing the string C<About this document> in +translation. + +=cut + +#TRANSLATORS: you can keep the French example here, or invert it: Use a header in your own language at the beginning, and then say that it shouldn't be in English, translating the second header example to English + +=pod + + PO4A-HEADER: position=About this document; mode=after; endboundary=</section> + +In practice, when trying to apply an addendum, po4a searches for the first line +matching the C<position> argument (this can be a regexp). Do not forget that +po4a considers the B<translated> document here. This documentation is in +English, but your line should probably read as follows if you intend your +addendum to apply to the French translation of the document. + + PO4A-HEADER: position=À propos de ce document; mode=after; endboundary=</section> + +Once the C<position> is found in the target document, po4a searches for the next +line after the C<position> that matches the provided C<endboundary>. The +addendum is added right B<after> that line (because we provided an +I<endboundary>, i.e. a boundary ending the current section). + +The exact same effect could be obtained with the following header, that is equivalent: + + PO4A-HEADER: position=About this document; mode=after; beginboundary=<section> + +Here, po4a searches for the first line matching C<< <section> >> after the line +matching C<About this document> in the translation, and add the addendum +B<before> that line since we provided a I<beginboundary>, i.e. a boundary marking +the beginning of the next section. So this header line requires placing the +addendum after the section containing C<About this document>, and instruct po4a +that a section starts with a line containing the C<< <section> >> tag. This is +equivalent to the previous example because what you really want is to add this +addendum either after C<< </section> >> or before C<< <section> >>. + +You can also set the insertion I<mode> to the value C<before>, with a similar +semantic: combining C<mode=before> with an C<endboundary> will put the addendum +just B<after> the matched boundary, that is the last potential boundary line before +the C<position>. Combining C<mode=before> with an C<beginboundary> will put the +addendum just B<before> the matched boundary, that is the last potential boundary +line before the C<position>. + + Mode | Boundary kind | Used boundary | Insertion point compared to the boundary + ========|===============|========================|========================================= + 'before'| 'endboundary' | last before 'position' | Right after the selected boundary + 'before'|'beginboundary'| last before 'position' | Right before the selected boundary + 'after' | 'endboundary' | first after 'position' | Right after the selected boundary + 'after' |'beginboundary'| first after 'position' | Right before the selected boundary + 'eof' | (none) | n/a | End of file + +=head3 Hint and tricks about addenda + +=over + +=item + +Remember that these are regexp. For example, if you want to match the end of a +nroff section ending with the line C<.fi>, do not use C<.fi> as B<endboundary>, +because it will match with C<the[ fi]le>, which is obviously not what you +expect. The correct B<endboundary> in that case is: C<^\.fi$>. + +=item + +White spaces ARE important in the content of the C<position> and boundaries. So +the two following lines B<are different>. The second one will only be found if +there is enough trailing spaces in the translated document. + + PO4A-HEADER: position=About this document; mode=after; beginboundary=<section> + PO4A-HEADER: position=About this document ; mode=after; beginboundary=<section> + +=item + +Although this context search may be considered to operate roughly on each line +of the B<translated> document, it actually operates on the internal data string of +the translated document. This internal data string may be a text spanning a +paragraph containing multiple lines or may be a XML tag itself alone. The +exact I<insertion point> of the addendum must be before or after the internal +data string and can not be within the internal data string. + +=item + +Pass the C<-vv> argument to B<po4a> to understand how the addenda are added to the +translation. It may also help to run B<po4a> in debug mode to see the actual +internal data string when your addendum does not apply. + +=back + +=head3 Addenda examples + +=over + +=item + +If you want to add something after the following nroff section: + + .SH "AUTHORS" + +You should select a two-step approach by setting B<mode=after>. Then you should +narrow down search to the line after B<AUTHORS> with the B<position> argument +regex. Then, you should match the beginning of the next section (i.e., +B<^\.SH>) with the B<beginboundary> argument regex. That is to say: + + PO4A-HEADER:mode=after;position=AUTHORS;beginboundary=\.SH + +=item + +If you want to add something right after a given line (e.g. after the line +"Copyright Big Dude"), use a B<position> matching this line, B<mode=after> and +give a B<beginboundary> matching any line. + + PO4A-HEADER:mode=after;position=Copyright Big Dude, 2004;beginboundary=^ + +=item + +If you want to add something at the end of the document, give a B<position> +matching any line of your document (but only one line. Po4a won't proceed if +it's not unique), and give an B<endboundary> matching nothing. Don't use simple +strings here like B<"EOF">, but prefer those which have less chance to be in +your document. + + PO4A-HEADER:mode=after;position=About this document;beginboundary=FakePo4aBoundary + +=back + + +=head3 More detailed example + +Original document (POD formatted): + + |=head1 NAME + | + |dummy - a dummy program + | + |=head1 AUTHOR + | + |me + +Then, the following addendum will ensure that a section (in French) about +the translator is added at the end of the file (in French, "TRADUCTEUR" +means "TRANSLATOR", and "moi" means "me"). + + |PO4A-HEADER:mode=after;position=AUTEUR;beginboundary=^=head + | + |=head1 TRADUCTEUR + | + |moi + | + +To put your addendum before the AUTHOR, use the following header: + + PO4A-HEADER:mode=after;position=NOM;beginboundary=^=head1 + +This works because the next line matching the B<beginboundary> C</^=head1/> after +the section "NAME" (translated to "NOM" in French), is the one declaring the +authors. So, the addendum will be put between both sections. Note that if +another section is added between NAME and AUTHOR sections later, po4a +will wrongfully put the addenda before the new section. + +To avoid this you may accomplish the same using B<mode>=I<before>: + + PO4A-HEADER:mode=before;position=^=head1 AUTEUR + +=head1 How does it work? + +This chapter gives you a brief overview of the po4a internals, so that you +may feel more confident to help us to maintain and to improve it. It may also +help you to understand why it does not do what you expected, and how to +solve your problems. + +=head2 TransTractors and project architecture + +At the core of the po4a project, the +L<Locale::Po4a::TransTractor(3pm)|TransTractor> class is the common ancestor to +all po4a parsers. This strange name comes from the fact that it is at the same +time in charge of translating document and extracting strings. + +More formally, it takes a document to translate plus a PO file containing +the translations to use as input while producing two separate outputs: +Another PO file (resulting of the extraction of translatable strings from +the input document), and a translated document (with the same structure as +the input one, but with all translatable strings replaced with content of +the input PO). Here is a graphical representation of this: + + Input document --\ /---> Output document + \ TransTractor:: / (translated) + +-->-- parse() --------+ + / \ + Input PO --------/ \---> Output PO + (extracted) + +This little bone is the core of all the po4a architecture. If you provide both +input and disregard the output PO, you get B<po4a-translate>. If you disregard +the output document instead, you get B<po4a-updatepo>. The B<po4a> uses a first +TransTractor to get an up-to-date output POT file (disregarding the output +documents), calls B<msgmerge -U> to update the translation PO files on disk, and +builds a second TransTractor with these updated PO files to update the output +documents. In short, B<po4a> provides one-stop solution to update what needs to +be, using a single configuration file. + +B<po4a-gettextize> also uses two TransTractors, but another way: It builds one +TransTractor per language, and then build a new PO file using the msgids of the +original document as msgids, and the msgids of the translated document as +msgstrs. Much care is needed to ensure that the strings matched this way +actually match, as described in L<po4a-gettextize(1)>. + +=head2 Format-specific parsers + +All po4a format parsers are implemented on top of the TransTractor. Some of them +are very simple, such as the Text, Markdown and AsciiDoc ones. They load the +lines one by one using C<TransTractor::shiftline()>, accumulate the paragraphs' +content or whatever. Once a string is completely parsed, the parser uses +C<TransTractor::translate()> to (1) add this string to the output PO file and +(2) get the translation from the input PO file. The parser then pushes the +result to the output file using C<TransTractor::pushline()>. + +Some other parsers are more complex because they rely on an external parser to +analyze the input document. The Xml, HTML, SGML and Pod parsers are built on top +of SAX parsers. They declare callbacks to events such as "I found a new title +which content is the following" to update the output document and output POT +files according to the input content using C<TransTractor::translate()> and +C<TransTractor::pushline()>. The Yaml parser is similar but different: it +serializes a data structure produced by the YAML::Tiny parser. This is why the +Yaml module of po4a fails to declare the reference lines: the location of each +string in the input file is not kept by the parser, so we can only provide +"$filename:1" as a string location. The SAX-oriented parsers use globals and +other tricks to save the file name and line numbers of references. + +One specific issue arises from file encodings and BOM markers. Simple parsers +can forget about this issue, that is handled by C<TransTractor::read()> (used +internally to get the lines of an input document), but the modules relying on an +external parser must ensure that all files are read with an appropriate PerlIO +decoding layer. The easiest is to open the file yourself, and provide an +filehandle or directly the full string to your external parser. Check on +C<Pod::read()> and C<Pod::parse()> for an example. The content read by the +TransTractor is ignored, but a fresh filehandle is passed to the external +parser. The important part is the C<< "<:encoding($charset)" >> mode that is passed +to the B<open()> perl function. + +=head2 Po objects + +The L<Locale::Po4a::Po(3pm)|Po> class is in charge of loading and using PO and +POT files. Basically, you can read a file, add entries, get translations with +the B<gettext()> method, write the PO into a file. More advanced features such as +merging a PO file against a POT file or validating a file are delegated to +B<msgmerge> and B<msgfmt> respectively. + +=head2 Contributing to po4a + +Even if you have never contributed to any Open Source project in the past, you +are welcome: we are willing to help and mentor you here. po4a is best maintained +by its users nowadays. As we lack manpower, we try to make the project welcoming +by improving the doc and the automatic tests to make you confident in +contributing to the project. Please refer to the CONTRIBUTING.md file for more +details. + +=head1 Open-source projects using po4a + +Here is a very partial list of projects that use po4a in production for their +documentation. If you want to add your project to the list, just drop +us an email (or a Merge Request). + +=over + +=item + +adduser (man): users and groups management tool. + +=item + +apt (man, docbook): Debian package manager. + +=item + +aptitude (docbook, svg): terminal-based package manager for Debian + +=item + +L<F-Droid website|https://gitlab.com/fdroid/fdroid-website> (markdown): +installable catalog of FOSS (Free and Open Source Software) applications for the +Android platform. + +=item + +L<git|https://github.com/jnavila/git-manpages-l10n> (asciidoc): +distributed version-control system for tracking changes in source code. + +=item + +L<Linux manpages|https://salsa.debian.org/manpages-l10n-team/manpages-l10n> (man) + +This project provides an infrastructure for translating many manpages +to different languages, ready for integration into several major +distributions (Arch Linux, Debian and derivatives, Fedora). + +=item + +L<Stellarium|https://github.com/Stellarium/stellarium> (HTML): +a free open source planetarium for your computer. po4a is used to +translate the sky culture descriptions. + +=item + +L<Jamulus|https://jamulus.io/> (markdown, yaml, HTML): +a FOSS application for online jamming in real time. The website +documentation is maintained in multiple languages using po4a. + +=item + +Other item to sort out: +L<https://gitlab.com/fdroid/fdroid-website/> +L<https://github.com/fsfe/reuse-docs/pull/61> + +=back + +=head1 FAQ + +=head2 How do you pronounce po4a? + +I personally vocalize it as L<pouah|https://en.wiktionary.org/wiki/pouah>, which +is a French onomatopoetic that we use in place of yuck :) I may have a strange sense of humor :) + +=head2 Why are the individual scripts deprecated? + +Indeed, B<po4a-updatepo> and B<po4a-translate> are deprecated in favor of B<po4a>. The reason is +that while B<po4a> can be used as a drop-in replacement to these scripts, there is quite a lot of +code duplication here. Individual scripts last around 150 lines of codes while the B<po4a> program +lasts 1200 lines, so they do a lot in addition of the common internals. The code duplication results +in bugs occuring in both versions and needing two fixes. One example of such duplication are the +bugs #1022216 in Debian and the issue #442 in GitHub that had the exact same fix, but one in B<po4a> +and the other B<po4a-updatepo>. + +In the long run, I would like to drop the individual scripts and only maintain one version of this +code. The sure thing is that the individual scripts will not get improved anymore, so only B<po4a> +will get the new features. That being said, there is no deprecation urgency. I plan to keep the +individual scripts as long as possible, and at least until 2030. If your project still use +B<po4a-updatepo> and B<po4a-translate> in 2030, you may have a problem. + +We may also remove the deprecation of these scripts at some point, if a refactoring reduces the code +duplication to zero. If you have an idea (or better: a patch), your help is welcome. + +=head2 What about the other translation tools for documentation using gettext? + +There are a few of them. Here is a possibly incomplete list, and more tools are +coming at the horizon. + +=over + +=item B<poxml> + +This is the tool developed by KDE people to handle DocBook XML. AFAIK, it +was the first program to extract strings to translate from documentation to +PO files, and inject them back after translation. + +It can only handle XML, and only a particular DTD. I'm quite unhappy with +the handling of lists, which end in one big msgid. When the list become big, +the chunk becomes harder to swallow. + +=item B<po-debiandoc> + +This program done by Denis Barbier is a sort of precursor of the po4a SGML +module, which more or less deprecates it. As the name says, it handles only +the DebianDoc DTD, which is more or less a deprecated DTD. + +=item B<xml2po.py> + +Used by the GIMP Documentation Team since 2004, works quite well even if, +as the name suggests, only with XML files and needs specially configured +makefiles. + +=item B<Sphinx> + +The Sphinx Documentation Project also uses gettext extensively to manage its +translations. Unfortunately, it works only for a few text formats, rest and +markdown, although it is perhaps the only tool that does this managing the whole +translation process. + +=back + +The main advantages of po4a over them are the ease of extra content addition +(which is even worse there) and the ability to achieve gettextization. + +=head2 SUMMARY of the advantages of the gettext based approach + +=over 2 + +=item + +The translations are not stored along with the original, which makes it +possible to detect if translations become out of date. + +=item + +The translations are stored in separate files from each other, which prevents +translators of different languages from interfering, both when submitting +their patch and at the file encoding level. + +=item + +It is based internally on B<gettext> (but B<po4a> offers a very simple +interface so that you don't need to understand the internals to use it). +That way, we don't have to re-implement the wheel, and because of their +wide use, we can think that these tools are more or less bug free. + +=item + +Nothing changed for the end-user (beside the fact translations will +hopefully be better maintained). The resulting documentation file +distributed is exactly the same. + +=item + +No need for translators to learn a new file syntax and their favorite PO +file editor (like Emacs' PO mode, Lokalize or Gtranslator) will work just fine. + +=item + +gettext offers a simple way to get statistics about what is done, what should +be reviewed and updated, and what is still to do. Some example can be found +at those addresses: + + - https://docs.kde.org/stable5/en/kdesdk/lokalize/project-view.html + - http://www.debian.org/intl/l10n/ + +=back + +But everything isn't green, and this approach also has some disadvantages +we have to deal with. + +=over 2 + +=item + +Addenda are somewhat strange at the first glance. + +=item + +You can't adapt the translated text to your preferences, like splitting a +paragraph here, and joining two other ones there. But in some sense, if +there is an issue with the original, it should be reported as a bug anyway. + +=item + +Even with an easy interface, it remains a new tool people have to learn. + +One of my dreams would be to integrate somehow po4a to Gtranslator or +Lokalize. When a documentation file is opened, the strings are +automatically extracted, and a translated file + po file can be +written to disk. If we manage to do an MS Word (TM) module (or at +least RTF) professional translators may even use it. + +=back + +=head1 SEE ALSO + +=over + +=item + +The documentation of the all-in-one tool that you should use: L<po4a(1)>. + +=item + +The documentation of the individual po4a scripts: L<po4a-gettextize(1)>, +L<po4a-updatepo(1)>, L<po4a-translate(1)>, L<po4a-normalize(1)>. + +=item + +The additional helping scripts: +L<msguntypot(1)>, L<po4a-display-man(1)>, L<po4a-display-pod(1)>. + +=item + +The parsers of each formats, in particular to see the options accepted by each +of them: L<Locale::Po4a::AsciiDoc(3pm)> L<Locale::Po4a::Dia(3pm)>, +L<Locale::Po4a::Guide(3pm)>, L<Locale::Po4a::Ini(3pm)>, +L<Locale::Po4a::KernelHelp(3pm)>, L<Locale::Po4a::Man(3pm)>, +L<Locale::Po4a::RubyDoc(3pm)>, L<Locale::Po4a::Texinfo(3pm)>, +L<Locale::Po4a::Text(3pm)>, L<Locale::Po4a::Xhtml(3pm)>, +L<Locale::Po4a::Yaml(3pm)>, L<Locale::Po4a::BibTeX(3pm)>, +L<Locale::Po4a::Docbook(3pm)>, L<Locale::Po4a::Halibut(3pm)>, +L<Locale::Po4a::LaTeX(3pm)>, L<Locale::Po4a::Pod(3pm)>, +L<Locale::Po4a::Sgml(3pm)>, L<Locale::Po4a::TeX(3pm)>, +L<Locale::Po4a::Wml(3pm)>, L<Locale::Po4a::Xml(3pm)>. + +=item + +The implementation of the core infrastructure: +L<Locale::Po4a::TransTractor(3pm)> (particularly important to understand the +code organization), L<Locale::Po4a::Chooser(3pm)>, L<Locale::Po4a::Po(3pm)>, +L<Locale::Po4a::Common(3pm)>. Please also check the F<CONTRIBUTING.md> file in +the source tree. + +=back + +=head1 AUTHORS + + Denis Barbier <barbier,linuxfr.org> + Martin Quinson (mquinson#debian.org) + +=cut + +LocalWords: PO gettext SGML XML texinfo perl gettextize fr Lokalize KDE updatepo +LocalWords: Gtranslator gettextization VCS regexp boundary +LocalWords: lang TransTractor debconf diff poxml debiandoc LocalWords +LocalWords: Denis barbier linuxfr org Quinson |