Use of XSL-FO, CSS3 instead of CSS2 to create Paginated documents like PDF?
Thanks all comments and answers!
Now, 2014, passed over 1.5 years of my post (May 17 '12), is time to consolidate: no answer was, for me, a "full answer", but all answers (see Nenotlep's and Alex's) contributed to form a big picture.
My main motivation now, to consolidate, is the @mzjn's news (here) of 2013-11.
XSL-FO is officially dying
On Sat, 2013-11-02, Liam R. E. Quin wrote:
"We have closed the Working Group because not enough people were taking part", W3C XML Activity Lead, about the failure of XSL-FO 2.0 continuity. (see a better copy here).
The last update for the Working Draft was in January 2012, and now confirmed: W3C stop developing XSL-2.
Why? It will be replaced by CSS3-page, see below.
PS: to discuss the "official statment", use https://stackoverflow.com/a/21345449/287948
CSS3 is officially growing
The standard CSS3-page is a draft, but many applications, like PrinceXML v9 and AntennaHouse Formatter v6 demonstrated that it is ready (!); and, the expected launch of HTML5 for 2014 is carrying along the forecast release CSS3.
So, I understand that for W3C, CSS3-page do all that we need to express good prints and good PDF.
One day, in a far future... PDF will dead — it is complex and is not part of the XML family or W3C investments —, and many claim that EPUB will replace it.
This is another good motivation: tablet readers and PC browsers will print (HTML, XHTML and EPUB) as well as PDF. So PDF will be not necessary... And, for this day, the only standard need for, ex. Webkit printing project, will be the CSS3-page standard.
CSS3 is the key point in two strategic affairs: 1) to generate good PDF from XML or HTML contents; 2) to replace PDF.
NOTE: another 2014's updates for the links of the question: wkHtmlToPDF is now here. About "new texts", now we have many, see ex. Building Books with CSS3.
An updated answer for programmers, for this page's question, Why use XSL-FO instead of CSS2, for transform HTML into good PDF?
If you go further and implement a new system for XML-Publishing, there are no good reason to use XSL-FO. SUMMARIZING:
XSL-FO is a dead technology today, only used by niche companies, to give maintenance to legacy systems in big publishing companies, like Elsevier... Most writers/readers of Stackoverflow are from small and medium companies. Companies like O'Reilly Media, Inc. already use CSS3 for print.
CSS3 will replace CSS2, covering all gaps (and fears as @AlexS's) of CSS2.
today (2014), as you can check by Google or my links (see PrinceXML v9 and AntennaHouse Formatter v6), we have some good software to render content with CSS2 or CSS3.
as @bytebuster say, "CSS is much easier to develop" (and easier to learn!).
as I say above, CSS3 is not isolated, it is a piece of the "XML/HTML/SVG" family.
is much cheaper to develop "HTML+CSS templates" (hourly cost of a standard web designer doing a simple task), than "XSL-FO templates" (hourly cost of a rare professional in a complex task).
Jan'2016, the definitive CSS3 standard is coming!
About W3C standards: the old "css-page" was replaced by "css-break", and "paged media" to "fragmentation"... Now it is a Candidate Recommendation, see https://www.w3.org/TR/css-break-3
Apr'2020, Blimey, +4 years and nothing!... Ok, need more tests
Total 8 years from question's post, and 4 years from "css-break-3 fineshed!" announcement ...
Chrome was the first to finesh in 2019 but some was wrong in test validation team of W3C, and in 2020 back... Now the status (in 23 tests) is:
- Chrome's Blink engine fail 1 test;
- Firefox's Gecko engine fail 3 tests.
The draft now is here and tests here.
XSL-FO compared to classical technologies
Newer, XML based publishing is really taking off. Just dismissing XML as inefficient and with "inherent problems" shows a lack of understanding what customers really want: open technologies that are not locked to a specific vendor.
XSL-FO, just like XHTML, XSLT, SVG and other public standards may not be perfect, and may not have all the traditional technologies and file formats have.
But software solutions that implement these standards are cheaper, easier to integrate, and interchangeable. You can switch from one implementation to another much easier than implementing a proprietary technology.
It's understandable why the software providers of the 80's hate open standards - is because they level the playing field. Adobe stopped supporting SVG after it become obvious that they cannot control the standard; ISIS Papyrus could support XSL-FO, but they choose not to because they will be forced to compete on equal foot with everybody else.
Anything better than XSL-FO?
The best XSL-FO engine is Antenna House XSL Formatter. RenderX XEP is also pretty good, Apache FOP is pretty average but you can make it work for simple things.
There is no other "standard" for getting XML into PDF. For SGML there used to be DSSSL. I think some people have also implemented XML->TeX conversion and then use a TeX typesetter. The other (commercial) options off the top of my head are:
- PrinceXML (XML+CSS)
- PTC Arbortext (FOSI, XSL-FO and APP/3B2)
- TurnKey TopLeaf (proprietary)
- SDL XySoft XPP (proprietary)
- Typefi (basd on InDesign Server)
I guess if your print publishing is simple enough you could use something like iText to build the PDF using a Java class or something.