How to Generate Plain-Text Source-Code PDF Examples That Work in a Document Viewer

How to generate plain-text source-code PDF examples that work in a document viewer?

You should append a (syntactically correct) xref and trailer section to the end of the file. That means: each object in your PDF needs one line in the xref table, even if the byte offset isn't correctly stated. Then Ghostscript, pdftk or qpdf can re-establish a correct xref and render the file:

[...]
endobj
xref
0 8
0000000000 65535 f
0000000010 00000 n
0000000020 00000 n
0000000030 00000 n
0000000040 00000 n
0000000050 00000 n
0000000060 00000 n
0000000070 00000 n
trailer
<</Size 8/Root 1 0 R>>
startxref
555
%%EOF

Is there a way to obtain Source Code of a PDF file in Windows?

For what you propose one potential solution is MuPDF/MuTool If you wish to decompile An existing PDF there are options in MuPDF-GL for windows using option A to convert to Ascii and "PrettyPrint"

You can write your own PDF as text but it can have limitations this is accepted as a working PDF

%PDF-1.2 4 0 obj << >> stream BT/ 36 Tf((Hello World!))' ET endstream endobj 3 0 obj << /Type /Page /Parent 2 0 R /Contents 4 0 R >> endobj 2 0 obj << /Kids [3 0 R ] /Count 1 /Type /Pages /MediaBox [ -195 -442 400 400 ] >> endobj 1 0 obj << /Pages 2 0 R /Type /Catalog >> endobj trailer << /Root 1 0 R > %%EOF

courtesy of Thomas see Create Memorystream of type pdf and return to browser

If you are "Hand balling" with UTF 16 chars on a "small device" it becomes a step harder see https://stackoverflow.com/a/68442444/10802527

More useful to producing your own many RaspberryPi users Compile PDF via MuTool Create https://mupdf.com/docs/manual-mutool-create.html

The Input Text to be translated during compilation is much simpler especially for image handling

%%MediaBox 0 0 612 792
%%Font TmRm Times-Roman
%%Font Helv-C Helvetica Cyrillic
%%Font Helv-G Helvetica Greek
%%Image I0 logo/ClientLogo.png

% Draw the image.
q
480 0 0 480 50 250 cm
/I0 Do
Q

% Draw a triangle. (Can be rectangles or a grid etc)
q
1 0 0 rg
50 50 m
100 200 l
200 50 l
f
Q

% Show some text. (Remember we humans work downwards, so 50 in then 760,730,700, etc. downwards)
q
0 0 1 rg
BT /TmRm 24 Tf 50 760 Td (Hello, from EPS32!) Tj ET
BT /Helv-C 24 Tf 50 730 Td <fac4d2c1d7d3d4d7d5cad4c521> Tj ET
BT /Helv-G 24 Tf 50 700 Td ( I am Line 3) Tj ET
Q

Is there a text string variable type in Adobe PDF specification?

The PDF does not have something like a variable like PostScript does. What may come close to what you are trying to achieve (output the same text multiple places) is a form XObject. Just like a page it has a content stream with graphics objects such as (Hello, world!) Tj, and it can be be drawn on a page (or another XObject) through the graphics Do operator. Its operand corresponds to a key in the XObject dictionary in the Resources dictionary of the page. The PDF would look something like this. (Note that stream lengths, the cross references table and the trailer or no longer valid so consider this pseudo-PDF.)

%PDF-1.4

1 0 obj % entry point
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj

2 0 obj
<<
/Type /Pages
/MediaBox [ 0 0 200 200 ]
/Count 1
/Kids [ 3 0 R ]
>>
endobj

3 0 obj
<<
/Type /Page
/Parent 2 0 R
/Resources <<
/Font <<
/F1 4 0 R
>>
/XObject <<
/A 6 0 R % XObject /A is obj 6 0
>>
>> % /Resources must close here
/Contents 5 0 R
>>
endobj

4 0 obj
<<
/Type /Font
/Subtype /Type1
/BaseFont /Times-Roman
>>
endobj

5 0 obj % page content
<<
/Length 44
>>
stream
BT
70 50 TD % this has no effect on `/A Do` - only on the "manual" `Tj`
/A Do % do the drawing of XObject A
/F1 12 Tf % without this line: "Error: No font in show;"
% if without TD, then the next text is just appended
%-10 50 TD
0 0 TD % "Td/TD move to the start of next line"; but here like \r
(Hello, world - manual!) Tj
ET
endstream
endobj

6 0 obj
<< /Type /XObject
/Subtype /Form
/FormType 1
/BBox [ 0 0 1000 1000 ]
/Matrix [ 1 0 0 1 0 0 ]
/Resources << /ProcSet [ /PDF ] >>
/Length 58
>>
stream
%70 50 TD % without this `TD` setting, `/A Do` places this in 0,0 - bottom left corner
/F1 12 Tf
(Hello, world!) Tj
endstream
endobj

xref
0 7
0000000000 65535 f
0000000010 00000 n
0000000079 00000 n
0000000173 00000 n
0000000301 00000 n
0000000380 00000 n
0000000450 00000 n
trailer
<<
/Size 7
/Root 1 0 R
>>
startxref
600
%%EOF

Output in evince:

hello-evince.png

EDIT The text in the form XObject appears at the lower left corner because the current transformation matrix equals the identity matrix at the time of the show string operation. The initial CTM of the form XObject equals the concatenation of [the CTM in the parent stream when Do is invoked] and [the Matrix entry in the form XObject dictionary]. Which is identity in this case. The text matrix is not propagated from the parent stream to the form XObject.



Related Topics



Leave a reply



Submit