pdf2htmlEX tries its best to render the PDF precisely, maintain proper styling, while retaining text and optmizing for Web.
Fonts are extracted form PDF and then embedded into HTML (Type 3 fonts are not supported). Text in the converted HTML file is usually selectable and copyable.
Other objects are rendered as images and also embedded.
If this switch is on, local matched font will be used and embedded; otherwise only font names are exported such that web browsers may try to find proper fonts themselves.
.TP
.B--embed-external-font<0|1>(Default:0)
Similar as above but for non-base fonts
.TP
.B--heps<len>,--veps<len>(Default:1)
Specify the maximum tolerable horizontal/vertical offset (in pixels).
pdf2htmlEX would try to optimize the generated HTML file moving Text within this distance.
.TP
.B--space-threshold<ratio>(Default:1.0/6)
pdf2htmlEX would insert a whitespace character ' ' if the distance between two consecutive letters in the same line is wider than ratio * font_size
.TP
.B--font-size-multiplier<ratio>(Default:10)
Many web browsers limit the minimum font size, and many would round the given font size, which results in incorrect rendering.
Specify a ratio greater than 1 would resolve this issue.
.TP
.B--always-apply-tounicode<0|1>(Default:0)
A ToUnicode map may be provided for fonts in PDF which indicates the 'meaning' of the characters.
However often there is better "ToUnicode" info in Type 1 fonts, and sometimes the ToUnicode map provided is wrong. So by default pdf2htmlEX will find the Unicode value directly from the fonts instead of ToUnicode map. This behavior may be changed by turning on this switch.