pdf2htmlEX/README.md

151 lines
5.6 KiB
Markdown
Raw Normal View History

2013-01-25 13:11:27 +00:00
# pdf2htmlEX
2013-01-25 18:10:07 +00:00
[![Build Status](https://travis-ci.org/coolwanglu/pdf2htmlEX.png?branch=master)](https://travis-ci.org/coolwanglu/pdf2htmlEX)
2012-09-26 14:44:50 +00:00
2012-09-07 17:50:18 +00:00
A beautiful demo is worth a thousand words:
2012-08-04 18:25:47 +00:00
2012-10-10 19:41:58 +00:00
- [**Typography**](http://coolwanglu.github.com/pdf2htmlEX/demo/geneve.html) [Original](https://github.com/raphink/geneve_1564/raw/master/geneve_1564.pdf)
2012-09-07 17:50:18 +00:00
2012-10-10 19:41:58 +00:00
- [**Full Circle Magazine(large)**](http://coolwanglu.github.com/pdf2htmlEX/demo/issue65_en.html) [Sample](http://coolwanglu.github.com/pdf2htmlEX/demo/issue65_en_sample.html) [Original](http://dl.fullcirclemagazine.org/issue65_en.pdf)
2012-09-27 04:57:58 +00:00
2012-10-10 19:41:58 +00:00
- [**Formulas**](http://coolwanglu.github.com/pdf2htmlEX/demo/cheat.html) [Original](http://www.tug.org/texshowcase/cheat.pdf)
2012-09-07 17:50:18 +00:00
2012-10-10 19:41:58 +00:00
- [**Scientific Paper**](http://coolwanglu.github.com/pdf2htmlEX/demo/demo.html) [Original](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.148.349&rep=rep1&type=pdf)
2012-09-07 17:50:18 +00:00
2012-10-10 19:41:58 +00:00
- [**Chinese**](http://coolwanglu.github.com/pdf2htmlEX/demo/chn.html) [Original](http://files.cnblogs.com/phphuaibei/git%E6%90%AD%E5%BB%BA.pdf)
2012-08-18 08:05:36 +00:00
2012-08-28 09:56:37 +00:00
## Introduction
2012-08-28 09:54:27 +00:00
2012-08-07 12:16:05 +00:00
pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies, aims to provide an accuracy rendering, while keeping optimized for Web display.
2012-08-04 18:03:53 +00:00
2012-12-12 08:54:48 +00:00
It is optimized for modern web browsers. On Linux/Mac, the generated HTML pages could be as beautiful as PDF files.
2012-08-07 12:16:05 +00:00
2012-09-23 17:31:55 +00:00
This program is designed for scientific papers with complicate formulas and figures, therefore precise rendering is the #1 concern. But of course general PDF files are also supported.
2012-08-28 09:54:27 +00:00
2012-12-12 08:54:48 +00:00
### Why HTML ?
HTML, together with CSS and Javascript, is much more open and flexible than PDF. Almost everything can be customized.
- Embedding documents to web pages with consistent theme and behavior
- Cross references to other documents are much easier and intuitive
- More functions to the document with Javascript, e.g. access control, animation, statistics
Readers can also be benefitted
- Read while downloading
- Plugin-free
2012-08-28 09:56:37 +00:00
## Features
2012-08-04 18:03:53 +00:00
2012-12-12 08:54:48 +00:00
* Optional single HTML file output
2012-08-07 12:16:05 +00:00
* Precise rendering
2012-12-12 08:54:48 +00:00
* Text perserved - you can select & copy & paste
* Proper styling
- Font - extracted and reencoded
- Color
- Transformation
2012-09-16 12:48:51 +00:00
* Links
2013-01-28 13:59:06 +00:00
* Outline
2012-10-03 04:51:04 +00:00
* [EXPERIMENTAL] Path drawing with CSS
2012-12-12 08:54:48 +00:00
- Orthogonal lines
- Rectangles
- Linear gradients
2013-01-28 13:59:06 +00:00
* Not fully supported (Rendered as images)
2012-12-12 08:54:48 +00:00
- Type 3 fonts
- Non-text object
2012-08-04 18:03:53 +00:00
2012-08-28 09:56:37 +00:00
## Get started
2012-08-28 09:54:27 +00:00
2013-01-30 04:15:22 +00:00
### Install
Thanks to all packagers!
2012-10-03 18:43:45 +00:00
2013-01-30 04:15:22 +00:00
* [Ubuntu PPA](https://launchpad.net/~coolwanglu/+archive/pdf2htmlex) by Lu Wang <coolwanglu@gmail.com>, not always up-to-date.
* [ArchLinux AUR](https://aur.archlinux.org/packages.php?ID=62426) by Arthur Titeica <arthur.titeica@gmail.com>
* [Gentoo Overlay](http://gpo.zugaina.org/app-text/pdf2htmlex), gentoo-zh, mrueg or sunrise, by respective packagers.
* [Homebrew Formula](https://github.com/jamiely/homebrew/blob/pdf2htmlex/Library/Formula/pdf2htmlex.rb) by Jamie Ly <me@jamie.ly>
* [Macports (local repo)](https://github.com/iapain/pdf2htmlEX-macport) by Deepak Thukral <iapain@iapa.in>
2012-09-21 06:21:56 +00:00
2012-08-29 23:42:53 +00:00
### Build from source
2012-08-28 09:54:27 +00:00
2012-08-28 09:56:37 +00:00
#### Dependency
2012-08-28 10:27:45 +00:00
2012-09-10 18:44:45 +00:00
* CMake, pkg-config
2012-09-21 13:35:27 +00:00
* GNU Getopt
2012-09-17 17:32:27 +00:00
* compilers support C++11, for example
2012-09-09 08:13:04 +00:00
* GCC >= 4.4.6
2012-09-17 17:32:27 +00:00
* I heard about successful build with Clang
2012-10-13 08:01:27 +00:00
* **poppler** with xpdf header >= 0.20.0 (compile with **--enable-xpdf-headers**)
* Install **libpng** (and headers) BEFORE you compile poppler if you want background images generated
* Install **poppler-data** if your want CJK support
* **fontforge** (with header files)
2012-09-21 13:35:27 +00:00
* git version is recommended to avoid annoying compilation issues
2012-10-13 08:01:27 +00:00
* [Optional] **ttfautohint**
* run pdf2htmlEX with **--external-hint-tool=ttfautohint** to enable it
2013-01-30 04:15:22 +00:00
* [For Windows]
* Cygwin
2013-01-30 04:18:43 +00:00
* or MinGW, with some modifications to pdf2htmlEX. See [pdf2htmlEX on TeX Wiki](http://oku.edu.mie-u.ac.jp/~okumura/texwiki/?pdf2htmlEX) (in Japanese), special thanks to Haruhiko Okumura
2013-01-30 04:15:22 +00:00
2012-08-04 18:03:53 +00:00
2012-08-28 09:56:37 +00:00
#### Compiling
2012-08-28 10:27:45 +00:00
2012-12-25 09:10:38 +00:00
git clone --depth 1 git://github.com/coolwanglu/pdf2htmlEX.git
2012-10-08 09:02:34 +00:00
cd pdf2htmlEX
2012-08-14 18:28:19 +00:00
cmake . && make && sudo make install
2012-08-04 18:03:53 +00:00
2012-08-28 09:54:27 +00:00
## Usage
2012-08-28 10:27:45 +00:00
2012-08-28 09:54:27 +00:00
pdf2htmlEX /path/to/foobar.pdf
pdf2htmlEX --help
2012-10-03 12:32:21 +00:00
man pdf2htmlEX
2012-08-28 09:54:27 +00:00
## FAQ
2012-10-03 12:32:21 +00:00
* [Troubleshooting compilation errors](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ#wiki-compile)
2012-10-08 08:50:42 +00:00
* [The demo pages are ugly](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ#wiki-ugly)
2012-10-03 12:12:51 +00:00
* [How can I help](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ#wiki-help)
* [I want more features](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ#wiki-feature_commission)
* [More](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ)
2012-08-28 09:56:37 +00:00
## LICENSE
2012-08-04 18:03:53 +00:00
2012-08-31 13:19:46 +00:00
GPLv2 & GPLv3 Dual licensed
2012-08-04 18:03:53 +00:00
2012-09-11 06:51:31 +00:00
**pdf2htmlEX is totally free, please credit pdf2htmlEX if you use it**
**Please consider sponsoring it if you use it for commercial purpose**
2012-09-07 10:23:38 +00:00
2012-09-11 07:02:11 +00:00
**Font extraction, conversion or redistribution may be illegal, please check your local laws**
2012-09-07 10:23:38 +00:00
2012-10-08 08:50:42 +00:00
### [**Donate Now**](http://coolwanglu.github.com/pdf2htmlEX/donate.html)
2013-01-30 04:21:57 +00:00
## Contact
* Mailing list <pdf2htmlex@googlegroups.com>
* Please read `man pdf2htmlEX` and [**FAQ**](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ) before sending emails. Or your message might be ignored.
* Please use the **latest master branch**.
* Lu Wang <coolwanglu@gmail.com>
* Please use the mailing list above unless for personal enquiries.
* Accepting messages in **Chinese**, **English** or **Japanese**.
2012-10-03 18:41:07 +00:00
## Acknowledge
2012-08-28 09:54:27 +00:00
2012-09-21 13:38:23 +00:00
pdf2htmlEX is made possible thanks to the following projects:
2012-09-21 13:35:27 +00:00
2012-09-21 13:38:23 +00:00
* [poppler](http://poppler.freedesktop.org/)
* [Fontforge](http://fontforge.org/)
2012-09-22 06:41:29 +00:00
* [jQuery](http://jquery.com/)
2012-09-21 13:35:27 +00:00
2012-09-18 16:45:20 +00:00
pdf2htmlEX is inspired by the following projects:
2012-08-04 18:03:53 +00:00
* pdftops & pdftohtml from poppler
2012-08-11 11:55:06 +00:00
* MuPDF
2012-08-04 18:03:53 +00:00
* PDF.js
* Crocodoc
* Google Doc
2012-08-28 09:56:37 +00:00
### Special Thanks
2012-08-28 09:54:27 +00:00
2012-08-04 18:27:18 +00:00
* Hongliang Tian <tatetian@gmail.com>
2012-12-01 16:59:59 +00:00
* Wanmin Liu <wanminliu@gmail.com>
2012-08-04 18:03:53 +00:00