Liberate printed matter, on the Web

View the Project on GitHub stevenjmesser/hypertext-store

Hypertext Store

Hypertext Store is a library of print documents, such as reports, analysis and election leaflets, liberated on the Web in Markdown and HTML.

How did this start?

I found a report about misinformation and wanted to read it on my smartphone on the bus, but it was published in PDF and I couldn’t. That sucked, so I asked the publisher for an HTML version. They said that, unfortunately, they ‘[couldn’t] do anything to change it this time around’ – i.e., couldn’t be bothered. After finding more PDFs that could do with being liberated, I decided to set up this little website.

How does it work?

I find PDFs and documents that I think should be web pages and I spend time converting these to Markdown. Jekyll converts those Markdown files into HTML, so you can find them on the Web. Hosting everything on GitHub Pages means that each document has a change history.

Can I request an HTML version of a document?

Yes! Send me an email, tweet me or raise a GitHub Issue on this repository. I’ll need to know the document title, the publishing organisation and, where possible, a URL for the original document.

Who are you?

I’m Steve, a product manager in London, UK. You can visit my website.

Why do you need to ‘liberate’ print documents?

Most people and organisations use the World Wide Web as their main communications tool for distributing documents, but many of those documents are still designed and published for print, usually as PDFs.

Compared with HTML content, information published in a PDF is harder to find, use and maintain. More importantly, unless created with sufficient care PDFs can often be bad for accessibility and rarely comply with open standards. Good, standards-compliant HTML is almost always better for use on the Web.

For example, it’s hard to read PDFs on your smartphone during your daily commute. A4 print pages do not fit most smartphone screens neatly, and the resolution involved means one has to zoom in and out of the page often. It’s a terrible reading experience.

Read more about PDF vs HTML and why people use PDF.

Why does the document look weird?

Converting from PDF to Markdown is not an exact science. It takes a few seconds to convert a document but considerably longer to go back and fix formatting, therefore some of these files are raw and unprocessed. If you’d like to tidy up a document, please raise a pull request on the GitHub repository.