ePublishing | Modules Unraveled

101 Building an ePublishing Platform Using Drupal Modules with Liang Shen - Modules Unraveled Podcast

Photo of Liang Shen

PDF

Yes. But pdf and epub modules are the second generation. Fileviewer module was the first generation. It uses Poppler, the popular pdf lib in Linux, to convert pdf file into png images and display them in browser.
After several years, the Mozilla Foundation created pdf.js which allow browsers use HTML5 and JavaScript to display PDF file. Today pdf.js has become the default PDF plugin in Firefox. I wrote PDF module to integrate it into Drupal.

  • So, this just integrates pdf.js into Drupal?
  • Can you create pdfs with the module?

ePub

Since Amazon launched Kindle, ebook market was getting hot. Google and Apple joined the battle soon. Epub format as an open standard chosen by many new competitors in this market became popular. Thanks to Jake Hartnell the author of epub.js, an open source Javascript epub lib, we can display epub file in the browser as well. So I wrote epub module to integrate it into Drupal.
Google Book Search has been renamed into Google Books and become a part of Play Books. Both Google and Amazon have HTML5 online reader now. Although epub.js is not as good as them, it has gotten most features for a online ebook reader.

  • Do either of these provide search functionality?

Apachesolr_file

  • How does Apachesolr_file fit into this?
    It’s always easy to use Ctrl-F to search in one book. If you have thousands of books or even more, you need a full-text search engine to index them all. Apachesolr_file module uses Solr, Apache Foundation’s popular full text search engine, to index files.
    We already have apachesolr module and apachesolr_attachments‎ module. The difference between apachesolr_attachments‎ and apachesolr_file is - apachesolr_attachments was designed to index the files with nodes and apachesolr_file was designed to index file entity (the new conception since Drupal 7) for purely file management.
    Not only pdf and epub but also other popular file formats like MS Word, Excel, PowerPoint… can be indexed by Solr (https://tika.apache.org/1.5/formats.html all the formats supported by Tika - the file parser used by Solr). So you can also use this module on intranet for companies, schools and other organizations.

Application

  • Do you know of any sites that are using these now?
  • What are some other applications you can see for these modules?

NodeSquirrel Ad

Have you heard of/used NodeSquirrel?
Use "StartToGrow" it's a 12-month free upgrade from the Start plan to the Grow plan. So, using it means that the Grow plan will cost $5/month for the first year instead of $10. (10 GB storage on up to 5 sites)