polyglot.ebook module

Convert various document formats to epub or mobi

Author:David Young
Date Created:October 9, 2016
class polyglot.ebook.ebook(log, settings, urlOrPath, outputDirectory, bookFormat, title=False, header=False, footer=False)[source]

The worker class for the ebook module

Key Arguments:
  • log – logger
  • settings – the settings dictionary
  • urlOrPath – the url or path to the content source
  • bookFormat – the output format (epub, mobi)
  • outputDirectory – path to the directory to save the output html file to.
  • title – the title of the output document. I. False then use the title of the original source. Default False
  • header – content to add before the article/book content in the resulting ebook. Default False
  • footer – content to add at the end of the article/book content in the resulting ebook. Default False

Usage:

WebToEpub

To generate an ebook from an article found on the web, using the webpages’s title as the filename for the book:

from polyglot import ebook
epub = ebook(
    log=log,
    settings=settings,
    urlOrPath="http://www.thespacedoctor.co.uk/blog/2016/09/26/mysqlSucker-index.html",
    title=False,
    bookFormat="epub",
    outputDirectory="/path/to/output/folder"
)
pathToEpub = epub.get()

To add a header and footer to the epub book, and specify the title/filename for the book:

from polyglot import ebook
epub = ebook(
    log=log,
    settings=settings,
    urlOrPath="http://www.thespacedoctor.co.uk/blog/2016/09/26/mysqlSucker-index.html",
    title="MySQL Sucker",
    bookFormat="epub",
    outputDirectory="/path/to/output/folder",
    header='<a href="http://www.thespacedoctor.co.uk">thespacedoctor</a>',
    footer='<a href="http://www.thespacedoctor.co.uk">thespacedoctor</a>'
)
pathToEpub = epub.get()

WebToMobi

To generate a mobi version of the webarticle, just switch epub for mobi:

from polyglot import ebook
mobi = ebook(
    log=log,
    settings=settings,
    urlOrPath="http://www.thespacedoctor.co.uk/blog/2016/09/26/mysqlSucker-index.html",
    title="MySQL Sucker",
    bookFormat="mobi",
    outputDirectory="/path/to/output/folder",
    header='<a href="http://www.thespacedoctor.co.uk">thespacedoctor</a>',
    footer='<a href="http://www.thespacedoctor.co.uk">thespacedoctor</a>'
)
pathToMobi = mobi.get()

DocxToEpub

To instead convert a DOCX document to epub, simply switch out the URL for the path to the DOCX file, like so:

from polyglot import ebook
epub = ebook(
    log=log,
    settings=settings,
    urlOrPath="/path/to/Volkswagen.docx",
    title="A book about a car",
    bookFormat="epub",
    outputDirectory="/path/to/output/folder",
    header='<a href="http://www.thespacedoctor.co.uk">thespacedoctor</a>',
    footer='<a href="http://www.thespacedoctor.co.uk">thespacedoctor</a>'
)
pathToEpub = epub.get()

DocxToMobi

You can work it out yourself by now!

get()[source]

get the ebook object

Return:
  • ebook

Usage:

See class docstring for usage