Pandoc
Posted: Thu Feb 08, 2018 4:05 pm
Using Pandoc to convert .epub files to other formats
Pandoc is a universal document converter that can be installed in either Windows or Mac computers. For the most part, the tool preserves the original format. See https://pandoc.org/
I run Pandoc either from the command prompt or Windows PowerShell and both work the same.
To convert the .epub file of Hit the Ground Running into a .docx and .html formats I did the following:
1. Put the file in a separate folder and renamed it to something simpler, preserving the epub extension (a simpler name helps when entering the specific command to convert); I renamed the file to "Running.epub".
2. Ensured I was in the correct working directory, where I had put the file. In the command line or Powershell I typed "cd documents" and pressed enter; because my file was in a subdirectory, I used the same command to change directory: typed "cd foldername"
3. To convert to .docx I typed the following line (from the command line or Powershell).
"pandoc -s Running.epub -o Running.docx"
To convert to html the command is almost the same, except for the file extension in the last part:
"pandoc -s Running.epub -o Running.html"
The "-s" is to produce a standalone file; "-o" is to indicate the output, followed by the file name with the desired extension type.
When I opened the resulting .html file in IE, I noticed a random link named exactly as the folder where I had my files, but it did not open anything when I clicked on it. So I opened the same html file in Firefox, Chrome and Safari and I tried to find the link, but it is not there. When I activate a list of links, I can find it in IE but not in any of the other brousers. I assume it is an issue with IE reading something else, but I don't know what.
Unfortunately, Pandoc does not convert files from .pdf. I usually run those first through Kurzweil 1000, or open them in a PDF viewer. Kurzweil seems to OCR the .pdf files, even if they are text-based.
Pandoc is a universal document converter that can be installed in either Windows or Mac computers. For the most part, the tool preserves the original format. See https://pandoc.org/
I run Pandoc either from the command prompt or Windows PowerShell and both work the same.
To convert the .epub file of Hit the Ground Running into a .docx and .html formats I did the following:
1. Put the file in a separate folder and renamed it to something simpler, preserving the epub extension (a simpler name helps when entering the specific command to convert); I renamed the file to "Running.epub".
2. Ensured I was in the correct working directory, where I had put the file. In the command line or Powershell I typed "cd documents" and pressed enter; because my file was in a subdirectory, I used the same command to change directory: typed "cd foldername"
3. To convert to .docx I typed the following line (from the command line or Powershell).
"pandoc -s Running.epub -o Running.docx"
To convert to html the command is almost the same, except for the file extension in the last part:
"pandoc -s Running.epub -o Running.html"
The "-s" is to produce a standalone file; "-o" is to indicate the output, followed by the file name with the desired extension type.
When I opened the resulting .html file in IE, I noticed a random link named exactly as the folder where I had my files, but it did not open anything when I clicked on it. So I opened the same html file in Firefox, Chrome and Safari and I tried to find the link, but it is not there. When I activate a list of links, I can find it in IE but not in any of the other brousers. I assume it is an issue with IE reading something else, but I don't know what.
Unfortunately, Pandoc does not convert files from .pdf. I usually run those first through Kurzweil 1000, or open them in a PDF viewer. Kurzweil seems to OCR the .pdf files, even if they are text-based.