Sigil 0.8.900 Released (first of the pre Sigil 0.9.0 release series)

Today we are releasing Sigil-0.8.900 which really represents the first of the Sigil 0.9.0 pre releases whose eventual goal is to add epub3 support without disrupting epub2 editing.

To accomplish this, Sigil has had most of its internals torn out and rebuilt from scratch with new components including Google’s Gumbo html5 parser and the use of an embedded Python 3.4.3 interpreter with lxml to replace Xerces and Tidy which had some serious issues with potential ePub3 support. As such, we do expect users to run into corner cases we have not yet found ourselves. So please test this new release and report any irregularities you see so that we can quickly iron out any difficulties before continuing our development towards support for both ePub2 and ePub3.

In addition, due to Compiler/DLL literal hell on Windows we have been forced to mix a VS2013 built Sigil, and Qt with a VS 2010 built Python 3.4. Given that Python 3.4 is C based (not C++) and given all python memory allocation/deallocation never crosses over our Python embedded interface, this has not proven to be an issue under all of our testing. That said, once Qt 5.5.1 actually builds with VS2015 on Windows, we will move towards using a VS2015 built version of Sigil, Qt, and Python 3.5 to make sure all pieces of the Sigil app are compiled with the exact same compiler on Windows. On Linux and Mac OS X where the compiler designers actually understand the concept of backwards compatibility using versioned symbols, separation of exception handling, and etc, none of this is an issue.

Features:

Features

– Removed Tidy and replaced it with a combination of a specially modified Gumbo
parser that supports html5

– Removed Xerces, XercesExtras, Boost, and unused bundles.

– Updated the source code to be compatible with the latest clang compiler by replacing all “undefined – behaviour” use of “Null” References with proper pointers across our entire codebase.

– Completely revamped the build process to embed the Python 3.4 Interpreter
inside of Sigil and integrate it in, including as site-packages:
[lxml, bs4, PIL, regex, six, html5lib]. This will allow plugins
that use the internal Python 3.X engine access to all of these
specialized packages by default with no additional action
needed by end users of their plugin.

– Created our own version of BeautifulSoup4-4.4.0 called sigil_bs4 that
fixes lxml namespace bugs, fixes serialization/prettyprinting of
inline xhtml tags, and modifies the bs4 codebase so that a single
source code works equally well on both Python 2.7 and Python 3.X

– Replaced internal opf and ncx xml processing and cleaning with a
combination of embedded Python 3.4, sigil_bs4 / lxml

– We now build Hunspell as a shared library and added a ctypes interface to allow
plugins to spellcheck

– We now build our modified gumbo html5 parser as a shared library
and provide a bs4 ctypes interface to it for easy xhtml processing
in plugins that use either Python 2.7 or Python 3.X

– Allowed plugins to auto-fix “text/html” media-types to “application/xhtml+xml”

– Began the transition to allow for both epub2 and epub3 editing
(Note: epub3 editing is still incomplete)

– Converted Flightcrew to become a Sigil plugin and replace it with
a simple and fast internal sanity checker.

– Updated Hunspell dictionaries to be actual dictionaries and not just word lists for en_US and en_GB

– Updated other dictionaries to their most current version to match what is used in LibreOffice 5

– Fix issue #54 modified date using local numerials when it should be
using Arabic numerals per the spec.

– Set the book to modified when font’s are obfuscated.

– Change FORCE_BUNDLED_COPIES build flag to USE_SYSTEM_LIBS.  This flips the meaning of the flag. Now USE_SYSTEM_LIBS will enable using system
libraries. Also, SYSTEM_LIBS_REQUIRED was added which will fail the
cmake configure if any system libraries are not found instead of
falling back to the bundled copy. Finally, this makes the build more
consistent for Windows and OS X users.

—-

See https://github.com/Sigil-Ebook/Sigil/releases/tag/0.8.900
to get Sigil-0.8.900.

One additional note: as Gumbo is an html5 aware library, Sigil will 
now replace named and numeric entities with their actual unicode 
characters.  So you should make sure Sigil's Preserve Entities Preferences
settings includes   and any other named entities you want to 
keep as entities.

Sigil 0.8.7 Released

This is a very small maintenance release. It mainly updates links for the change in code location. This release can be found here. The sha256sum of the checksum file is ec03bb7b586a4963fb9ac1ac22ae72c76360915775c0af631ce8e2da341aa0eb.

Also, this is the last release that will have the Mac package signed by John Schember.

Leaving Sigil in the Hands of New Maintainers

As of today I’m retiring from being part of Sigil. Nothing nefarious just lack of time (mainly) and motivation. This doesn’t mean the end of Sigil. Kevin Hendricks and Doug Massay have been working on Sigil with me for months now. I’m leaving it in their very capable hands.

To accommodate this transition I’ve created a GitHub organization for Sigil’s code. Kevin and Doug will be added as maintainers very soon. This way it’s not dependent on a repository connected to my personal account. Also there will be a 0.8.7 release which updates all the links to point to the new code location.

It’s been fun, bye.
John Schember

Sigil 0.8.6 Released

This release is a maintenance release of the 0.8.x series and fixes a few critical bugs that could cause Sigil to crash. You can find binary packages here and the change log, here.

Finally, the sha256 checksum for the checksum file is 732773ec4fc73ab2ba29584130833b53d96c6c1296c433d889f2cd4b55d565be. The Mac package is signed by my signing key (John Schember) and 10.9.5 is the minimum OS X version but it was built and tested on 10.10. The Windows builds were built on Windows 7.

Edit April 14, 2015.

There was an issue with the Mac Package where it would crash instead of opening. A new build has been uploaded. The new checksum for the checksum file is b05880c62ecd63a20225e13fc3868ea5520fe5f2e498842c542b1f32d525fee1.

Sigil 0.8.5 Released

This release is a maintenance release of the 0.8.x series and fixes a few critical bugs. Currently 0.8.x is the being maintained for critical fixes while the 0.9.0 is being worked on. You can find binary packages here and the change log, here.

Finally, the sha256 checksum for the checksum file is c34fe0e4d5d7fac3347a23e0644b1e72c6250579cc1939c625911d03800e967f. The Mac package is signed by my signing key (John Schember) and 10.9.5 is the minimum OS X version but it was built and tested on 10.10. The Windows builds were built on Windows 7.

Sigil Master Flux In Python

Right now Sigil master is in a state of flux. Many components are being removed and replaced. Python 3 is going to be a hard dependency (it will be embedded by default). Right now Python 3 and a few packages are required to be installed on your system to build and run Sigil. Specifically:

  • Python3.4+
  • lxml
  • six

I haven’t gotten around to researching and bundling all of this yet and as primary development happens on OS X these things are easy enough for me (and Kevin) to just not worry about at this moment. Anyone building from master themselves will need to deal with this though.

Already Tidy has been removed and replaced with a new parser, Gumbo+BS4. FlightCrew has been removed (if you want it see if someone is willing to make it into a plugin). Boost will be going next.

Gumbo+BS4(Beautiful Soup)+lxml all mean we will be able to support epub3. Not to mention even running some of the code in Python it’s a lot faster than the current solution (tidy and Xerceres). These are also easier to use.

As for FlgihtCrew, validation isn’t going away. We’re just going to build in a non-schema validating validator. Meaning if you want true schema validation to the letter of the epub spec then use a plugin. I added a specific plugin type in a previous release for this very reason. The validation we have in mind is simple stuff like is the HTML well formed. Also, by utilizing Python we’ll be able to (hopefully) have Javascript and CSS validation as well.

Basically FlightCrew wasn’t up to the task. Being written in C++ all the prebuilt libraries for validating things like CSS and Javascript being written in Python meant I’d have to write my own C++ version. There really wasn’t any sense in reinventing something that is already available…

One last thing. The Gumbo parser and BS4 are forks we’re including directly. In Gumbo’s case both the main person (Google employee) and the Github employee who forked it (there are two competing Gumbos’ right now. Don’t want our epub 3 changes. So there will probably be three versions of Gumbo as each diverges to meet the specific needs of each project. Right now we’re pretty close to the Github one but that will probably change in the future and we’ll probably rename and maintain ours as part of Sigil.

As for BS4 there are patches we need that have been waiting in never never land for years. calibre for example uses a modifed BS4 with a lot of the same patches because while they’ve been submitted upstream they’ve never even been looked at let along included…

So right now Sigil built directly from the master git repo (which we don’t recommend unless you’re developing directly for it) is in a major state of flux so we can finally get the structural changes to support not just epub 3 for to fix long standing issues.

One issue the new parser should fix is the fact that ebooks that were created using epubmerge often lose whole parts of the book due to file name conflicts due to the use of sub directories.

At this point I’m not willing to commit to saying exactly what the next release will bring other than underlying structural changes. Chances are Sigil itself from a user perspective will remain and function exactly the same when 0.9.0 (a long way off from now) as 0.8.x. But underneath a lot will have changed.

Sigil and BookView Research Update

As many Sigil users know, Sigil has a WYSIWYG editor portion. It’s also in my mind substandard. It gets the job done for quick edits but it’s not as full featured as I’d like. Especially when I’m used to using editors like WordPress’s editor.

Back in 2012 I started researching updating the BookView editor. This was a planned feature for the 0.6.0 but was ultimately dropped due to issues unresolvable issues.

Right now Sigil uses a QWebView set to allow editing. This is a very nice feature but requires a lot of Javascript glue to provide things like making selected text bold. Basically, we have a basic editor that gets the job done but isn’t really all that full featured. It’s also very difficult to work with (from a programing standpoint) and for a long time has been stagnant. Most users don’t care because they tend to use CodeView and now that there is a live Preview the use for BookView to see changes is reduced even further. One major issue with BookView is it will change the underlying code which is why Preview was introduced in the the first place.

Now back to 2012. I was researching using CKEditor or TinyMCE as a replacement for QWebView in edit mode. This would put all the editor functionality into these editor packages and reduce the scope of BookView (and it’s Javascript glue) to nearly nothing.

As I said they never made it into any release because they just weren’t working like I needed them to. That’s not to say they aren’t terrific editors and provide all the functionality I wanted. They just didn’t perform like I wanted.

Well, it’s been a few years since then and both projects have matured and greatly improved over that time. So I decided to revisit using them to replace the current BookView. Unfortunately, I ran into the same issue as before. All text is in Javascript which uses a lot of memory and can be very slow to work with. Slow to the point of unusable. Slow to the point of loading the editor can take minutes. Slow to the point of scrolling, and typing can take minutes. I will say they cope much better then they did the last time I looked at them but they still just aren’t going to work for Sigil

This isn’t an issue with the editors themselves. It’s an issue with all text being in Javascript. For a post on a blog or book divided into “short” chapters they work fine and very well. The issue is the amount of text you put into them. Long chapters you start to see the slowdown.

If you were to import a single HTML file and use BookView to put chapter split markers then split, we’ll, the editor just can’t cope. Again it’s because of how Javascript uses strings. These editors and Javascript just can’t deal with multiple Megabytes of text loaded into them.

I know that splitting single files into chapters is a typical use case so I can’t justify an editor that makes that impossible. So for the time being the best option is just to leave BookView as is. Simply because it’s able to cope with a much larger amount of text than these Javascript editors (which are beautiful and so nice to work with). Maybe in a few more years I’ll be able to switch to them for BookView but not right now.