This weekend was certainly an eye-opener. I won’t go into grody details (gag me with a spoon) but the end result is twofold. First, my publisher has given the job to someone else outside the company. Second, we’re going to try and get some in-house work on this as we learned SO MUCH about the format and publishing, that I think *I* can do e-books for the company in the future. But I am short on time for the learning curve, so that’s why we’re taking the plunge to pay someone else.

One of the things I think I will do is start from scratch with a basic format, and document as I go to help others. One of the things I have noticed is that you can just put up a PDF, but it’s not going to look professional. You can do an e-book, but sometimes that doesn’t look good on Kindle. We’re not sure what happened with my book, but no matter what we did, either the chapters got missed or the formatting looked really bad. The previous converter person gave up, but didn’t charge us, so we’re okay with her giving up.

So what makes me think *I* can do this? During all the back and forth about formatting, I downloaded some tools to see what I could do. One of the first things that struck me was how e-book formats are nothing more than XML or XHTML code. This is plain text, and I could easily build perl scripts to take raw data and carve it up. That’s what I do at work. I have noticed a lot of the problems we had were based on poor tags and what I call “over-formatting.” This is where a program tries to compensate for all situations and bloats your page with hundreds of lines of formatting instructions that are completely redundant and unnecessary. Microsoft is NOTORIOUS for this, like when you convert a a DOC format into HTML. But they are not alone. Over the years, I have seen many “WYSIWYG” web page makers that do this as well.

But I am a programmer. Maybe not the BEST programmer, but I know how to code, debug code, and I always have an eye for detail and what looks good. The last point is actually quite rare in the programming business; most programmers tend to be focused on doing things without ever thinking about how good the end product looks. “Who gives a crap if there are extra lines,” they may say, “if the output is readable? Don’t be picky.” I am picky. I am an artist, after all. Not only that, many programmers never estimate what kind of input they are going to get. “Some dumbass sent this to me in the wrong format; my scripts can’t handle this!” Well, maybe they should.

Perl was BUILT for data parsing. It’s not the BEST language, but it’s what I know, and I have made some pretty impressive internal tools using it over the years, including a completely automated systems for checking on thousands of servers, and if they crashed, automatically rebooting them. In fact, someone has already made a Perl module for e-books:

More info: