Moving at the Speed of Creativity by Wesley Fryer

Advice for web 1.0 to 2.0 (WordPress) page conversion?

Geeky but important question: Has anyone had luck batch converting a large number of similar, static webpages into a format that is importable into WordPress? I’m thinking a program like TextWrangler which supports batch grep search and replace commands might work, but that would be a pretty geeky / technical way to do this.

Anyone know of any programs that can streamline the conversion process? These are the pages which need to be converted. Ideally a solution which runs on Windows XP would be best in this specific case. (I’m asking on behalf of Mark Ahlness.)

Technorati Tags:
, ,

If you enjoyed this post and found it useful, subscribe to Wes’ free newsletter. Check out Wes’ video tutorial library, “Playing with Media.” Information about more ways to learn with Dr. Wesley Fryer are available on wesfryer.com/after.

On this day..


Posted

in

,

by

Tags:

Comments

4 responses to “Advice for web 1.0 to 2.0 (WordPress) page conversion?”

  1. Anthony Chivetta Avatar

    One option would be to try and convert the pages into RSS…

    For example, on unix you could download all the html files into a directory and then run something like (I ran this in zsh when I was playing):

    (for file in $(find ./); do echo “$file”; cat $file; echo “”; done) > foo.rss

    The resulting file will have a number of issues, such as the description elements containing and such as well as needing a tag, etc.

    The biggest issue here is that you are trying to take unstructured data (html files, not all of which follow the same format) and turn it into structured data (RSS, or whatever format you want to import).

    If you can get away with it, making posts that just link to the original html files may be much easier… something like Feed43 ( http://feed43.com/how-it-works.html ) might be able to help you with that…

    Hope that helps!

  2. Anthony Chivetta Avatar

    It looks like part of my post got mangled, one of the above paragraphs should read like (assuming it comes out correctly this time):

    The resulting file will have a number of issues, such as the description elements containing and such as well as needing a tag, etc.

  3. Anthony Chivetta Avatar

    (OK, third try is a charm, right?… replace the bracketed tags with normal tags)

    (for file in $(find ./); do echo “[item][title]$file[/title][description]”; cat $file; echo “[/description][/item]”; done) > foo.rss

    The resulting file will have a number of issues, such as the description elements containing [html] and such as well as needing a [channel] tag, etc.

  4. Mark Ahlness Avatar

    Wesley, thanks so much for putting the question out there!
    Anthony, thank you as well. I got a little mired down and lost
    in variable land over at feed43.com. I had looked at rss as well.
    I do have a routine spelled out to convert all the html files
    to a WordPress friendly file, but it will take me a while, as
    there are over 600 html docs, and it’s manual. Will let you know
    how it turns out. Thanks again! – Mark