University Web Developers

University Web Developers

We're in the process of moving from static HTML pages (Dreamweaver templates) to a content management system for our main university web site.  A lot of CMSes have import utilities that can use CSV (Comma Separated Value) files or XML.

I am looking for a way to grab all of the data on our old pages and put them into one giant CSV file (or at least a folder at a time) for bulk import into our new site.  Finding the content areas on our pages would be pretty easy, since they're already clearly marked in the code (ie. "<!-- InstanceBeginEditable name="Text" -->").

Does anyone have experience doing something like this?  Something like Mozenda ( looks promising, but I'd prefer something that is a little cheaper or at least doesn't charge by the page.

Views: 1315

Reply to This

Replies to This Discussion

I'm nearing the end of a 5-month CMS migration from Dreamweaver. I did it by hand (read: work-study students) because the HTML in Dreamweaver had 10 years of cruft built up and I didn't want that in my shiny new CMS. I could have done it in a couple days, though, with a Ruby script. If you don't have someone with those skills, I'm sure you could hire someone (or yours truly) for just a few hours to do it.

I'd second the notion that depending on how recently those pages were created, you could be bringing along font tags and the associated "cruft."

I'm working on something similar, parsing through HTML and producing an XML markup used by our new CMS. I just want to get stuff in. My tool of choice is PERL given all the pattern matching abilities of that language.
Stephanie Leary built a really great plugin for WordPress that does just this. You could pull your static content into a WP install and export to XML. Might be a quick and easy way to accomplish the task.

Just curious, what kind of CMS will you be using?

dotCMS is the frontrunner at the moment.
I've helped with a few such migrations.Getting the data areas themselves aren't hard. A good developer could write a script for it rather easily. It's the formatting, accessibility, etc that is a problem especially with DW where there tends to be a lot of Garbage code. I've tried plugins and other methods, but in the end your best bet might be something on the lines of a student worker and a text editor. Anything else can cause all sorts of extra problems down the line.
Hi Stephen,

At NNU, we just accomplished content migration from static HTML files (Dreamweaver) into Typo3 using a custom extension we built. I'd be happy to let you look at the code and see if you can use it. You might also consider Typo3 as a CMS.

How big is the site you are looking to migrate?



How deeply have you delved into this? Have you thought about site structure? Is that going to change at all? What about any attachments like PDFs and images that are part of the content?

The Typo3 extension we built not only imports the HTML files into the db, it also imports any files, such as PDF (you specify file types), rebuilds the links to work with the CMS, and if you have Dreamweaver template tags around your content it can target only that area.

Of course there is still some manual editing we are doing, but I processed at 35k+ file site in less than an hour and have only needed to make relatively small tweaks on what was imported (not counting actual restructuring)

The extension is still a little rough, but fully workable. We were planning on releasing it back to the Typo3 community after a little cleanup, but if you would like to take a gander at it please feel free to talk to Zac or myself
I have a BBEdit Text Factory that does this. If you're on a Mac, you can download BBEdit from (free for 30 days), go to the menu that looks like a gear, open the Text Factories folder, and then put in the file below (once you've extracted it from the .zip archive). Once you put it into the folder, you can run the Factory over a whole folder/site full of documents.
I would look into YQL (Yahoo Query Library). You can build your own query that delivers either an XML tree or JSON structure then parse that query with any language. I've never used in the matter you are looking to do but it's worth a shot.
Well, it doesn't look like it will do the job en masse but DW CS5 can export the template data as an XML document. You can export the XML two ways, one where the editable sections names as the XML tags and one that uses DW XML tags.

I recently built a migration script as a proof of concept that my college could make the change to Drupal pretty seamlessly.


Our current CMS (reddot) has the ability to output valid XHTML files. I built a script in classic ASP script (am currently porting it to PHP) that scraped these XHTML files using Xpath and output them into a CSV file. I just made sure to consistently name elements (e.g. - <div id="header">) so that Xpath could find them.



Latest Activity

Jon Shaw posted a discussion

email obfuscation

Anyone using a javascript or php email obfuscation library that is effective for spam defense?See More
Jun 11
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Kelly Bostick from University of Arkansas who will provide some great tips on ways to ensure that all of your digital content is accessible."
Jun 6
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Creating and producing website content is just the tip of the iceberg. In our latest white paper, learn how to manage that content to help your website reach its fullest marketing and recruiting potential."
May 30
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"A college or university website redesign is the most effective and cost-efficient way to attract and recruit new students. Download our ultimate guide to get started on your redesign today!"
May 28
Cody Bryant is now a member of University Web Developers
May 20
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Rachael Frank from Gravity Switch to learn how to organize your content and messaging for a website redesign."
May 16
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Capitalize on content by creating an editorial calendar for your college or university website. Here’s how:"
May 9
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"A soft launch of your website redesign is well worth the extra time. Find out why."
May 2
Linda Faciana posted a blog post

Webcast - Website Redesign | The importance of using content inventories

Join us for our next webcast with Laura Lehman from Eastern Mennonite University to learn how to effectively use Google Sheets during a website redesign and migration! More
May 1
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"What are characteristics of the best CMS for colleges and universities? Read our guide to find out:"
Apr 24
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Caroline Roberts from iFactory who will be providing tips on how to improve your SEO by finding and wisely using the keywords and phrases that matter most!"
Apr 18
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"If your college website is not reaching your audience, but still meets most technical and accessibility requirements, there are a number of ways to fine-tune its performance."
Apr 18
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Before you click the launch button on your newly redesigned website, it pays to doublecheck the details:"
Apr 11
Laurie Trow replied to Jessie Groll's discussion Thoughts on "part-time work from home" for a web developer?
"I do work from home a few days a week. Depending where you're located, this would definitely be a perk. I've found plenty of higher ed jobs, but it's not often where working from home is an option. I find this odd since the higher ed…"
Apr 4
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Take a look at these award-winning higher ed digital marketing campaigns to see what’s working for them – and what you can implement to make your school’s digital marketing campaign one of the best."
Apr 3
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Do you have a plan in place to ensure your redesigned website is a success? Read our new white paper for the steps needed to successfully launch your website so that it is effective, informative, and gets noticed."
Mar 29
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"The better you pay attention to these small SEO details, the higher your website will rank in searches."
Mar 28
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Brian Johnson from Tacoma Community College for tips on how to effectively communicate the website redesign process and move to a new CMS. Win over those key players and get your project started on the right…"
Mar 27
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Angela Cavaliere from Montgomery County Community College who will discuss how to successfully navigate a website redesign in higher ed. Learn how to get buy-in from stakeholders, organize your team, and…"
Mar 20
Linda Faciana joined Lynn Zawie's group


Share your experiences using OmniUpdate CMS
Mar 20

UWEBD has been in existence for more than 10 years and is the very best email discussion list on the Internet, in any industry, on any topic


© 2019   Created by Mark Greenfield.   Powered by

Badges  |  Report an Issue  |  Terms of Service