University Web Developers

University Web Developers

In a few weeks, my organization will be removing our old Web site from our Web server and replacing it with a completely new site. The new site includes completely new content and a radically different structure.

We will be using a Google Custom Search Engine as our internal search engine on the Web site.

I am curious if anybody has any advice on how we should handle this situation.

Obviously, if at all possible, I would like to remove all of our old URLs from Google's index and have it completely re-index our site from scratch. Otherwise, visitors using our internal search engine will most likely get a huge number of search results that return 404 pages. Where there may be only one or two pages on our new site that match the entered search criteria, it's possible that our visitors will get anywhere between 10 and 50 results from the old Web site when searching that criteria. If at all possible, I would like to avoid this.

However, I do not want to damage our search engine ranking in any way, so I am a little nervous about doing anything too drastic when making the switch.

As of today, I have Google indexing our new site at a temporary sub-domain, which I will redirect to our real domain once we make the switch.

Obviously, I will also set up the server so that any URLs to non-existent pages generate proper 404 messages, hoping that Google will automatically remove those pages from its index.

Our new Web site has a dynamically-generated Google Sitemap (whereas our old site doesn't even have a Google Sitemap - mostly because even I have no idea what pages are on the site or where they should be). I have also put a lot of effort into utilizing keywords and page descriptions, adding RSS feeds where applicable and implementing methods to make sure content gets updated regularly. As an organization, we have also made an effort to write much more user-friendly (and search engine-friendly) content for the site.

Has anyone else gone through this? How did you handle it? Do you have any specific advice for me? Thank you.

Views: 53

Replies to This Discussion

I see this post is old, I missed it when you posted it. How has it been going for you so far?

I converted our online college site from an old cms to Drupal and custom URL's so there was only a few URL's that stayed the same.

Here are some thoughts -

I wouldn't use 404's on those old pages, use 301's so they pass in the link juice they previously had. As Google finds more and more of the new URL's it will update it's search results with the proper listings. And giving a user a 404 when you could 301 them to the right page is very poor for usability.

We experienced a 2 week drop period before the road to recovery and better rankings than ever. Keep a close eye on google webmaster tools and the 404 report it gives you. I added that report to my iGoogle homepage so I see it multiple times a day and get rid of all 404's that occasionally pop up (using 301's to somewhere).

Our site is smaller so in total up to this date I've only written about 60 redirects. I'd imagine on a huge site it may not be feasible to go that route.
Thank you for your response.

Unfortunately, the launch of our new Web site was delayed for a few reasons over which I had no control. Therefore, I can't yet tell you how the process with Google is going. However, due to the slow indexing of our new site and the fact that the addresses will obviously change from the development location to the production location, I've developed an internal search engine for the CMS. It's not quite as robust and reliable (I don't have it set up to search non-Web documents like PDF, DOC, etc., nor do I yet have it set up to rank the pages in any sort of order) as the Google CSE will be, but it's much easier to customize the look and will be more reliable at the start (searching everything on the site instead of just what's been indexed, so far).

Regarding the 301s, I think that's a great idea in theory. Unfortunately, our current Web site currently contains over 45,000 files (most of which have been on the site for many years before my employment here), which is part of the reason we decided to start over from scratch.

The new Web site currently has a total of 918 pages (that number includes multiple pages that show content duplicated from other pages, so the total number of unique pages is probably closer to 750-800).

In addition, because we completely overhauled the Web site, creating an entirely new structure and developing completely new content, there are going to be very pages that carry over from the old site to the new site.

For important pages, I am setting up page aliases (so that the old address still leads to the same place without redirecting) rather than 301s. However, I just don't see an efficient way to effectively set up 301s for everything on the site.

Any suggestions in that area would be greatly appreciated.

I have tried to enhance our 404 pages (obviously this won't have any effect on Google, but it will help our visitors) showing the comprehensive site outline on the 404 page, along with a list of "similar" pages (at the moment, the "similarity" checker I built only checks for the existence of the keyword the user typed in the address bar within the identifiers of all the pages on the site - for instance, if someone were to try to access http://www.example.com/testing/test2/president.html, the CMS will check for all pages that include "president" in their identifier (which is generally just an SEF version of the page title).

Therefore, the 404 page would show a "Were you looking for..." list at the top of the page with a list of all of the pages that include "president" in their pagename.

I am thinking of developing a more robust "similarity" algorithm for the future, but have extremely little experience in building search algorithms, and Google's "Were you looking for..." feature is still very buggy.

I am always open to new suggestions, though, so please don't hesitate to comment back.
Well just looking through your site very briefly I see a few issues.

On internal pages your logo links to the index.asp page when it should be linking to the root domain - http://www.lfcc.edu/ If you have the google toolbar installed you can see that google has issued seperate PageRank values to these two pages which is not good. You want all of that going to the domain root not the index.asp version. So the solution there is to A) make all internal links go to the domain root and B) set up a 301 to redirect index.asp to /.

Clicking around various places I encountered a couple of 404's from left sidebar links. I'd run XENU Linkslueth on your site so you can immediately fix on site 404's.

I took a random sampling from the bottom of your XML sitemap file and many of those pages are 404's. You should be removing those from your sitemap.
The site you visited is still our old Web site. The new Web site has not yet been unveiled, and, unfortunately, cannot be made public, yet.

I will update the sitemap (I had forgot that there even was an XML sitemap for the old site) and will update the Dreamweaver templates (unfortunately, there's about 100 different Dreamweaver templates in use on the site right now, which will all go away once we unveil the new site) when I am able to.

I will download and check out Link Sleuth. I'd not heard of that before. We some times use Dreamweaver to check for broken links, but with the massive amount of files on the server, that can actually tie up my computer for an entire work day, so we try not to do that very often.

The advice you've given is sound advice, and I will definitely work on improving the new Web site to make sure that it does not suffer from any of the issues you raised about our old Web site. The CMS I built includes a built in link checker which runs as a cron job weekly (using PHP to check the header status of each link on the site), notifying the CMS super users if any links are broken. Also, the CMS dynamically generates the XML sitemap each time a page is added, removed or modified, so I won't have to worry about 404 errors occurring from the sitemap itself.

I will, however, make sure that the logo link leads to the root rather than index.html (or php or whatever extension I choose, since all of the links are built dynamically through a cross-section of htrewrite and PHP).

Thank you again for keeping up with me and providing the advice you've provided.
I sympathize with your situation. 45,000 obsolete pages is a lot to have to worry about. But changing all your URLs means that you lose all your old inbound links. That's serious. When Google reindexes your site, it will clean up its index (provided your 404 pages send 404 Not Found and not 200 Okay...) But Google can't clean up other people's web pages.

To see your current link juice, search Google for "site:yourschool.edu". I would make sure I didn't lose those top results. Create individual 404 pages for them, if possible.

To see your current inbound links, search Google for "link:yourschool.edu" and check your logs for top referrers. If you have time, ask them to link to your new pages. Or create individual 301 pages for your most popular content.
Thank you for the tips.

I did look through our Google Analytics data to identify most of our popular pages, and set up 301 redirects for each of those.

In addition, I logged into my Google Webmaster Tools account and took a look at the pages that had inbound links ("Pages with external links"). I went ahead and set up 301 redirects for all of those that exist on the new site that weren't previously taken care of (there were only a total of 46 listed).

I just did a "link:lfcc.edu" search on Google, and went through and took care of most of those results as well (about 98% of them were only linking to our home page anyway).

We unveiled the new Web site at the beginning of this month. While we were at it, I went ahead and made official 301 redirects from each of our many domain names so that they all redirect to the official address of http://www.lfcc.edu/ (you can actually get there from www.lf.vccs.edu, www.lf.cc.va.us, and lfcc.edu, as well as a few others). That had never been done before, either.
Google custom search has a cool new "index now" button so you no longer need to wait for a crawl. That was a great enhancement in the last release. Good luck.
Thank you for that information.

Because it was such a pain to get Google to index our site properly (since they didn't have that feature when I was actively working on it), I ended up having to write my own search engine for our site. In some ways, it's not nearly as good as using Google (for instance, my search engine only searches what's in the database, so it doesn't catch any PDF files, images, etc.), but in other ways it's much better (it's valid XHTML for starters).

I still have a use for Google CSE, though, so it's nice to know that I can force it to re-index my site.

RSS

Elsewhere

Latest Activity

Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with April Buscher from Montana State University Billings to learn how blind readers and people with hearing impairment view and read your website and how you can make it accessible to them. http://bit.ly/2zhdcIt"
Wednesday
Amanda Lawson joined Lynn Zawie's group
Thumbnail

OmniUpdate

Share your experiences using OmniUpdate CMS
Aug 9
Amanda Lawson posted a photo

Amanda Lawson

Amanda Lawson, Web Content ManagerCommunity College of Allgheny County
Aug 9
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"High schoolers spend more time on their digital devices than they do sleeping, doing homework, or participating in extracurricular activities. So how do you make your message stand out to them? #eexpect http://bit.ly/2MOIIWC"
Aug 8
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Want to increase digital engagement with high school juniors and seniors? Join our next webcast with Stephanie Geyer from Ruffalo Noel Levitz as she shares new data from the 2019 E-Expectations Trend Report on email, paid media, and social media…"
Jul 31
Charlie Holder joined DNI's group
Thumbnail

Cascade Server CMS

For folks who use (or are interested in) Hannon Hill's Cascade Server CMS productSee More
Jul 26
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Is your website in compliance with the new WCAG 2.1? Join our webcast to learn various accessibility guidelines, what’s new in 2.1, and more! http://bit.ly/2zhdcIt"
Jul 22
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Even though GDPR has been in effect for over a year, many U.S. colleges and universities are still struggling with how best to implement the rules. We’re here to help. http://bit.ly/2YZZtRQ"
Jul 18
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Does your college or university website meet the new WCAG 2.1 accessibility standards? http://bit.ly/2JBXD3s"
Jul 12
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Eric Turner from Mt. San Antonio College, who will share easy steps to make your website GDPR compliant. http://bit.ly/2zhdcIt"
Jul 10
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"It is always important to make a good first impression! Join Aaron Blau from Converge Consulting as he covers ways to make your web content attractive to your target audience and create an authentic brand message. http://bit.ly/2zhdcIt"
Jun 19
Jon Shaw posted a discussion

email obfuscation

Anyone using a javascript or php email obfuscation library that is effective for spam defense?See More
Jun 11
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Kelly Bostick from University of Arkansas who will provide some great tips on ways to ensure that all of your digital content is accessible. http://bit.ly/2zhdcIt"
Jun 6
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Creating and producing website content is just the tip of the iceberg. In our latest white paper, learn how to manage that content to help your website reach its fullest marketing and recruiting potential. http://bit.ly/30WJ0PW"
May 30
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"A college or university website redesign is the most effective and cost-efficient way to attract and recruit new students. Download our ultimate guide to get started on your redesign today! http://bit.ly/30MmcSQ"
May 28
Cody Bryant is now a member of University Web Developers
May 20
Linda Faciana commented on Lynn Zawie's group OmniUpdate
"Join us for our next webcast with Rachael Frank from Gravity Switch to learn how to organize your content and messaging for a website redesign. http://bit.ly/2zhdcIt"
May 16
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"Capitalize on content by creating an editorial calendar for your college or university website. Here’s how: http://bit.ly/2WCauaY"
May 9
Sara Arnold commented on Lynn Zawie's group OmniUpdate
"A soft launch of your website redesign is well worth the extra time. Find out why. http://bit.ly/2LfeigX"
May 2
Linda Faciana posted a blog post

Webcast - Website Redesign | The importance of using content inventories

Join us for our next webcast with Laura Lehman from Eastern Mennonite University to learn how to effectively use Google Sheets during a website redesign and migration! http://bit.ly/2zhdcItSee More
May 1

UWEBD has been in existence for more than 10 years and is the very best email discussion list on the Internet, in any industry, on any topic

About

© 2019   Created by Mark Greenfield.   Powered by

Badges  |  Report an Issue  |  Terms of Service