We don't want to use Google for our site searching/indexing. Does anyone know of any alternative solutions?
At the simplest level, if you have been using Google CSE in a very basic way, you might want to test DuckDuckGo as an alternative: http://duckduckgo.com/search_box.html
However... some more information would be very helpful in thinking about the problem: Are you currently using Google CSE or are you replacing a Google search appliance? Are you using a CMS? Searching multiple sites? Roughly how many pages? Do you need to search within PDFs?
I'd be interested to hear about other options. We're expecting to transition to Google Site Search since the Mini will no longer be supported, but I'm wondering if anyone else is making a viable search appliance similar to the Mini?
Our problem isn't a technical one; We have issues with their terms of service. We'd rather have something proprietery instead of open source. Turns out our CMS has a Site Search Module that we can add and we're looking into that.
Check these Site Search scripts resources:
We're also considering other search options now that the Google Mini is end-of-life.
Lead contenders are Google Site Search and an upgrade to a Google Search Appliance. I've also been thinking about some sort of homegrown solution using Solr (for search) + Nutch (for crawling the web presence. Here's a tutorial on combining the two: http://wiki.apache.org/nutch/NutchTutorial
I'm looking for something that can search the entire web presence, including college-related sites like our Athletics web site (which is produced by a third party), rather than just a single site.
I know that there are also modules for integrating Drupal and WordPress with Solr, but I haven't delved into that too deeply.
I'm very curious to hear what others are doing with search in the future; it seems a good many colleges were using the Google Mini appliance.