Recently we have begun to experience an increase in spam generated from some of our web (HTML) forms. How do you deal with this? I'm concerned that some solutions may be inaccessible (I.e. CAPTCHA).
We are experimenting with the recaptcha API for some of our forms. The audio feature of it was quite appealing. It may not be a system wide solution, but to us it was worth a look. http://recaptcha.net/learnmore.html
Another interesting method I have seen is to add
enctype="multipart/form-data"
to your form and check for that when processing. I guess most bots don't know to change their encode type. Granted this is another short fix till they decide to do that.
The other ones I know of are mostly Javascript based which I won't use on the off chance they don't have it enabled.
I recently tried a concept that seems to work well. It's based on the fact that most bots only visit a form page once, so that it can harvest the form and all its field names, action URL, etc. This info about the form is then stored in their database somehow, and is then used to constantly hammer the action URL by sending values for all the fields.
(Before continuing, let me say that I took a pretty detailed look at my server logs, found IP addresses of bots sending form spam, and discovered that those IP addresses were not actually visiting the form page itself. They were only submitting to the action URL.)
So, knowing this, I created a method that logs the IP address of every visit to a form. So when you visit a web page with a form on it, your IP address is logged, and you now have permission to actually submit the form. Then I changed my form handling program to check the incoming IP address against that log, to make sure that it has permission to submit the form.
In other words, you're not allowed to submit a form unless you actually visit the form web page first. Most bots do not visit the form web page first, and therefore most all form submissions are now coming from humans.
The main reason I like this solution is that it's all server-based. No javascript. No CAPTCHA. Nothing extra for users to do. And I got fantastic response from the form submission recipients. Form spam dropped to virtually zero!
Then came a problem. A number of visitors started reporting that they could not submit our forms. After investigating IP addresses, I found that they are all AOL users. It seems that AOL is doing some stuff with their IP address allocation, so that a user's IP address can actually change during their online session. This of course throws my whole theory out the window.
So, I currently have this whole thing turned off, and I'm hoping to figure out a way to still use it. I figured I'd post it here anyway, with the hope that maybe someone else can expand on it with a good idea.
AOL uses proxy servers that make many users look like one IP address and, as you discovered, makes the same user look like different IP addresses sometimes.
You can check the referrer in the script for the action url to verify it came from the form page and achieve about the same amount of security without all the extra work; and avoid the AOL problem. Referrers can be spoofed, but so can IPs.
Most the bots I have dealt with are smart enough to spoof the referrer so that method hasn't been that fruitful for me.
Though Scott I think a better was to handle your situation is instead of putting an IP address in your DB, make a random seed and put that in the DB and place it in a hidden field in the form. After the person submits the form delete it from the database (to remove the chance you later generate the same seed and the person can't submit) and now when the bots attempt to keep doing it they won't be able to.
I think that method would fix the issues had with the previous solution.
Yeah, when I did my initial look into the form spam in my server logs, I found a lot of spoofed HTTP_REFERERs. Their initial harvesting process just logged the URL of the original form and simply sent that with the form submissions. I know that IP addresses can be spoofed too, but after banging heads with a couple of other IT minds, we figured that using IP address is more reliable (less spoofed).
I also know about AOL's caching system (as well as the ability for any ISP to do the same). But I only thought that the risk would be multiple visitors with the same IP address. What I didn't realize is that an AOL user doesn't use the same cache server throughout the life of a single online session, hence the possibility of changing IP addresses.
I still like my original concept of insuring that a user actually visits the form first before submitting. I like a couple other ideas posted here too.
Greetings,What are you all doing online with "old" magazine stories? Do you delete issues after so many years? 5 years? 10? I'm torn between keeping all on for historical purposes or keeping just a few years online to simplify the site (ala Gerry McGovern.) Curious as to what you see best practices being.ThanksSara KisseberthBluffton Universitywww.bluffton.eduSee More
The HighEdWeb 2020 Accessibility Summit is a one-day, online conference about digital accessibility in higher education happening June 25, 2020, from 10 a.m. to 5 p.m. CDT.Join in to learn best practices, share stories and connect with your higher ed peers on topics including social media accessibility, web development, user experience and more. Sessions are designed to boost knowledge at every level, from accessibility beginners to technical experts. Conference registration is $25, with…See More
October 19-20, 2020https://2020.highedweb.org/#HEWeb20 Join us ONLINE for HighEdWeb 2020, the conference created by and for higher education professionals across all departments and divisions. Together we explore and find solutions for the unique issues facing digital teams at colleges and universities. In 2020, the Conference will be held completely online, offering multiple tracks of streamed presentations, live…See More
"Throughout April, we're hosting webcasts exploring how colleges and universities across North America are responding to the COVID-19 pandemic. Register for the series today! https://bit.ly/2xsXhK9"
"Download our latest white paper to learn how the demographics of today’s higher ed learners are shifting, and how schools can adapt to meet the needs of these new learners. https://bit.ly/2wTKdgB"
"As we ride out the latest developments and impact of the coronavirus, there's no better time than now to learn the three Bs of crisis planning. http://bit.ly/2ITVkc2"
"Is your college or university prepared to meet the challenges that come with disasters and emergencies like the coronavirus? Learn how your CMS can help. http://bit.ly/2TUZUM8"
"Can’t afford the time and money to launch a comprehensive guided pathways model? Register for our FREE webcast to learn tricks for simulating a digital guided pathways experience."
"With college enrollment decreasing for the 8th year in a row, boosting your college or university marketing efforts is more important than ever. Here's how to get started. http://bit.ly/2vTQAzz"
October 18-21, 2020 in Little Rock, Arkansas, USAhttps://2020.highedweb.org/#HEWeb20 Join us for HighEdWeb 2020, the conference created by and for higher education professionals across all departments and divisions. Together we explore and find solutions for the unique issues facing digital teams at colleges and universities. With 100+ diverse sessions, an outstanding keynote presentation, intensive workshops, and engaging networking events,…See More
The 2020 Annual Conference of the Higher Education Web Professionals Association (HighEdWeb) will travel to Little Rock, Arkansas, this October 18-21 — and the call for proposals is now open! As a digital professional in higher education, we know you have great ideas and experiences to share. From developers, marketers and programmers to managers, designers, writers and all team members in-between, HighEdWeb provides valuable professional development for all who want to explore the unique…See More