Preventing Your Web Server From Blocking Facebook Share
Today, I noticed that Facebook could not scrape any information from some of our web sites. In fact, when I tried to share articles on Facebook from these web sites, Facebook simply posts the URL without a title, a description, nor an image.
Funny thing, this problem only occurs on web sites running on two servers that we operate ourselves. Our web sites running on shared hosting servers do not have this problem. That points to a web server configuration issue.
Using the Facebook debugger tool on the problematic web sites, we get this error message: "Error Parsing URL: Error parsing input URL, no data was scraped."
Finally, we traced it down to the fact that Apache's default mod_security rule blocks all Facebook connection. This security rule watches a certain connection header that many spammers use and prevents connection. It turns out that Facebook also uses this header. It's not clear why Facebook uses this header, rather then using a more "legitimate" header. However, the solution is to add a mod_security rule to allow Facebook connection:
Simple add the above rule to your Apache web server configuration. In our Ubuntu installation, that's the "/etc/apache2/apache2.conf" file. After adding the mod_security rule, restart the Apache server. With this change, Facebook should have no problem getting a valid response from your server.
Looks like it's a little more difficult than what's been outlined so far. For example, if you put the URL of this page into the Facebook debugger, you will likely get the "Error Parsing URL: Error parsing input URL, no data was scraped." error message. However, if you use "http://www.GearHack.com/", then the debugger returns the results fine. Somehow, the root page differs from the sub-page.
What we are also noticing is that the Facebook debugger may parse the sub-page correctly three times. Then the third time, start to return the error message. It's not clear if Facebook debug tool is doing that or our web server is shutting down Facebook after the third attempt.
It turns out that mod_security problem identified above may be an issue for some servers, but isn't the problem in our case. Our server constantly returns 206 to Facebook. However, the two servers in question responds too slowly at times, causing Facebook to timeout. The slowness in response is caused by the "gethostbyaddr" PHP function call, where we convert the remote IP address to a domain name. It seems to respond fast enough on our two shared hosting server, but not faster enough on our dedicated server (go figure). Once we removed the call, Facebook debugger has no problem retrieving and parsing our web pages. We still do not have a solution to speed up the "gethostbyaddr" function.
I have this error "Error Parsing URL: Error parsing input URL, no data was scraped, when I tried it on facebook debugger.
Did your message disappear? Read the Forums FAQ.
Spam Control | * indicates required field
No TrackBacks yet. TrackBack can be used to link this thread to your weblog, or link your weblog to this thread. In addition, TrackBack can be used as a form of remote commenting. Rather than posting the comment directly on this thread, you can posts it on your own weblog. Then have your weblog sends a TrackBack ping to the TrackBack URL, so that your post would show up here.
Messages, files, and images copyright by respective owners.
189 Users Online
Copyright © 2004 - 2017. All Rights Reserved.