|
Recent
Articles |
Spry 1.5 - New "Session Expired" Support I blogged about the Spry 1.5 preview a few days ago and finally made time to whip up a quick demo of one of the new features - Session Handling
First and foremost...
$65m, Bad Use Of Javascript Javascript is one of my favourite web technologies. When I stop to think how the Internet would be without Javascript I can't help but think how much less productive and enjoyable my days would be. Take for instance...
Speed Up Page Load Times W/ Mod_deflate In a report from November, 2006 Akamai and JupiterResearch concluded that 4 seconds is the threshold an online shopper is willing to wait for...
LAMP Development - Backbone Of The World Wide Web The World Wide Web has always been an entity that needed a little bit of help in order to run effectively. As long as there has been the Internet, there have been tools to help keep it running well. More times...
Frames And Model-Glue Dan asked the following question about frames and Model-Glue: Basically, my question is about using frames within my application. I basically want to use a...
|
|
|
04.12.07
SES: I, Robots.txt
By David A. Utter
Danny Sullivan keyed the Robots.txt Summit session during SES New York, where representatives from major search engines discussed the future of the humble file used to manage crawler behavior.
Robots.txt serves as a tool to guide obedient search engine spiders to the content a webmaster wants indexed. It also blocks those spiders from reaching content as the webmaster desires.
It's a relatively simple concept to implement, and it began out of a cooperative discussion between search engines over a decade ago. Since then, robots.txt hasn't come up in conversations regarding changes that may need to be made to it.
That's what brought Danny and people from Ask, Google, Microsoft, and Yahoo together at SES New York.
The search engines have ideas about enhancing the now-venerable standard.
Keith Hogan, Director of Program Management, Search Technology, Ask.com, talked about Ask's views on robots.txt. He put forth a handful of concepts they are puzzling over, like controlling when a crawler can traverse a site or even passing meta directives in HTML to the engine from the robots.txt file.
Eytan Seidman, Senior Program Manager Lead, Live Search, noted how hotelier Hilton.com thinks robots.txt can universally control indexing times already.
In Hilton's file, the text "Do not visit Hilton.com during the day!" appears.
"This is very helpful to a search engine bot," he added in an eye-rolling way.
Sean Suchter, Director of Yahoo! Search Technology, Yahoo! Search, said Yahoo currently supports "crawl-delay," but it is frequently misused. "We would like to replace this with something that is better," he said.
Dan Crow, Product Manager, Google, said his long term goal for robots.txt is to standardize the common core feature set.
He suggested the term 'robots.txt' is incomplete, given the existence of robots meta tags.
About the Author: David Utter is a staff writer for WebProNews covering technology and business.
|