Saturday, 28 September 2013

How are web pages scraped and how to protect againist someone doing it?

How are web pages scraped and how to protect againist someone doing it?

Im not talking about extracting a text, or downloading a web page. but I
see people downloading whole web sites, for example, there is a directory
called "example" and it isnt even linked in web site, how do I know its
there? how do I download "ALL" pages of a website? and how do I protect
against?
this question is not language-specific, I would be happy with just a link
that explains techniques that does this, or a detailed answer.

No comments:

Post a Comment