- Create robots.txt and put to web site root folder
- Use password to protect sensitive content
- Add X-Robots-Tag in Http headers
- <meta name="robots" content="nofollow" />
- <meta name="robots" content="noindex" />
http://antezeta.com/news/avoid-search-engine-indexing
Here is one example of robots.txt to disallow search engines to crawl your website (some bad bots may skip this robots.txt)
#Disallow search engines to index your website
#Please put it to the root directory of your website
User-agent: *
Disallow:/