The robot exclusion standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is otherwise publicly viewable.
Links are considered either "external" or "internal" by having a defined parameter of what portion of the internet counts as "us" and which as "them." Most commonly, a link to a page outside the same domain is considered external, whereas one in the same domain is considered internal.
When HTML documents are served there are three ways to tell the browser what specific character encoding is to be used for display to the reader.
One of methods is for the HTML document to include this information at its top, inside the HEAD element.
Favicon -
logo image website address in browser. A 16 x 16 pixel image icon, also known as a "shortcut icon" or "favorites icon" The favicon appears in the user's favorites list and also in the browser's address bar.
more on wikipedia.org
Retrived information you can use for search engine optimization and incrase web page popularity.