How search engine robots work?
How Search Engine Robots Work ?
Search Engines do not actually search the internet each time
somebody types in a search query. This would take far too long.
Instead, what they do is search through their databases of web
sites they have already indexed. The Search Engine robots find
pages that are linked to from other pages that they already know
about or pages that are submitted to them. When a web page is
submitted to a search engine, the url is added to the search
engine bots queue of websites to visit. Even if you don't
directly submit a website, or the web pages within a website,
most robots will find the content within your website if other
websites link to it. That is a part of the process referred to
as building reciprocal links. This is one of the reasons why it
is crucial to build the link popularity for a website, and to
get links from other relevant sites back to yours. It should be
part of any website marketing strategy you opt in for.
The search engine databases update at varying times. Once a
website is in the search engine database, the bots will keep
visiting it regularly, so as to pick up any changes that are
made to the websites pages, and to ensure they have the most
current data. The number of times a website is visited will
depend on how the search engine sets up its visits, which can
vary per search engine. However, the more active a website, the
more often if will get visited. If a website varies frequently,
the search engine will send bots more often. This is also true
if the website is extremely popular, or heavily trafficked.
Sometimes bots are unable to access the website they are
visiting. If a website is down, the bot may not be able to
access the website. When this happens, the website may not be
re-indexed, and if it happens repeatedly, the website may drop
in the rankings.
Types of Search Engines
There are basically two types of search engines which gather
their listings in different ways.
Crawler Based Search Engines - Crawler based search engines,
such as Google, create their listings automatically. The Google
Robot crawls or spiders the web, then people search through what
they have found.
Human Powered Search Engines - A human powered directory, such
as the Open Directory, depends on humans for its listings. You
submit a short description to the directory for your entire
site, or editors write one for sites they review. A search looks
for matches only in the descriptions submitted.
How do you identify a Search Engine Robot?
Search engines send out what are called spiders, crawlers or
robots to visit your site and gather web pages. These Search
Engine robots leave traces behind in your access logs, just as
an ordinary person does. If you know what to look for, you can
tell when a spider has come to call. That can save you worrying
that you haven't been visited. You can tell exactly what a robot
has recorded or failed to record. You can also spot robots that
may be making a large number of requests, which can affect your
page impression statistics or even burden your server.
A better way of spotting spiders is to look for their agent
names, or what some people call browser names. Spiders or search
engine robots have their own names, just like browsers. For
example, Netscape identifies itself by saying Mozilla. Alta
Vista's spider says Scooter, while yahoo's spider is named Slurp.
Search Engine Robots Crawling Problems
Search engine robots follow standard links with slashes, but
dynamic pages, generated from databases or content management
systems, have dynamic URLs with question marks (?) and other
command punctuation such as &, %, + and $. Search Engine Robots
find difficult to crawl such dynamic sites as they include these
blocking parameters in url's. The simplest Search Engine
Optimisation solution is to generate static pages from your
dynamic data and store them in the file system, linking to them
using simple URLs. Site visitors and robots can access these
files easily. This also removes a load from your back end
database, as it does not have to gather content every time
someone wants to view a page.
Search engine robots sometimes have problems finding pages on
the web. Spidering issues can be caused by Macromedia Flash
sites which are coded with image based language, rather than a
text based language, which the search engine robots can't read.
Search engine robots can have difficulty in penetrating
Javascript navigation menus as well.
Conclusion
Search engine robots are your friends. They ensure that your
site is seen by potential customers. The search results decide
the fate of a search engine. Different search engines try to
cater to different users. Search engine technology is evolving
every day and new researches are carried out to provide more
concept and descriptive based search queries. However, the same
theory applies - "The search engine, which provides the most
relevant results, will rule".
About the author:
S Prema is Search Engine Optimisation Executive for UK-based
internet marketing company, Star Internet Ltd. Cients of Star
Internet benefit from a range of services designed to maximise
ROI from internet marketing activities. To find out more, visit
http://www.affordable-seo.co.uk