Thursday, 26 May 2011

How Search Engines Rank Pages

Every smart Search Engine Optimizer starts his or her career by looking at Web pages with the eye of a search engine spider. Once the optimizer is able to do that, the path is half way complete to full mastery.

The first thing to remember is that the search engines rank "pages", not "sites". What this means is that you will not achieve a high ranking for your site by attempting to optimize your main page for ten different keyword phrases. However, different pages of your site WILL appear up the list for different key phrases if you optimize each page for just one of them. If you can't use your keyword in the domain name, no problem – use it in the URL of some page within your site, e.g. in the file name of the page. This page will rise in relevance for the given keyword. All search engines show you URLs of specific PAGES when you search – not just the root domain names.

Second, understand that the search engines do not see the graphics and JavaScript dynamics your page uses to captivate visitors. You can use a graphic image of written text that says you sell beautiful Christmas gifts. But it does not tell the search engine that your website is related to Christmas Gifts – unless you use an ALT attribute where you write about it.

Here's an example to illustrate.

What the visitor sees:

Beautiful Christmas Gifts!!!

What the search engine will read in this place:

<img src="http://training.webceo.com/images/assets/Stg2_St2_L5/0004.png" width="250" height="100" class="image" />

As you see there's nothing in the code which could tell the search robots that the content relates to "Christmas", "Gifts", or "Beautiful". The situation will change if we rewrite the code like this:

<img src="http://training.webceo.com/images/assets/Stg2_St2_L5/0004.png" width="250" height="100" class="image" alt="Beautiful Christmas Gifts!!!" />

As you can see we've added the ALT attribute with the value that corresponds to what the image tells your visitors. Initially, the "alt" attribute was meant to provide alternative text for an image that for some reason could not be shown by the visitor's browser. Nowadays it has acquired one more function – to bring the same message to the search engines that the image itself brings to human Web surfers.

The same concerns the usage of JavaScript. Look at these two examples:
  1. Visit our page about discounted Samsung Monitors!
  2. <script language="JavaScript" type="text/javascript"><!--document.write("Visit our page about " + goods[Math.round(0.5 +(3.99999 * Math.random()))-1]); --> </script>
The first example is what visitors see, the second is the source code script that produces the output. Assume the search engine spider is intelligent enough to read the script (however, actually not all the spiders do); is there anything in the code that can tell it about the Samsung Monitor? Hardly.

As a rule, search engine spiders have a limit on loading page content. For instance, the Googlebot will not read more than 100 KB of your page, even though it is instructed to look whether there are keywords at the end of your page. So if you use keywords somewhere beyond this limit, this is invisible to spiders. Therefore, you may want to acquire the good habit of not overloading the HEAD section of your page with scripts and styles. Better link them from outside files, because otherwise they just push away your important textual content.

There are many more examples of relevancy indicators a spider considers when visiting your page, such as the proximity of important words to the beginning of the page. Here, as well, the spider does not necessarily see the same things a human visitor would see. For instance, a left-hand menu pane on your Web page. People visiting your site will generally not first pay attention to this, focusing instead on the main section. The spider, however, will read your menu before passing to the main content – simply because it is closer to the beginning of the code.

Remember: during the first visit, the spider does not yet know which words your page relates to! Keep in mind this simple truth. By reading your HTML code, the spider (which is just a computer program) must be able guess the exact words that make up the theme of your site.

Then, the spider will compress your page and create the index associated with it. To keep things simple, you can think of this index as an enumeration of all words found on your page, with several important parameters associated with each word: their proximity, frequency, etc.

Certainly, no one really knows what the real indices look like, but the principals are as they have been outlined here. The words that are high in the list according to the main criteria will be considered your keywords by the spider. In reality, the parameters are quite numerous and include off-page factors as well, because the spider is able to detect the words every other page out there uses when linking to your page, and thus calculate your relevance to those terms also.

When a Web surfer queries the search engine, it pulls out all pages in its database that contain the user's query. And here the ranking begins: each page has a number of "on-page" indicators associated with it, as well as certain page-independent indicators (like PageRank). A combination of these indicators determines how well the page ranks.

It's important to keep this in mind: after you have made your page attractive for visitors, ask yourself whether you have also made it readable for the search engine spiders. In the lessons that follow, we will provide for you detailed insight into the optimization procedure; however, try to keep in mind the basics you've learned here, no matter how advanced you become.

Here's what you should remember from this lesson:

  1. Search engines rank pages, not sites.
  2. When a spider first visits a page on your website, it does not yet know the keywords for which your page is relevant; it does not know anything except your URL. Try to optimize your code to make it readable not only to visitors but also to spiders.

No comments:

Post a Comment