Thursday 26 May 2011

How Search Engines Rank Pages

Every smart Search Engine Optimizer starts his or her career by looking at Web pages with the eye of a search engine spider. Once the optimizer is able to do that, the path is half way complete to full mastery.

The first thing to remember is that the search engines rank "pages", not "sites". What this means is that you will not achieve a high ranking for your site by attempting to optimize your main page for ten different keyword phrases. However, different pages of your site WILL appear up the list for different key phrases if you optimize each page for just one of them. If you can't use your keyword in the domain name, no problem – use it in the URL of some page within your site, e.g. in the file name of the page. This page will rise in relevance for the given keyword. All search engines show you URLs of specific PAGES when you search – not just the root domain names.

Second, understand that the search engines do not see the graphics and JavaScript dynamics your page uses to captivate visitors. You can use a graphic image of written text that says you sell beautiful Christmas gifts. But it does not tell the search engine that your website is related to Christmas Gifts – unless you use an ALT attribute where you write about it.

Here's an example to illustrate.

What the visitor sees:

Beautiful Christmas Gifts!!!

What the search engine will read in this place:

<img src="http://training.webceo.com/images/assets/Stg2_St2_L5/0004.png" width="250" height="100" class="image" />

As you see there's nothing in the code which could tell the search robots that the content relates to "Christmas", "Gifts", or "Beautiful". The situation will change if we rewrite the code like this:

<img src="http://training.webceo.com/images/assets/Stg2_St2_L5/0004.png" width="250" height="100" class="image" alt="Beautiful Christmas Gifts!!!" />

As you can see we've added the ALT attribute with the value that corresponds to what the image tells your visitors. Initially, the "alt" attribute was meant to provide alternative text for an image that for some reason could not be shown by the visitor's browser. Nowadays it has acquired one more function – to bring the same message to the search engines that the image itself brings to human Web surfers.

The same concerns the usage of JavaScript. Look at these two examples:
  1. Visit our page about discounted Samsung Monitors!
  2. <script language="JavaScript" type="text/javascript"><!--document.write("Visit our page about " + goods[Math.round(0.5 +(3.99999 * Math.random()))-1]); --> </script>
The first example is what visitors see, the second is the source code script that produces the output. Assume the search engine spider is intelligent enough to read the script (however, actually not all the spiders do); is there anything in the code that can tell it about the Samsung Monitor? Hardly.

As a rule, search engine spiders have a limit on loading page content. For instance, the Googlebot will not read more than 100 KB of your page, even though it is instructed to look whether there are keywords at the end of your page. So if you use keywords somewhere beyond this limit, this is invisible to spiders. Therefore, you may want to acquire the good habit of not overloading the HEAD section of your page with scripts and styles. Better link them from outside files, because otherwise they just push away your important textual content.

There are many more examples of relevancy indicators a spider considers when visiting your page, such as the proximity of important words to the beginning of the page. Here, as well, the spider does not necessarily see the same things a human visitor would see. For instance, a left-hand menu pane on your Web page. People visiting your site will generally not first pay attention to this, focusing instead on the main section. The spider, however, will read your menu before passing to the main content – simply because it is closer to the beginning of the code.

Remember: during the first visit, the spider does not yet know which words your page relates to! Keep in mind this simple truth. By reading your HTML code, the spider (which is just a computer program) must be able guess the exact words that make up the theme of your site.

Then, the spider will compress your page and create the index associated with it. To keep things simple, you can think of this index as an enumeration of all words found on your page, with several important parameters associated with each word: their proximity, frequency, etc.

Certainly, no one really knows what the real indices look like, but the principals are as they have been outlined here. The words that are high in the list according to the main criteria will be considered your keywords by the spider. In reality, the parameters are quite numerous and include off-page factors as well, because the spider is able to detect the words every other page out there uses when linking to your page, and thus calculate your relevance to those terms also.

When a Web surfer queries the search engine, it pulls out all pages in its database that contain the user's query. And here the ranking begins: each page has a number of "on-page" indicators associated with it, as well as certain page-independent indicators (like PageRank). A combination of these indicators determines how well the page ranks.

It's important to keep this in mind: after you have made your page attractive for visitors, ask yourself whether you have also made it readable for the search engine spiders. In the lessons that follow, we will provide for you detailed insight into the optimization procedure; however, try to keep in mind the basics you've learned here, no matter how advanced you become.

Here's what you should remember from this lesson:

  1. Search engines rank pages, not sites.
  2. When a spider first visits a page on your website, it does not yet know the keywords for which your page is relevant; it does not know anything except your URL. Try to optimize your code to make it readable not only to visitors but also to spiders.

Saturday 21 May 2011

Submitting to Directories

There are hundreds of directories on the Web that cover every possible market, offering many valuable opportunities to get your site listed in crawler-based engines, expose your site to your audience and increase the absolute value of your pages (also known as Google PageRank). The first (and, if you succeed, maybe the only) directories to get listed in are Yahoo and DMOZ.

There are other directories such as JoeAnt which can be quite useful also; however, many of them are just not worth the trouble. There's a good technique to determine if a directory can help you on your way to top rankings and traffic. When you are considering placement in a directory, check it's "robots.txt" file (which we've covered in the previous section about optimizing site structure) and look if it allows the major search engines to crawl it. If they don't allow crawlers to go through their directory, it is useless for you to list there.

As you remember, the robots.txt file is always in the root directory so just type the full URL of the site and add "/robots.txt" on the end to see the file.

When submitting to the directories, remember that they are search engines powered by humans. All listings within the directories are compiled by human editors. You already know how important a listing in a directory can be; please find the time to make some of the recommended preparations for directory submission. This includes writing a 25-word or less description of your entire website, which includes the use of the primary keyphrase you've optimized your home page for. Review and make sure that the description you write doesn't misuse marketing language or hype.

If you are going to submit your site to several directories or even to a dozen it is recommended to have this information prepared and saved. This will help you to speed the process of submission. This data should be ready  somewhere to copy from and then paste:
  • Email address
  • Website URL
  • A title for the website
  • A description for the website
When it comes to the title, use your company name or the official name of your website as there are strict criteria. For example Yahoo! will allow  only these names.

As for a description Jill Whalen suggests the following: “The website description posted with your URL is a big factor in how your site will rank once it's listed in the directory. It is very important to do this right the first time. If you put too much promotional jargon in your description or make it too long, for example, the editors are sure to change it. When they do, you can bet your keywords won't appear in the final listing.

If you've created a good meta description tag for your site, start with that. Copy and paste it into the submission form, then start deleting extraneous words. Move words around until you have the shortest yet most descriptive sentence possible. If you do this correctly, chances are the editors won't change it.

Be sure the words you're using in your description appear on the pages of your website. If they don't, and the site appears to be about subjects other than what you described in your form, your description might be edited.”

For deeper insight visit - http://www.webproguide.com/articles/How-to-Submit-Your-Site-to-Directories-such-as-Yahoo-DMOZ-and-Zeal/index.php?phrase_id=2796

Submitting to Yahoo

When listed in the Yahoo! Directory it is easier to get indexed and rank higher in crawler-based results of Yahoo! as well as Google and Live Search.

Yahoo! Directory offers two types of submission: "Standard" which is free, and "Yahoo Directory Submit", which involves a submission fee and annual fee of $299 ($600 if your site is adult-related). This guarantees your site will be reviewed by an editor within 7 days. It does not, however, guarantee inclusion and the fee is non-refundable.

To choose the submission option go to http://dir.yahoo.com, select the appropriate category and then click "Suggest a Site" link in the top right section.

Anyone can use Standard submission to submit for free to a non-commercial category. You'll know the category is non-commercial because if you try to submit to a non-commercial category, the Standard submission option will be offered in addition to the Yahoo Express paid option, discussed further below.

If you choose the free submit, there's no guarantee that your submission will be reviewed quickly or reviewed at all.

You can have a commercial site and still try to submit for free to a non-commercial category; however use caution when submitting. Let's say you sell weather forecast software. If you submit your site as such, chances are good it will not be accepted; but, if you highlight a page that tells interesting facts about weather and weather forecasting, this information can be considered a good reason to list your site in a non-commercial category.

If accepted into a commercial category for money, you'll be reevaluated after a year and charged the submission fee again if you want to stay in Yahoo's commercial area. You should review the traffic you've received from Yahoo over the past year to decide if it is worth paying the fee again. If not,  decline to be listed again and you will not be charged. Most often, you will decide to drop your listing after a year, for the category itself does not bring much traffic. Remember that the directory listing is initially important for us as a doorway to search engines listings. Once we've done that, we may safely let the directory listing drop, most often without a significant impact on the search traffic the site receives. The crawler-based engines will keep revisiting and listing your site on their regular basis.

Before submission be aware of the Terms and Conditions. Here are the most important:
  • I have verified that my site does not already appear in the Yahoo! Directory and I understand that this is not the place to request a change for an existing site.
  • My site supports multiple browsers and capabilities. (For example, java only sites will not be listed).
  • I understand that if my site is added, it will be treated as any other site in Yahoo! and will receive no special consideration.
The full list of requirements can be found here https://ecom.yahoo.com/dir/reference/instructions

It is crucial to choose the category properly. It is not recommended to choose the category you would prefer your site to be seen in, but the one it really matches to. If you are not certain what category to choose try this: using your most appropriate keyword for your site, make a query and observe the result page with categories.

The next step is to choose a subcategory because if you submit to a top-level category while disregarding an appropriate subcategory your submission becomes questionable. Additionally, don't forget about geographic regions because if your site is local by nature it should be taken to consideration during the submission process.

Remember the more carefully you prepare, research keywords, debug pages, write valuable content and compile your description, the more chances there are to be included, this is regardless of whether it is through Yahoo! Directory Submit or the Free Submission service.

Submitting to DMOZ

DMOZ / ODP is a catalog of the World Wide Web compiled by volunteers. DMOZ used to be a starting page for the Google's crawler, however nowadays it remains to be a good source of reliable inbound links for the sites.

Submission to DMOZ is free but on the other hand there's no guaranteed turnaround time for acceptance.

To suggest your site to DMOZ, go to http://www.dmoz.org and locate the category you want to be listed in. Then use the "Suggest URL" link that is visible on the top of the category page. Fill out the form, and the submission process is complete.

If accepted, you should see your site appear within approximately three to six weeks. If this doesn't happen, don't try to resubmit. Instead, try to get your site listed in several regional or thematic categories.

As with Yahoo, it's highly recommended that you take the time to learn more about the Open Directory before submitting, in order to maximize the amount of traffic you may receive.

 

What you should remember from this lesson:

  1. Submitting to categories provides a power bonus to your rankings and search engine visibility, but to be successful, it requires thorough preparation.
  2. If your submission isn't listed after three to six weeks, the correct technique on Yahoo! is to resubmit. With DMOZ, on the contrary, just try submitting to another category or categories.

Verifying Submission Success

When manually submitting to a search engine, it's clear whether your submission has been accepted: most commonly, you will be shown a message confirming that your page has been queued for crawling or an error message explaining why it hasn't.

Submission with auto-submission software to the crawler based engines, sometimes, as with Web CEO, will provide an opportunity to see the submission result in a report (something like 'OK' or 'Failed') and the real response pages returned by the search engine as well.

Directories like Yahoo!, when using paid submission, send a message from the editor with an explanation of why your submission has been accepted or rejected.

Dynamic Search Marketing, however, requires staying updated not only on whether your submission has been queued – it is also extremely important to find out at once when your site has been crawled and indexed and thus found its worthy place in the search engine index. Also, if the site does not appear in the index after a period of time, something must be done. In any case, you need to know when your site has been indexed.

Here are some techniques for verifying whether your submission has been successful.

First use the site:URL syntax to check if your site has been indexed by the given SE. Mind however that some SEs do not support this syntax, so you will have to use some other method for indexation check.
Another checking method is to include a special unique word or combination of words into the page you are going to submit. The idea is that your page will be matched against this word by the search engine when queried for this term.

For this purpose we may use a randomly generated alphanumeric string like "249ej38eh234ieb32i40ly5u05" or a real word combination which is unlikely to be found on any other page on the web, i.e. "International red widgets online open the ranking contest". Include it somewhere on your page so that it can be read by a search engine spider. Don't worry; as soon as you determine you're in, you can immediately remove this from your page.

When you are included in the index, your page will be shown in the result list for this query and – as it's a unique search term not used by anyone else – your page will appear on the top of the list.

It's simply about regular checking with the search engine by querying it for this term and looking for your page in the first results. If your page isn't there, it means you aren't included in the index yet.

You can make your life easier by using a combination of Web CEO Ranking Checker and Scheduler to automatically and periodically perform this check for you. Once it detects you are found among the results, it means your page has been included into the index and you may celebrate your first SEO victory.

What you should remember from this lesson:

  1. Submission verification is important because it alerts you to when the first step is complete and allows one to move to the next step in the process.
  2. For verification, include a string in your page that can be uniquely matched against your page by the search engine. Then regularly check the search results for this term.

What to Do if Your Site Has Been Penalized

We are pleased to congratulate you with a great job done – studying the White, Black, and Gray-Hat SEO. This final lesson is devoted to search engine penalty signs and the principles that differentiate them.

As evidence shows, websites can suffer from penalties for numerous reasons. Search engines hate spammers violating or manipulating their rules. They are continuously working to strengthen the algorithms and sophistically sieve out the spamming sites/pages.

The initiative taken against spammers has resulted in scalable and intelligent methodologies for fighting manipulation. Today, search engines do their best to control and remove spam with hundreds of the world's best online engineers engaged in dealing with spammers.

We can repeat the widely known methods Google, a leading spam fighter, applies. Their names are still buzzwords: Google Sandbox, supplemental index, penalties for SE rules violations, regular updates called "Google dances"… They check many factors and monitor each domain or website within their reach.

Wide spread mistakes occur when your links directory expands and starts looking like a link farm (even if this was not your intention). That's why we have warned you to be very careful with link exchange programs, and who you elect to link with. Too many sites have been banned because of the enormous number of links they displayed or for having paid or manipulative links.

Moreover, sometimes it's hard to see whether your site/page actually has a penalty or some things have changed, either in the search engines' algorithms or on your site that negatively impacted the rankings or inclusion.

This loophole is still critical for the Google and Yahoo! search engines, though Bing/Live Search/MSN has already added a special checking option for the penalties applied.

Checking if your site has been banned
To check and recover from the penalty box, Bing/Live Search/MSN offers its Webmaster Center tools (http://www.bing.com/webmaster/ ). As usual, you first have to add a website and troubleshoot the crawling and indexing processes.
Google and Yahoo! users are still deprived of this option. Yahoo! offers a special contact form for you to send feedback regarding the status of your site in their Search Index. If your site disappeared from the Search Results, you can use this form - Yahoo Search Support.
Very often, what is initially perceived as a mistaken spam penalty with these engines is, in fact, related to accessibility issues.
That's why a two-step checking process is available for your attention.
Step 1
  • Poor website availability
    Look deep into your website monitoring reports (e.g. Web CEO's Monitoring tool) and check your website availability over time. If your site was not uptime for a period, the chances are the crawler failed to access your site several times and decided those pages did not exist any more.
  • Changes to the site content
    Consider whether you have made any changes to the site content recently – maybe you have applied keyword stuffing (such changes could have triggered spam filters).
  • Participation in affiliate schemes
    If you are an affiliate or manage your affiliate program – make sure your site's content is unique and no one scrapes content from you. If you have stronger affiliates who use the same content but rank better, your site could be penalized for duplicate content.
  • Changes in the site structure
    Have you changed your site structure; perhaps added more outgoing links, changed the internal link structure, removed or added a new website section, removed pages, played around with the redirects, etc? Those changes might not be SE-friendly.
  • The quality and quantity of your backlinks
    Carefully check the value and number of your backlinks changed over time. E.g. look to see if most of your backlinks are off-topic, or some linking sites were banned or devalued because of Gray or Black-Hat SEO, or if there was a one-night boost in your link popularity which may be a sign of aggressive link-building techniques.
  • Temporary changes in the SE database or ranking algorithms
    First, check sites that share similar backlink profiles, and whether they've also lost rankings. Then wait for a couple of hours and come back to the SE to check your rankings. Maybe you were just expecting another Google update or so called "dance." When the engines update ranking, algorithms links change the value; site importance changes, and you may suffer from ranking movements. (Read the full explanation of the Google dance notion in our "Crawler-Based Search Engines" lesson.) Refer to the SE news to check if there have been any changes to the SE’s ranking algos.
If any of these suggestions are true for your site – go to Step 2 to proceed with your investigation.

Penalties are very disappointing to get. All you'll see is a sudden site drop; furthermore, the search engine won’t send you an email to tell you to stop doing anything wrong. As a rule, search engines just listen to a competitor who ratted on you. They take the competitor’s word for it, without getting any type of defense from you.

Step 2

This step will cover the following factors:

Check the site’s presence in the search engine index (use the site:url.com query) and see if it ranks for the domain name. Then search your website for the unique brand name and five or six relative unique terms from the title tag of your pages. A positive answer for these actions shows your presence in the search engine’s index and possible loss of ranking.

On the other hand, negative answers in the second step check means a penalty has probably been applied.

Note: site URL presence (only home page) means your site was banned, while a partial presence in the index states the penalty. Once you've ruled these out, clear your pages of spam and remove all the potentially harmful links.

Cleaning penalized Web pages

Our new task is to check or recall what you did wrong while optimizing the pages. Search Engines may help you with this job. Top search engines have created tools to make the site owner’s life easy. You'll be informed about wrong steps you took on your website.
First, register your site with the engine's Webmaster Tools service and brush it up properly. Remove and fix everything you can with the help of Webmaster Tools warnings and alerts.
Top Search Engines Webmaster Tools services are:
Asking for Reconsideration
Now we are going to show our ready for reconsideration pages. Feel free to use the next links to re-include the clean pages:
It is recommended that you include a clear explanation of what you have done to fix all previous conflicts. Write the reasons for your fault. Perhaps your site was hacked or you used a problematic technique unintentionally, and so on.
To summarize: all Search Engines have all driven down the value of search spam and made so-called "White-Hat" tactics far more attractive. The re-inclusion/re-consideration process can take months. Moreover, get ready for the rejection as well.

What you should remember:

  1. Website accessibility difficulties are often perceived as a penalty for spam. Check your pages for errors that can prevent the crawling process.
  2. If your site was banned or penalized, clean the pages of spam by using SE special tools and then use a standard re-consideration or re-inclusion request.

Google PageRank, Local Rank and Hilltop Algorithms

When estimating websites, crawler-based search engines usually consider many factors they can find on your pages and about your pages. Most important for Google are PageRank and links. Let's look closer at the algorithms applied by Google for ranking Web pages.

Google PageRank

Google PageRank (further referred to as PR) is a system for ranking Web pages used by the Google search engine. It was developed by Google founders Larry Page and Sergey Brin while they were students at Stanford University. PageRank ("PageRank" written together is a trademark that belongs to Google) is the heart of Google's algorithm and makes it the most complex of all the search engines.

PageRank uses the Internet's link structure as an indication of each Web page's relevancy value. Sites considered high quality by Google receive a higher Page Rank and – as a consequence – a higher ranking in Google results (the interdependence between PageRank and site rankings in the search results is discussed later in this lesson). Further, since Google is currently the world's most popular search engine, the ranking a site receives in its search results has a significant impact on the volume of visitor traffic for that site.

You can view an approximation of the PageRank value currently assigned to each of your pages by Google if you download and install Google's toolbar for Microsoft Internet Explorer (alternatives also exist for other popular browsers). The Google toolbar will display the PageRank based on a 0 to 10 scale, however a page's true PageRank has many contributing factors and is known only to Google.

For each of your pages PageRank may be different, and the PageRanks of all the pages of your site participate in the calculation of PageRank for your domain.

For each of your pages, the PR value is almost completely dependent upon links pointing to your site, reduced, to some degree, by the total number of links to other sites on the linking page. Thus, a link to your site will have the highest amount of impact on your PR if the page linking to yours has a high PR itself and the total number of links on that page is low, ideally, just the one link to your site.

The actual formula (well, an approximate one, according to Google's official papers) for PR is as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

where pages T1...Tn all point to page A. The parameter d is a damping factor which can be set between 0 and 1. Google usually sets d to 0.85. C(T) is defined as the number of links going out of page T.

Thus, a site with a high PR but a large number of outbound links can nullify its own impact on your PR. To increase your PageRank, get as many links to your site from pages with a high PR and a low number of total links. Alternatively, obtain as many links pointing to your site as you can, no matter what their PageRank is, as long as they are ranked. It depends on each specific case which variant will get the best out of the PR formula.

Those of you interested in the mathematical aspect will see that the formula is cyclic: the PR of each page depends on the PR of the pages pointing to it. But we won't know what PR those pages have until the pages pointing to them have their PR calculated and so on. Google resolves this by implementing an iterative algorithm which starts without knowing the real PR for each page and assuming it to be 1. Then the algorithm runs as many times as needed and on each run it gets closer to the estimate of the final value.

Each time the calculation runs, the value of PageRank for each page participating in the calculation changes. When these changes become insignificant or stop after a certain number of iterations, the algorithm assumes it now has the final Page Rank values for each page.

Real Page Ranks range from 0.15 (for pages that have no inbound links at all) up to a very large number. The actual value changes every time Google does re-indexing and adds new pages to its database. Most experts agree on the point that the interdependence of toolbar PR and real PR are based on the logarithmic scale. Here's what it means if we assume that the base for the algorithm is, for instance, 10:

Toolbar PageRank
(log base 10)
Real PageRank
0
0 .15 – 10
1
100 – 1,000
2
1,000 – 10 , 000
3
10,000 – 100,000
4
100,000 – 1,000,000
5
1,000,000 – 10,000,000
6
10,000,000 – 100,000,000
7
100,000,000 – 1,000,000,000
8
1,000,000,000 – 10,000,000,000
9
10,000,000,000 – 100,000,000,000
10
100,000,000,000 – 1,000,000,000,000
Although there is no evidence that the logarithm is based on 10, the main point is that it becomes harder and harder to move up the toolbar, because the gaps to overcome become larger and larger with each step. This means that for new websites, "toolbar" PR values between 1 and 3 may be relatively easy to acquire, but getting to 4 requires considerably more effort and then pushing up to 5 is even harder still.

As you may have figured out from the formula above, every page has at least a PR of 0.15 even if it doesn’t have any inbound links pointing to it. But this may only be in theory – there are rumors that Google applies a post-spidering phase whereby any pages that have no incoming links at all are completely deleted from the index.

Local Rank

Local Rank is an algorithm similar to PR which is written by Krishna Bharat of the HillTop project. Google applied for a patent in 2001 and received it in early 2003. To sum it up, this algorithm re-ranks the results returned for a certain user's query by looking at the inter-connectivity between the results. This means that after a search is done, the PR algorithm is run among the result pages only, and the pages that have the most links from other pages in that set will rank highest.

Essentially, it's a way of making sure that links are relevant and ranking sites accordingly. Please note that this algorithm does not count links from your own site – or, to be more exact, links from the same IP address.

Assuming that it is used by Google, make sure that you first get links pointing to you from other pages that rank well (or rank at all) for the keyword that you are targeting. Directories such as Yahoo! and DMOZ would be a good place to start – they tend to rank well for a wide range of keywords. Also, keep in mind that this is about pages, not sites. The links need to be from the pages that rank well – not other pages on sites that rank well.

Hilltop

Hilltop is a patented algorithm that was created in 1999 by Krishna Bharath and George A. Mihaila of the University of Toronto. The algorithm is used to find topic relevant documents to the particular keyword topic. Hilltop operates on a special index of " expert documents".

Basically, it looks at the relationship between the "Expert" and "Authority" pages. An "Expert" is a page that links to lots of other relevant documents. An "Authority" is a page that has links pointing to it from the "Expert" pages. Here they mean pages about a specific topic and having links to many non-affiliated pages on that topic. Pages are defined as non-affiliated if they are authored by authors from non-affiliated organizations. So, if your website has backlinks from many of the best expert pages it will be an "Authority".

In theory, Google finds "Expert" pages and then the pages that they link to would rank well. Pages on sites like Yahoo!, DMOZ, college sites and library sites can be considered experts.

Google acquired the algorithm in February 2003.

Site Structure and PageRank

PageRank can be transmitted from page to page via links across different pages of your site as well as across all the sites in the Web. Knowing this, it’s possible to organize your link system in such a way that your content-rich pages receive and retain the highest PageRank.

The pages of your site receive PageRank from outside through inbound links. If you've got many inbound links to different pages of your site, it means PageRank enters your site at many points.

Such "PageRank entry points" can pass PageRank further on to other pages of your site.

The idea that you should keep in mind is that the amount of PageRank that a page of your site is able to give to another page depends on how many links the first (linking) page itself contains. This page only has a certain amount of Page Rank, which is going to be distributed over several other pages that this page links to.

The best way to obtain a good PR on all of your pages is to have a well thought-out linking structure for your site.

What this means is that every page on your site should have multiple links from your other pages coming into it. Since PR is passed on from page to page - the higher the PR that a page has, the more it has to pass on. Pages with a low number of links on them will pass relatively more PR per link. However, on your own site, you want all of your pages to benefit - usually. Also, PR is passed back and forth between all of your pages - this means that your home page gets an additional boost because, generally, every page on your site links to your home page.

Let's look at the prototypes of site linking schemes that may be beneficial in terms of PR distribution.

1. Simple hierarchy.

Simple hierarchy

The boxes denote separate pages and the figures in them denote the PR value calculated with the help of a simple algorithm that takes into consideration only these pages. With a site structure like this, it's pretty easy to get a high PR for your home page; however this is an ideal situation which is difficult to recreate in real life: you will want to get more cross-linking then just links from all your pages to the home page.

2. Linking to external pages that return backlinks

Linking to external pages that return backlinks

This just means creating a link directory page on your site and benefit a bit from link exchange with the external pages. Link exchanges are dealt with in the next lesson.

3. Site with inbound and outbound links

Site with inbound and outbound links

This is very similar to the first scheme, however, here there is an external site (Site A) passing its PR to your home page which then distributes it to child pages. You can see that both a homepage's PR and that of the child pages have significantly increased. It doesn't matter how many pages you have in your site, your average PR will always be 1.0 at best. But a hierarchical layout can strongly concentrate votes and, therefore the PR, into the home page.

So here are some main conclusions you should keep in mind when optimizing the link structure of your site for better PR distribution.
  • If a particular page is very important – use a hierarchical structure with the important page at the "top".
  • When a group of pages may contain outward links – increase the number of internal links to retain as much PR as possible.
  • When a group of pages do not contain outward links – the number of internal links in the site has no effect on the site's average PR. You might as well use a link structure that gives the user the best navigational experience.

How your PageRank influences your rankings

While the exact algorithm of each search engine is a closely guarded secret, search engine analysts believe that search engine results (ranking) are some form of a multiplier factor of Page relevance (which is determined from your multiple of "on-page" and "off-page" factors) and PageRank. Simply put, the formula would look something like –

Ranking = [Page Relevance] * [PageRank]

The PR logic makes sense since the algorithm seems invulnerable to spammers. The search results of Google search have demonstrated high relevance and this is one of the main reasons for their resounding success. Most other major search engines have adopted this logic in their own algorithms in some form or other, varying the importance they assign to this value in ranking sites in their search engine result pages.

What you should remember from this lesson:

  1. PageRank was developed by Google to estimate the absolute (keyword-independent) importance of every page in its index. When Google pulls out the results in response to a Web surfer's query, it does something similar to multiplying the relevance of each page by the PR value. So, PageRank is really worth fighting for.
  2. PageRank depends on how many pages out there link to yours (the more, the better) and how many other links these pages contain (the less, the better).
  3. You may try to optimize the link structure of your site for better PageRank distribution. Most simply, you should create a site map, get many cross-links between your pages and organize a hierarchy link structure with the most important pages on the top.

Saturday 7 May 2011

HTML Elements (Page Areas) That Matter

Since spiders see your page as HTML code instead of what is directly visible through a browser, optimizers must gain a solid understanding of the structure of a typical HTML document.
This lesson will guide you through some HTML basics and then show which elements are critical for optimization and why.
First, we recommend that your HTML documents comply with the XHTML standard. XHTML is the strictest standard of HTML (hypertext markup language). By following this standard you ensure that your pages are easily readable for search engine spiders. You can learn more about XHTML at the official resource of the World Wide Web consortium:
http://www.w3.org/
Every HTML document that complies with the standards has two principal sections: the HEAD area and the BODY area. To illustrate this we can open the source code of any HTML page found on the Web. Open it in your browser, right-click on the page and select "view page source" or "view source".

The HEAD section is everything you see between the <head> and </head> tags. The content of this section is invisible when viewing the page in a browser. As you can see, one of the elements it includes is the title tag (between the <title> and </title> markup). This is what is shown in the caption bar of the internet browser when this page is displayed in a browser. The title will also represent your page in the search engine results. As such, the title tag is a very important element.
The head section also includes various META tags. In the w3.org example we see the META keywords tag and the META description tag:
<meta name="keywords" content="…">
<meta name="description" content="…">
After the <head> tag is closed, the <body> tag opens. Everything that's within the body tag (i.e. between the <body> and </body> markup) is visible on the page when viewed in the browser.
In the body text of the w3c.org example, we see the <h1> and <h2> tags. These are called HTML headings and range from the 1st (h1) to the 6th (h6) level; initially, they were meant to mark logical styles for different levels of heading importance: "h1" being the most important heading and "h6" being the least important. Usually browsers display the tags from the largest to smallest. The <h1> tag displayed with the largest font size, and on down respectively until <h6> which displays the smallest font size. The search engines treat the heading tags the same way.
The links tag is another important body element, and is delimited by <a> and </a> markup.
The image tag <img> is responsible for displaying an appropriate image whenever a browser sees it in the source code.
Schematically, an HTML document in an optimizer's eyes (as well as in the search engine's eyes) looks like this:
<head>
<title>My title goes here</title>
<meta name="keywords" content="keyword 1, keyword 2, keyword 3">
<meta name="description" content="site description">
</head>
<body>
<h1>This is the first level heading which is important to search engines</h1>
<h2>this is a kind of subheading which is also important</h2>
This is a simple text in the body of the page. This content must include a minimum of 100 words, with keyword density around 3% to 7%, maximum keyword prominence towards the beginning, middle and end of the page, and maximum keyword proximity.
<b>This text will show in bold</b>
<a href="http://www.somesite.com" title="some widget site">Link to some widget site</a>
<img src="http://mysite.com/image.jpg" alt="and this is my image" />
</body>

Let's go through all the HTML elements and get some in-depth insight into how we can optimize each of them.

The title tag

The Title tag of your Web page is probably the most important HTML tag. Not only will search engines consider it when estimating your pages' relevancy towards certain keywords, but also when your title tag finally shows up in the SERP (search engine result pages). A lot depends on how attractive the title is to Web surfers and whether they are compelled to click on your link.
All search engines consider the keywords in this tag and generally give those keywords a great deal of importance in their ranking system. It is as important as your visible text copy and the links pointing to your page.
Always use your primary keywords in the title tag at least once. Try to place them at the start of the tag, i.e. make their prominence 100%. The minimum keyword prominence for the title is 60%.
Don't use several title tags on one page. Make sure the title is the first tag in the head section and that there are no tags before it. Avoid listing the same keyword multiple times in the tag, some engines may penalize for this. Avoid using the same title throughout your site. Try using a unique title tag for each Web page and use key phrases that are thematically relevant to that page. You can use variant forms of a keyword when possible or applicable.
For instance, if you use "Designer" in your Title tag, a search on "design" will give you a match on most engines. However, words like "companies" will not always yield a match on "company" since "company" is not an exact substring of "companies".
Longer titles are generally better than shorter ones. However the recommended word count for a title is only 5 to 9 words, and character length up to 80 symbols. Make your title interesting and appealing to searchers to convince them that they should click on it.
Moreover, you can put your company’s name in the title tag, even place it at the very begining of the tag. If your company is a well-known brand it’s essential for you to do it, if not – then it’s an excellent opportunity to promote it. What is more important is that you shouldn’t stop with just your company name but definitely add one or two descriptive phrases to the tag. Those who already know your company will query for it specifically in the engines and those who don’t – will find you while seeking the products or services you sell based on the descriptive phrases.
One more point to remember is that you should be very specific if you are working in a certain area. Your keywords should reflect the geographical region where you are primarily seeking clients. For example, if a customer looks for some goods (let’s say slippers) first they will begin with typing simply “slippers” and after the engine returns an enormous list from all over the world, the customer will narrow the query by adding some geographical names (e.g. Utah slippers). That’s your chance to be in the Top 10 of the new results for that area. 
While creating the title you can use the following approaches:
<Title>My Company Inc. Utah Slippers</Title>
<Title>My Company Inc. – Utah Footwear</Title>
<Title>My Company Inc. – Utah Slippers – Footwear in Utah</Title>
In the last example the geographical name is used twice in different variations and it is crucial not to put the same words right next to each other as that might be considered as spam by SEs. Don’t use ALL CAPS,  SEs are not case sensitive now and it won’t help, instead it will look rather crude. Initial capitals are well suited for the title tag.
The title should reveal the main idea of the visible text and thus reflect your business in the best possible way.
Here are 10 tips for title tags given by John Alexander, a prominent search engine expert:
  1. "When working with your keyword phrase, get it positioned up front so that as you build a sentence it still reads well.
  2. Try working with your one important keyword phrase up front and another secondary phrase to the rear of the title.
  3. Try writing your title to make a thought provoking statement.
  4. Try writing your title so that it asks a "thought provoking" question.
  5. Try writing a title so that it reveals a hidden truth or misconception.
  6. While in creative mode, keep your mind on what it is that your target audience really wants.
  7. Build honest titles that are related to your site content.
  8. Do NOT resort to keyword stuffing or stacking the title full of multiple phrases that do not convey an intelligent message.
  9. Do not include a lot of capitals or special characters in your title.
  10. Do not get hung up on title length. The easiest rule is to simply keep your title under 59 characters (for Lycos sake) and honestly, you can build really excellent titles in this space."

The META tags

There are two META tags that still appear to be of use by the search spiders: META keywords and META description. These tags are very unlikely to impact rankings; they can only play a weighty role in the site's click-through rate from the SERPs so it's worth optimizing them for your keywords as well.
If you use any other META tags, place them after these two.
The META Keywords
Syntax:
<meta name="keywords" content="keyword 1, keyword 2, keyword 3, …" />
Its initial purpose was to give search engine robots an idea of what the page is about to help with rankings. Unfortunately, as soon as this became evident, so many spammers started abusing it that spiders now have discounted the importance of this tag by at least half of its original ranking value. Most experts say this tag has no weight from the SEO perspective and does not influence your rankings.
If you still want to exploit this tag, use your main keyword phrase, a secondary keyword phrase, and a few synonyms of your keyword phrase in your keyword META tag. Make sure to focus the words in your keyword tag on that one page only, not on every single keyword that could possibly be associated with your entire website. Focus your tags on that page only.
Remember if you use many of the same words in your different keyword phrases, it could look as if you're spamming the engine, so be careful.
The META Description
Syntax:
<meta name="description" content="a short description of your site" />
The contents of the META description tag is what most search engines and directories will show under your title in the search result list. If you have not provided any META description tag to your Web page, the search engines try to make one for you, often using the first few words of your Web page or a text selection where the keyword phrases searched by the users appear. If the Search Engine makes up a description by picking up text from your page, the generated description may not do you Web page justice.
The Meta description tag needs to be kept brief yet informative. A description of about 25-30 words should be fine. Keywords and key phrases should be included in the Meta description tag, though care should be taken not to repeat them too often. Like the title tag, the META description tag should be customized for each page depending on the content theme and target keywords of this page.
Remember that even though Google doesn't consider the META description tag when determining relevancy, it often uses the contents of this tag in the snippet description of your page in the search results. So, make your description captivating and designed to attract traffic. The Meta description tag should be appealing to users, tempting them to click on the link to your site and visit your Web estate. Using the Meta description tag can help you increase the click-through rate of your page, which in its turn increases the traffic you can get from any ranking position.
Below is a nice example of an informative description optimized for "weather forecast software":
"The only weather forecast software that brings long-range weather forecasts, daily horoscopes, biorhythm calculator, Web cams, and weather maps to your desktop."

The body text

The main textual content that is visible to your visitors is placed within the body tag. It still matters for some search engines when it comes to your page analysis and ranking.
Remember the importance of keyword prominence and place your keyword phrase early in the body text of the page. This may also become a means to communicate your message to prospects; some search engines retrieve the first few lines of your Web page and use them as the description of your site in the search results. So, put a number of important keywords in the first few lines in the visible part of your body text. Try to tailor the text in the beginning so that it can be used as a description of your site.
Spread your keyword phrases throughout the body of the page in natural sounding paragraphs; try to keep separate words of your key phrases close together for proximity sake. Put a secondary key phrase in the middle and at the end of your body text. Have some of your keywords in bold (for this purpose, it's better to use the "<b>" tag instead of styles of logical <strong> formatting. However, you can still apply the necessary styles to this text by the following trick: <b style="font-weight:bold">).
Remember your content minimum for a page is 125 words but it's better to reach far beyond this limit.

HTML headings h1 – h6

The headings themselves are a good means of visually organizing your content. Besides that, search engines consider the headings and sub-headings of your page, (especially those in bold), to be important. Take advantage of this by using H1, H2 and H3 tags instead of graphical headings, especially towards the top of your page.
Use heading tags to break up your page copy and make it easier to read and absorb for your visitors. Include your most important search keywords and phrases within the heading text. It follows that using your target keywords and phrases within these headings means that search engines will give them more relevancy weight. Thus, you should always try to use your target keywords within the headings and sub-headings to break up the text on your page.
Page Heading incorporating most important keyword phrase 
Sub-Heading 1 incorporating most important keyword phrase
Paragraph of text incorporating other target keyword phrases 
Sub-Heading 2 incorporating next most important keyword phrase
Paragraph of text incorporating other target keyword phrases
And so on…

The problem about headings is that each type of browser has its own way of displaying them and thus may not match your design ideas. You may apply the following workaround with the help of the style attribute:
<h1 style="font-size:10px;color:#00FFFF;font-weight:bold">This is the formatted heading</h1>
Or with the class attribute, provided the class is defined somewhere in a style sheet.
The search engine will see a first level heading here, but the browser will show human visitors the text formatted as you need instead of standard level one heading.
What should be avoided is trying to repeat the first level tag more than one time. In other words, you shouldn't have more than one <h1> tag on your page to indicate that your main topic is streamlined around a single definite concept.
As for the tags of other levels, it is up to you to use multiple <h2>, <h3> tags etc. on a page in order to structure information in a proper way. Just do your best not to overuse them, and keep to the quality content guidelines.

Link text

Keyword usage is important in the visible text (also called anchor text) of links pointing outside your domain (also called outbound links) as well as links to the internal pages. When you give your users a reference to other documents relative to your theme, the words you use to refer to those documents are considered descriptive for your page's profile.
A usual link would look like this:
Click here for <a href="http://www.somesite.com/keyword-phrase.html" title="this text will appear when user mouse-overs the link">Visible link text</a>
Note: When you link to your own pages, rename these pages so that the URLs contain keywords separated with a hyphen – instead of running the keywords together. By breaking the words up in some way, you let the engines see them as individual words in a phrase. If the words are not broken up, the spiders will see the words as a single term.
Don't flood your links with keywords; usually it's enough to have up to three links per page containing your targeted terms, desirably the first three links.

ALT attributes of images

Alt tags consist of alternative wording for images that is displayed in browsers that can't display images, did not download the image for some reason, or is spoken by talking browsers for the blind. Search Engines use the text in the alt tags to substitute the anchor link text if the image is a hyperlink. This makes these attributes ideal for optimization.
Example:
<img src="images/logo.gif" alt="Graphic of a weather forecast software" width="415" height ="100" / >
As a rule, if you insert your keyword phrase in your ALT text (as long as you are also describing the graphic), you'll have a boost in relevancy with many of the engines. Google often picks up the first ALT text on the page and uses it as the description in the search results, so pay special attention to the ALT text in your first graphic.
To avoid spamming, never remove the actual graphic description from the ALT attributes when you're populating them with your key phrase, and do not plant your key phrase into more than the first three ALT image attributes on the page and then perhaps the last one as well.

What you should remember for this lesson:

  1. The areas of a standard HTML document which matter most for the search engine spiders are: the TITLE tag, the HTML headings, the link text and the ALT attributes of images.
  2. With most of these, observe four parameters: prominence, density, proximity and frequency when populating them with your keywords. This is most important with the body text. To improve keyword significance in the body text, use your keywords in bold once or twice.
  3. While working with your keywords, keep away from any kind of keyword stuffing. After finishing, ensure that the copy still reads naturally.

Country-Specific Domain Names and SEO

A client asked a question today: “Should I use a Country-Specific Domain Name (TLD) If I Am Targeting [That] Market?”
The short answer is Yes.
There are several benefits from doing this, and having a site aimed at each country you are targeting. But there are also some “traps” to watch out for if you do plan to do this.
Let me start with the benefits:
Increase Your Search Engine Rankings Where It Counts (Benefit #1)

This is the most simple, fundamental benefit of having separate web-sites for each country you are targeting - you get a big rankings boost.
Because most of my clients are Australian businesses who target an Australian audience, I work a lot with Australian (.com.au) domain names.
And it doesn’t take a rocket scientist to notice that it’s clearly easier to rank better in Australian users’ Google Search results if you have an Australian .com.au domain name.
Have a look at this (click on images to see rankings):
Search Engine Rankings Australia
Check it out - 9 out of the top 10 results are Australian web-sites.
Here’s what people in the U.K. see when they do the same search:
Search Engine Rankings UK
(Here 5 out of the top 10 results in the Search Engine Rankings are British web-sites.)
And for comparison, here’s the same page in the International version of Google:
International Search Engine Rankings
Here you see a different set of results again - this time, more American and International web-sites, with 8 of the top 10 sites being owned by American companies.
So we very clearly see there’s a very clear positive correlation here between having a site targeted to a particular country, and search engine rankings.
Increase your Click Through Rates (Benefit #2)
If SEO isn’t your style, and you’re more looking at Adwords results, consider what a country-specific domain name does to your credibility in that market.
If you have a .com.au domain name as the display URL in your Google Adwords account, and you’re targeting an American or British audience, in the half second it takes for someone to review your ad and decide to click, the seed of doubt enters their mind and they decide your site is “probably not relevant”.
On the other hand, if you were targeting an Australian audience, and your display URL showed a .com.au domain name, you would immediately GAIN credibility.
Increase your Conversion Rates by Tailoring Content to Each Market (Benefit #3)
Finally, one of the great things about having separate web-sites for each country you are targeting is that you can tailor the content to suit your market.
For example, the American market is less offended by heavy-selling, hyped-up advertising copy - whereas Australian, New Zealand and British consumers are generally more conservative in the marketing messages they respond to.
This presents businesspeople who own several different country targeted web-sites a unique opportunity to optimise their content in the same way that some successful direct mail marketers will segment and target their lists.
WARNING: Duplicate Content
But… There’s one BIG issue to consider if you’re planning to set up a second version of your web-site under a country-specific URL.
Duplicate Content on Multiple Domains Case Study #1
Recently, a client’s web-site was ranking poorly in search engines.
Found they were running the same site on both the .com and .com.au domain names - and every page (on one of the two sites) had received a duplicate content penalty. That meant roughly half the pages on each site were working.
The client was targeting an Australian audience, but they wanted to avoid some domain squatter or competitor owning the .com version of their domain name - so they registered both.
…And they couldn’t bear to not use the .com version of their domain (and let it sit there, going to waste), so they set their site up so that it would run on both the .com and .com.au domains.
It all seemed pretty logical to the web developer who set up their web-site… but little did they know it would end up causing big problems for their client.
When Google found their web-site(s), they quickly found all of the content was identical on the two domains and had to decide which version of the domain was the “legitimate” owner of the content, and which should be penalised for copyright infringement.
Google ended up deciding some of the content legitimately belonged to one web-site, and some belonged to the other - and it penalised both web-sites.
Ouch!
It only took a tiny bit of Search Engine Optimisation work to fix this.
We 301 (permanent) redirected all pages from one version of the site into the other (so that they kept the links that they had on both domains), changed some settings in their Google Webmaster Tools accounts (to fix a second issue to do with domain canonization) and did some minor on-page SEO tweaks.
Ever since, their rankings have been improving, and their positioning in search engines has never been better!
The Lesson: The lesson here is simply - if all of your customers are in one country, don’t run two web-sites.
(Oh, and speaking to an SEO guy can pay off - even if you’re *logically* doing the right thing. ;) )
Duplicate Content on Multiple Domains Case Study #2
This one’s a doozie… and it happened just prior to the last Google Algorithm update.
Another potential client - a corporate marketing firm - came to me asking to help them improve their search engine rankings.
Their web-site was part of a network of partner companies, each providing the same service in different regions. His site was targeting Australia.
I looked into the job and found that one of the major issues they faced was duplicated content… the same content was repeated over the multiple web-sites.
I’m not just talking about content which described their packaged services and business philisophy - they also had duplicated links pages which served to cross-promote the various partner companies.
But the client didn’t want to change these duplicate pages.
“I can’t do that. Part of the agreement with the international partners says we will all display these pages. They help us to get links to our sites, and mean we can cross-promote each other. And we haven’t been penalised so far…”
Famous last words…
A few weeks later when Google’s Algorithm was updated, all of the partner sites dropped significantly - all pages lost page rank, and the sites no longer featured on the first page of Google for any significant keywords.
They’re now “reassessing their options”.
The lesson: Don’t copy and paste content. Even if there is a short term benefit, it’s poison for search engine rankings.
Final Tip: Google Webmaster Tools
Australia (and many other countries) have tight restrictions on who can own their country-specific domain names. In Australia, there are a whole host of restrictions - and in general, you need a registered business entity in order to own the domain name you want to register.
But, if you don’t want to go to the effort of setting up structures in other countries, or if you already have a perfectly good .com domain name which you want to target to a specific country, the alternative is to set up a Google Webmaster Tools account for that domain and tell Google which country you are targeting.
Set up, or log-into your Google Webmaster Tools account, and make sure your site has been verified. Once you have done that…
Google Webmaster Tools geotargeting
  1. Click on Tools;
  2. Then Set Geographic Target;
  3. Then Associate a geographic location with this site, and finally;
  4. Select the Country or Region you want to target
Google does use some of its own tricks to work out which country or region a web-site is targeting, (you would have seen several British .com web-sites ranking well in search engines in the examples above,) but this helps to make sure they get it right.
This is as good as a country-specific domain name for Search Engine Optimisation purposes.