Search 101
Home » Search 101 » About Search Engines

Major Search Engines Listing and Information

We have written and compiled a strategy guide explaining the major search engines, the various nuances of each one and the techniques necessary for effectively optimizing a site to rank well in each one. Any research information found on other websites has been noted accordingly, and links have been provided to those websites. The Major Search Engines:

Google | Yahoo | MSN | AOL | Ask


Google

Google.com has 52.7% of the market share of online searches. Google is currently the world's largest search engine (Nielsen//NetRatings 6/07).

Getting Listed in Google (Information pulled directly from the Google.com website)

How do I add my site to Google's search results?

Getting your site indexed by Google is easy and free through either of the following methods:

1) You may directly submit your website here by typing your URL and any comments you may have about your website. While all submissions are not added to Google's index, this is a quick way to get Google to review your website.

2) It is not necessary, however, to submit your website in order for Google to index it. The Google search engine includes software known as "spiders" that regularly crawl the Web in order to find new websites and review currently indexed sites. Most sites that are indexed by Google are found this way and added to the index with no direct action by the webmaster or website owner. Not all websites are automatically added to the Google index, so read their webmaster guidelines or read the tips below to understand how to create a website that is Google-friendly and more likely to be indexed.

In order to find out if your website has already been found and indexed by our search engine spiders, perform a site search (search for site:mywebsite.com).

 

How can I create a Google-friendly site?

Things to do

Our webmaster guidelines provide general design, technical, and quality guidelines. Below are more detailed tips for creating a Google-friendly site.

1) Quality: Give visitors the information they're looking for
Google's emphasis is on creating high quality content, particularly on your home page, that is beneficial to the user. If a website has high quality content, it should attract other webmasters to link to your website and Google is likely to see it as a resource worth indexing. Also, use words that accurately describe the topic of each page and terms that your users are likely to search for when they look for information, a product, or a service like yours.

2) Popularity: Make sure that other sites link to yours
Not only do links help Google spiders to find your website, but they also indicate how well-liked and useful your website is. Incoming links are one part of the equation to determine the PageRank of each page on your website, and each link counts as a vote for the importance of the page it points to. It's important to remember that the Google algorithm can distinguish between natural and unnatural links. Natural links, which develop from high quality content being recognized by and linked to by other webmasters, are useful to getting indexed and obtaining rankings. Unnatural links, which are usually bought with the sole purpose of making your website seem more popular to search engines, are not useful.

3) Accessibility: Make your site easily accessible
Most importantly, your website should have a logical linking structure that makes navigation simple for all users as well as Google's spiders. Pages should all be found through static text links. You should also provide your users with a simple, straightforward site map to help them more easily navigate the site.

Use a text browser, such as Lynx, to examine your site. Most spiders see your site much as Lynx would. Some website features make crawling more difficult for spiders, such as JavaScript, cookies, session IDs, frames, DHTML, or Macromedia Flash. Using text rather than images (particularly for important names, links and content) also makes your website more accessible to all visitors, and Google spiders do not recognize images, or even text contained in images.

Dynamic pages also cause some problems for spiders, and Google recommends that you make static copies of your dynamic pages to increase your chances of having all your pages indexed and ranking. Remember to add the dynamic pages to your robots.txt file in order to avoid having duplicate pages for spiders to crawl, as this may result in a penalty from Google.

Some More Tips:

  • Make sure that your TITLE tags and ALT attributes are descriptive and accurate.
  • Check for broken links and correct HTML.
  • Keep the links on a given page to a reasonable number (fewer than 100).

Technical guidelines

  • Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page.
  • Make sure your web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.
  • Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it's current for your site so that you don't accidentally block the Googlebot crawler. Visit http://www.robotstxt.org/wc/faq.html to learn how to instruct robots when they visit your site. You can test your robots.txt file to make sure you're using it correctly with the robots.txt analysis tool available in Google webmaster tools.
  • If your company buys a content management system, make sure that the system can export your content so that search engine spiders can crawl your site.
  • Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines.

Quality guidelines

The basic principles and specific quality guidelines below are intended as examples of manipulative behavior that could hurt a website's rankings in Google. This is not to say that other manipulative behavior solely meant to improve search engine rankings won't also penalize a site. These are simply examples of the type of behavior to avoid. Google's logarithm is designed to reward webmasters whose focus is to provide high quality products, service, and information to the user. Websites that you believe are abusing Google's quality guidelines can be reported to Google in order to establish scalable solutions to major threats to the integrity of search engine results pages at https://www.google.com/webmasters/tools/spamreport.

Basic principles

Design your website with the user in mind rather than for search engines. Tactics to manipulate search engine rankings that provide no benefit to users can be penalized by search engines. This means avoiding any practice that has no purpose other than to trick Google into giving your website higher rankings than it would get naturally. These methods include participating in unnatural linking schemes, or otherwise violating Google's Terms of Service.

Quality guidelines - specific guidelines

If you determine that your site doesn't meet these guidelines, you can modify your site so that it does and then submit your site for reconsideration.

How often does Google crawl the web?

Google's spiders are constantly crawling the Web and making updates to their index, but the commonly accepted idea is that a major crawl and re-indexing occurs roughly every month. Websites are not all crawled equally, either, and the frequency each is crawled depends on factors such as PageRank, inbound links, and more. This is all done by computer software. Google does not accept payment to crawl websites with more frequency.

PageRank Explained

PageRank is Google's rating of each Web page's value, and it determined by a number of different factors. Each inbound link (or vote) factors into PageRank, but not all links are counted the same. Links from higher quality results benefit the page they link to more than a page with a PageRank of 0. An indication of PageRank can be viewed on the Google Toolbar, although this is simply an indication and there os no guarantee that this number is up-to-date or accurate.

PageRank is only a small part of Google's algorithm's selection of websites to display in search engine results. The algorithm also carefully analyzes websites to return the most relevant websites to the searcher's query. A combination of higher PageRank and relevancy is what determines search engine results placing.


Yahoo

Yahoo has 20.2% of the market share of online searches and ranks second in terms of search market share.

Yahoo! Search Content Quality Guidelines for Indexing and Ranking

Yahoo! Allows webmasters and website owners to submit their websites to Yahoo!'s index here. You will be required to create a Yahoo! ID.

Pages Yahoo! Wants Included in its Index

  • Original and unique content of genuine value
  • Pages designed primarily for human visitors, with search engine considerations secondary
  • Hyperlinks intended to help people find interesting, related content, when applicable
  • Metadata (including title and description) that accurately describes the contents of a web page
  • Good web design in general (easily navigable, informative, and fully functional)

As many people now realize, the Internet does not solely exist of high quality, beneficial websites to visitors. In fact, many websites are created for the sole purpose of tricking search engines into believing they are high quality sites, even though they provide nothing of value to Web users. This type of website is often called "spam" and Yahoo! does not want this type of website in its index.

What Yahoo! Considers Unwanted
Some, but not all, examples of the more common types of pages that Yahoo! does not want include:

  • Pages that harm accuracy, diversity or relevance of search results
  • Pages dedicated to directing the user to another page
  • Pages that have substantially the same content as other pages
  • Sites with numerous, unnecessary virtual hostnames
  • Pages in great quantity, automatically generated or of little value
  • Pages using methods to artificially inflate search engine ranking
  • The use of text that is hidden from the user
  • Pages that give the search engine different content than what the end-user sees
  • Excessively cross-linking sites to inflate a site's apparent popularity
  • Pages built primarily for the search engines
  • Misuse of competitor names
  • Multiple sites offering the same content
  • Pages that use excessive pop-ups, interfering with user navigation
  • Pages that seem deceptive, fraudulent or provide a poor user experience

Yahoo! Search Technology's Content Quality Guidelines are designed to ensure that poor-quality pages do not degrade the user experience in any way. As with Yahoo!'s other guidelines, Yahoo! reserves the right, at its sole discretion, to take any and all action it deems appropriate to ensure the quality of its index.


MSN Search

Guidelines for successful indexing in MSN Search

The following recommendations include technical advice as well as content guidelines and examples of things to avoid when designing your website. Following these guidelines makes it easier for MSNBot and other web crawlers to index and rank your Web pages.

Technical recommendations for your website

  • Check your HTML for errors and breaks. Bad HTML (especially broken links) makes it more difficult for MSNBot to crawl and index your site. Broken links may cause some pages opn your site to not be indexed.
  • When moving pages, be sure to set the original URL up to redirect to the new URL. Also, make it clear whether the move is temporary or permanent.
  • Do not prohibit MSNBot from crawling and indexing your website.
  • Make good use of a robots.txt folder. By controlling which files MSNBot can access, you ensure that your website is indexed the way you want.
  • Simpler and static URLs are simpler for search engine crawlers like MSNBot. Complicated or frequently changed URLs are difficult to use as link destinations. For example, the URL www.example.com/mypage is easier for MSNBot to crawl and for people to type than a long URL with multiple extensions. In addition, a simple URL that doesn't change is easier for visitors to remember should they want to return to your website in the future.

Content guidelines for your website

MSN Search values websites that are designed to provide the visitor with valuable content. If it is designed to provide value to the visitor it should take into account the following content quality guidelines.

  • In the visible page text, include words users might choose as search query terms to find the information on your site. This helps MSN Search to match your website with relevant search queries.
  • Limit all pages to a reasonable size. One topic on each page naturally limits the page size as well as keeping the focus clear and easy to distinguish by MSNBot. An HTML page with no pictures should be under 150 KB.
  • Make sure that each page is accessible by at least one static text link.
  • Keep important text that you want indexed outside of images. Search engines crawlers cannot read images, so any important text in images is ignored by search engines.
  • Create a simple, straightforward site map. This is not only a resource for the visitor to use to help navigate your website, but also enables MSNBot to find and index every page on your site.
  • When designing page hierarchy, don't go too deep. MSN recommends that you should have no pages that are father than three clicks away from the home page.

Items and techniques discouraged by MSN Search

MSN Search strongly recommends against using any of the following tactics, and the use of such may negatively impact your MSN Search rankings, or may even cause the removal of your website from MSN Search's index.

  • Keyword stuffing. Using keywords in an inappropriate way in an attempt to influence search engine rankings. This includes using irrelevant keywords, filling alt attributes with keywords, and more.
  • Hidden links and text. If text or links are valuable to the visitor, they should be easily visible by the visitor. If they are not valuable, they need not be there.
  • Other methods of increasing rankings in search engines artificially, without increasing the value of the page to visitors.

AOL Search

AOL is another major search engine that is now powered by Google technology. Following the guidelines provided above about how to get listed in and rank with Google is key to obtaining listing and rankings in AOL's search results pages.


Ask

Ask (formerly Ask Jeeves) is one of the highest ranked search engines in terms of search market share, and is currently ranked fourth by some sources. No longer associated with Jeeves, its long term cartoon logo, Ask.com's focus has shifted to include results to keyword phrase queries as well as natural, everyday language. In order to get indexed by Ask.com, they recommend that webmasters or website owners submit a complete site map.

© 1997-2012 Web.com Search Agency - All Legal Rights Apply.Privacy Policy | Link To Us | Contact Us | 1-904-251-6312


Other Web.com Services or Brands: Website Design | Leads | Website Development | Ecommerce Shopping Cart | RenovationExperts.com | Solid Cactus