Search engine optimization (
SEO) is the process of improving the visibility of a
website or a
web page in
search engines via the "natural" or un-paid ("
organic" or "algorithmic")
search results.
In general, the earlier (or higher on the page), and more frequently a
site appears in the search results list, the more visitors it will
receive from the search engine's users. SEO may target different kinds
of search, including
image search,
local search,
video search,
academic search,news search and industry-specific
vertical search engines.
As an
Internet marketing
strategy, SEO considers how search engines work, what people search
for, the actual search terms typed into search engines and which search
engines are preferred by their targeted audience. Optimizing a website
may involve editing its content and
HTML and associated coding to both increase its relevance to specific keywords and to remove barriers to the
indexing activities of search engines. Promoting a site to increase the number of
backlinks, or inbound links, is another SEO tactic.
The acronym "SEOs" can refer to "search engine optimizers," a term adopted by an industry of
consultants
who carry out optimization projects on behalf of clients, and by
employees who perform SEO services in-house. Search engine optimizers
may offer SEO as a stand-alone service or as a part of a broader
marketing campaign. Because effective SEO may require changes to the
HTML source code of a site and site content, SEO tactics may be incorporated into
website development and
design. The term "search engine friendly" may be used to describe website designs,
menus,
content management systems, images, videos,
shopping carts, and other elements that have been optimized for the purpose of search engine exposure.
Another class of techniques, known as black hat SEO, search engine poisoning, or
spamdexing, uses methods such as
link farms,
keyword stuffing and
article spinning
that degrade both the relevance of search results and the quality of
user-experience with search engines. Search engines look for sites that
employ these techniques in order to remove them from their indices.
History
Webmasters
and content providers began optimizing sites for search engines in the
mid-1990s, as the first search engines were cataloging the early
Web. Initially, all webmasters needed to do was submit the address of a page, or
URL, to the various engines which would send a "
spider" to "crawl" that page, extract links to other pages from it, and return information found on the page to be
indexed.The process involves a search engine spider downloading a page and
storing it on the search engine's own server, where a second program,
known as an
indexer,
extracts various information about the page, such as the words it
contains and where these are located, as well as any weight for specific
words, and all links the page contains, which are then placed into a
scheduler for crawling at a later date.
Site owners started to recognize the value of having their sites
highly ranked and visible in search engine results, creating an
opportunity for both
white hat and
black hat SEO practitioners. According to industry analyst
Danny Sullivan, the phrase "search engine optimization" probably came into use in 1997.
The first documented use of the term Search Engine Optimization was
John Audette and his company Multimedia Marketing Group as documented by a web page from the MMG site from August, 1997.
Early versions of search
algorithms relied on webmaster-provided information such as the keyword
meta tag, or index files in engines like
ALIWEB.
Meta tags provide a guide to each page's content. Using meta data to
index pages was found to be less than reliable, however, because the
webmaster's choice of keywords in the meta tag could potentially be an
inaccurate representation of the site's actual content. Inaccurate,
incomplete, and inconsistent data in meta tags could and did cause pages
to rank for irrelevant searches.
Web content providers also manipulated a number of attributes within
the HTML source of a page in an attempt to rank well in search engines.
By relying so much on factors such as
keyword density
which were exclusively within a webmaster's control, early search
engines suffered from abuse and ranking manipulation. To provide better
results to their users, search engines had to adapt to ensure their
results pages
showed the most relevant search results, rather than unrelated pages
stuffed with numerous keywords by unscrupulous webmasters. Since the
success and popularity of a search engine is determined by its ability
to produce the most relevant results to any given search, allowing those
results to be false would turn users to find other search sources.
Search engines responded by developing more complex ranking algorithms,
taking into account additional factors that were more difficult for
webmasters to manipulate.
Graduate students at
Stanford University,
Larry Page and
Sergey Brin,
developed "backrub," a search engine that relied on a mathematical
algorithm to rate the prominence of web pages. The number calculated by
the algorithm,
PageRank, is a function of the quantity and strength of
inbound links.PageRank estimates the likelihood that a given page will be reached by a
web user who randomly surfs the web, and follows links from one page to
another. In effect, this means that some links are stronger than
others, as a higher PageRank page is more likely to be reached by the
random surfer.
Page and Brin founded
Google in 1998. Google attracted a loyal following among the growing number of Internet users, who liked its simple design.
Off-page factors (such as PageRank and hyperlink analysis) were
considered as well as on-page factors (such as keyword frequency,
meta tags,
headings, links and site structure) to enable Google to avoid the kind
of manipulation seen in search engines that only considered on-page
factors for their rankings. Although PageRank was more difficult to
game, webmasters had already developed link building tools and schemes
to influence the
Inktomi
search engine, and these methods proved similarly applicable to gaming
PageRank. Many sites focused on exchanging, buying, and selling links,
often on a massive scale. Some of these schemes, or
link farms, involved the creation of thousands of sites for the sole purpose of
link spamming.
By 2004, search engines had incorporated a wide range of undisclosed
factors in their ranking algorithms to reduce the impact of link
manipulation. Google says it ranks sites using more than 200 different
signals.The leading search engines,
Google,
Bing, and
Yahoo, do not disclose the algorithms they use to rank pages. Notable SEO service providers, such as Rand Fishkin,
Barry Schwartz,
Aaron Wall and
Jill Whalen, have studied different approaches to search engine optimization, and have published their opinions in online
forums and
blogs.SEO practitioners may also study patents held by various search engines to gain insight into the algorithms.
In 2005 Google began personalizing search results for each user.
Depending on their history of previous searches, Google crafted results
for logged in users.In 2008,
Bruce Clay said that "ranking is dead" because of
personalized search.
It would become meaningless to discuss how a website ranked, because
its rank would potentially be different for each user and each search.
In 2007 Google announced a campaign against paid links that transfer PageRank.
On June 15, 2009, Google disclosed that they had taken measures to mitigate the effects of PageRank sculpting by use of the
nofollow attribute on links.
Matt Cutts,
a well-known software engineer at Google, announced that Google Bot
would no longer treat nofollowed links in the same way, in order to
prevent SEO service providers from using nofollow for PageRank
sculpting.
As a result of this change the usage of nofollow leads to evaporation
of pagerank. In order to avoid the above, SEO engineers developed
alternative techniques that replace nofollowed tags with obfuscated
Javascript and thus permit PageRank sculpting. Additionally several solutions have been suggested that include the usage of
iframes,
Flash and Javascript.
In December 2009 Google announced it would be using the web search history of all its users in order to populate search results.
The Real-time-search was introduced in late 2009 in an attempt to
make search results more timely and relevant. Historically site
administrators have spent months or even years optimizing a website to
increase search rankings. With the growth in popularity of social media
sites and blogs the leading engines made changes to their algorithms to
allow fresh content to rank quickly within the search results.
Relationship with search engines
By 1997 search engines recognized that
webmasters were making efforts to rank well in their search engines, and that some webmasters were even
manipulating their rankings in search results by stuffing pages with excessive or irrelevant keywords. Early search engines, such as
Altavista and
Infoseek, adjusted their algorithms in an effort to prevent webmasters from manipulating rankings.
Due to the high marketing value of targeted search results, there is
potential for an adversarial relationship between search engines and SEO
service providers. In 2005, an annual conference, AIRWeb, Adversarial
Information Retrieval on the Web, was created to discuss and minimize the damaging effects of aggressive web content providers.
Companies that employ overly aggressive techniques can get their client websites banned from the search results. In 2005, the
Wall Street Journal reported on a company,
Traffic Power, which allegedly used high-risk techniques and failed to disclose those risks to its clients.
[23] Wired magazine reported that the same company sued blogger and SEO
Aaron Wall for writing about the ban.
[24] Google's
Matt Cutts later confirmed that Google did in fact ban Traffic Power and some of its clients.
[25]
Some search engines have also reached out to the SEO industry, and
are frequent sponsors and guests at SEO conferences, chats, and
seminars. In fact, with the advent of paid inclusion, some search
engines now have a vested interest in the health of the optimization
community. Major search engines provide information and guidelines to
help with site optimization.
[26][27][28] Google has a
Sitemaps program
[dead link][29]
to help webmasters learn if Google is having any problems indexing
their website and also provides data on Google traffic to the website.
Google guidelines are a list of suggested practices Google has provided
as guidance to webmasters.
Yahoo! Site Explorer provides a way for webmasters to submit URLs, determine how many pages are in the Yahoo! index and view link information.
[30] Bing Toolbox
provides a way from webmasters to submit a sitemap and web feeds,
allowing users to determine the crawl rate, and how many pages have been
indexed by their search engine.
Methods
Getting indexed
The leading search engines, such as
Google,
Bing and
Yahoo!, use
crawlers
to find pages for their algorithmic search results. Pages that are
linked from other search engine indexed pages do not need to be
submitted because they are found automatically. Some search engines,
notably Yahoo!, operate a paid submission service that guarantee
crawling for either a set fee or
cost per click.
[31] Such programs usually guarantee inclusion in the database, but do not guarantee specific ranking within the search results.
[dead link][32] Two major directories, the Yahoo Directory and the
Open Directory Project both require manual submission and human editorial review.
[33] Google offers
Google Webmaster Tools, for which an XML
Sitemap
feed can be created and submitted for free to ensure that all pages are
found, especially pages that aren't discoverable by automatically
following links.
[34]
Search engine crawlers may look at a number of different factors when
crawling
a site. Not every page is indexed by the search engines. Distance of
pages from the root directory of a site may also be a factor in whether
or not pages get crawled.
[35]
Additionally, search engines sometimes have problems with crawling
sites with certain kinds of graphic content, flash files, portable
document format files, and dynamic content.
[36]
Preventing crawling
To avoid undesirable content in the search indexes, webmasters can
instruct spiders not to crawl certain files or directories through the
standard
robots.txt
file in the root directory of the domain. Additionally, a page can be
explicitly excluded from a search engine's database by using a
meta tag specific to robots. When a search engine visits a site, the robots.txt located in the
root directory
is the first file crawled. The robots.txt file is then parsed, and will
instruct the robot as to which pages are not to be crawled. As a search
engine crawler may keep a cached copy of this file, it may on occasion
crawl pages a webmaster does not wish crawled. Pages typically prevented
from being crawled include login specific pages such as shopping carts
and user-specific content such as search results from internal searches.
In March 2007, Google warned webmasters that they should prevent
indexing of internal search results because those pages are considered
search spam.
[37]
Increasing prominence
A variety of methods can increase the prominence of a webpage within the search results.
Cross linking between pages of the same website to provide more links to most important pages may improve its visibility.
[38]
Writing content that includes frequently searched keyword phrase, so as
to be relevant to a wide variety of search queries will tend to
increase traffic.
[38]
Updating content so as to keep search engines crawling back frequently
can give additional weight to a site. Adding relevant keywords to a web
page's meta data, including the
title tag and meta description, will tend to improve the relevancy of a site's search listings, thus increasing traffic.
URL normalization of web pages accessible via multiple urls, using the "canonical"
meta tag[39] or via
301 redirects can help make sure links to different versions of the url all count towards the page's link popularity score.
White hat versus black hat
SEO techniques are classified by some into two broad categories:
techniques that search engines recommend as part of good design, and
those techniques that search engines do not approve of and attempt to
minimize the effect of, referred to as
spamdexing.
Some industry commentators classify these methods, and the
practitioners who employ them, as either white hat SEO, or black hat
SEO.
[40]
White hats tend to produce results that last a long time, whereas black
hats anticipate that their sites will eventually be banned once the
search engines discover what they are doing.
[41]
An SEO tactic, technique or method is considered white hat if it
conforms to the search engines' guidelines and involves no deception. As
the search engine guidelines
[26][27][28][42]
are not written as a series of rules or commandments, this is an
important distinction to note. White hat SEO is not just about following
guidelines, but is about ensuring that the content a search engine
indexes and subsequently ranks is the same content a user will see.
White hat advice is generally summed up as creating content for
users, not for search engines, and then making that content easily
accessible to the spiders, rather than attempting to game the algorithm.
White hat SEO is in many ways similar to web development that promotes
accessibility,
[43] although the two are not identical.
White Hat SEO is merely effective marketing, making efforts to
deliver quality content to an audience that has requested the quality
content. Traditional
marketing
means have allowed this through transparency and exposure. A search
engine's algorithm takes this into account, such as Google's
PageRank.
Black hat SEO
attempts to improve rankings in ways that are disapproved of by the
search engines, or involve deception. One black hat technique uses text
that is hidden, either as text colored similar to the background, in an
invisible
div,
or positioned off screen. Another method gives a different page
depending on whether the page is being requested by a human visitor or a
search engine, a technique known as
cloaking.
Search engines may penalize sites they discover using black hat
methods, either by reducing their rankings or eliminating their listings
from their databases altogether. Such penalties can be applied either
automatically by the search engines' algorithms, or by a manual site
review. One infamous example was the February 2006 Google removal of
both
BMW Germany and
Ricoh Germany for use of deceptive practices.
[44] Both companies, however, quickly apologized, fixed the offending pages, and were restored to Google's list.
[45]
Additionally, many professionals in the SEO industry refer to "gray
hat" tactics that may skirt the lines of black and white hat tactics.
Numerous references to gray hat techniques have been published, and
these usually constitute practices that are not strictly disapproved by
search engines, but may go against the spirit of the regulations that
search engines have laid out.
As a marketing strategy
SEO is not an appropriate strategy for every website, and other
Internet marketing strategies can be more effective, depending on the
site operator's goals.
A successful Internet marketing campaign may also depend upon building
high quality web pages to engage and persuade, setting up
analytics programs to enable site owners to measure results, and improving a site's
conversion rate.
SEO may generate an adequate
return on investment.
However, search engines are not paid for organic search traffic, their
algorithms change, and there are no guarantees of continued referrals.
Due to this lack of guarantees and certainty, a business that relies
heavily on search engine traffic can suffer major losses if the search
engines stop sending visitors.
It is considered wise business practice for website operators to liberate themselves from dependence on search engine traffic.A top-ranked SEO blog Seomoz.org
has suggested, "Search marketers, in a twist of irony, receive a very
small share of their traffic from search engines." Instead, their main
sources of traffic are links from other websites.
International markets
Optimization techniques are highly tuned to the dominant search
engines in the target market. The search engines' market shares vary
from market to market, as does competition. In 2003,
Danny Sullivan stated that Google represented about 75% of all searches.In markets outside the United States, Google's share is often larger,
and Google remains the dominant search engine worldwide as of 2007.
As of 2006, Google had an 85-90% market share in Germany.
While there were hundreds of SEO firms in the US at that time, there were only about five in Germany.As of June 2008, the marketshare of Google in the UK was close to 90% according to
Hitwise.
That market share is achieved in a number of countries.
As of 2009, there are only a few large markets where Google is not
the leading search engine. In most cases, when Google is not leading in a
given market, it is lagging behind a local player. The most notable
markets where this is the case are China, Japan, South Korea, Russia and
the Czech Republic where respectively
Baidu,
Yahoo! Japan,
Naver,
Yandex and
Seznam are market leaders.
Successful search optimization for international markets may require professional
translation of web pages, registration of a domain name with a
top level domain in the target market, and
web hosting that provides a local
IP address. Otherwise, the fundamental elements of search optimization are essentially the same, regardless of language.
Legal precedents
On October 17, 2002,
SearchKing
filed suit in the United States District Court, Western District of
Oklahoma, against the search engine Google. SearchKing's claim was that
Google's tactics to prevent
spamdexing constituted a
tortious interference
with contractual relations. On May 27, 2003, the court granted Google's
motion to dismiss the complaint because SearchKing "failed to state a
claim upon which relief may be granted."
In March 2006,
KinderStart filed a lawsuit against
Google
over search engine rankings. Kinderstart's website was removed from
Google's index prior to the lawsuit and the amount of traffic to the
site dropped by 70%. On March 16, 2007 the
United States District Court for the Northern District of California (
San Jose Division) dismissed KinderStart's complaint without leave to amend, and partially granted Google's motion for
Rule 11 sanctions against KinderStart's attorney, requiring him to pay part of Google's legal expenses
.