The search engine Google, which gave its name to the
company Google is the search engine on the Internet's most
widely used worldwide. In 2009, 67% of Internet use [1].
                 
                            Summary
     * 1 Principles and characteristics
           o 1.1 The PageRank system
           o 1.2 Sobriety and recovery of words
           o 1.3 Infrastructure
           Ø 1.4 Logos
           o 1.5 Beta
     * 2 Services
     * 3 Tips for Using the Google search engine
           O 3.1 Terms to search
           Ø 3.2 Logical (Boolean)
           o 3.3 Limitations
           o 3.4 Date
           Ø 3.5 Sorting results
           Ø 3.6 Additional functions
           Ø 3.7 SearchWiki
           Ø 3.8 Special Characters
     * 4 Misuse of Google
           O 4.1 Competition positioning
           O 4.2 Google bomb
           O 4.3 Google fight
           Ø 4.4 Google Whacks
           o 4.5 "Fake Google"
     * 5 Limitations and errors from Google
           o 5.1 The size of the database
           o 5.2 The effectiveness of research
     * 6 Rating Engine
     * 7 Controversies
           o 7.1 Controversy about the influence of the

content of the results displayed
           o 7.2 Case Tiananmen
           Ø 7.3 Case BMW Germany
           Ø 7.4 Case keywords in France
           Ø 7.5 Controversy on the number of results

displayed
     * 8 Notes and references
     * 9 See also
           o 9.1 Article Related
           o 9.2 External links

Principles and Characteristics [edit]
The PageRank system [edit]

The operating principle of Google, which made its success is
based on an invention to its creators, PageRank: when a
document is pointed to by many links (link popularity),
increases its PageRank. Plus its PageRank, the higher it will
likely be displayed in the first search results. This system
gives an indication of the "popularity" of material from other
web documents.
This principle was immediately a success because it allowed
more relevant results than other search engines that simply
to recognize the keywords inserted in the pages of the sites.
It also permits what is called Google bombing.

Sobriety and recovery of words [edit]
In addition, this search engine is popular for its fast search
and sobriety: no Flash, no flashing banner, etc.. Its
interface has inspired the other engines like Yahoo.
This sober, far from being anecdotal, is at least partly
behind the success of the site. At the time of its launch, in
fact, the fashion was to search engines inserted on pages
loaded very content and advertising. These pages are often
slow to appear and difficult to read.
He still uses a system of AdWords (ad words ") to pay. This
system is based on a value per word depending on its
application. The more the word will be asked more expensive
it will be paid per click. But it is always possible for the user
to block the display of such advertisements through plugins
like CustomizeGoogle for Firefox.

Amenities [edit]
Around 2002, Google claimed to distribute the load over 10
000 PCs running a modified Linux kernel. The figure of 1 000
simultaneous requests peak was also frequently mentioned.
Actual figures appear 10 times higher. However, they are
secrets, including not allowing to easily calculate the
investment required to compete with Google.
     * Google and Akamai: Cult of Secrecy vs.. Kingdom of Openness
Google uses robots named Googlebot visiting at regular
intervals all websites have requested to be referenced to
maintain updated database that provides answers to queries.
Article: Google Platform.

Logos [edit]
Apart from the official logo [1], the site adopts special logos
for certain festivals and events: the Google Doodles. Made by
Dennis Hwang, an American designer of Korean origin aged
23, they appear regularly as a local or international festival
(new year, national holidays, etc..) Or events (Olympics, to
commemorate a person shows, etc..) permits.
All logos festivals and events to www.google.com put online
since 1999 are available here and, more specifically, those
that emerged in France are available here.

Beta [edit]
A beta version is usually a note indicating that a program is
nearing completion. At Google it's become a trademark
affixed to most services and software except the search
engine and advertising services.
The interest lies in the term "beta" is that, in terms of
quality of service, it binds to any obligation of result, since it
is a development phase. This may also mean that Google is
constantly improving phase.
This peculiarity Google becomes a fashion that results from
its competitors by a more overt use of this reference.

Services [edit]
This search engine is available in 35 languages and offers its
interface into over 100 languages.
Google is basically a search engine for web pages, it has
gradually extended to different types of documents (PDF,
Microsoft Word, Flash, ...), images. As well as Usenet
newsgroups, Google Groups since the purchase of Deja News.
The web2news gives access to forums on Google.
He now has a section directory for finding sites by category
(directory dmoz ranked by PageRank), and a portal for news
gatherin sites of major newspapers and major news
agencies.
The vast popularity of Google and its development policy very
diverse (links advertising, purchasing databases and archives
forums) eventually lead to a number of concerns about the
potential drift of that power: in effect, it sometimes just
"googling" the name of a person to obtain information about
her personal and thorough.
Google offers as an increasing number of ancillary functions,
available either through the normal Google field or in the
form of web application.

Tips for Using the Google search engine [edit]
Google offers a simple form and an advanced search form to
exclude words or search complete expressions (see here other
advanced features).

Terms to search [edit]
Documentation for Google on its interpretation of queries is
fairly spartan. The observed changes in the operation shows
that this is probably by design to keep freedom of maximum
change. The following must be continuously validated and
modified to track changes.
     * H2O is searched as one word and then Google does not
find documents with H 2 O or H2O in their text. These are
found by asking "H 2 O". H-2-O (see the role of the hyphen)
is both H2O as H 2 O and H2O. Unfortunately, the operator
"dash" seeks only the two extreme combinations (all words
glued or all separate words: it is not H2 O).
     * Word: A word and its variants, singular / plural - male
/ female - with or without accents. For example, pommel
horse pommel horse found: this algorithm works in French
and English but not in Dutch (he does not know the plural in
"en"). Note: the variant you specify is favored in the sorting
of documents submitted.
     * ~ Word: A word and its synonyms. Works with an
English dictionary on the same research in French and Dutch!
Try the request ~ car-car to see the words found outside the
strict automotive term. ~ returns arabic Egypt, Lebanon,
Arab and Hindu ...! We do not know the source of synonyms.
     * "Word": An exact word. Google does not take into
account the emphasis for research but favors the form
specified in the sorting of documents submitted.
     * "Word ... word": a series of specific words,
expressions
     * "* Word word" in a series of words in quotes (and only
there), a star can be put in place one or more complete words
you do not wish to specify. For example: "* Ministry of
Trade and Commerce"
     * Site: www ...: an area of origin. It may be more or
less general and even indicate top level domains. For example:
site: org OR site: com
     * Intitle: "... word word": a series of specific words in
the document title (tag and / or first tag .. )

     * + Word search word even if it is an empty word in the
 Language of the user (more than in French for example) and
 look in the light accents (eg + dice). A "+" is assumed if a
 word is searched: tea alone is searched as if they had typed
 + tea (This form has a meaning very different from Altavista
 where the "+" indicates required words). When sorting
 documents, Google gives preference to the typed form: the
 operator "+" no longer has much interest.
     * Word-word search term consisting of multiple words,
 whether written with dashes, spaces or even no space at all:
 skyscraper located skyscraper, skyscrapers and gratteciel.
 skyscrapers does not mean all the same thing as skyscrapers
 (see operator "-"). Warning: bare-foot is going barefoot and
 vanupied but not going nupied.

Logical operators (Boolean) [edit]
      * Space: The documents must contain what is right and
 what is left. Sorting Google promotes various documents
 which specified words are close to each other (see below).
     * OR or |: Documents may contain what is right or what
 is left. Note: OR must be written in capital letters!
     * Space-(minus sign) to exclude documents containing the
 following word (NOT)
     * (...): Sub-expression to evaluate before making  operations surrounding
 The GoogleGuide you give other examples. The site HotBot
 United States provides a form of Google search sometimes
 more convenient than that of Google itself.


Restrictions [edit]
      * Queries are limited to 32 words.
     * Only the first 1000 results relevant to a query are
 available, even if the connections are more numerous. The
 results can sometimes be less than 1000 due to the removal
 of pages from one site. According to Google, more than 1000
 results would lead to a heavy burden on an application actually
 rather rare.
 In theory, sorting ensures that references are most useful
 first (difficult to validate).

Dates [edit]
      * When searching by date, the date is that of indexing in
 the database (ie the visit of the "spider" Google) and not the
 actual publication of the page (as provided by the http server
 : / /)
     * In the advanced search form, you can search on the
 last 3, 6 and 12 months.
     * The operator daterange: Julian date, Julian date (or
 the form of site HotBot) to specify another date range. A
 Julian date is the number of days elapsed since the beginning
 of our era: the http://www.numerical-recipes.com/julian.html
 site can help you calculate.

Sort the results [edit]
 The quality of Google comes from its ability to show first the
 pages deemed most relevant in general and relevant to a
 particular search. Google sorts the documents found in the
 function:
      * Measures of site quality in general and also of each
 page (consistency of meta-information with the visible text
 of the page for example). These measures are not or poorly  documented.
     * A measure of the weight of each page indexed: This is
 the PageRank algorithm which reads a passage quoted from  Google:
 We assume page A has pages T1 ... Tn Which point to it (ie,
 are citations). The parameter d is a damping factor Which
 can be set between 0 and 1. We usually set d to 0.85. There
 are more details about d in the next section. Also C (A) is
 defined as the number of links going out of page A. The
 PageRank of a page A is given as follows: PR (A) = (1-d) + d
 (PR (T1) / C (T1) + ... + PR (Tn) / C (Tn)) Note that the
 PageRanks form a probability distribution over web pages, so
 the sum of all web pages' PageRanks will be one. PageRank or
 PR (A) can be calculated using a simple iterative algorithm,
 and corresponds to the principal eigenvector of the
 normalized link matrix of the web. See also: [2]
      * An assessment of the relevance of page vis-à-vis the
 research conducted. This is done taking into account:
           o presence in the top of the search words (possibly
 expanded their synonyms or their variants, singular / plural)
           o the location of these words on the page (title,
 metadata, text) or links to this page: the latter may cause
 ethical problems because a page can be found indexed by the
 words of others that the authors use to describe it. (Try:
 "miserable failure", the author of the target page does not
 consciously trying this description!)
           o From tf-idf for each word formula that takes into
 account the number of occurrences of the word in the top-
 weighted by the inverse of the relative frequency of this
 word in that part of the web indexed by Google:
                 + Tfi = frequency of term i in page
                 + Dfi = number of web pages containing the term i
                 + D = number of documents on the Web
                 + This formula was developed by Gerard Salton
 (1927-1995), Cornell University, based on the Information
 Theory of Claude Shannon.
           o the distance between the top searched words: the
 more they are close to each other, the more the page is
 considered relevant vis-à-vis the research conducted. See: [3]
     * The country indicated by the URL to Google: google.be
 gives strong preference to sites in Belgium, French google.fr
 sites, U.S. sites to google.com and google.co.uk to English
 sites, etc.. It is really important to choose the "localization"
 of his research. The next page will more often serve as start
 page of a search: [4]
     * The language of the user who is also one of the
 searched words: the only form to specify it is [5]. The only
 other way to change the language of the user is to edit "by
 hand" the Google URL (http://www.google.be/search?hl=fr&q
 =...) by changing the parameter & hl = xx (xx is the two-
 letter code of the desired language).
 It is essential to research into changing its language user
 based on the language of your search words. Google then
 sorts the documents supporting this language (and perhaps
 use one day good dictionary of synonyms). It then uses the
 appropriate algorithm to make the equivalent singular and
 plural, feminine and masculine (reminder: the Netherlands

seems poorly supported at the moment).
Additional functions [edit]

Google also offers additional functions:

     * In the headlines: some keywords related to the current

refer top results 3 titles of articles in Google News. A

button to search the headlines.
     * Currency Conversion: ex. : In the search field, type: 3

euros in dollars, Google will display: € 3 = x, xxxxx U.S.

dollars (rates provided by Citibank unsecured).
     * Google Calculator: in the search field, type a

mathematical formula
     * Machine Translation
     * PDF Files
     * Page caching: allows you to display the page stored in

Google Base, useful if the page no longer exists
     * Similar pages
     * Links: in the search field type in link: site.com to view

pages that poinent external to the specified URL
     * Operators Targeting can do research exclusively on a

single web address. Syntax: "site: your query.
     * I'm Feeling Lucky
     * Definitions: provides one or more (or any) definition (s)

of words, taken from various websites (Wikipedia and

Wiktionaire mainly, and other sites). This function is now

available in English, French, Spanish, German, Chinese,

Italian and Russian. Syntax: "define: word to define"
     * Google Movies: Enter film title to view criticism of the

film which was typed the title (movies: title for reviews in

English) on Google Movies, you can choose between web

search and retrieval of films showing movie showtimes

cinemas in some cities. Movies on Google, you can choose

between web search and retrieval of films showing the

desired movie reviews.

SearchWiki [edit]
This section is empty, not detailed enough or incomplete.

Your help is welcome!

Since November 21, 2008, functionality SearchWiki can

personalize the Google results page on the English version.

The novelty has appeared on the French version of Google

April 28, 2009 [2].
Special characters [edit]

Google handles accents written as entities, but not Unicode.

Therefore, search for "ALKENE" and "alkene" does not give

the same result (for a single word is searched by giving

preference to the form in which it was written) while seeking

"encyclopedia" or "ENCYCLOPEDIA" does not change nothing.

If you type "recipe for the soup * and tomato", Google will

offer basil or pumpkin in place of the star. We can expand

his research to the synonyms of a word, by preceding the

symbol "~". The "+" used to force the word to be interpreted

as such by Google (this is particularly useful for the accents

in French).
Misuse of Google [edit]

The many features of Google gave birth to various

recreational uses by the Internet.
Contest positioning [edit]

Many competitions positioning emerged on Google and other

engines. The goal is to place a page on a keyword more or

less fictional first positions of search results on it. The first

important contest on the request SERPS. In 2004, a

competition on the French expression stork eater was

attended by 170 candidates and reached 420 000 queries on

Google for that phrase. The controversies have taken place

on the motivations of these competitions, which are some

tools useful experience in SEO, but as others have that

motivation only fun, making Google a simple playground
Google bomb [edit]

Google Bombing (Google bombing) is to combine the web pages

over a possible expression to a particular website, so a

Google search on that phrase back the site in question in the

first results. The bombing campaign are Google through

forums or blogs, encouraging users to participate. Simply add

the participant to a website or blog a link to the target site

by associating the expression.

One of the first sites to have been targeted by a bombing is

the biography of the President of the United States George

Walker Bush [6] on the website of the White House. A google

search on the term "failure" or "miserable failure" still gave

this site as first result, until the Mountain View company has

made some adjustments on their system, which would

significantly reduce the number of Google bombing ( see

below).

During autumn 2005, following a massive email campaign

launched by the political party of Nicolas Sarkozy, and in

retaliation, the Webmasters have called to make Google

Bombing on behalf of the Minister of Interior. So when you

type Nicolas Sarkozy [7] in Google, you get second place a

link to Iznogoud, the cartoon character who wants to be

caliph instead of the caliph. The Google Bombing is to put on

the page of a website link (Iznogoud or George Bush) and

associate it with a text (or Nicolas Sarkozy miserable

failure). If the operation is performed by a certain number

of webmasters, the result is rapid misleading links back in

the first results of Google.

End January 2007, Google announced that it has developed an

algorithm to solve the problem of "google bombing" and that

in any language. Now "miserable failure" referring to a page

explaining the "google bombing".
Google Fight [edit]

The Google Fight is to compare the number of results

returned by Google on several expressions is declared the

winner word having obtained the most results. Customers

have fun and to compare names, ideas, policies, etc.. A

website has even been created to provide an interface to this

type of "fight" [8].

Since January 2006, the team intercepts Google Google Fight

queries and returns results fantastic. You can check this by

querying the site several times on the same couple of names.
Google Whacks [edit]

The Google Whacks is a game of finding two words that

associates in a Google search gives a unique result. The terms

must exist in the dictionary, and found the site should not be

a simple list of words. Quotes and all punctuation should not

be used. The score is often calculated by multiplying the

number of results of the first term by the number of results

for the second word. [9]
False Google "[edit]

There are search engines that are copies of Google in a

language minority and non-official. Most of the time they are

created in a humorous purpose.

     * Google ch'ti: Gogol
     * Google Walloon: Gôgueule
     * Google in West Flemish: Hoegel
     * In présipauté of Groland: Grögler

Limitations and errors from Google [edit]

The main limitation of Google is that the engine that runs the

web visible, leaving aside all professional databases,

sometimes enormous, and often appropriate, but access is

limited (but sometimes free). Example: Dialog (15 000 GB).

Studies show internal limits of Google, such as large

variations in the number of results announced in identical

searches at certain periods [3], or inconsistent results when

comparing the results of some research, due to limitations

techniques [4] [5] [6].
The size of the database [edit]

Several studies have shown that the number of pages actually

indexed is only half the number reported the other half would

be the pages visited by the robot of Google, but only a part

(the header without the body page) would be indexed. These

pages are mostly non-English pages, because of AdWords

technology, which is used only for English, which is the main

source of funding from Google.

This concept of index size has been and remains a marketing

major search engines. In late 2005, following a critical

analysis [5], started in January 2005, the size of its index,

initiated by Jean Veronis, the company Google has decided

not to put that argument forward. [Ref. necessary]

For example, this marketing approach, Google announced a

doubling of the size of its index announced the day after the

launch of MSN Search [ref. necessary].
The effectiveness research [edit]

When searching for a medium complexity (using a Boolean

operator, that is to say an area [AND operator]), results

vary up to threefold in the same day in some cases ,

according to an order of magnitude ranging from one to ten.

Sometimes the search does not include operators requested.

This variability in the number of responses reflects the

architecture of Google. There are indeed several servers

scattered around the world, hosting the index of the pages

visited by Google. According to the location of a user (or as

the local site of Google interviewed), its application is

directed towards one or other of these servers. Normally,

each index is identical to others but as they are not

synchronized in real time (but at intervals exceeding one

month), only the main index located in California, is

constantly updated and gives a maximum correct answers. The

server can thus give ten times more responses than a

secondary server.
Rating engine [edit]

According to Jean Verona [7] Yahoo! and Google are the two

best engines (among six major engines Francophone). For the

author, these two engines with equivalent performance, the

reason for the massive preference for Google users is the

relevance of results.

But according to Trent, it could be less than Windows Live

Search [ref. necessary].
Controversies [edit]
Controversy about the influence of the content of the results

displayed [edit]

By becoming the first search engine in terms of use, Google

has become the first vehicle information on the Internet.

This role - convey information - is inherent in the business

of search engines and the resulting problems are not all due

to Google, which is not the author of the content pages.

Beyond the difficulties posed by the strategic importance of

Google ranking in the economic field, the real problem lies in

the strong ideological influence that have pages that appear

in the first results which have emerged as gospel. The

popularity of a search engine such as Google can be used as a

vehicle for misinformation, where the influence of a site is

especially significant that the keyword is popular and he tops

the list. The Google executives admit [8] to be powerless

against the phenomena of intoxication and defamation that

currently appear in Google results first, the technique can not

judge the sincerity of the information.
Case Tiananmen [edit]

The leaders of China People's embarrassed that a search on

Google Images in Tiananmen returns photos of tanks

suppressing the student revolt, obtained in 2006 from the

Google query "Tiananmen" on Google's Chinese portal will no

longer return the images [ref . necessary].
Case BMW Germany [edit]

Following attempts to BMW Germany and the referrer, to

increase its PageRank (and therefore the positioning of links

on BMW car as queries in Google), the car company has been

blacklisted by Google that was eliminated from its index in

January 2006. Research on "BMW" returns only references

on his website World [10].
Case keywords in France [edit]

In 2005, the UMP and especially Nicolas Sarkozy has been

criticized for having bought dozens of keywords such as

"riot", "CPE", "Jack Lang" ... referring to the site of the

UMP.
Controversy over the number of results displayed [edit]

When the number of pages is too large, only 1 000 first

pages are viewable, which is a reasonable limit and adopted

by most search engines. However, some users suspect that

the number of pages found to be artificially "inflated" when

it exceeds this limit. This hypothesis is based on two facts:

     * He sometimes displays a number of pages larger than

the number of pages of the web indexed by Google (for

example with a query on a word used in all pages of English

as "the" definite article);
     * When we made several successive searches on the same

keyword, the result varies. This reflects the number of

servers used by Google, each server with different number of

pages recorded for the same result.

See for example the message <449d92eb $ 0 $ 1002 $

ba4acef3@news.orange.fr> on the usenet group

fr.sci.physique and [11]
Notes and references [edit]

    1. ? "Who's afraid of Google, G.F., Challenges, n? 180,

September 17, 2009, p. 49
    2. ? New Google: customizing results [archive]
    3. ? http://www.bases-

publications.com/revues/netsources/e-

docs/00/00/02/C9/document_article.phtml [archive] item-

base publications
    4. ? (en) Mark Liberman, "Google recall (They stole his

mind, he now wants it back.) [archive], January 24, 2005
    5. ? a and b Jean Veronis, "Accounts cans at google

[archive], January 26, 2005
    6. ? Jean Veronis, "Web: The mystery of the missing

pages of Google solved [archive], February 8, 2005
    7. ? "[Pdf] comparative study of six search engines

[Archive]", February 2006
    8. ? Interview with Eric Schmidt, Google Defining

television documentary broadcast by CBSNews (January

2005)

See also [edit]
Related Articles [edit]

     * ElgooG a humorous mirror site of Google.
     * Link farm, a method of diversion of search engine

Google
     * SearchMash engine "experimental" Google (stopped

since November 2008)

[Edit]

     * Google.com
     * Google.fr
     * Special Features of Google
     * (En) experimental Features

Recent Posts