|
Consejos y Golosinas |
|
|||||
|
Astrología
|
Auto
|
Belleza
|
Comunicación
|
Ordenadores
|
Jardín
|
Entretenimiento
|
Moda
|
Dinero
|
Comida
|
Internet
|
Citas
|
Compras
|
Deportes
|
Viajar
|
Varios.
|
||||||
|
||||||||||||||||||||||||
Page Rank Tips
PageRank is a numeric value that represents how important a page is on the web.
Googlefigures that when one page links to another page, it is effectivelycasting a vote for the other page. The more votes that are cast fora page, the more important the page must be. Also, the importance ofthe page that is casting the vote determines how important the voteitself is. Google calculates a page's importance from the votes castfor it. How important each vote is is taken into account when a page'sPageRank is calculated.
PageRank is Google's way of deciding a page's importance. Itmatters because it is one of the factors that determines a page's rankingin the search results. It isn't the only factor that Google uses torank pages, but it is an important one.
Last Updated - 12th November 2005
How is PageRank Used?
PageRank is one of the methods Google uses to determine a pagesrelevance or importance. It is only one part of the story when it comesto the Google listing.
PageRank is also displayed on the toolbar of your browser if youveinstalled the Google toolbar (http://toolbar.google.com/).But the Toolbar PageRank only goes from 0 10 and seems to besomething like a logarithmic scale:
Toolbar PageRank (log base 10) Real PageRank 0 0-10 1 100-1,000 2 1,000-10,000 3 10,000-100,000 4 and so on....We cant know the exact details of the scale because, as wellsee later, the maximum PR of all pages on the web changes every monthwhen Google does its re-indexing! If we presume the scale is logarithmic(although there is only anecdotal evidence for this at the time of writing)then Google could simply give the highest actual PR page a toolbar PRof 10 and scale the rest appropriately.
Also the toolbar sometimes guesses! The toolbar often shows a ToolbarPR for pages just uploaded and cannot possibly be in the index yet!
What seems to be happening is that the toolbar looks at the URL ofthe page the browser is displaying and strips off everything down thelast / (i.e. it goes to the parent page in URLterms). If Google has a Toolbar PR for that parent then it subtracts1 and shows that as the Toolbar PR for this page. If theres noPR for the parent it goes to the parents parents page, butsubtracting 2, and so on all the way up to the root of your site. Ifit cant find a Toolbar PR to display in this way, that is if itdoesnt find a page with a real calculated PR, then the bar isgreyed out.
Note that if the Toolbar is guessing in this way, the Actual PR ofthe page is 0 - though its PR will be calculated shortly after the Googlespider first sees it.
PageRank says nothing about the content or size of a page, the languageits written in, or the text used in the anchor of a link!
Definitions
- PR: Shorthand for PageRank: the actual, real, page rank foreach page as calculated by Google. As well see later this canrange from 0.15 to billions.
- Toolbar PR: The PageRank displayed in the Google toolbarin your browser. This ranges from 0 to 10.
- Backlink: If page A links out to page B, then page B is saidto have a backlink from page A.
What is PageRank
In short PageRank is a vote, by all the other pages onthe Web, about how important a page is. A link to a page counts as avote of support. If theres no link theres no support (butits an abstention from voting rather than a vote against the page).
Quoting from the original Google paper, PageRank is defined like this:
We assume page A has pages T1...Tn which point to it (i.e., are citations).The parameter d is a damping factor which can be set between 0 and 1.We usually set d to 0.85. There are more details about d in the nextsection. Also C(A) is defined as the number of links going out of pageA. The PageRank of a page A is given as follows:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
Note that the PageRanks form a probability distribution over web pages,so the sum of all web pages' PageRanks will be one.
PageRank or PR(A) can be calculated using a simple iterative algorithm,and corresponds to the principal eigenvector of the normalized linkmatrix of the web.
But thats not too helpful so lets break it down into sections.
- PR(Tn) - Each page has a notion of its own self-importance.Thats PR(T1) for the first page in the web all theway up to PR(Tn) for the last page
- C(Tn) - Each page spreads its vote out evenly amongst allof its outgoing links. The count, or number, of outgoing linksfor page 1 is C(T1), C(Tn) for page n, andso on for all pages.
- PR(Tn)/C(Tn) - so if our page (page A) has a backlink frompage n the share of the vote page A will get is PR(Tn)/C(Tn)
- d(... - All these fractions of votes are added together but,to stop the other pages having too much influence, this total voteis damped down by multiplying it by 0.85 (the factor d)
- (1 - d) - The (1 d) bit at the beginning is a bitof probability math magic so the sum of all web pages' PageRankswill be one: it adds in the bit lost by the d(.... It also meansthat if a page has no links to it (no backlinks) even then it willstill get a small PR of 0.15 (i.e. 1 0.85). (Aside: the Googlepaper says the sum of all pages but they mean the thenormalised sum otherwise known as the averageto you and me.
How is PageRank Calculated?
This is where it gets tricky. The PR of each page depends on the PRof the pages pointing to it. But we wont know what PR those pageshave until the pages pointing to them have their PR calculated and soon And when you consider that page links can form circles it seemsimpossible to do this calculation!
But actually its not that bad. Remember this bit of the Googlepaper:
PageRank or PR(A) can be calculated using a simple iterative algorithm,and corresponds to the principal eigenvector of the normalized linkmatrix of the web.
What that means to us is that we can just go ahead and calculate apages PR without knowing the final value of the PR of the otherpages. That seems strange but, basically, each time we run the calculationwere getting a closer estimate of the final value. So all we needto do is remember the each value we calculate and repeat the calculationslots of times until the numbers stop changing much.
Lets take the simplest example network: two pages, each pointing tothe other:
PageA <->PageB
Each page has one outgoing link (the outgoing count is 1, i.e. C(A)= 1 and C(B) = 1).
Eg. 1
We dont know what their PR should be to begin with, so letstake a guess at 1.0 and do some calculations:
d
= 0.85
PR(A)
= (1 d) + d(PR(B)/1)
PR(B)
= (1 d) + d(PR(A)/1)i.e.
PR(A)
= 0.15 + 0.85 * 1
= 1
PR(B)
= 0.15 + 0.85 * 1
= 1Hmm, the numbers arent changing at all! So it looks like we startedout with a lucky guess!!!
Eg. 2
No, thats too easy. Ok, lets start the guess at 0 insteadand re-calculate:
PR(A)
= 0.15 + 0.85 * 0
= 0.15
PR(B)
= 0.15 + 0.85 * 0.15
= 0.2775
NB. weve already calculated a next best guess at PR(A)so we use it here
And again:
PR(A)
= 0.15 + 0.85 * 0.2775
= 0.385875
PR(B)
= 0.15 + 0.85 * 0.385875
= 0.47799375
And again
PR(A)
= 0.15 + 0.85 * 0.47799375
= 0.5562946875
PR(B)
= 0.15 + 0.85 * 0.5562946875
= 0.622850484375
and so on. The numbers just keep going up. But will the numbers stopincreasing when they get to 1.0? What if a calculation over-shoots andgoes above 1.0?
Eg. 3
Well lets see. Lets start the guess at 40 each and do afew cycles:
PR(A) = 40
PR(B) = 40First calculation
PR(A)
= 0.15 + 0.85 * 40
= 34.25
PR(B)
= 0.15 + 0.85 * 0.385875
= 29.1775
And again
PR(A)
= 0.15 + 0.85 * 29.1775
= 24.950875
PR(B)
= 0.15 + 0.85 * 24.950875
= 21.35824375
Yup, those numbers are heading down alright! It sure looks the numberswill get to 1.0 and stop
Principle: it doesnt matter where you start your guess,once the PageRank calculations have settled down, the normalizedprobability distribution (the average PageRank for all pages)will be 1.0
Advertisement
![]()
![]()
Getting the answer quicker
How many times do we need to repeat the calculation for big networks?Thats a difficult question; for a network as large as the WorldWide Web it can be many millions of iterations! The damping factoris quite subtle. If its too high then it takes ages for the numbersto settle, if its too low then you get repeated over-shoot, bothabove and below the average - the numbers just swing about the averagelike a pendulum and never settle down.
Also choosing the order of calculations can help. The answer will alwayscome out the same no matter which order you choose, but some orderswill get you there quicker than others.
In the examples below, very simple code for clarity and roughly 20to 40 iterations are needed!
Example 1
![]()
So the correct PR for the example is:
![]()
You can see it took about 20 iterations before the network began tosettle on these values!
Look at Page D though - it has a PR of 0.15 even though no-one is votingfor it (i.e. it has no incoming links)! Is this right?
The first part, or "term" to be techinal, of the PR equationis doing this:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
So, for Page D, no backlinks means the equation looks like this:
PR(A)
= (1-d) + d * (0)
= 0.15
no matter what else is going on or how many times you do it.Observation: every page has at least a PR of 0.15 to share out.But this may only be in theory - there are rumours that Google undergoesa post-spidering phase whereby any pages that have no incoming linksat all are completely deleted from the index...
Example 2
A simple hierarchy with some outgoing links
![]()
As youd expect, the home page has the most PR after all,it has the most incoming links! But whats happened to the average?Its only 0.378!!!
Take a look at the external site pages whatshappening to their PageRank? Theyre not passing it on, theyrenot voting for anyone, theyre wasting their PR
Example 3
Lets link those external sites back into our home page just sowe can see what happens to the average
![]()
Thats better - it does work after all! And look at the PR of ourhome page! All those incoming links sure make a difference.
Example 4
What happens to PR if we follow a suggestion about writing page reviews?
![]()
![]()
Example 5
A simple hierarchy
![]()
Our home page has 2 and a half times as much PR as the child pages!Excellent!Observation: a hierarchy concentrates votes and PR into onepage
Example 6
Looping
![]()
This is what wed expect. All the pages have the same number ofincoming links, all pages are of equal importance to each other, allpages get the same PR of 1.0 (i.e. the average probability).
Example 7
Extensive Interlinking or Fully Meshed
![]()
Yes, the results are the same as the Looping example above and for thesame reasons.
Example 8
Hierarchical but with a link in and one out.
Well assume theres an external site that has lots of pagesand links with the result that one of the pages has the average PR of1.0. Well also assume the webmaster really likes us theresjust one link from that page and its pointing at our home page.
![]()
In example 5 the home page only had a PR of 1.92 but now it is 3.31!Excellent! Not only has site A contributed 0.85 PR to us, but the raisedPR in the About, Product and Morepages has had a lovely feedback effect, pushing up the homepages PR even further!Priciple: a well structured site will amplify the effect ofany contributed PR
Example 9
Looping but with a link in and a link out
![]()
Well, the PR of our home page has gone up a little, but whatshappened to the More page?The vote of the Product page has been split evenly betweenit and the external site. We now value the external Site B equally withour More page. The More page is getting onlyhalf the vote it had before this is good for Site B but verybad for us!
Example 10
Fully meshed but with one vote in and one vote out
![]()
Thats much better. The More page is still gettingless share of the vote than in example 7 of course, but now the Productpage has kept three quarters of its vote within our site - unlike example10 where it was giving away fully half of its vote to the externalsite!Keeping just this small extra fraction of the vote within our sitehas had a very nice effect on the Home Page too PR of 2.28 comparedwith just 1.66 in example 10.
Observation: increasing the internal links in your site canminimise the damage to your PR when you give away votes by linking toexternal sites.
Principle: If a particular page is highly important usea hierarchical structure with the important page at the top.
Where a group of pages may contain outward links increase thenumber of internal links to retain as much PR as possible.
Where a group of pages do not contain outward links the numberof internal links in the site has no effect on the sites averagePR. You might as well use a link structure that gives the user the bestnavigational experience.
Site Maps
Site maps are useful in at least two ways:
- If a user types in a bad URL most websites return a really unhelpful404 page not found error page. This can be discouraging.Why not configure your server to return a page that shows an errorhas been made, but also gives the site map? This can help the userenormously
- Linking to a site map on each page increases the number of internallinks in the site, spreading the PR out and protecting you againstyour vote donations
Example 11
Lets try to fix our site to artificially concentrate the PR into thehome page.
![]()
That looks good, most of the links seem to be pointing up to page Aso we should get a nice PR.
Try to guess what the PR of A will be before you scroll down or runthe code.
![]()
Oh dear, that didnt work at all well its much worsethan just an ordinary hierarchy! Whats going on is that pagesC and D have such weak incoming links that theyre no help to pageA at all!Principle: trying to abuse the PR calculation is harder thanyou think.
Example 12
A common web layout for long documentation is to split the documentinto many pages with a Previous and Next linkon each plus a link back to the home page. The home page then only needsto point to the first page of the document.
![]()
In this simple example, where theres only one document, the firstpage of the document has a higher PR than the Home Page! This is becausepage B is getting all the vote from page A, but page A is only gettingfractions of pages B, C and D.Principle: in order to give users of your site a good experience,you may have to take a hit against your PR. Theres nothing youcan do about this - and neither should you try to or worry about it!If your site is a pleasure to use lots of other webmasters will linkto it and youll get back much more PR than you lost.
Can you also see the trend between this and the previous example? Asyou add more internal links to a site it gets closer to the Fully Meshedexample where every page gets the average PR for the mesh.Observation: as you add more internal links in your site, thePR will be spread out more evenly between the pages.
Example 13
Getting high PR the wrong way and the right way.
Just as an experiment, lets see if we can get 1,000 pages pointingto our home page, but only have one link leaving it
![]()
Yup, those spam pages are pretty worthless but they sure add up!Observation: it doesnt matter how many pages you havein your site, your average PR will always be 1.0 at best. But a hierarchicallayout can strongly concentrate votes, and therefore the PR, into thehome page!
This is a technique used by some disreputable sites (mostly adult contentsites). But if Googles robots decide youre doing this theresa good chance youll be banned from Google! Disaster!On the other hand there are at least two right ways to do this:
- Be a Mega-site - Mega-sites, like http://news.bbc.co.uk havetens or hundreds of editors writing new content i.e. new pages- all day long! Each one of those pages has rich, worthwile contentof its own and a link back to its parent or the home page! Thatswhy the Home page Toolbar PR of these sites is 9/10 and the rest ofus just get pushed lower and lower by comparison
- Principle: Content Is King! There really is no substitutefor lots of good content
- Give away something useful - www.phpbb.com has a ToolbarPR of 8/10 (at the time of writing) and it has no big money or marketingbehind it! How can this be?
- What the group has done is write a very useful bulletin boardsystem that is becoming very popular on many websites. And atthe bottom of every page, in every installation, is this HTMLcode:
- Powered by <a href="http://www.phpbb.com/" target="_blank">phpBB</a>
The administrator of each installation can remove that link, butmost dont because they want to return the favour- Can you imagine all those millions of pages giving a fractionof a vote to www.phpbb.com? Wow!
- Principle: Make it worth other peoples while touse your content or tools. If your give-away is good enough othersite admins will gladly give you a link back.
Principle: its probably better to get lots (perhaps thousands)of links from sites with small PR than to spend any time or moneydesperately trying to get just the one link from a high PR page.
A Discussion on Averages
The average Actual PR of all pages in the index is 1.0!
So if you add pages to a site youre building the total PR willgo up by 1.0 for each page (but only if you link the pages togetherso the equation can work), but the average will remain the same.
If you want to concentrate the PR into one, or a few, pages then hierarchicallinking will do that. If you want to average out the PR amongst thepages then "fully meshing" the site (lots of evenly distributedlinks) will do that - examples 5, 6, and 7 in my above. (
Getting inbound links to your site is the only way to increase yoursite's average PR. How that PR is distributed amongst the pages on yoursite depends on the details of your internal linking and which of yourpages are linked to.
If you give outbound links to other sites then your site's averagePR will decrease (you're not keeping your vote "in house"as it were). Again the details of the decrease will depend on the detailsof the linking.
Given that the average of every page is 1.0 we can see that for everysite that has an actual ranking in the millions (and there are some!)there must be lots and lots of sites who's Actual PR is below 1.0 (particularlybecause the absolute lowest Actual PR available is (1 - d)).
It may be that the Toolbar PR 1,2 correspond to Actual PR's lower than1.0! E.g. the logbase for the Toolbar may be 10 but the Actual PR sequencecould start quite low: 0.01, 0.1, 1, 10, 100, 1,000 etc...
PageRank is, in fact, very simple (apart from one scary looking formula).But when a simple calculation is applied hundreds (or billions) of timesover the results can seem complicated.
PageRank is also only part of the story about what results get displayedhigh up in a Google listing. For example theres some evidenceto suggest that Google is paying a lot of attention these days to thetext in a links anchor when deciding the relevance of a targetpage perhaps more so than the pages PR
PageRank is still part of the listings story though, so its worthyour while as a good designer to make sure you understand it correctly.
|
|
Disclaimer: The Page Rank Tips / Informationpresented and opinions expressed herein are those of the authors anddo not necessarily represent the views of TipsAndTreats.com and/orits partners.