Friday, 19 July 2013

New Panda 26 update confirmed by Google

Panda 26th update is rolling out on 18 July 2013

Google has confirmed a Panda update is rolling out and this specific update is “more finely targeted.”

In the last few days we’ve been pushing out a new Panda update that incorporates new signals so it can be more finely targeted.

Google’s Panda update has had a major impact on search results and it isn’t a one time event since Panda updates continue to roll out.
Panda update was rolling out, but this panda update is "Softer" than the previous updates.
where many webmasters who were originally hit by the algorithm are now claiming recovery. Google has confirmed a Panda update is rolling out and this specific update is "more finely targeted."

Barry Schwartz over at Search Engine Roundtable noticed people at Webmaster World talking about “another shuffle taking place in Google” as pretty much occurs all the time. Panda was suspected, and according to Scwhartz, Google has confirmed that Panda is indeed the culprit. He shares this statement at Search Engine Land:

Google has indicated in the past that it would no longer be confirming Panda updates, but apparently they changed their minds. Schwartz quotes the company as saying this one is “more finely targeted”.

History of Google Panda Updates (Google Panda Refresh):
Panda #26 on July 18th
Panda #25 on March 15th
Panda #23 on Decembar 21th
Panda #24 on January 22th
Panda #22 on November 21th
Panda #21 on November 5th
Panda #20 on September 27th
Panda 3.9.2 on September 18th
Panda 3.9.1 on August 20th
Panda 3.9 on July 24th
Panda 3.8 on June 25th
Panda 3.7 on June 9th
Panda 3.6 on April 27th
Panda 3.5 on April 19th
Panda 3.4 on March 23rd
Panda 3.3 on about February 26th
Panda 3.2 on about January 15th
Panda 3.1 on November 18th
Panda 2.5.3 on October 19/20th
Panda 2.5.2 on October 13th
Panda 2.5.1 on October 9th
Panda 2.5 on September 28th
Panda 2.4 in August
Panda 2.3 on around July 22nd
Panda 2.2 on June 18th
Panda 2.1 on May 9th
Panda 2.0 on April 11th
Panda 1.0 on February 24th

Wednesday, 17 July 2013

Importance of Robots.txt

Robots.txt file is a very important file for seo if you want to have a good ranking in major search engines, many websites don't offer this file. A Robots.txt file is helpful to keep out unwanted search engine spiders like email retrievers, image strippers, etc. It defines which paths are off limits for spiders to visit. This is useful if you want to hide some personal information or some secret files from search engines.

What is Robots.txt

Robots.txt file is a special text file that is always located in server's root. Robots.txt file contains restrictions for Spiders, telling them where they have permission to read. A Robots.txt is like defining rules for search engine spiders (robots) what to follow and what not to. It should be noted that Web Robots are not required to respect Robots.txt files, but most well written Web Spiders follow the rules you define.

How to Create Robots.txt

The format for the robots.txt file is special. It consists of records. Each record consists of two fields: a User-agent line and one or more Disallow: lines. The format is: 
<Field> ":" <value>
The robots.txt file should be created in Unix line ender mode! Most good text editors will have a Unix mode or your FTP client *should* do the conversion for you. Do not attempt to use an HTML editor that does not specifically have a text mode to create a robots.txt file.


The User-agent line specifies the robot. For example: 
User-agent: googlebot

You may also use the wildcard character "*" to specify all robots:
User-agent: *
You can find user agent names in your own logs by checking for requests to robots.txt. Most major search engines have short names for their spiders.


The second part of a record consists of Disallow: directive lines. These lines specify files and/or directories. For example, the following line instructs spiders that it can not download contactinfo.htm:
Disallow: contactinfo.htm
You may also specify directories:
Disallow: /cgi-bin/
Which would block spiders from your cgi-bin directory?

There is a wildcard nature to the Disallow directive. The standard dictates that /bob would disallow /bob.html and /bob/indes.html (both the file bob and files in the bob directory will not be indexed).

If you leave the Disallow line blank, it indicates that ALL files may be retrieved. At least one disallow line must be present for each User-agent directive to be correct. A completely empty Robots.txt file is the same as if it were not present.

White Space & Comments

Any line in the robots.txt that begins with # is considered to be a comment only. The standard allows for comments at the end of directive lines, but this is really bad style: 
Disallow: bob #comment

Some spider will not interpret the above line correctly and instead will attempt to disallow "bob#comment". The moral is to place comments on lines by themselves.
White space at the beginning of a line is allowed, but not recommended.
Disallow: bob #comment


The following allows all robots to visit all files because the wildcard "*" specifies all robots.
User-agent: * 

This one keeps all robots out.
User-agent: * 
Disallow: /

The next one bars all robots from the cgi-bin and images directories:
User-agent: * 
Disallow: /cgi-bin/
Disallow: /images/

This one bans Rover dog from all files on the server: 
User-agent: Rover dog 
Disallow: /

This one bans keeps googlebot from getting at the personal.htm file:
User-agent: googlebot
Disallow: personal.htm