Official Google Webmaster Central Blog

Webmaster Tools API updated with Site Settings

Monday, October 13, 2008

The Webmaster Tools GData API has been updated to allow you to get even more out of Webmaster Tools, such as setting a geographic location or your preferred domain. For those of you that aren't familiar with GData, it's a protocol for reading and writing data on the web. GData makes it very easy to communicate with many Google services, like Webmaster Tools. The Webmaster Tools GData API already allows you to add and verify sites for your account and to submit Sitemaps programmatically. Now you can also access and update site-specific information. This is especially useful if you have a large number of sites. With the Webmaster Tools API, you can perform hundreds of operations in the time that it would take to add and verify a single site through the web interface.
What can I do?
We've included four new features in the API. You can see and update these settings for each site that you have verified. The features are:

Crawl Rate: You can request that Googlebot crawl your site slower or faster than it normally would (the details can be found in our Help Center article about crawl rate control). If many of your sites are hosted on the same server and you know your server's capacity, you may want to update all sites at the same time. This now a trivial task using the Webmaster Tools GData API.
Geographic Location: If your site is targeted towards a particular geographic location but your domain doesn't reflect that (for example with a .com domain), you can provide information to help us determine where your target users are located.
Preferred Domain: You can select which is the canonical domain to use to index your pages. For example, if you have a site like www.example.com, you can set either example.com or www.example.com as the preferred domain to use. This avoids the risk of treating both sites differently.
Enhanced Image Search: Tools like the Google Image Labeler allow users to tag images in order to improve Image Search results. Now you can opt in or out for all your sites in a breeze using the Webmaster Tools API.

How do I do it?
We provide you with Java code samples for all the current Webmaster Tools API functionality. Here's a sample snippet of code that takes a list of sites and updates the geographic location of all of them:

  // Authenticate against the Webmaster Tools service
  WebmasterToolsService service;
  try {
   service = new WebmasterToolsService("exampleCo-exampleApp-1");
   service.setUserCredentials(USERNAME, PASSWORD);
  } catch (AuthenticationException e) {
   System.out.println("Error while authenticating.");
   return;
  }

  // Read sites and geolocations from your database
  readSitesAndGeolocations(sitesList, geolocationsList);

  // Update all sites
  Iterator sites = sitesList.iterator();
  Iterator geolocations = geolocationsList.iterator();
  while (sites.hasNext() && geolocations.hasNext()) {
   // Create a blank entry and add the updated information
   SitesEntry updateEntry = new SitesEntry();
   updateEntry.setGeolocation(geolocations.next());

   // Get the URL to update the site
   String encodedSiteId = URLEncoder.encode(sites.next(),
   "UTF-8");
   URL siteUrl = new URL(
   "http://www.google.com/webmasters/tools/feeds/sites/"
   + encodedSiteId);

   // Update the site
   service.update(siteUrl, updateEntry);
  }
Where do I get it?
The main page for the Webmaster Tools GData API explains all the details of the API. It has a detailed reference guide and also many code snippets that explain how to use the Java client library, which is available for download. You can find more details about GData and all the different Google APIs in the Google Data API homepage.

Written by Javier Tordable, Software Engineer

Webmaster Tools shows Crawl error sources

Monday, October 13, 2008

Ever since we released the crawl errors feature in Webmaster Tools, webmasters have asked for the sources of the URLs causing the errors. Well, we're listening! We know it was difficult for those of you who wanted to identify the cause of a particular "Not found" error, in order to prevent it in the future or even to request a correction, without knowing the source URL. Now, Crawl error sources makes the process of tracking down the causes of "Not found" errors a piece of cake. This helps you improve the user experience on your site and gives you a jump start for links week (check out our updated post on "Good times with inbound links" to get the scoop).

In our "Not Found" and "Errors for URLs in Sitemaps" reports, we've added the "Linked From" column. For every error in these reports, the "Linked From" column now lists the number of pages that link to a specific "Not found" URL.

Clicking on an item in the "Linked From" column opens a separate dialog box which lists each page that linked to this URL along with the date it was discovered. The source URL for the 404 can be within or external to your site.

For those of you who just want the data, we've also added the ability to download all your crawl error sources at once. Just click the "Download all sources of errors on this site" link to download all your site's crawl error sources.

Again, if we report crawl errors for your website, you can use crawl error sources to quickly determine if the cause is from your site or someone else's. You'll have the information you need to contact them to get it fixed, and if needed, you can still put in place redirects on your own site to the appropriate URL. Just sign in to Webmaster Tools and check it out for your verified site. You can help people visiting your site—from anywhere on the web—find what they're looking for.

Written by Jonathan Simon, Webmaster Trends Analyst and Michael Williamson, Webmaster Tools Intern

Good times with inbound links

Thursday, October 09, 2008

Inbound links are links from pages on external sites linking back to your site. Inbound links can bring new users to your site, and when the links are merit-based and freely-volunteered as an editorial choice, they're also one of the positive signals to Google about your site's importance. Other signals include things like our analysis of your site's content, its relevance to a geographic location, etc. As many of you know, relevant, quality inbound links can affect your PageRank (one of many factors in our ranking algorithm). And quality links often come naturally to sites with compelling content or offering a unique service.

How do these signals factor into ranking?

Let's say I have a site, example.com, that offers users a variety of unique website templates and design tips. One of the strongest ranking factors is my site's content. Additionally, perhaps my site is also linked from three sources -- however, one inbound link is from a spammy site. As far as Google is concerned, we want only the two quality inbound links to contribute to the PageRank signal in our ranking.

Given the user's query, over 200 signals (including the analysis of the site's content and inbound links as mentioned above) are applied to return the most relevant results to the user.

So how can you engage more users and potentially increase merit-based inbound links?

Many webmasters have written about their success in growing their audience. We've compiled several ideas and resources that can improve the web for all users.

Create unique and compelling content on your site and the web in general

Start a blog: make videos, do original research, and post interesting stuff on a regular basis. If you're passionate about your site's topic, there are lots of great avenues to engage more users.

If you're interested in blogging, see our Help Center for specific tips for bloggers.

Teach readers new things, uncover new news, be entertaining or insightful, show your expertise, interview different personalities in your industry and highlight their interesting side. Make your site worthwhile.

Participate thoughtfully in blogs and user reviews related to your topic of interest. Offer your knowledgeable perspective to the community.

Provide a useful product or service. If visitors to your site get value from what you provide, they're more likely to link to you.

For more actionable ideas, see one of my favorite interviews with Matt Cutts for no-cost tips to help increase your traffic. It's a great primer for webmasters. (Even before this post, I forwarded the URL to many of my friends. :)
Pursue business development opportunities

Use Webmaster Tools for "Links > Pages with external links" to learn about others interested in your site. Expand the web community by figuring out who links to you and how they're linking. You may have new audiences or demographics you didn't realize were interested in your niche. For instance, if the webmasters for example.com noticed external links coming from art schools, they may start to engage with the art community -- receiving new feedback and promoting their site and ideas.

Of course, be responsible when pursuing possible opportunities in this space. Don't engage in mass link-begging; no one likes form letters, and few webmasters of quality sites are likely to respond positively to such solicitations. In general, many of the business development techniques that are successful in human relationships can also be reflected online for your site.

Now that you've read more information about internal links, outbound links, and inbound links (today's post :), we'll see you in the blog comments! Thanks for joining us for links week.

Update -- Here's one more business development opportunity:
Investigate your "Diagnostics > Web/mobile crawl > Crawl error sources" to not only correct broken links, but also to cultivate relationships with external webmasters who share an interest in your site. (And while you're chatting, see if they'll correct the broken link. :) This is a fantastic way to turn broken links into free links to important parts of your site.

In addition to contacting these webmasters, you may also wish to use 301 redirects to redirect incoming traffic from old pages to their new locations. This is good for users who may still have bookmarks with links to your old pages... and you'll be happy to know that Google appropriately flows PageRank and related signals through these redirects.

Written by Maile Ohye, Developer Programs Tech Lead

Linking out: Often it's just applying common sense

Wednesday, October 08, 2008

Creating outbound links on your site, or "linking out", is our topic for Day 3 of Links Week. Linking out happens naturally, and for most webmasters, it's not something you have to worry about. Nonetheless, in case you're interested about an otherwise simple topic that's fundamental to the web, here's the good, the bad, and answers to more advanced questions asked by our fellow webmasters. First, let's start with the good...

Relevant outbound links can help your visitors.

Provide your readers in-depth information about similar topics
Offer readers your unique commentary on existing resources

Thoughtful outbound links can help your credibility.

Show that you've done your research and have expertise in the subject manner
Make visitors want to come back for more analysis on future topics
Build relationships with other domain experts (e.g. sending visitors can get you on the radar of other successful bloggers and begin a business relationship)

When it comes to the less-than-ideal practices of linking out, there shouldn't be too many surprises, but we'll go on record to avoid any confusion...

The bad: Unmonitored (especially user-generated) links and undisclosed paid advertising outbound links can reduce your site's credibility.

Including too many links on one page confuses visitors (we usually encourage webmasters to not have much more than 100 links per page)
Hurts your credibility—turns off savvy visitors and reduces your authority with search engines. If you accept payment for outbound links, it's best to rel="nofollow" them or otherwise ensure that they don't pass PageRank for search engines. (As a user, I prefer to see disclosure to maintain my loyalty as well.)
Allows comment spam, which provides little benefit for users. Also, from a search engine perspective, comment spam can connect your site with bad neighborhoods instead of legitimate resources. Webmasters often add the nofollow attribute (<rel="nofollow">) to links that are user generated, such as spammable blog comments, unless the comments are responsibly reviewed and thus vouched for.

See Jason Morrison's recent blog post about keeping comment spam off your site to prevent spam in the first place.

Answers to advanced questions about outbound links

When linking out, am I sending visitors away forever?!

Hmmm... visitors may initially leave your site to check out relevant information. But can you recall your behavior on sites that link to good articles outside their domain? Personally, I always come back to sites I feel provide commentary and additional resources. Sometimes I stay on the original site and just open up the interesting link in a different tab. It's likely that with relevant outbound links you'll gain repeat visitors, and you won't lose them forever.

Yesterday's post mentioned that descriptive anchor text is helpful in internal links. Is it still important for outbound links?

Descriptive anchor text (the visible text in a hyperlink) helps accurately inter-connect the web. It allows both users and Googlebot to better understand what they're likely to find when following a link to another page. So if it's not too much trouble, try making anchor text descriptive.

Should I worry about the sites I choose to link to? What if their PageRank may be lower than mine?

If you're linking to content you believe your users will enjoy, then please don't worry about the site's perceived PageRank. As a webmaster, the things to be wary of regarding outbound links are listed above, such as losing credibility by linking to spammy sites. Otherwise, consider outbound links as a common sense way to provide more value to your users, not a complicated formula.

Written by Maile Ohye, Developer Programs Tech Lead

Importance of link architecture

Monday, October 06, 2008

In Day 2 of links week, we'd like to discuss the importance of link architecture and answer more advanced questions on the topic. Link architecture—the method of internal linking on your site—is a crucial step in site design if you want your site indexed by search engines. It plays a critical role in Googlebot's ability to find your site's pages and ensures that your visitors can navigate and enjoy your site.

Keep important pages within several clicks from the homepage

Although you may believe that users prefer a search box on your site rather than category navigation, it's uncommon for search engine crawlers to type into search boxes or navigate via pulldown menus. So make sure your important pages are clickable from the homepage and for easy for Googlebot to find throughout your site. It's best to create a link architecture that's intuitive for users and crawlable for search engines. Here are more ideas to get started:

Intuitive navigation for users

Create common user scenarios, get "in character," then try working through your site. For example, if your site is about basketball, imagine being a visitor (in this case a "baller" :) trying to learn the best dribbling technique.

Starting at the homepage, if the user doesn't use the search box on your site or a pulldown menu, can they easily find the desired information (ball handling like a superstar) from the navigation links?

Let's say a user found your site through an external link, but they didn't land on the homepage. Starting from any (sub-/child) page on your site, make sure they can easily find their way to the homepage and/or other relevant sections. In other words, make sure users aren't trapped or stuck. Was the "best dribbling technique" easy for your imaginary user to find? Often breadcrumbs such as "Home > Techniques > Dribbling" help users to understand where they are.

Crawlable links for search engines

Text links are easily discovered by search engines and are often the safest bet if your priority is having your content crawled. While you're welcome to try the latest technologies, keep-in-mind that when text-based links are available and easily navigable for users, chances are that search engines can crawl your site as well.

This <a href="new-page.html">text link</a> is easy for search engines to find.

Sitemap submission is also helpful for major search engines, though it shouldn't be a substitute for crawlable link architecture. If your site utilizes newer techniques, such as AJAX, see "Verify that Googlebot finds your internal links" below.

Use descriptive anchor text

Writing descriptive anchor text, the clickable words in a link, is a useful signal to help search engines and users alike to better understand your content. The more Google knows about your site—through your content, page titles, anchor text, etc.—the more relevant results we can return for users (and your potential search visitors). For example, if you run a basketball site and you have videos to accompany the textual content, a not-very-optimal way of linking would be:

To see all our basketball videos, <a href="videos.html">click here</a> for the entire listing.

However, instead of the generic "click here," you could rewrite the anchor text more descriptively as:

Feel free to browse all of our <a href="videos.html">basketball videos</a>.

Verify that Googlebot finds your internal links

For verified site owners, Webmaster Tools has the feature "Links > Pages with internal links" that's great for verifying that Googlebot finds most of the links you'd expect. This is especially useful if your site uses navigation involving JavaScript (which Googlebot doesn't always execute)—you'll want to make sure that Googlebot is finding other internal links as expected.

Here's an abridged snapshot of our internal links to the introductory post for "404 week at Webmaster Central." Our internal links are discovered as we had hoped.

Feel free to ask more internal linking questions
Here are some to get you started...

Q: What about using rel="nofollow" for maximizing PageRank flow in my internal link architecture (such as PageRank sculpting, or PageRank siloing)?

A: It's not something we, as webmasters who also work at Google, would really spend time or energy on. In other words, if your site already has strong link architecture, it's far more productive to work on keeping users happy with fresh and compelling content rather than to worry about PageRank sculpting.

Matt Cutts answered more questions about "appropriate uses of nofollow" in our webmaster discussion group.

Q: Let's say my website is about my favorite hobbies: biking and camping. Should I keep my internal linking architecture "themed" and not cross-link between the two?

A: We haven't found a case where a webmaster would benefit by intentionally "theming" their link architecture for search engines. And, keep-in-mind, if a visitor to one part of your site can't easily reach other parts of your site, that may be a problem for search engines as well.

Perhaps it's cliche, but at the end of the day, and at the end of this post, :) it's best to create solid link architecture (making navigation intuitive for users and crawlable for search engines)—implementing what makes sense for your users and their experience on your site.

Thanks for your time today! Information about outbound links will soon be available in Day 3 of links week. And, if you have helpful tips about internal links or questions for our team, please share them in the comments below.

Written by Maile Ohye, Developer Programs Tech Lead

Links information straight from the source

Monday, October 06, 2008

We hope that you're able to focus on helping users (and improving the web) by creating great content or providing a great service on your site. In between creating content and working on your site, you may have read some of the (often conflicting) link discussions circling the web. If you're asking, "What's going on -- what do I need to know about links?" then welcome to the first day of links week!

Day 2: Internal links (links within your site)

Internal linking is your homepage linking to your "Contact us" page, or your "Contact us" page linking to your "About me" page. Internal linking (also known as link architecture) is important because it's a major factor in how easily visitors can navigate your site. Additionally, internal linking contributes to your site's "crawlability" -- how easily a spider can reach your pages. More in Day 2 of links week.

Day 3: Outbound links (sites you link to)

Outbound links are external sites that you're linking to. For example, www.google.com/webmasters links to the domain googlewebmastercentral.blogspot.com (our lovely blog!). Outbound links allow us to surf the web -- they're a big reason why the web is so exciting and collaborative. Without outbound links, your site can seem isolated from the community because each page becomes "brochure-ware." Most sites include outbound links naturally and it shouldn't be a big concern. If you still have questions, we'll be covering outbound linking in more detail on Day 3.

Day 4: Inbound links (sites linking to you)

Inbound links are external sites linking to you. There are many webmasters who (rightfully) aren't preoccupied by the subject of inbound links. So why do some webmasters care? It's likely because merit-based or volunteered inbound links may seem like a quick way to increase rankings and traffic. Answers to your questions like, "Are there no-cost methods to maximize my merit-based links?" are provided on Day 4.

Update: Included references to blog posts as they were published throughout links week.

Written by Maile Ohye, Developer Programs Tech Lead

Advanced Website Diagnostics with Google Webmaster Tools

Tuesday, September 30, 2008

Running a website can be complicated—so we've provided Google Webmaster Tools to help webmasters to recognize potential issues before they become real problems. Some of the issues that you can spot there are relatively small (such as having duplicate titles and descriptions), other issues can be bigger (such as your website not being reachable). While Google Webmaster Tools can't tell you exactly what you need to change, it can help you to recognize that there could be a problem that needs to be addressed.

Let's take a look at a few examples that we ran across in the Google Webmaster Help Groups:

Is your server treating Googlebot like a normal visitor?

While Googlebot tries to act like a normal user, some servers may get confused and react in strange ways. For example, although your server may work flawlessly most of the time, some servers running IIS may react with a server error (or some other action that is tied to a server error occurring) when visited by a user with Googlebot's user-agent. In the Webmaster Help Group, we've seen IIS servers return result code 500 (Server error) and result code 404 (File not found) in the "Web crawl" diagnostics section, as well as result code 302 when submitting Sitemap files. If your server is redirecting to an error page, you should make sure that we can crawl the error page and that it returns the proper result code. Once you've done that, we'll be able to show you these errors in Webmaster Tools as well. For more information about this issue and possible resolutions, please see http://todotnet.com/archive/0001/01/01/7472.aspx and http://www.kowitz.net/archive/2006/12/11/asp.net-2.0-mozilla-browser-detection-hole.aspx.

If your website is hosted on a Microsoft IIS server, also keep in mind that URLs are case-sensitive by definition (and that's how we treat them). This includes URLs in the robots.txt file, which is something that you should be careful with if your server is using URLs in a non-case-sensitive way. For example, "disallow: /paris" will block /paris but not /Paris.

Does your website have systematically broken links somewhere?

Modern content management systems (CMS) can make it easy to create issues that affect a large number of pages. Sometimes these issues are straightforward and visible when you view the pages; sometimes they're a bit harder to spot on your own. If an issue like this creates a large number of broken links, they will generally show up in the "Web crawl" diagnostics section in your Webmaster Tools account (provided those broken URLs return a proper 404 result code). In one recent case, a site had a small encoding issue in its RSS feed, resulting in over 60,000 bad URLs being found and listed in their Webmaster Tools account. As you can imagine, we would have preferred to spend time crawling content instead of these 404 errors :).

Is your website redirecting some users elsewhere?

For some websites, it can make sense to concentrate on a group of users in a certain geographic location. One method of doing that can be to redirect users located elsewhere to a different page. However, keep in mind that Googlebot might not be crawling from within your target area, so it might be redirected as well. This could mean that Googlebot will not be able to access your home page. If that happens, it's likely that Webmaster Tools will run into problems when it tries to confirm the verification code on your site, resulting in your site becoming unverified. This is not the only reason for a site becoming unverified, but if you notice this on a regular basis, it would be a good idea to investigate. On this subject, always make sure that Googlebot is treated the same way as other users from that location, otherwise that might be seen as cloaking.

Is your server unreachable when we try to crawl?

It can happen to the best of sites—servers can go down and firewalls can be overly protective. If that happens when Googlebot tries to access your site, we won't be able crawl the website and you might not even know that we tried. Luckily, we keep track of these issues and you can spot "Network unreachable" and "robots.txt unreachable" errors in your Webmaster Tools account when we can't reach your site.

Has your website been hacked?

Hackers sometimes add strange, off-topic hidden content and links to questionable pages. If it's hidden, you might not even notice it right away; but nonetheless, it can be a big problem. While the Message Center may be able to give you a warning about some kinds of hidden text, it's best if you also keep an eye out yourself. Google Webmaster Tools can show you keywords from your pages in the "What Googlebot sees" section, so you can often spot a hack there. If you see totally irrelevant keywords, it would be a good idea to investigate what's going on. You might also try setting up Google Alerts or doing queries such as [site:example.com spammy words], where "spammy words" might be words like porn, viagra, tramadol, sex or other words that your site wouldn't normally show. If you find that your site actually was hacked, I'd recommend going through our blog post about things to do after being hacked.

There are a lot of issues that can be recognized with Webmaster Tools; these are just some of the more common ones that we've seen lately. Because it can be really difficult to recognize some of these problems, it's a great idea to check your Webmaster Tools account to make sure that you catch any issues before they become real problems. If you spot something that you absolutely can't pin down, why not post in the discussion group and ask the experts there for help?

Have you checked your site lately?

Written by John Mueller, Webmaster Trends Analyst, Google Zürich

Webmaster Central Blog

Webmaster Tools API updated with Site Settings

Webmaster Tools shows Crawl error sources

Good times with inbound links

Linking out: Often it's just applying common sense

Importance of link architecture

Links information straight from the source

Advanced Website Diagnostics with Google Webmaster Tools

Labels

Archive

Feed

Subscribe via email