Tuesday, May 25, 2010

Getting Started - Google Custom Search API - Google Code

Getting Started - Google Custom Search API - Google Code: "- Sent using Google Toolbar"

emlynoregan@gmail.com | My favorites | English | Sign out

Watch Google I/O keynotes live on May 19 and 20!

Google Custom Search API

Getting Started

This page walks you through the creation of your first custom search engine and gives you a tour of the control panel.

Contents

This page includes the following sections:

What Is Custom Search?

Google Custom Search enables you to create a search engine for your website, your blog, or a collection of websites. You can fine-tune the ranking, customize the look and feel of the search results, and invite your friends or trusted users to help you build your custom search engine. You can even make money from your search engine by using your Google AdSense account.

You can create a search engine that searches only the contents of your website, or you can create one that focuses on a particular topic. You can use your expertise about a subject to tell Custom Search which websites to search, prioritize, or ignore. Because you know your users well, you can tailor the search engine to their interests. Your search engine can take into account the context in which your users are searching. For example, when an avid cyclist searches for "wheel" on Google search, she will have to sift through hundreds of results on automobile tires, steering wheels, or Buddhist wheels. This is because Google search cannot discern that the intended context is "wheels for bicycles." A custom search engine for bicycles, on the other hand, would search only pre-selected websites on bicycles and give relevant results to the cyclist.

Figure 1: Searching for "wheel" on Google search gets results about all kinds of wheels.

Standard Google search results

Figure 2: Searching for "wheel" on a search engine for bicycles yields more relevant results.

You can create your own custom search engine in a few minutes by filling out a wizard. Once you have defined your search engine, Custom Search generates code for a search box, which you can insert anywhere on your webpage or blog. You can look at Google Picks for examples of popular custom search engines.

Back to top

Creating a Custom Search Engine (Hello World)

A good way to really understand Custom Search is by creating a simple search engine using the wizard. This section tells you how you can define your first search engine, then shows you how the XML code underneath it might look.

Since you're experimenting and figuring out some basic concepts, spend only a couple of minutes making a simple search engine. Don't sweat the details until later. Keep the search engine simple so that you can follow what's happening when you start testing it. You can always change the definitions of the search engine and add more sites and jazzy features later; alternatively, you can delete it altogether and start from scratch.

Defining a Custom Search Engine

To create a custom search engine, do the following:

  1. Use your Google account (the same account that you use for Gmail, iGoogle, or Checkout) to log into the Custom Search wizard. If you do not have a free account with Google, create one first.
  2. Fill out basic information about your search engine, including the search engine name, description, and language. You can ignore the text box for search engine keywords.
  3. Under the What do you want to search? section, tell Custom Search how wide your search coverage should be. For your first search engine, choose Only sites I select.
  4. Under the Select some sites section, tell Custom Search which webpages or websites to search. For your first search engine, start out with a few websites, such as www.google.com/coop/docs/*.
  5. Select the free standard edition, not the business edition. You can always upgrade later.
  6. Read the Terms of Service, and, if you agree with it, select the "I have read and agree to the Terms of Service." checkbox.
  7. Click the Next button, so you can start testing your search engine.
  8. After you finish testing out a few search queries, select the Send confirmation email checkbox to receive tips on managing your search engine.
  9. Click Finish. That's it, you've defined your first custom search engine.

Figure 3: You can try out different queries on your new search engine.

A preview of the Hello World search engine

Looking Under the Covers

You don't need to understand this yet, but if you want to look under the covers, the two sets of code are the XML code that the wizard has created for you:

 id="_6zdjrkhn3a" creator="001058780666577659641" volunteers="false"

keywords="" visible="false" encoding="UTF-8">
</span><span class="pln">Hello World</span><span class="tag">
Experimental search engine




about="http://www.google.com/coop/docs/*" score="1">

The more advanced sections of the developer's guide will walk you through the CSE XML. In the meantime, you can just skim through it. If you are curious about the code for your search engine, you can go to the control panel and click the Advanced tab. The next section tells you about the control panel and all the tabs.

Back to top

Managing Your Search Engine Updated!

You manage your custom search engines in the My search engines page and define the search engine specifications in the control panel.

Getting to Know the My Search Engines Page

You can create more than one search engine under your Google account. The search engines do not have to be related to each other. You can view, manage, and delete them in the My search engines page. You can also do all this through the Custom Search Console Gadget, which is a mini-application that you can add to your iGoogle homepage.

The My search engine page includes the following components:

  • the search ending - Click the name of the search engine to view its public homepage. The homepage shows your profile, lets users try out your search engine and add a gadget version of your search engine to their blogs, webpages, and iGoogle homepage. You can create your profile by clicking the My Profile link on the left.
  • Control panel - The administrative console for your search engine. Each search engine has its own control panel.
  • Statistics - The dashboard that shows the number of queries processed by your search engine. If enough users are using the same search terms, you will be able to see the most popular queries for your search engine. You can use this data to fine-tune your search engine.
  • Delete - Permanently delete the search engine. You cannot undo this.

Touring the Control Panel

After you create a custom search engine, you can use the control panel to modify it. You can access the control panel from the My search engines page or your Custom Search Console Gadget.

This section describes the following tabs in the control panel:

Back to top

Basics

The Basics tab let you define public information and general themes about your search engine. The public information is displayed in your custom search engine homepage. The tab has three main sections, Basic Information, Language settings, and Preferences.

Warning: If you want to keep the changes you make in the Basics tab, click Save Changes before you move on to the next tab; otherwise, you will lose all your modifications.

Basic information

The Basic information section lets you define the following fields:

  • Search engine name - Descriptive name that could give people ideas about the type of search engine you are building. Short and descriptive names work well. Do not use a name that would violate other people's trademark.

    The name appears on your custom search engine homepage that Google hosts and the search results page that Google serves.

  • Search engine description - Brief information about your custom search engine, such as what it searches and who might be interested in using it. Be succinct, and don't repeat information. For example, if your search engine name is "Badminton Search Engine", do not describe it as "search engine for badminton". Don't just jot down a description in a hurry, because users who are looking for a specific custom search engine might be frustrated if they encounter search engines that don't work as described or have descriptions that are too vague.

    The description appears on your custom search engine homepage and results page.

  • Search engine keywords - Optional. Keywords are a quick way of boosting certain webpages in your search results and getting more search results about the subject. You can add as many keywords as you want, as long as you don't exceed 100 characters. List words or phrases—which are a short series of words enclosed in quotation marks (for example, "mountain bike")—that describe the content of the webpages or the coverage of your search engine. While Custom Search boosts results that contain those keywords, it does not demote or filter out results that don't contain the keywords.

    When you are in the early stages of defining your search engine, you can skip this setting. Later, when you are fine-tuning results, you can define keywords. For more information, see Changing the Ranking of Your Search Results page.

Back to top

Language Settings

The Language Settings section lets you define the following fields:

  • Search engine language - The language of the interface of the search engine, such as the search box button. It also boosts results in that language, but it does not keep results in other languages from appearing in the results page. For example, if you selected Chinese as your search engine language, Chinese webpages will be given higher priority than English webpages in the search results.
  • Transliteration - Enable the transcription of words in Romanized or English alphabets into another writing system. The query is converted into the phonetic equivalent of the script that the user has selected. For example, if you enabled Arabic transliteration, your users can search for news by typing "akhbar" and Custom Search transcribes it to its phonetic equivalent in Arabic (أخبار).

    If you select more than one writing system, your users can select the right script from a drop-down list besides the search box.

  • Search engine encoding - The text format of the search results. This setting must match the encoding of your webpage. In the vast majority of cases, Unicode (UTF-8) is the best option. This setting matters only if you embed a search box in your webpage.

Back to top

Preferences

The Preferences section lets you define the following fields:

  • How to search included sites - The breadth of the coverage of your search engine. You can restrict the search to sites (webpages or websites) that you select, or you can create a junior Google that emphasizes certain sites. You list the specific websites and webpages in the Sites tab.
  • Search engine visibility - Whether your search engine should be included in the Custom Search directory. Not all public search engines are automatically included in the directory, only active and popular ones are. The best search engines are listed in the Google Picks page, and you can search for other active and popular search engines in the Find a Custom Search Engine search box in that same page. You might not see your search engine in the directory until it gains more users and activity.

    Not listing your search engine in the directory does not make it secret or hidden. Just as people can still dial an unlisted phone number, so can web surfers still view your search engine if they know the URL.

  • Advertising status - Whether Custom Search should show ads on results pages. You have to show ads, unless you are creating the search engine on behalf of a registered non-profit organization, a university, or a government agency. Define your AdSense settings in the Make Money tab to get a share of ad revenues that are generated from your search engine.
  • Enable special results, such as subscribed links and promotions. - If you have created subscribed links or promotions, you can set your search engine to display special results within the results page. Users of your search engine will automatically see your special results and do not need to take special action. To learn more, read the Creating Special Results page.

Back to top

Sites

The Sites tab lets you tell Google Custom Search which sites to search. A site can be an exact or complete URL for a single page (such as http://www.example.com) or a URL pattern (such as *.example.com/*). For more examples of URL patterns, see the Help Center topic on URL patterns.

To include sites in your search engine, click the Include sites link under the Included sites section; to exclude sites from your search engine, click the Exclude sites link under the Excluded sites section. Once you've added at least one site to include in or exclude from your search engine, a menu bar with Add Sites and Delete buttons appear right above your list of sites.

If you have a lot of sites to add, you can add them in bulk by clicking Add Sites, and then clicking the Include sites in bulk link at the top of the dialog box. To remove sites from Custom Search, simply select the check box next to the site and click Delete.

You can list up to 5,000 sites across all your custom search engines.

Including or Excluding Sites from Your Search Engine

When you add a site, a dialog box with the following fields appears:

  • URL - Enter the site that you want Custom Search to crawl. If you enter the URL of a specific page without the subdomain (the www in www.example.com), Custom Search automatically searches all the subdomains of the site. If example.com has the subdomains, www, store, and home, Custom Search searches for www.example.com, store.example.com , and home.example.com. So you do not need to add each subdomain.

    If you use a URL pattern with wild cards (*), you can skip the next field and just click Save Changes; the control panel automatically selects the correct What to include option for you.

  • What to include - determines how extensively you want Custom Search to search the site. For example, you can have Custom Search search just the site you defined in the URL text box, or you can have Custom Search also search other webpages linked from the site. It includes the following options:
    • Include all pages whose address contains this URL - Custom Search also searches subdirectories and subpages of the site you defined in the URL text box. For example, if you defined your site as http://code.google.com/apis/, Custom Search automatically includes every URL that begins with http://code.google.com/apis/. You do not need to recursively add the URL for subdirectories and subpages.
    • Include just this specific page or URL pattern I have entered - Custom Search searches only the specific webpage or URL pattern you have defined. It is the most restrictive option, because it instructs Custom Search to follow your URL pattern exactly.
    • Dynamically extract links from this page and add them to my search engine - Custom Search searches not only for webpages that match the URL you specified, but also webpages linked from your target webpages. You cannot use URL patterns with wild cards (*) with this option.

      Note: If you want Custom Search to cover just your webpages in a domain and no other domain, do not select this option.

      After you select this option, determine how inclusive you want Custom Search be with the linked webpages. The options are:

      • Include all pages this page links to - Includes individual webpages that are directly linked from your site. For example, if your site, blog.example.com, links to http://code.google.com/apis/customsearch/docs/dev_guide.html, and you select this option, Custom Search includes your site and just the linked page in your search results. It will not include other webpages from code.google.com or any other page that were not explicitly linked from your site.
      • Include all partial sites this page links to - Includes all webpages under the parent directory of the webpage that is linked from your site. For example, if your site, blog.example.com links to http://code.google.com/apis/customsearch/docs/dev_guide.html, and you select this option, Custom Search includes your site, the linked page, and all sibling webpages under the linked page's parent directory (docs/). If your site links to a URL that ends in a directory (such as code.google.com/apis/customsearch/docs/) and not an HTML page (such as code.google.com/apis/customsearch/docs/dev_guide.html), Custom Search searches for all the webpages and subdirectories under that directory.
      • Include all sites this page links to - Includes all webpages in the domain of the webpage that is linked from your site. For example, if your site, blog.example.com, links to http://code.google.com/apis/customsearch/docs/dev_guide.html, and you select this option, Custom Search includes your site, the linked page, and everything under code.google.com. It will not include webpages from other subdomains, such as groups.google com or docs.google.com.

Viewing List of Sites

You can view twenty sites at a time. If you have a lot of sites, use the URL contains text box to view only URL patterns you want to see. When you are done searching for a specific site, you can click Clear to see all your sites again.

Tagging Sites with Labels

If you have created refinement labels in the Refinements tab, you can apply labels on sites by selecting their check boxes, then selecting the label from the Label actions drop-down list in the menu bar, which is above the list of sites.

Indexing

The Indexing tab helps you improve the breadth of coverage and freshness of the search results in your custom search engine. If you are satisfied with your search results or if your search engine covers popular sites that you can find on Google search, you don't need to do anything; you can move on to other tabs.

However, if the sites included in your search engine haven't been indexed by Google search, you can make the webpages more discoverable to Google by submitting a Sitemap. A Sitemap is an XML file that lists pages on your website and includes information about your webpages, such as when they were most recently updated, how frequently they change, and how important they are in relation to each other. To learn more about Sitemaps, see Selecting Sites to Search. If you do not want to use Sitemaps, you can add individual URLs of webpages you want Custom Search to index.

Google can index only webpages that can be accessed and crawled. Make sure that you do not have robots.txt file or meta tags that block Googlebot from crawling the pages that you want indexed.

The improvement in the coverage of your index does not happen instantaneously, because it takes some time for the pages to be crawled and indexed. The cycle takes about 24 hours or less. The improved indexing affects only search results within the search engine you create, not your rank and indexing on Google search.

The Indexing tab lets you do the following:

  • View information about the indexing status of your search engine and Sitemaps.

    You can request indexing for up to 50 webpages for the search engine. If you had upgraded to Google Site Search, you have higher limits that vary according to your account level. Additional on-demand URLS available tracks the number of webpages you can have Custom Search index for you.

    If you have Google Site Search, this tab also includes information about your plan quota and number of pages indexed.

  • Submit a Sitemap.

    If you have verified that you own the website on Google Webmaster Tools and have a Sitemap, you can enter the the URL of the verified Sitemap in the Sitemap URL text box. A Sitemap helps Google discover additional pages on your site and learn which URLs are more important. Submitting a Sitemap does not guarantee that Google will crawl or index all of your URLs, but Google does use the data in your Sitemap to learn about the structure of your site. To learn more about Sitemaps, see Selecting Sites to Search.

    After Google has verified the Sitemap you have submitted, the Indexing tab lists it under the Identified Sitemaps section. To start on-demand indexing, click Index Now. Custom Search starts indexing the 50 most important webpages that haven't already been included in the index. Custom Search determines the importance of the webpages by the priority you have assigned to them in your Sitemap. If the number of webpages with the highest priority exceeds your allotment for on-demand indexing, Custom Search selects the highest priority webpages with the most recent last-modified dates. As Google crawls and indexes your webpages, your custom search engine results might improve.

    Note: Sitemaps are submitted to Google search, which means that you cannot have results appear only in your custom search engine but not on Google search.

  • Enter URLs of webpages that you want Custom Search to index.

    If you do not want to create and submit a Sitemap or if you are not the owner of the webpages that you want indexed, you can add the URL of webpages to the On-demand indexing using individual URLs list box. After you click Index Now, Custom Search indexes the first 50 pages that have not yet been indexed by Google search.

Back to top

Refinements

Refinements are a way for you to categorize sites by topics. For example, if you have a bicycling search engine, you can have categories of, say, bike maintenance, bike reviews, bike stores, biking skills, and so on and so forth. You can create refinement labels that you associate with the sites you listed in the Sites tab. The refinement links appear at the top of your search results page, and users can click them to narrow down their searches. A search page can have as many as 16 refinement links.

To create a refinement label, click Add Refinement, and define the settings. To tag websites with labels, go to the Sites tab, select check boxes next to the sites, and select the label from the Label actions drop-down list. You can tag sites with more than one label.

Before you create your labels, you might want to check out existing labels and pool your resources with Google and other users.

Back to top

Promotions

The Promotions tab lets you create special results that appear at the top of the results page and define search terms for triggering the results. For example, if you want users who are searching for "cool stuff" to discover your latest widget, you can create a promotion result that would show a link to the webpage about the widget.

You can create a promotion by clicking Add and defining the queries that would trigger the promotion, as well as the content of the special result. The content includes a title, description, image, and the URL of the webpage. The description and the image, which are optional, are not displayed automatically. You must change the setting to activate that feature.

To display the optional description and image in your promotion, do the following:

  1. Click the Promotion Design Settings button on the right.
  2. Select the checkboxes to allow the inclusion of a brief description and an image for each promotion.
  3. Click OK.

To create content for the description or add an image, do the following:

  1. Add or edit a promotion.
  2. Fill out a description and add a link to an image. These additional fields are optional and do not need to be filled out.

To delete promotions, select the checkboxes next to the promotions you want to remove, and click Delete.

To change the appearance—such as the border, text, and background color—of all the promotion results, go the Look and Feel tab, which is described in detail in the Designing the Look and Feel with the Control Panel.

If you want to create a lot of promotions, create an XML file instead. To learn more about the XML format, see the Creating Special Results page. To upload the XML file you created, click Upload and select the file.

Back to top

Synonyms New!

You can expand your users' search queries by using synonyms, which are variants of a search term. For example, the search query,"food", could have the following alternatives: "meal", "chow", "cooking", and so on. If you create synonyms for "food" in your search engine, your users would not need to type multiple variants to find information they are seeking. The custom search engine automatically searches for all sites that are relevant to "food", "meal", "chow","cooking", and other related terms. For recommendations on the types of synonyms to create, see the Improving User Queries for More Relevant Results page.

The Synonyms tab let you create synonyms for specific search terms.

To create a synonym, click Add and define the search terms that would trigger the synonym expansion. To delete a synonym, select the checkboxes next to the synonyms you want to remove, and click Delete.

If you want to add a lot of synonyms, create an XML file instead, and use the Synonyms page to upload the file. To upload the XML file you created, click Upload and select the file.

Back to top

Look and feel

If you have your own website, you can change the design of your search box and customize the style of the search results page to match the look and feel of your website.

You can select one of the predefined themes that broadly matches the look and feel of your website. If the standard themes are not quite what you want, you can make further changes.

To keep the changes you made, click Save before you move on to the next tab; otherwise, you would lose all your modifications.

For more information, see the Designing the Look and Feel page.

Get Code

After you specify the look and feel for your search box and search results pages in the Look and feel tab, you can copy the generated code in the Get code tab and insert it in your webpages. Presto, you have an instant custom search engine.

Back to top

Collaboration

You can make your search social by inviting collaborators to help you tweak your search engine. They can include or exclude sites from your search engine and apply search refinements to them. They cannot change the name, description, and look and feel of your search engine, nor can they access the code and the monetary features in your search engine. You can either invite contributors or accept volunteers who want to collaborate. Your collaborators must have a Google account to contribute to your search engine. If they don't already have one, they can easily create an account.

To invite contributors, simply fill out the Invite others to contribute section and click the Send Invite button. You can have up to 100 contributors.

If your contributors have their own search engines, you can use their refinement labels. You can include relevant sites that they have tagged with labels to your search engine results.

To start contributing, your collaborators can use their Google accounts to log into their My search engines pages. Under the Search engines I'm contributing to section, they can click the control panel for your search engine. The collaborators' control panels are spare and do not have all the fancy features of a full control panel.

Back to top

Make money

Make money with your custom search engine by connecting it to your Google AdSense account. When users click on an ad in your search results, you get a share of the ad revenue.

If you do not have an AdSense account but want one, simply click I am a new AdSense user and fill out the form. After your application is approved, you can see your AdSense ID. You can also create channels to track the monetization performance of your search engines. To learn more about channels, see the Help Center for AdSense.

Warning: If you already have an existing AdSense account, do not create a new one, even if you create multiple search engines. Google automatically associates your search engines with the same AdSense account. Creating another AdSense account might result in the termination of your AdSense account.

If you have an existing AdSense account, click I already have an AdSense account and fill out the form. All the search engines in your account will be associated with your AdSense account.

Business account

Google Site Search lets you create search engines that do not include ads, remove Google branding (if you so choose), have access to XML feeds of your search results, and have more control over how the results are presented to your users, among other things. Although you can manage and define your Site Search search engine in the control panel, you have to use the WebSearch Protocol to implement the additional level of customization.

If you want to upgrade to the business edition, click the Convert to Google Site Search button and fill out the form. The business edition starts from $100 a year. The annual fee for various plans are listed in the Google Enterprise page.

Advanced

When you've outgrown the control panel and want to start tinkering with the advanced features, you should consider using the Custom Search context and annotations files. The context file is in XML format, while the annotations file can be in OPML, TSV, or Custom Search XML format. Don't be intimidated if none of these terms sound familiar to you; the rest of the developer guide discusses them and shows you what you can do with the advanced tools.

Preview

When you tweak your search engine, you can test the changes in the Preview tab.

Statistics

The Statistics tab displays the same dashboard that you can access from the My search engine page. It shows the number of queries processed by your search engine. If enough users are using the same search terms, you will be able to see the most popular queries for your search engine. You can use this data to fine-tune your search engine.

Back to top

Releasing Your Search Engine into the Wild

Once you have defined your search engine, your users can access them in four places:

  • Your custom search engine homepage - If your search engine is popular and you opted to have it listed (the listing is set in the Preferences section of the Basics tab), users can find your search engine in the Custom Search directory; otherwise, you can send the URL to your friends. To get the URL for your homepage, go to the My search engines page and click the homepage link for your search engine. Copy the URL from the address bar of your browser.
  • A search engine gadget in their iGoogle pages - Your homepage has a button for adding search engine gadgets button that lets users add your search engine to their iGoogle page. You don't even have to do anything.
  • A search engine gadget in their webpages or blogs - You homepage has an "Add this search engine to your blog or webpage »" link that generates a code for your search engine gadget. Your users can paste that code in the HTML of their webpages, and their users can access your search engine from that webpage. The code for your gadget is created automatically, so you do not need to learn the Gadgets API; but if you have some time, you might want to check it out.
  • A search box in your website - If you embed a search box in your webpage, your users can make searches from your website. To get your very own search box, copy the generated code from the Code tab of the control panel, and paste it into the HTML of your webpage.

    Back to top

Remember to create your profile to let your users know a bit about you.

Taking the Next Step

If the custom search engine that you just created with the wizard works for your needs, you're all set. But if you want to learn more about the more powerful and advanced features of the Custom Search API, you can continue to The Basics.

Forward to The Basics >


No comments:

Post a Comment