Lesson 3.1: Web organization

Access Lesson 3.1 slides here

Contents:

Welcome to Class 3, which will focus on operators.

What is an operator? An operator is a command that you add to your query to give Google special instructions about how you want it to deal with a specific search term. Some operators are symbols, while others are words. What symbols or words act as operators is determined by Google, rather than the searcher. Today you will learn about several operators, and gain a better sense of what operators are.

For example, if you do a query for [tesla coil], which you saw earlier, you get results about Tesla coils, as well as Nikola Tesla and other associated ideas.

Figure: Search results for [tesla coil].

But suppose you want to limit results to pages that come from the Stanford University website (stanford.edu). You add an operator. In this case, the site: operator: [tesla coil site:stanford.edu]. It limits the results to pages with the words tesla and coil, but justfrom the specified website-- stanford.edu.

Figure: Notice that every web address is from Stanford.edu is these results for the query [tesla coil site:stanford.edu].

Here is the key idea to remember about the way operators work: you can imagine this sort of giant blob of all the results for the search [tesla coil]. When you add in an operator like [site:stanford.edu], it is giving you a subset of those results. What it effectively is doing is saying, "Here's the entire space of all possible results, now, just give me the ones from just this site."

Figure: Venn Diagram showing that the results for [tesla coil site:stanford.edu] are a subset of the results for [tesla coil].

In addition to using the site: operator to narrow to a single site, you can also use it to limit to all the sites within a top-level domain:

Country codes top-level domains are a good way to restrict what you are looking for, such as using .in (India), .br (Brazil), .es (Spain), or even .aq (Antartica).

Be aware, too, of how top level domains operate. For example:

If you need to find out the domain for a particular country, you can do a search like [government bosnia] and look at what domains come up in results.

You can use site: with parts of domains, such as ac.uk for academic sites within the United Kingdom or go.ke, for government sites in Kenya.

Here are some tips for writing site: searches that work:

1. Spacing

Site: operators do not work if there is a space after the colon:

2. Query order

The site: portion of the query can come either before or after the other search terms:

3. Including the period

Now, take a look at what a search looks like with just the top-level domain: [business workplace accident rates site:gov].

Figure: Results for the query [business workplace accident rates site:gov].

Notice that all the results are coming from .gov sites, which denote government pages here in the United States.

Government websites may contain databases that house rich data. While Google can point you to the front pages of these databases, the databases themselves are often created in a manner that does not allow Google to crawl the data directly. So, you can use Google to find the homepage of the database, and you can go in and search the contents yourself.

For example, OSHA has a ton of data available to you, and you can find the databases that contain it through a site restriction mechanism like this, but you have to go to the front page of the database to actually search its contents.

If you want to learn more about this concept of the deep web, check out the bonus videoin the forum.

You can also explore these features by selecting a simple starter search, like [mariculture], or fish farming.

Start with a search like [mariculture site:com]:

 Figure: Results for the query [mariculture site:com].

You can then use this search in other types of media on Google. For example, in Image search:

Figure: Results for the image search mariculture site:com.

These images are all from .com sites. You can also change the query to find images from [site:gov].

Figure: Results from the query  [mariculture site:gov].

Now, you are seeing only results from governmental sites in the US that have images involving mariculture.

The same trick works with other media, like News:

[business workplace site:gov]

Figure: Results from News search for [business workplace site:gov].

You can confirm that these results are from News because News is highlighted in red in the column on the left side of the screen.

So, in this lesson you have explored how the site: operator can be used to restrict results to either a specific top-level domain or a specific website.

As you will be seeing in the next few classes, there are many different kinds of operators. Today you will learn about the top four or five that people use a lot. You can explore even more operators here and here, but these are the ones that we use day-to-day and actually give you a lot more power.

Go ahead and try the site: searching activity.


Power Searching with Google © 2015  Google, Inc.  (DMR 7-19-15)