Tags

Tag is a special character string parameter which can be assigned to a set of documents to group them into a logical subsection of the search database. At search time you can specify a tag value to limit searches to only a desired group. You can also pass the tag limit to indexer to have it re-crawl only the given group of the documents, or perform any other actions on the group, like watching statistics for the group, deleting the documents belonging to the group and so on.

Adding tags

Use the Tag command in indexer.conf to assign a tag value to a site, a site part, or a group of sites. For example:


Tag cars
Server http://www.ford.com/
Server http://www.toyota.com/

Tag computers
Server http://www.sun.com/
Server http://www.apple.com/

Using tags at search time

When sending a search query, you can specify a tag value to limit searches to the desired subsection of the database with help of the t parameter to search.cgi. You may find useful to add a SELECT search form variable into the search form in search.htm:


Search through:
<SELECT NAME="t">
  <OPTION VALUE="">All sites</OPTION>
  <OPTION VALUE="cars">Cars</OPTION>
  <OPTION VALUE="computers">Computers</OPTION>
</SELECT>
Take a look into Tag, indexer.conf-dist and search.htm-dist for more details and examples.

Using substring tag match

Starting from the version 3.1.x, tags are stored in the database as character strings and support patterns with help of SQL LIKE operator with _ and % wildcards and thus support substring searches on tags.

Nested tags

SQL LIKE patterns also make possible to have nested tag. For example, the documents with the tag value computers-hardware can be found using any of the following tag patters: computers-%, computers-hardware, %-hardware.

Multiple selections

Playing with LIKE wildcards you can make a document match multiple tag selections. For example, the tag ABCDE will match all of the following patterns:


_BCDE
A_CDE
AB_DE
ABC_E
ABCD_

Using tags with indexer

To limit an indexer action to a desired tag use the -t command line option. For example:


indexer -t cars -S
will display statistics for the documents associated with the tag cars.

You can also use multiple -t options. For example:


indexer -t cars -t computers -am
will mark all documents from the tags cars and computers as expired and will re-crawl to these documents forcing full update information about the documents.

The -t command line option also understands SQL LIKE patterns. This command:


indexer -t "c%" -C
will delete information about all documents associated with any tag starting with the letter c from the search database.