HtmlSearch Usage
This document describes how to deploy and use the HtmlSearch local search.
HtmlSearch can be deployed as an applet or an application.
A number of parameters apply in either case.
The CLASSPATH environment variable may need to be set for your search to work.
There are a number of examples that describe various combinations of options.
See also the user help for screen description and options, as well as support and contact info.
When invoking, HtmlSearch you can specify parameters, whether you use HtmlSearch as an applet or an application.
Refer to the appropriate section for how to specify the parameters.
The HtmlSearch parameters correspond to settings on the screen (most of them in the Advanced panel), so you may want to refer
to the user help for more explanations.
The asterisk (*) indicates parameters that are ignored in the Lite version.
For parameters that are available in both versions, they can be set in the Lite version but the user will not be able to
change the settings from the screen.
- Search target:
- lookFor
- String: string to look for.
- caseSensitive
- boolean: search case sensitive. Default = false (i.e. no)
- wordOnly
- boolean: search for word only string. Default = false (i.e. no)
- onlyHtml
- boolean: only read HTML files.
By default, HtmlSearch will only read .HTM or .HTML files.
You can set this option to 'false' if you want to access other files (e.g. .txt, .doc).
There is no guarantee that such files will be accessed, depending on the security settings of your browser.
See also noSearchSuffixes. Default = 'true' (i.e. yes).
- otherHost
- boolean: allow search in other hosts.
By default, HtmlSearch only searches on the same host as where the start URL is,
i.e. it doesn't go out to the internet to follow links.
Set this option to 'true' to follow links elsewhere.
Going to other hosts may be disallowed by the security settings of your browser or firewall.
Default = 'false' (i.e. no).
- startUrl
- see specifics for applets or applications
- Restrictions:
- maxSize (*)
- int: maximum size of the files read.
This can be used to eliminate large files which can take a long time to access
(a typical HTML page is rather small, e.g. 10 to 50K). The default is 200K.
- nogoDir (*)
- String: don't follow links into this directory.
The pages in these directories are not searched for the string, nor for links to other pages.
Specify the directories as a comma separated list.
- nosearchDir (*)
- String:directory not to search for 'lookFor'.
Specify the directories as a comma separated list.
The pages in these directories are accessed to determine links to other pages, but not searched.
- nosearchSuffix (*)
- String: file suffixes to avoid.
A number of suffixes are defined by default (e.g. jpg, gif, etc...) but you can define your own if you wish.
That'll speed up the search because the corresponding pages are not accessed.
Specify the suffixes as a comma separated list.
- nogoHost (*)
- String: hosts to avoid.
If you want to avoid visiting pages on certain hosts, use this option.
That'll speed up the search because the corresponding pages are not accessed.
Specify the hosts as a comma separated list.
Each host should be specified as 'www.somehost.com'.
- retryErrors
- boolean: retry pages previously inaccessible.
HtmlSearch keeps track of pages it tried to access, and if an error is encountered, it doesn't retry.
You can set this option to 'true' to force HtmlSearch to retry after an error.
This can significantly slow down the search. Default = false (i.e. no)
- tryIndexOnError
- boolean: try index.htm and index.html for inaccessible links whose
name ends in '.htm' or '.html'. Default = false (i.e. no)
- checkContentType
- boolean: check the MIME type of pages before searching them. Default = true (i.e. yes)
- dirLevels (*)
- int: nb of directory levels searchable above the start URL.
You can limit how far 'up' HtmlSearch can go in your directory structure.
This can be useful to search part of a tree without going back to the root of the tree, which would
be the case if a page down in one of the limbs had a link to the root of the tree.
To specify "don't go above the start URL", set this option to "0". Default = no limit.
- downLevels (*)
- int: nb of directory levels searchable below the start URL.
You can limit how far 'down' HtmlSearch can go in your directory structure.
This can be useful to search part of a tree without going down to the leaves of the tree.
To specify "don't go below the start URL", set this option to "0". Default = no limit.
- metaOnly
- boolean: search in META tags only.
This speeds up the search because only the Header section is searched. Default - false (i.e. no) -
See also the metaTagName parameter
- metaTagName
- string: when searching in META tags only, you can restrict the search
to tags whose NAME attribute matches the value of metaTagName. Default = "" (i.e. all META tags
are examined)
This speeds up the search because only the Header section is searched. Default - false (i.e. no) -
See also the metaTagName parameter
- User interface:
- showLocationInOptionsPanel
- boolean: show the dialog allowing the user to select the starting URL
on the main panel. Otherwise, display it in the Advanced panel. You can set showLocationInOptionsPanel
to false to get a more 'bare' look, or if you don't want the users to easily change the starting location;
this is especially true if you don't want to allow the users to go 'on the web' by setting a HTTP URL.
Default = 'true' (i.e. show the location dialog on the main panel).
- noDetailsPanel (*)
- boolean: do not show a button allowing the user to switch to the Details panel.
This option can be used for a simple search where the user has very limited choices (e.g. see the file help.html).
Default = 'false' (i.e. show the button).
- noAdvancedPanel (*)
- boolean: do not show a button allowing the user to switch to the Advanced panel.
This option can be used for a simple search where the user has very limited choices (e.g. see the file help.html).
It can be used in conjunction with the 'showLocationInOptionsPanel="false"' if desired.
Default = 'false' (i.e. show the button).
- noDomainSelection
- boolean: do not show the radio buttons allowing the user to
select the search domain (new, found pages, previous pages).
Default = 'false' (i.e. show the radio button).
- helpFile
- String: location of the help file, if you want to use your own instead of the
default one. If you set the value to "", no help will be available.
You may have to set your CLASSPATH environment variable to point to the directory where your help file resides.
Default = "HtmlSearchApp/help.html".
- displayContainerPage
- boolean: display page in its container if possible, e.g. the frame the page belongs to.
Default = 'true' (i.e. display the page in its container)
- showFoundUrl
- boolean: when listing a found page, list its URL -1=before the title, 1=after the title, 0=don't list the URL.
Default = '-1' (i.e. display the URL first)
- forceLite (*)
- boolean: if you are using the HtmlSearch Pro version, it is sometimes interesting to 'pretend' it is the Lite version
(e.g. because the Lite version loads faster). The difference then between the 'forced lite' version and the 'real lite' version is that all the parameters
that are otherwise unavailable in the real Lite version are still available in the forced lite version.
Default = 'false' (i.e. use the Pro version)
- targetWindow
- see specifics for applets
- standAloneFrame
- see specifics for applets
- displayApp
- see specifics for applications
- Indexing:
- noIndexPanel (*)
- boolean: do not show a button allowing the user to switch to the Index panel.
This option can be used for a simple search where the user has very limited choices (e.g. see the file help.html).
Default = 'false' (i.e. show the button).
- indexExclusion (*)
- String: location of the index exclusion dictionary file, if you want to use your own instead of the
default one. If you specify a file with a relative path, this path will be taken from the location of the html file from which the applet is launched.
Default = "HtmlSearchApp/indexExclusion.txt".
In this case, HtmlSearch is invoked from an HTML page. Browser security settings may restrict access to other hosts and/or your
PC. You may also have to set the
CLASSPATH environmnent variable
HtmlSearch can be used within a an HTML page with the following applet tag:
Here is a brief explanation of the various elements of the invocation (this is not a HTML tutorial !):
codebase = path to the directory containing the HtmlSearch directory
(where the search class files have been copied)
e.g.: if the applet has been installed in the directory D:\h1\h2\HtmlSearchApp
and this page is in D:\h1\p1\p2\search.html set codebase to: "../../h2".
If you can, it is better to use relative paths than absolute paths.
code = "HtmlSearchApp.HtmlSearch.class". No choice here.
width and height = any size you want to display your applet. The height
of 550 provides a nice balance between all the panels.
The width can be anything you want (less than 500 would look really
goofy though): 650 seems to provide enough room for most error
error messages during the search; 550 is a nicer look. Try what fits
for you. If you are using the standAloneFrame option
you can set these to 5 (or some lower value: there is a lurking Netscape bug, so
you'll have to try what works for the version you're using).
jar = the name of the JAR archive for loading a compressed image of the code.
This speeds the loading of the applet for Netscape and MsIE 4 and above, or
any other browser that supports loading JAR files (JDK 1.1 and above).
param: startUrl = optional if you want to specify where the search starts
from; e.g. startUrl = the page that invoked this search page
(default = from this page).
All the parameters follow the same syntax:
just replace "startUrl" by the name of the option,
and set the value to what you want for the option.
See below for the list of options.
You can also use the applet attributes 'MAYSCRIPT' and 'NAME' if you
want to use javascript to refer to the applet (you'll quickly
run into Netscape vs Explorer discrepancies though, so beware...)
The following HtmlSearch parameters have a particular usage with applets:
- startUrl
- String: URL to start the search. By default, the startUrl is "." (i.e. the directory from which the applet was launched)
- standAloneFrame
- boolean: show the dialog in a stand alone frame instead of in the applet.
By default, the HtmlSearch dialog is displayed in the page where HtmlSearch is invoked.
If you set standAloneFrame to 'true', nothing will be displayed until you invoke the initsearch() function.
This can be used if you want a button to start the search (see example in top.htm).
The advantage is that the dialog window can be resized, and also that it doesn't occupy real estate on your HTML page.
There is however a Netscape bug so beware: if you leave the page from which the search was started in the HTML page
(e.g. type another URL or follow a link), the 'display' button doesn't work anymore.
- targetWindow
- String: window or frame name where to display the page.
By default, when you press the 'display' button, pages are shown in a new browser window.
You can tell HtmlSearch to display the page in a specific window.
This is useful if you want to display the results in a specific frame.
To use HtmlSearch as an application you need to start it with the Java Interpreter, which you can download
from the Java site
if you don't already have it on your machine. Java 1.1x will also work
(it is also available at the Java site).
On Windows, if you download the Java Interpreter after you installed HtmlSearch, you may want
to re-run the installation because that will automatically create a batch file that invokes HtmlSearch.
When you use HtmlSearch as a Java application, there are no browser access restrictions, so searches can take place anywhere
your firewall lets you go.
The display button will work only if you specify which application you want to use to display the pages.
To specify the arguments, use a syntax such as: argumentName="value".
To specify quoted arguments, escape them with a backslash, and if necessary,
separate the escaped quotes from the other quotes with a space; this is somewhat platform dependant
so you may have to try what works with your environment.
The following HtmlSearch parameters have a particular usage with applications:
- startUrl
- String: URL to start the search.
You should set this one to a specific absolute URL, because the application has no frame of reference.
- displayApp
- String: application to use when displaying the pages.
Typically this would be the path to your browser, e.g. on Windows
"C:\program files\netscape\navigator\program\netscape.exe"
Depending on your Operating System and settings, this may start a new browser every time you click the display button.
-
You can start an HtmlSearch application with a command such as (example given for Windows, with lines split for better readability):
cd HTMLSEARCH_INSTALL_DIR
JAVA_BIN_DIR\java HtmlSearchApp.HtmlSearch startUrl="D:\dir\index.html"
displayApp="C:\program files\netscape\navigator\program\netscape.exe"
lookFor=" \"a complex sentence\" \"another sentence\" individual words "
with HTMLSEARCH_INSTALL_DIR = the directory where you installed HtmlSearch (i.e. the one containing the subdirectory HtmlSearchApp)
and JAVA_BIN_DIR the directory where the Java interpreter resides (e.g. on a Windows PC, look for java.exe).
The HtmlSearch java window can be closed with CTRL-F4, although that is somewhat platform
dependant, so try it first.