Stata commands often need to access external programs or ancillary files, which may be hard to find unless they are in the working directory or the ado path. The whereis
command provides a convenient way to keep track of resource locations by maintaining a directory or registry of external files and folders, making things simple for developers and users alike.
The syntax of the command is simple
whereis [name [location]] |
The name
argument specifies the name of a resource, and must be a single word conforming to Stata conventions for names, with no spaces.
The location
argument is used when registering a resource and should be a full path specifying the location of the file or folder. This must conform to Stata conventions for file names. In particular, it should be enclosed in quotes if it includes spaces.
When the command is called with name
and location
it checks that the named file or folder exists and creates or updates an entry in its registry. If only a name
is specified the command retrieves and prints the location of the named resource and checks that it exists. In both cases the location is stored in a macro called r(name)
using the name of the resource.
If the command is called with no arguments it simply lists all registered resources.
Consider a Stata command that needs to access an external executable, for example the pandoc
document converter. Suppose pandoc
was installed at c:\program files (x86)\pandoc\pandoc.exe
. How can we pass this information to the command?
One solution I have seen used is to provide an option for the user to specify the full path to the executable. For example the developer may provide a pandoc()
option, so the user can type pandoc(c:\program files (x86)\pandoc\pandoc.exe)
among the options. Unfortunately this procedure is tedious and error prone, as the path has to be specified every time the program is used.
An alternative solution is to define a global macro, for example global PANDOC c:\program files (x86)\pandoc\pandoc.exe
, and an even better one is to store this macro in the user's profile.do
file, so it will be loaded when Stata starts up. There is a slight inefficiency in defining the macro regardless of whether the resource will be used, but presumably only a small number of programs would be involved for any given user.
The whereis
command provides a simpler solution. Once pandoc
has been installed, the user registers its location by typing in Stata the one-time command whereis pandoc "c:\program files (x86)\pandoc.exe"
, where we have used quotation marks because the full path to the command includes spaces.
In turn, the Stata command that needs to know the location of pandoc
uses the one-liner whereis pandoc
. The whereis
command will print the location of the file and, being an r-class command, will also store it in the macro r(pandoc)
, where it can be retrieved.
Diana Goldemberg from the World Bank suggested using whereis
to store the location of folders as well as files, and indicated how to modify the code to enable this extension. Their teams use GitHub and Dropbox with the same project folder structure, but everyone has a different GitHub or Dropbox root. Storing the root with whereis
provides a uniform way to refer to project files and folders.
The advantages of the whereis
approach over storing global macros in profile.do
are that the resource location is retrieved only on demand, and more importantly, the command checks that the file or folder exists at the given location, both on storage and retrieval. This feature can be important when Stata executes a command by ``shelling out'', as the failure may not be noticed immediately.
For this scheme to work the user needs to know the location of the external resources. If you are not sure exactly where a program has been installed, the operating system may help locate the file.
On Mac and Linux systems there is a system command called which
that can find an executable by searching the user's path. If you are not quite sure where pandoc
was installed in your Mac, open a terminal window (select Applications, Utilities and then Terminal) and type which pandoc
. This will list the path to the executable if found. (There is also a Unix whereis
command, after which this Stata command is named, which searches the standard locations for binary files, but I have obtained better results with which
.)
On Windows there is a similar command called where
. By default this searches only the user's path, but there is an option to search recursively. If you think pandoc
was installed in your C drive try opening a command prompt window and typing where /R c:\ pandoc.exe
.
Once you have identified the location of the file of interest using the operating system, don't forget to register it by running the Stata whereis
command.
Programmers using whereis
to access a resource should allow for the possibility that the path may include spaces. For example to execute pandoc
one could code
. whereis pandoc
. shell "`r(pandoc)'" *arguments*
Note that the command will fail with error code 601 if pandoc
has not been registered with whereis
or if the file is not found in the specified location.
The whereis
command is available from the Statistical Software Components (SSC) archive and can be installed by typing in Stata
. ssc install whereis
You may also try search whereis
and follow the links. The current version is 1.4, and became available on SSC on 28 feb 2020.