Troubleshooting Beagle
If you run into any problems using Beagle, first and foremost please make sure you have the latest version of Beagle. Since it's under heavy development, bugs are frequently introduced and fixed, especially when running from SVN.
Also please check to make sure your question isn't answered in the FAQ.
Common problems
- beagled-helper process uses 100% CPU
- Search returns incomplete results or no results at all
- Other issues
Useful debugging information
Other useful information
- Using signals for diagnostic information at runtime
- Enabling and disabling backends
- If a filter isn't working properly, you can workaround it with external filters.
Other issues
Signals
Signals can be sent to running beagled or beagled-index-helper to retrieve runtime diagnostic information. This are meant to be used only for debugging.
- SIGUSR1: If beagle was built or run without debugging enabled, SIGUSR1 can be sent to turn on debug logging at runtime. The log messages can be found in the files ~/.beagle/Log/current-*.
- SIGUSR2: If SIGUSR2 is sent, beagled and index-helper prints to the log file various information regarding their current states e.g. beagled prints information about open IndexReaders, global tables etc. and index-helper prints information about currently indexing file.
Beagle loops on directories
(or, "Maximum inotify watch limit hit" or "ioctl: No space left on device" messages)
First and foremost, Beagle had a two bugs (fixed in 0.2.14 and 0.2.15) which could cause directories to be processed over and over. This is one source of 100% CPU issues people have been experiencing. If you are seeing this, first ensure that you are running Beagle 0.2.15.
However, in certain cases looping over directories is correct behavior. Beagle will do this if you don't have inotify support on your operating system, or if you run out of the limited number of inotify watches available on your system. This latter problem will particularly affect systems which have users with very large numbers of directories underneath their home directory, or which have several simultaneous users.
First, to check to see if you have inotify enabled on your system, check for the presence of the /proc/sys/fs/inotify/max_user_watches file. If you don't have it, you are missing inotify support. See the Inotify Kernel page for more information on inotify.
If you do have inotify enabled, check your log files for the "Maximum inotify watch limit hit" message mentioned above. If so, you will need to increase the number of inotify watches available on your system.
Beagle sets up one watch per directory under your home directory, plus a few more for monitoring other files and directories. You can get an idea of how many directories you have under your home directory by running:
$ find $HOME -type d | grep -v "^$HOME/\." | wc -l
The default number of watches is 8192. 16384 is a good value for most people, and 32768 is probably more than enough. Using additional watches does increase the amount of memory used inside the kernel, but increasing the number does not affect the amount of memory if they aren't used.
To change the limit:
# echo 16384 > /proc/sys/fs/inotify/max_user_watches
To make the change permanent you should put that command in one of your boot scripts. If you're running from a distribution package, you might want to ask them to create a boot script for you.
Recently a workaround was added to beagle to stop the recrawl if inotify limit isn reached. The downside is that changes in the unwatched directories will not be reported to beagle.
Beagle keeps looping on my Microsoft Word files
This is a bug in the wv1 software that Beagle uses to index Word documents. You need a patched wv1 package. See the Optional prerequisites page for more information and a link to the patch.
Beagle uses wrong filter for certain files
Beagle primarily uses freedesktop.org xdgmime to detect mimetypes. However, if the file contains the extended attribute user.mime_type, then it uses its value for the mimetype. If the extended attribute does not exist, the xdgmime is used to detect the mimetype for a file.
DocExtractor.exe crashes
This is related to the previous problem on Microsoft Word files. Due to several open bugs in libwv1, the library beagle uses to parse word doc files, beagle ships with a standalone program called DocExtractor.exe. Instead of calling libwv1 internally from the indexing program, beagle calls this external program to extract text from word doc files. The documents which crash libwv1 would still crash with DocExtractor, but being an external program it is easier for the indexer to recover from the crash.
Beagle crashes on login
If you find out the beagle is crashing at login please report the problem in the mailing list or Beagle bugzilla. The crash stacktrace is very important in figuring out what went wrong, so please attach it if you have it. Unfortunately the stacktraces collected by most crash reporting tools are not useful for mono applications like beagle. Instead we require the mono stacktrace. This is what you can do to collect the stacktrace from beagle.
Open the executable /usr/bin/beagled or /usr/local/bin/beagled (it is a shell script). Change the line
exec -a $PROCESS_NAME $CMDLINE &
by
exec -a $PROCESS_NAME $CMDLINE 2>~/.beagle/crash &
Now whenever beagle is run, it will store any crash stacktrace in the file HOME/.beagle/crash. Attach this file when reporting the problem.
There is another alternative to changing the installed executable beagled. That requires running beagled from a terminal. If its a regular or a frequent crash problem, then from a terminal run
$ beagled --fg
This will run beagled in the terminal printing the steps and it will print the mono stacktrace if it crashes. You may omit the --fg, but it is generally helpful to know at what point the crash occurred.
Upon restart, beagle removes previous index and starts indexing from the beginning
If everything is normal, beagle upon restart, will crawl the files and directories to make sure everything is indexed. It will not remove already indexed data. Thus there might be a brief hard-disk activity upon restart when beagle is quickly crawling its data but searches should return files from already indexed data.
If you think beagle is removing its index at restart and restarting indexing from scratch, this could be due to a few potential problems:
- In the previous run, beagle-helper hanged/crashed while indexing some file (possibly word doc or jpeg file). The underlying index could get corrupted if the process crashes or is killed in an unclean manner.
- The low level indexing subsystem failed to cleanup lock files. This problem has been reported time to time but the cause is still unknown.
The beagle log will contain a line like this at its beginning if it is removing the index for any of the above reasons Puring index.
Two recent workarounds have gone into beagle to overcome the above problems (in beagle-0.2.17):
- If beagle detects stale lockfile at startup, it tries to verify if the index is actually corrupt before removing the index. Most likely the index will not be corrupted and then beagle will not purge the index.
- Note that verifying the integrity of the index is a heavy process. If the stale lock file was caused because beagle crashed while indexing certain file, then it a good idea to remove that file from beagle's path. Beagle is likely to crash again if it encounters the file or similar files in future.
- Using low-level system calls to handle lock file operations; this eliminated the stale lockfile problem for several users.
Thunderbird problems
If you are using 0.2.8 or above and thunderbird is not being indexed, one problem may be multiple thunderbird installs. Beagle searches for and uses .mozilla-thunderbird for the default data store if it exists. This means if your datastore is .thunderbird, it will not be indexed. Solution is to remove the .mozilla-thunderbird datastore.
Second item is related to the first. When doing a search and then clicking on an email, Beagle searches for and uses the mozilla-thunderbird application over the thunderbird application. If you have both and are using thunderbird instead of mozilla-thunderbird, Beagle will start up mozilla-thunderbird which will try to create a new account and create the .mozilla-thunderbird directory. This of course breaks indexing. Solution is to remove the mozilla-thunderbird application.
Beagle doesn't index my Liferea feeds
Beagle expects your Liferea directory in a slightly different location than it actually is, and this will be rectified in a future release. In the mean-time, you can open a terminal and type:
$ ln -s ~/.liferea_1.2 ~/.liferea
Beagle doesn't index my Pidgin logs
The location of your Pidgin logs changed with the rebranding from Gaim, and Beagle doesn't know where to find them. This will be rectified in a future release, but in the mean-time, you can open a terminal and type:
$ ln -s ~/.purple ~/.gaim
Beagle does not index HTML files
Beagle does not index office doc and powerpoint files
Beagle uses freedesktop spec and implementation shared-mime-info and xdgmime to detect mimetypes of the files. There are some bugs in the spec and the implementation which can cause incorrect mimetype detection time to time. Please file a bug with Freedesktop bugzilla; if there are workarounds possible in beagle, please let us know.
Currently there are two known open bugs:
