0
Under review

Full Index Searching Help

NPb 3 weeks ago • updated 3 weeks ago 5

Hello,


I have a working installation of FileRun and I have followed the instructions to set up the full index searching.  

I have an ElasticSearch server installed using Bitnami on a Windows server and I have the Docker version of Apache Tika running in the same Ubuntu server VM as FileRun.

The configuration page of FileRun show green responses when I click the test buttons:

Cluster name: elasticsearch
Nodes: mH9kZfj (6.5.4)
Index count: 178

and

Apache Tika 1.20

I have also excluded the suggested file extension.

Everything seems to be running fine, but I have only ever had 0 queued operations and when I run the process_search_index_queue.php script it says there is nothing to do.

When I log on as a standard user, the content search option is available but it doesn't find any files.


Please tell me where I am going wrong and how I can get the full file searching working.

Thanks, K.

Under review

Files are being queued for indexing when they are uploaded, created, edited, copied, etc.

If you have existing files, from before you have enabled the full-text searching, you should use the "reindex_files.php" command line script (http://docs.filerun.com/command_line_tools) to index them.

Great, thanks for the really quick response.   

I've kicked off that script now and I'll see if that resolves the issue.

Sorry, I have a few followup questions:


I am using FileRun as a front end to various shares on a back end server.  The files on these shares are uploaded via other applications and not through the FileRun interface.  

With this in mind:

  1. I will have to schedule this reindex_files.php script to run regularly to be able to search for any new files which are added?
  2. Will this script reindex all the files in my entire directory structure (several TBs) every time it is run, or will it skip over the ones it already knows?  

Thanks again for your help.


K.

  1. FileRun won't know about the new files, unless something is done to them though FileRun. There is also a command line script for indexing individual files or individual subfolders. But I guess it depends on the particular flow if this can be used as a solution or not.
  2. It will not skip the previously processed files.

I will look into the possibility of adding an option to the command for skipping previously indexed files.

Hi, thanks again.


I will look at frequently indexing for targeted folders, and a less frequent reindex for the entire structure to catch any other changes.  


That would be awesome if you could amend the script so it only indexes new files.  I'm sure that there will be other people who use your software for web based access to files which are updated by other applications, so this would be extremely useful and would make the process significantly faster.

Cheers, K.