This guide assumes you already have FileRun installed via Docker.
Your server would need at least 2GB of RAM memory for ElasticSearch.
Edit your existing docker-compose.yml to include the two additional services (tika and elasticsearch) and link them by service name:
1services:
2 [...]
3 web:
4 image: [FileRun]
5 links:
6 - db
7 - tika
8 - elasticsearch
9 tika:
10 image: logicalspark/docker-tikaserver
11 elasticsearch:
12 image: docker.elastic.co/elasticsearch/elasticsearch-wolfi:9.2.1
13 container_name: elasticsearch
14 environment:
15 - "discovery.type=single-node"
16 - "xpack.security.http.ssl.enabled=false" #disable SSL
17 - "xpack.security.enabled=false" #disable authentication
18 - "xpack.security.enrollment.enabled=false" #disable authentication
19 - cluster.name=docker-cluster
20 - bootstrap.memory_lock=true
21 - "ES_JAVA_OPTS=-Xms2g -Xmx2g"
22 ulimits:
23 memlock:
24 soft: -1
25 hard: -1
26 mem_limit: 6g
27 volumes:
28 - /filerun/esearch:/usr/share/elasticsearch/data
For more information on running ElasticSearch via Docker, please see the official documenation.
Please note the above volumes configuration for the Elasticsearch index data, with the mount path set to
/usr/share/elasticsearch/data. Chown this folder 1000:1000.
You can use the following command with the docker-compose file above:
To configure the file indexing feature please follow this guide.
The Elasticsearch Host URL that needs to be configured is
http://elasticsearch:9200.
The Apache Tika server hostname should be configured with tika and
the port number 9998.
From the server's command line, open a console inside the FileRun Docker container:
1docker exec -it filerun bash
filerun is the container name. You can use the ID if a name is not
given. To check the Docker containers ID, you can use the docker ps
command.
Create the indexation script file which will run periodically:
1vim /var/www/html/cron/process_search_index_queue.sh
and paste (press i and then CTRL+V) the following inside:
1php /var/www/html/cron/process_search_index_queue.php
Press Esc then :wq and Enter to save the changes and close the
editor.
Adjust the script file permissions by making it executable:
1chmod 755 /var/www/html/cron/process_search_index_queue.sh
Open the crontab:
1vim /etc/crontab
and paste (press i and then CTRL+V) the following at its end
(leaving the empty line at the bottom of the file):
1* * * * * root /var/www/html/cron/process_search_index_queue.sh
Press Esc then :wq and Enter to save the changes and close the
editor.
You should now have FileRun automatically index inside Elasticsearch the contents of all the file types supported by Apache Tika. Note that the above cronjob runs once every minute, and it may take a minute or two for a file to be found by its content after uploading.
Important note: If your FileRun Docker container ever gets stopped for some reason, you will need to redo the "Setup the indexing process" section.