Files
SearchServices/search-services/README.md
2025-06-26 10:50:18 +05:30

544 lines
21 KiB
Markdown

## Alfresco Search Services Implementation
Alfresco Search Services using Alfresco and Apache Solr
### Get the code
Git:
```bash
$ git clone https://github.com/Alfresco/SearchServices.git
```
### Use Maven
Build the project:
```bash
$ mvn clean install -DskipTests=true
```
All the resources required to run Alfresco Search Services will be available under `packaging/target` folder.
### Start Alfresco Search Services from source
To run Alfresco Search Services locally, building the ZIP distribution file is required.
```bash
$ mvn clean install -DskipTests=true
```
After the project is successfully built, ZIP can be extracted.
```bash
$ cd packaging/target
$ unzip alfresco-search-services-*.zip
$ cd alfresco-search-services
```
From Alfresco *Search Services 1.3.0.5*, distribution ZIP is released with Mutual Authentication TLS (SSL) by default. So before starting the service, generating secure keys for SSL communication is required. You can find detailed information for this step at [Alfresco documentation](https://docs.alfresco.com/search-enterprise/tasks/generate-keys-ssl.html).
The `keystores` folder generated by the SSL Tool contains the keystores and truststores for SSL configuration. In the following steps, it's assumed that SSL Tool has been executed from `/tmp` or `C:\tmp` folder.
```bash
$ tree /tmp/keystores/
keystores/
├── alfresco
│   ├── keystore
│   ├── keystore-passwords.properties
│   ├── ssl-keystore-passwords.properties
│   ├── ssl-truststore-passwords.properties
│   ├── ssl.keystore
│   └── ssl.truststore
├── client
│   └── browser.p12
├── solr
│   ├── ssl-keystore-passwords.properties
│   ├── ssl-truststore-passwords.properties
│   ├── ssl.repo.client.keystore
│   └── ssl.repo.client.truststore
└── zeppelin
├── ssl.repo.client.keystore
└── ssl.repo.client.truststore
```
SOLR SSL configuration files are available in `/tmp/keystores/solr` folder.
These files must be copied to `rerank` configuration folder.
```
$ cp /tmp/keystores/solr/* solrhome/templates/rerank/conf
```
If you are running from a *Linux* or *Mac OS X* machine, add following lines to `solr.in.sh` file.
```
SOLR_SSL_KEY_STORE=/tmp/keystores/solr/ssl.repo.client.keystore
SOLR_SSL_KEY_STORE_PASSWORD=keystore
SOLR_SSL_KEY_STORE_TYPE=JCEKS
SOLR_SSL_TRUST_STORE=/tmp/keystores/solr/ssl.repo.client.truststore
SOLR_SSL_TRUST_STORE_PASSWORD=truststore
SOLR_SSL_TRUST_STORE_TYPE=JCEKS
SOLR_SSL_NEED_CLIENT_AUTH=true
SOLR_SSL_WANT_CLIENT_AUTH=false
```
If you are running from a *Windows* machine, add following lines to `solr.in.cmd` file.
```
set SOLR_SSL_KEY_STORE=C:\tmp\keystores\solr\ssl.repo.client.keystore
SOLR_SSL_KEY_STORE_PASSWORD=keystore
SOLR_SSL_KEY_STORE_TYPE=JCEKS
SOLR_SSL_TRUST_STORE=C:\tmp\keystores\solr\ssl.repo.client.truststore
SOLR_SSL_TRUST_STORE_PASSWORD=truststore
SOLR_SSL_TRUST_STORE_TYPE=JCEKS
SOLR_SSL_NEED_CLIENT_AUTH=true
SOLR_SSL_WANT_CLIENT_AUTH=false
```
Once this settings are ready, start SOLR service from command line:
```
$ ./solr/bin/solr start -a "-Dcreate.alfresco.defaults=alfresco,archive \
-Dsolr.ssl.checkPeerName=false \
-Dsolr.allow.unsafe.resourceloading=true" -f
```
SOLR will create Alfresco cores (`alfresco` and `archive`) when starting, and configuration from `rerank` template will be copied to each core.
If you also started an ACS instance running in [https://localhost:8443/alfresco](https://localhost:8443/alfresco) with the keystores provided by the SSL Tool (`keystores/alfresco` folder), then the index will be populated.
SOLR Web Console will be available at:
[https://localhost:8983/solr](https://localhost:8983/solr)
**Note** Client certificate `browser.p12`, generated by the SSL Tool, is required to be installed in your browser in order to access to this Web Console. See "Installing Browser certificate" section below.
By default Alfresco Search Services runs on port 8983, but this can be set by supplying e.g. `-p 8083` to the "solr start" command.
To set up remote debugging (on port 5005) start Alfresco Search Services with the following command and then connect using your IDE:
```bash
$ ./solr/bin/solr start -a "-Dcreate.alfresco.defaults=alfresco,archive \
-Dsolr.ssl.checkPeerName=false \
-Dsolr.allow.unsafe.resourceloading=true \
-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005" -f
```
DBID based sharding can be set up from the command line. For example a core containing shards 0, 1, 6 and 7 from an
index with twelve shards can be set up by starting an instance of Alfresco Search Services with a command like:
```bash
./bin/solr start -a "-Dcreate.alfresco.defaults=alfresco,archive -Dnum.shards=12 -Dshard.ids=0,1,6,7"
```
Further instances should be set up to contain the other shards, and it is possible to adjust the distribution and
replication of shards to achieve the desired index performance and redundancy.
To stop Alfresco Search Services:
```bash
$ ./solr/bin/solr stop
```
**Using Plain HTTP**
If you want to use Plain HTTP for SOLR instead of Mutual Auth TLS (SSL), use following steps.
```bash
$ mvn clean install -DskipTests=true
```
After the project is successfully built, ZIP can be extracted.
```bash
$ cd packaging/target
$ unzip alfresco-search-services-*.zip
$ cd alfresco-search-services
```
Change default Alfresco Communication protocol to `none`, and set `alfresco.allowUnauthenticatedSolrEndpoint` to `true`:
```bash
$ sed -i 's/alfresco.secureComms=https/alfresco.secureComms=none\nalfresco.allowUnauthenticatedSolrEndpoint=true/' solrhome/templates/rerank/conf/solrcore.properties
```
*Note* Above line is written in GNU sed, you can use `gsed` from Mac OS X or just edit the file with a Text Editor.
Start SOLR service from command line:
```
$ ./solr/bin/solr start -a "-Dcreate.alfresco.defaults=alfresco,archive" -f
```
SOLR will create Alfresco cores (`alfresco` and `archive`) when starting, and configuration from `rerank` template will be copied to each core and if you also started an ACS instance running in [http://localhost:8080/alfresco](http://localhost:8080/alfresco) then the index will be populated.
SOLR Web Console will be available at:
[http://localhost:8983/solr](http://localhost:8983/solr)
### Installing Browser certificate
In order to access to SOLR Web Console, available by default at [https://localhost:8983/solr](https://localhost:8983/solr), browser certificate must be installed in your machine.
For *Windows* systems, `client\browser.p12` file must be imported as new private certificate to `Windows Certificates` application.
For *Mac OS X* systems, `client/browser.p12` file must be imported to `Keychain Access` application.
Also setting the right options in these application to *trust* in this certificate is required.
Once the certificate is installed, the following message should be showed by your browser when accessing to Solr Web Console:
```
Your connection is not private
Attackers might be trying to steal your information from localhost (for example, passwords, messages or credit cards). Learn more
NET::ERR_CERT_AUTHORITY_INVALID
```
As the certificate has been generated for `localhost`, this warning is expected. Just click on `Advanced >> Proceed` and use your browser certificate to access Solr Web Console.
### Use Alfresco Search Services Docker Image
Once the project has been built, the Docker image can be also built:
```bash
$ cd packaging/target/docker-resources/
$ docker build -t searchservices:develop .
```
*Search Services* Docker image is configured with **Mutual Authentication TLS (SSL)** by default.
**Building Docker Image from Windows**
When building Search Services or Insight Engine Docker Images from **Windows**, some steps need to be added to default building process.
*Clone the repository preserving Linux line endings*
```
$ git clone git@git.alfresco.com:search_discovery/insightengine.git --config core.autocrlf=input
```
Alternatively you can use global settings before cloning the repository
```
$ git config --global core.autocrlf input
$ git config --list
core.autocrlf=input
$ git clone https://git.alfresco.com/search_discovery/insightengine.git
```
*Build the Maven project*
```
$ mvn clean package -DskipTests
```
*Modify default Dockerfile*
```
$ cd insight-engine/packaging/target/docker-resources/
```
Replace the last lines (from `EXPOSE` to `CMD`) of `Dockerfile` file in this folder with the following ones:
```
EXPOSE 8983
RUN chown -R solr:solr $DIST_DIR
RUN chmod +x $DIST_DIR/solr/bin/*
RUN set -x \
&& yum install -y dos2unix \
&& yum clean all
RUN dos2unix $DIST_DIR/solr.in.sh
USER ${USERNAME}
CMD $DIST_DIR/solr/bin/search_config_setup.sh "$DIST_DIR/solr/bin/solr start -f"
```
And build the Docker Image.
```
$ docker build -t insightengine:develop .
```
**Configuration**
The "-e" argument can be used to pass an environment variable:
```bash
$ docker run -e SOLR_JAVA_MEM="-Xms4g -Xmx4g" -p 8983:8983 searchservices:develop
```
To pass several environment variables (e.g. SOLR\_ALFRESCO\_HOST, SOLR\_ALFRESCO\_PORT, SOLR\_SOLR\_HOST, SOLR\_SOLR\_PORT, SOLR\_CREATE\_ALFRESCO\_DEFAULTS, SOLR\_HEAP, etc.), just include the "-e" argument as times as required:
```bash
$ docker run -e SOLR_ALFRESCO_HOST=localhost -e SOLR_ALFRESCO_PORT=8080 -p 8983:8983 searchservices:develop
```
The following environment variables are supported:
| Name | Format | Description |
|------|--------|-------------|
| SOLR_OPTS | "-Dparam=value ..." | Options to pass when starting the Java process. |
| SOLR_HEAP | Memory amount (e.g. 2g) | The Java heap assigned to Solr. |
| SOLR_JAVA_MEM | "-Xms... -Xmx..." | The exact memory settings for Solr. Note that SOLR_HEAP takes precedence over this. |
| MAX_SOLR_RAM_PERCENTAGE | Integer | The percentage of available memory to assign to Solr. Note that SOLR_HEAP and SOLR_JAVA_MEM take precedence over this. |
| SEARCH_LOG_LEVEL | ERROR, WARN, INFO, DEBUG or TRACE | The root logger level. |
| ENABLE_SPELLCHECK | true or false | Whether spellchecking is enabled or not. |
| DISABLE_CASCADE_TRACKING | true or false | Whether cascade tracking is enabled or not. Disabling cascade tracking will improve performance, but result in some feature loss (e.g. path queries). |
| SOLR_SSL_... | --- | These variables are also used to configure SSL. See below. |
| ALFRESCO_SECURE_COMMS | secret or https | This property instructs Solr if it should enable Shared Secret authentication or mTLS authentication with HTTPS. See below. |
**Using Mutual Auth TLS (SSL)**
This Docker image is exposing as VOLUME the folder `/opt/alfresco-search-services/keystores`, that can be used to mount `keystores` folder from host.
By default Docker image is using SSL, but an environment variable `ALFRESCO_SECURE_COMMS=https` can be also passed to the Docker container to declare explicitly the SSL mode.
Additionally, SOLR Jetty server must be configured to start in SSL Mode using `SOLR_SSL_*` environment variables and Search Services must be configured by using Java environment variables starting with `alfresco.encryption.ssl.*`
Following command will start Search Services with SSL using keystores located at `/tmp/keystores/solr`. Note that the internal folders are relative to `/opt/alfresco-search-services/keystores`, as this is the Docker container folder exposed to hold the keystores.
```bash
$ docker run -p 8983:8983 \
-v /tmp/keystores/solr:/opt/alfresco-search-services/keystores \
-e SOLR_CREATE_ALFRESCO_DEFAULTS=alfresco,archive \
-e SOLR_SSL_KEY_STORE=/opt/alfresco-search-services/keystores/ssl.repo.client.keystore \
-e SOLR_SSL_KEY_STORE_PASSWORD=keystore \
-e SOLR_SSL_KEY_STORE_TYPE=JCEKS \
-e SOLR_SSL_TRUST_STORE=/opt/alfresco-search-services/keystores/ssl.repo.client.truststore \
-e SOLR_SSL_TRUST_STORE_PASSWORD=truststore \
-e SOLR_SSL_TRUST_STORE_TYPE=JCEKS \
-e SOLR_SSL_NEED_CLIENT_AUTH=true \
-e SOLR_OPTS="-Dsolr.ssl.checkPeerName=false \
-Dsolr.allow.unsafe.resourceloading=true" \
searchservices:develop
```
SOLR Web Console will be available at:
[https://localhost:8983/solr](https://localhost:8983/solr)
*Note* You must install the `browser.p12` certificate in your browser in order to access to this URL.
**Using Shared Secret Authentication**
An alternative is to use a shared secret in order to secure repo <-> solr communication. You just need to set `ALFRESCO_SECURE_COMMS=secret` **AND** `JAVA_TOOL_OPTIONS="-Dalfresco.secureComms.secret=my_super_secret_secret"`.
By default, the SOLR Web Console will be available at:
[http://localhost:8983/solr](http://localhost:8983/solr)
but you can also start the Jetty server in SSL mode as explained above, in that case the SOLR Web Console will be available at:
[https://localhost:8983/solr](https://localhost:8983/solr)
*Note* You must install the `browser.p12` certificate in your browser in order to access to this URL.
In both cases, when trying to access the SOLR Web Console you will have to provide the `X-Alfresco-Search-Secret` header in the request, specifying as its value the same value that was used for the `-Dalfresco.secureComms.secret` property.
You can do so natively on Safari through the `Dev Tools > Local Overrides` feature, or with a browser extension on Google Chrome/Firefox/Opera/Edge: [ModHeader](https://modheader.com/).
**Using Shared Secret Authentication**
By default Docker image is using SSL, so it's required to add an environment variable `ALFRESCO_SECURE_COMMS=secret` AND `JAVA_TOOL_OPTIONS="-Dalfresco.secureComms.secret=my_super_secret_secret"` to use SOLR with Shared Secret authentication.
To run the docker image:
```bash
$ docker run -p 8983:8983 -e ALFRESCO_SECURE_COMMS=secret -e SOLR_CREATE_ALFRESCO_DEFAULTS=alfresco,archive -e JAVA_TOOL_OPTIONS="-Dalfresco.secureComms.secret=my_super_secret_secret" searchservices:develop
```
SOLR Web Console will be available at:
[http://localhost:8983/solr](http://localhost:8983/solr)
You will have to provide the `X-Alfresco-Search-Secret` header in the request, specifying as its value the same value that was used for the `-Dalfresco.secureComms.secret` property.
Sample Docker Compose service settings
```yaml
solr6:
image: searchservices:develop
mem_limit: 2500m
environment:
# Solr needs to know how to register itself with Alfresco
SOLR_ALFRESCO_HOST: "alfresco"
SOLR_ALFRESCO_PORT: "8080"
# Alfresco needs to know how to call solr
SOLR_SOLR_HOST: "solr6"
SOLR_SOLR_PORT: "8983"
# HTTP settings
ALFRESCO_SECURE_COMMS: "secret"
#Create the default alfresco and archive cores
SOLR_CREATE_ALFRESCO_DEFAULTS: "alfresco,archive"
SOLR_JAVA_MEM: "-Xms2g -Xmx2g"
JAVA_TOOL_OPTIONS: "
-Dalfresco.secureComms.secret=my_super_secret_secret
"
ports:
- 8083:8983 #Browser port
```
**Public Docker repository**
This Docker Image is available at Alfresco Docker Hub:
[https://hub.docker.com/r/alfresco/alfresco-search-services](https://hub.docker.com/r/alfresco/alfresco-search-services)
To use the public image instead of the local one (`searchservices:develop`) just use `alfresco/alfresco-search-services:1.3.x.x` labels.
## Docker Master-Slave setup
### Enable Search Slave Replica config
To enable slave node specify environment value `REPLICATION_TYPE=slave`, by default Master config is enabled and slave is disabled.
During deployment time whenever Search Services or Insight Engine image starts, it will execute the script [search_config_setup.sh](/packaging/src/docker) which will configure the slave config setup based on the value specified in the script.
To run the docker image:
```bash
$ docker run -p 8984:8983 -e REPLICATION_TYPE=slave -e ALFRESCO_SECURE_COMMS=secret -e SOLR_CREATE_ALFRESCO_DEFAULTS=alfresco,archive -e JAVA_TOOL_OPTIONS="-Dalfresco.secureComms.secret=my_super_secret_secret" searchservices:develop
```
Solr-slave End point: [http://localhost:8984/solr](http://localhost:8984/solr)
To generate your own Docker-compose file please follow [generator-alfresco-docker-compose](../e2e-test/generator-alfresco-docker-compose/README.md)
### Use Alfresco Search Services Docker Image with Docker Compose
Sample configuration in a Docker Compose file using **Shared Secret Authentication** to communicate with Alfresco Repository.
```
solr6:
image: searchservices:develop
mem_limit: 2500m
environment:
# Solr needs to know how to register itself with Alfresco
SOLR_ALFRESCO_HOST: "alfresco"
SOLR_ALFRESCO_PORT: "8080"
# Alfresco needs to know how to call solr
SOLR_SOLR_HOST: "solr6"
SOLR_SOLR_PORT: "8983"
# HTTP settings
ALFRESCO_SECURE_COMMS: "secret"
#Create the default alfresco and archive cores
SOLR_CREATE_ALFRESCO_DEFAULTS: "alfresco,archive"
SOLR_JAVA_MEM: "-Xms2g -Xmx2g"
JAVA_TOOL_OPTIONS: "
-Dalfresco.secureComms.secret=my_super_secret_secret
"
ports:
- 8083:8983 #Browser port
```
SOLR Web Console will be available at:
[http://localhost:8983/solr](http://localhost:8983/solr)
You will have to provide the `X-Alfresco-Search-Secret` header in the request, specifying as its value the same value that was used for the `-Dalfresco.secureComms.secret` property.
Sample configuration in a Docker Compose file using **Mutual Auth TLS (SSL)** protocol to communicate with Alfresco Repository.
```
solr6:
image: searchservices:develop
mem_limit: 2500m
environment:
# Solr needs to know how to register itself with Alfresco
SOLR_ALFRESCO_HOST: "alfresco"
SOLR_ALFRESCO_PORT: "8443"
# Alfresco needs to know how to call solr
SOLR_SOLR_HOST: "solr6"
SOLR_SOLR_PORT: "8983"
# SSL settings
ALFRESCO_SECURE_COMMS: "https"
SOLR_SSL_TRUST_STORE: "/opt/alfresco-search-services/keystore/ssl.repo.client.truststore"
SOLR_SSL_TRUST_STORE_PASSWORD: "truststore"
SOLR_SSL_TRUST_STORE_TYPE: "JCEKS"
SOLR_SSL_KEY_STORE: "/opt/alfresco-search-services/keystore/ssl.repo.client.keystore"
SOLR_SSL_KEY_STORE_PASSWORD: "keystore"
SOLR_SSL_KEY_STORE_TYPE: "JCEKS"
SOLR_SSL_NEED_CLIENT_AUTH: "true"
#Create the default alfresco and archive cores
SOLR_CREATE_ALFRESCO_DEFAULTS: "alfresco,archive"
SOLR_JAVA_MEM: "-Xms2g -Xmx2g"
SOLR_OPTS: "
-Dsolr.ssl.checkPeerName=false
-Dsolr.allow.unsafe.resourceloading=true
"
ports:
- 8083:8983 #Browser port
volumes:
- ./keystores/solr:/opt/alfresco-search-services/keystores
```
SOLR Web Console will be available at:
[https://localhost:8983/solr](https://localhost:8983/solr)
**Samples for development use only**
docker-compose files can be used to start up Search Services with Alfresco and Share. There are two docker-composes files available. Depending on the version you want to start either change to 5.x or 6.x. E.g.
```bash
cd packaging/target/docker-resources/6.x
docker compose up
```
##Docker Master-Slave setup
We have seperate docker compose file for slave. To setup Master slave setup
`docker compose -f docker-compose.yml -f ./master-slave/docker-compose.slave.yml up`
The slave running behind the nginx load balancer under 8084, so we can spin up multiple slaves with the same port. To deploy multiple slaves
`docker compose -f docker-compose.yml -f ./master-slave/docker-compose.slave.yml up --scale search_slave=2`
This will start up Alfresco, Postgres, Share and SearchServices. You can access the applications using the following URLs:
* Alfresco: http://localhost:8081/alfresco
* Share: http://localhost:8082/share
* Solr: http://localhost:8083/solr
* Solr-slave: http://localhost:8084/solr
Note: Once deployed and all services up and running, goto <http://localhost:8081/alfresco/s/enterprise/admin/admin-searchservice?m=admin-console.success>. Scroll down and click save button.
This is the bug which currently working on [SEARCH-2085](https://issues.alfresco.com/jira/browse/SEARCH-2085)
If you start version 5.x instead you can also access the API Explorer:
* API Explorer: http://localhost:8084/api-explorer
### License
Copyright (C) 2005 - 2017 Alfresco Software Limited
This file is part of the Alfresco software.
If the software was purchased under a paid Alfresco license, the terms of
the paid license agreement will prevail. Otherwise, the software is
provided under the following open source license terms:
Alfresco is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Alfresco is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with Alfresco. If not, see <http://www.gnu.org/licenses/>.