Apache Solr integrates Apache Tika to extract meta data and content from files. Alternatively Tika can be used as a standalone application. Both variants can be used with EXT:solr. The Tika integration is provided through EXT:tika which can be found on TYPO3 forge. EXT:tika can be configured to use either of the two variants. Which variant is used is completely transparent to EXT:solr.
Using Tika integrated as a request handler in Solr has the advantage that Java does not need to be present on the webserver when installing Solr on a separate host or when deploying a multi-webserver environment. Using Tika as a standalone application can be faster than the Solr request handler, additionally files do not need to be sent over the network. A standalone Tika application can also use a newer version than shipping
When performing a search in the frontend file results are shown differently from regular page results by linking to the file itself and additionally showing file meta data like mime type and links to referencing pages and records.
File Indexing with EXT:solr can be done through a couple of variants described below.
Status of Implementation
If you are interessted, contact us!