Site Hashes and API Keys
In previous posts we talked about how you can index external websites' content into a Solr index using Apache Nutch and then search it from a TYPO3 website using EXT:solr.
The issue was that in our TYPO3 specific Solr schema.xml we have a field called the siteHash. The site hash helps to filter results so that you can index multiple sites into one index while still only getting results of the site you are on.
To allow the documents indexed using Nutch to be found they also need to have a site hash. Of course does not know about the site hash and its implementation; to overcome this we now provide a small REST style API to request a site hash to be used when indexing external resources.
A GET request to
http://www.typo3-solr.com/index.php?eID=tx_solr_api&apiKey=cdf676f0f0ff749d7fd0545ce4dff7bbeaa5e0741 &api=siteHash&domain=www.dkd.de
would then for example provide the site hash needed to index a page from www.dkd.de.
To prevent abuse we also added a protection for the API by having to provide the correct API key. By that you are now able to index external resources into a Solr index that is searched by TYPO3.
This and lots more will be part of the soon to be released version 2.0 of Apache Solr for TYPO3.




