Loading Resumes and searching them using SQL Server – Part 2/2 (PDF+ FileTable + Full Text Search)

In this post we continue after creating our file table and posting some pdf documents into it. For the previous post please click here. Apparently there is very few content about how to achieve this so I am going to be as thorough as possible here.

First we need to download the filters that help SQL Server read and understand the varbinary content within the file table. The good news is when working with MS Office documents there Filters come preinstalled and are available by default. However when working with PDF documents there is a lot of confusion about what and how to go about it.

STEP 1:- Check if you have the IFilter for Adobe PDF Files installed already by running

Use the below query to determine if you have the filter installed.

select document_type, path from sys.fulltext_document_types
where document_type like '%pdf%'

STEP 2:-If it’s not installed then download and install the IFilter (PDFiFIlter64Installer.zip) from the below location.

There is a known bug in the Version 11 IFilter when it comes to SQL 2014 so in this case I am using the version 9. Also the links posted on most websites are not working so I have provided the direct FTP link here which has all the versions hosted. Please select the version needed as per your requirements.

ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/

STEP 3:-

Once the above setup file is installed. You will need to set the environment variable path for the iFilter. This is the most important step.

 

If you simply create a Full text index after installing the above filter you will encounter one of the below error messages in your fulltextsearch logs located in the folder

 

C:/Program Files/Microsoft SQL Server/MSSQL12.MSSQLSERVER/MSSQL/Log /SQLFT000XXXXX.LOG

 

Error ‘0x8004fd02: The filter daemon MSFTEFD failed to load an IFilter interface for document, so it can’t be indexed.

 

No appropriate filter was found during full-text index population for table or indexed view

 

This is because SQL Server is not able to find the location for the filter within the environment variables. To correct the above issues, you need to add the path of the bin folder of the IFilter installation into the environment variables.

C:/Program Files/Adobe/Adobe PDF iFilter 9 for 64-bit platforms/bin/

The above path (slashes are actually backslash but my blog doesn’t apply them so replaced with forward slash). Steps shown in screenshot below.

022116_0942_LoadingResu1.png

STEP 4:- Restart Full text search and MS SQL Server Services

 

022116_0942_LoadingResu2.png

STEP 5:- Initialize the full text search modules to detect the new filter

EXEC sp_fulltext_service 'load_os_resources’, 1
EXEC sp_fulltext_service 'verify_signature', 0
EXEC sp_fulltext_service 'restart_all_fdhosts';

Once the above code has been run re-run the instructions in step 1 to verify the filter and the path are reflecting in the results.

STEP 6:- Create the full text index on the filetable by right clicking it and selecting define full text index as shown below:

create a full text index in SQL Server

For steps on how to create a full text index click here.

STEP 7:- Run a Full text search query to verify the functionality is working as expected.

022116_0942_LoadingResu4.png

References

 

https://www.adobe.com/devnet-docs/acrobatetk/tools/AdminGuide/Acrobat_Reader_IFilter_configuration.pdf

https://support.microsoft.com/en-us/kb/972650

The above link is not required for most cases but still worth looking into in case you get the error message listed in the blue box.