Wednesday, May 4, 2011

Unified Search Engine

Quick and Dirty method to Combine MS Search Service, Indexing Service and SQL Server to provide a unified search engine for your ASP.NET website



Recently I was developing a site for a company and as usual they needed me to write a search engine. They had the contents in .aspx pages, not a problem, but they also had forums whose contents were collated in a database table, in a column to be precise. They wanted me to display the results from these two sources through a common search engine. Since I had little time to write a search engine of my own, I put the power of MS Search Service, Indexing Service and SQL Server together to do the task for me. There is a lot of scope for enhancement but here is how you can implement a very basic yet powerful search engine of your own.



STEP I: Create a Web Catalog in Indexing Service



By default, the indexing service has two catalogs, one for the file system (System) and one for the default web site (Web).



If Web catalog is not present, you can easily create one.



1. Open Control Panel --> Administrative Tools --> Computer Management



2. Scroll down to Computer Management--> Services and Applications --> Indexing Service in the MMC



3. Right click Indexing Service and choose New --> Catalog



4. In the name field give Web and choose C:\Inetpub as the location



5. Right click the newly created catalog, choose Properties. Click on the Tracking tab of the properties window. Select "Default Web Site" as the WWW server.



6. Restart Indexing Service



7. Go to Computer Management --> Services and Applications --> Services and configure Indexing Service as Automatic if it is Manual or Disabled.





STEP II: Optimize ASPX and ASCX files for full-text search





By default, the *.aspx and *.ascx file types are treated as text files which are not optimized for searching by the Indexing Service. To optimize searching for these two file types copy the following into a new .reg file and run it in your computer. The customary warning: Editing registry incorrectly may prohibit your computer to run properly. Edit the registry at your own risk. I may not be held responsible for the damage you do to your computer by incorrectly following the steps below.



   REGEDIT4



   [HKEY_CLASSES_ROOT\.aspx\PersistentHandler]



   @="{eec97550-47a9-11cf-b952-00aa0051fe20}"





   [HKEY_CLASSES_ROOT\.ascx\PersistentHandler]



   @="{eec97550-47a9-11cf-b952-00aa0051fe20}"





You must have Index files with Unknown Extensions enabled. To enable this, right click on Indexing Service, choose Properties and click on Generation tab on the window. Check Index files with Unknown Extensions checkbox. Restart the computer, stop Indexing Service, delete all the contents of the catalog.wci folder (not the folder itself) corresponding to your catalog (in this case C:\Inetpub\catalog.wci), start the Indexing Service and allow it to rebuild the catalog.





STEP III: Using Full-Text Searches directly in ASP.NET Applications



This is not actually a step but a side step where you can take a pause for a moment and test whether your newly created catalog is returning some results. If you don’t have a database to worry about, then this might be your last step unless you want to link the Indexing Service with SQL Server.



Indexing service exposes itself via the OLEDB provider MSIDXS. You can take the full advantage of the server in your ASP.NET application via ADO.NET. If  you have a TextBox (TextBox1), a Button (Button1) and a DataGrid (DataGrid1) on your web form and the Web catalog in place, this might as well be the content of your button click handler:





using System.Data;



using System.Data.OleDb;



…….



…….





private void Button1_Click(object sender, EventArgs e)



{



      string strCatalog = "Web";



      string strQuery = "Select Filename, Rank, VPath from SCOPE() where FREETEXT('" + TextBox1.Text + "')";



      string connString = "Provider=MSIDXS.1;Integrated Security .='';Data Source='" + strCatalog + "'";





      OleDbConnection Connection = new OleDbConnection(connString);



      Connection.Open();





      OleDbDataAdapter da = new OleDbDataAdapter(strQuery, Connection);



      DataSet ds = new DataSet();



      da.Fill(ds);



      Connection.Close();





      DataView source= new DataView(ds.Tables[0]);



      DataGrid1.DataSource = source;



      DataGrid1.DataBind();



}





STEP IV: Link Indexing Service with SQL Server





The next step is to link the Indexing Service with your SQL Server. Open Query Analyzer or your favourite SQL script editor. Run the following script.





EXEC



sp_addlinkedserver FTIndexWeb, 'Index Server', 'MSIDXS', 'Web'



GO





where FTIndexWeb is the chosen linked server name, and Web is the catalog name you created in STEP I.





STEP V: Querying Indexing Service via SQL Server





Let's modify the previous query and run it in SQL server. Run the following query in Query Analyzer.





SELECT Q.FileName, Q.Rank, Q.VPath



FROM OpenQuery(



FTIndexWeb,



'Select Filename, Rank, VPath



from SCOPE()



where FREETEXT(''Calcutta'')



ORDER BY Rank DESC'



              ) AS Q





Replace FTIndexWeb with whatever linked server name you chose in step IV and Calcutta with your search keyword(s).





STEP VI: Enabling a table/column in SQL Server for full-text searches





Open Enterprise Manager. Browse to Console Root-->; Microsoft SQL Servers --> Databases --> Tables. Check two things before you proceed.


1. The table where you want full-text searching enabled, must have some unique constraint. If a primary key or a unique constraint is not present, create an "ID" column and apply a unique constraint.

2. Microsoft Search Service (mssearch.exe) must have been enabled and running in your computer. If not, browse to Computer Management --> Services and Applications--> Services in Computer Management MMC and configure Microsoft Search Service as Automatic and start the service.


On the Enterprise Manager MMC, right click on your table and choose Full-Text Index Table --> Define Full-Text Indexing on a table. If the option is grayed out, check #2 above.


Click Next on the popped up wizard .Choose the unique index and click Next. Choose the columns where you want indexing enabled. Click Next. Give the catalog a name and specify a physical location to store the catalog. Click Next. If you want the control over how and when the catalog is filled (full or incremental) click on New Catalog Schedule. After configuring it, come back to Full-Text Indexing Wizard and click Next. Click Finish. The wizard takes a minute or two to setup the catalog.


STEP VII: Querying Full-Text Catalog in SQL Server


Let’s test the newly created catalog in SQL Server. Run the following query.


SELECT FT_TBL.subject, KEY_TBL.RANK, FT_TBL.topicid

FROM forums_topics AS FT_TBL,

CONTAINSTABLE ( forums_topics

                ,   message

                , '"Calcutta"' )

 AS KEY_TBL

WHERE FT_TBL.topicid = KEY_TBL.[KEY]

ORDER BY KEY_TBL.RANK DESC


Forums_Topics is the table name and Message is the column name on which full-text catalog is built. Replace Calcutta with your search keyword(s).


STEP VIII: Combining the results


The steps to combine the results would be to


1. Create a temporary table

2. Insert the results of the first query

3. Insert the results of the second query

4. Query the temp table

5. Drop the temp table


We need a stored procedure for this and here it is:


CREATE PROCEDURE sp_Accounts_SearchSite

@FreeText varchar (255)

AS


SET NOCOUNT ON


CREATE TABLE #tempresults(

ID int IDENTITY,

FileNames varchar (255),

Rank int,

VPath varchar(255))


DECLARE @sql nvarchar(1000)

SET @sql = N'INSERT INTO #tempresults(FileNames, Rank, VPath) ' + CHAR(13) +

N'SELECT Q.FileName As FileNames, Q.Rank As Rank, Q.VPath As VPath ' + CHAR(13) +

N'FROM OpenQuery(FTIndexWeb, ''Select Filename, Rank, VPath from SCOPE() where FREETEXT(''''' + @FreeText + ''''')'' ) AS Q'


EXECUTE sp_executesql @sql


SET @SQL = N'INSERT INTO #tempresults(FileNames, Rank, VPath) ' + CHAR(13) +

N'SELECT FT_TBL.subject As FileNames, KEY_TBL.RANK As Rank, FT_TBL.topicid As VPath ' + CHAR(13) +

N'FROM forums_topics AS FT_TBL, ' + CHAR(13) +

N'CONTAINSTABLE ( forums_topics ' + CHAR(13) +

N', message' + CHAR(13) +

N', ''"' + @FreeText + '"'' ) ' + CHAR(13) +

N'AS KEY_TBL' + CHAR(13) +

N'WHERE FT_TBL.topicid = KEY_TBL.[KEY] '


EXECUTE sp_executesql @sql


SELECT FileNames, Rank, VPath from #tempresults ORDER BY Rank DESC


DROP TABLE #tempresults


SET NOCOUNT OFF


GO


STEP IX: Modify your .NET Application


The rest is a piece of cake. Your Button click handler should now look like this:


using System.Data;

using System.Data.SqlClient; // Bye Bye OleDb

…….

…….


private void Button1_Click(object sender, EventArgs e)

{

      string connString = @"server=****;database=****;uid=****;pwd=****;";

      string storedProcName = "sp_Accounts_SearchSite";


      SqlConnection Connection = new SqlConnection(connString);

      Connection.Open();


      SqlCommand command = new SqlCommand( storedProcName, Connection );

      command.CommandType = CommandType.StoredProcedure;

      command.Parameters.Add("@FreeText", TextBox1.Text);


      SqlDataAdapter sqlDA = new SqlDataAdapter();

      sqlDA.SelectCommand = command;


      DataSet dataSet = new DataSet();

      sqlDA.Fill( dataSet, "mySearchResults" );

      Connection.Close();


      DataView source = new DataView(dataSet.Tables[0]);

      DataGrid1.DataSource = source;

      DataGrid1.DataBind();

}


The grid will show results from your file system as well as from your database tables. With everything indexed, the result is lightening fast for hundreds of results if not millions.


Many of you might think that there remains a lot to be told. But didn’t I say it was quick and dirty? No pun intended. To learn more about how to compose your own queries for full-text searches, visit the MSDN website at http://msdn.microsoft.com. With little logic of your own, you can have a nice search engine which would query different sources differently based on your own requirements. For example, you can redefine the scope (Deep Copy Traversal, Swallow Copy Traversal ring a bell?) and can do regular expression searches. You are the one to set your own limit.


A nice ASP.Net search engine article by Ram P Dash. I think it would be nice to share such cool things my blog. If anybody  is offended by this post write me back.



I implemented his work in my project too.



Author:

Ram Dash is a ASP.NET, C# developer and can be reached at “ram underscore dash at fastmail dot fm”. If you wish to reprint this article, a note with the link to the author would suffice.



ASP.Net , C# , Search Engine , Serach Service , SQL

0 comments :

Post a Comment

 

© 2011 GIS and Remote Sensing Tools, Tips and more .. ToS | Privacy Policy | Sitemap

About Me