Tutorial: Cleaning the Sharepoint codes with BIDS in SQL

May 4, 2009 at 12:54 PM
Following a request from a user in another thread, I decided to share how to clean all the Sharepoint codes with Business Intelligence Development Studio the easy way!
  1. First, please make sure to follow the MSDN Tutorial.  For this tutorial and my company needs, instead of a Sharepoint List Destination, I am using a OLE DB Destination.  I needed that Sharepoint info extrracted to a SQL database.
  2. Once done, in the Control Flow tab, add one Execute SQL Task right after the Data Flow Task and connect them with the arrow.
  3. Double-click on the Execute SQL Task  to configure it.
  4. Under SQL Statement --> Connection, add the proper SQL connection where your SQL table is sitting.
  5. Under SQL Statement --> SQLStatement, click on the ... button to enter the SQL query.
  6. To clean the #; code, use this query:
    UPDATE Name_of_the_table
    SET Name_of_the_field_1= SUBSTRING(Name_of_the_field_1,CHARINDEX('#',Name_of_the_field_1)+1,LEN(Name_of_the_field_1)-CHARINDEX('#',Name_of_the_field_1))
    SET Name_of_the_field_2= SUBSTRING(Name_of_the_field_2,CHARINDEX('#',Name_of_the_field_2)+1,LEN(Name_of_the_field_2)-CHARINDEX('#',Name_of_the_field_2))
    etc...
  7. To clean the square symbol in Text data type fields, use this query:
    UPDATE Name_of_the_tableSET Name_of_the_field_1 = REPLACE(SUBSTRING(Name_of_the_field_1, 1, DATALENGTH(Name_of_the_field_1)), char(0), '') WHERE charindex(char(0) , Name_of_the_field_1) > 0
    UPDATE Name_of_the_tableSET Name_of_the_field_2 = REPLACE(SUBSTRING(Name_of_the_field_2, 1, DATALENGTH(Name_of_the_field_2)), char(0), '') WHERE charindex(char(0) , Name_of_the_field_2) > 0
    etc...
    char(0) represent the square symbol.
  8. To clean the <div> and </div> in Text data type fields, use this query:
    UPDATE Name_of_the_tableSET Name_of_the_field_1 = REPLACE(SUBSTRING(Name_of_the_field_1, 1, DATALENGTH(Name_of_the_field_1)), '<div>', '') WHERE charindex('<div>' , Name_of_the_field_1) > 0
    UPDATE Name_of_the_tableSET Name_of_the_field_1 = REPLACE(SUBSTRING(Name_of_the_field_1, 1, DATALENGTH(Name_of_the_field_1)), '</div>', '') WHERE charindex('<div>' , Name_of_the_field_1) > 0
    etc...
There you go, I hope this info can help!

May 7, 2009 at 8:45 PM
That is very cool!! you rock. Thanks a lot for help.
One little help on this: I got the chars multiple times in data e.g ;#12;#14;# . This query removed only first occurance. Is there a way to do removals for multiple chars in query Or i have to look into cursor options?


Thanks Again!