September 03, 2010

"For successful technology, reality must take precedence over public relations, for nature cannot be fooled."    --  Richard Feynman
Moving Sql Server Code
Author:RBarryYoungCreated:8/11/2008 12:08 AM
My occasional adventures in performance, administration, secuirty and other database topics. -- RBarryYoung

Just a quick note ...

Upon further testing and analysis, I feel confident that the best solution for string concatenation in SQL Server 2005+ is the FOR XML method, like so:

SELECT CAST(
   (SELECT TABLE_NAME+', '
     From INFORMATION_SCHEMA.TABLES
     Order By TABLE_NAME
     FOR XML PATH('')
) as VARCHAR(MAX))

It has a lot of advantages, which I will go into later.  It also has one big problem(Entitization) which I will also demonstrate how to fix...

Last night I premiered my presentation "There Must Be 15 Ways To Lose Your Cursors" to the South Jersey division of the Philadelphia SQL Server Users Group.

As promised, I have uploaded the presentation and all of the SQL Scripts as a ZIP it to this site.  You can download it here.

Just a quick note on string concatenation:

 

  Concatenation(appending) a table of many small strings together into one big string is easy in SQL Server, but slow.  Fixng it is hard for the reasons that I list here.  I beleive that I have "solved" this, reducing the O(n2) operation to effectively O(n*Log(n)).

A preliminary version of this technique is demonstrated here.  The approach seen in these forum posts can be greatly improved by "stacking" the 2k roll-ups concatenating all of the lower level strings together at once, instead of "cacading" them up one level at a time, as demonstrated.

I will post a more complete version later...

A fast and accurate way to calculate the distance in miles between two points, based on the latitudes and longitudes.

Read More »

There are basically three (actually four) strategies for protecting against SQL Injection while allowing some use of Dynamic SQL:

  1.  BlackListing
  2.  Firewalling
  3.  Parametizing
  4.  Parsing*

"Blacklisting" is the practice of scanning the client-supplied text for certain "blacklisted" strings, such as semicolons or words like "EXEC" and "DROP", as a way of detecting injection attacks.  Blacklisting is a bad practice and fails for two simple reasons, first, it is not possible to detect every possible injection attack without actually fully parsing the final Dynamic SQL execution string.  No matter how many banned strings you included in your scanner, you still cannot be sure that you have accounted for every posibility.  For instance, consider the alternate quotation characters in the upper half of the ASCII character set (and many more in Unicode).  Do you know which ones of them can be used to close a quotation?  Neither do I, and neither does anybody that I know.

Secondly, blacklisting invariably breaks the application that it is trying to protect by preventing perfectly valid queries from being executed.  For instance, a query for an employee named "Sally Waldrop" is unlikley to make it through the blacklist.  A report query on a client called "Executive Cleaning" is also likely to be rejected.  And the more that you try to fix the first problem by adding more and more strings to the blacklist, the worse you will be making the second problem.  So blacklisting is a bad choice both because you cannot be sure that it will work and because you can be sure that it will break your application.

"Firewalling" is the practice of using security settings to restrict the Dynamic SQL's execution context permissions down to only what that Dynamic SQL statement is supposed to do.  Thus, for a procedure that is supposed to execute a dynamic SQL string to produce a report on the CLIENTS table, a special database User called "CLIENTS_Reports" would be created that had SELECT permission granted to the CLIENTS table, Like this:

    Grant Select on OBJECT::dbo.CLIENTS to CLIENTS_Reporter

 but no rights to anything else, nor any other rights on the CLIENTS table.  Then the procedure would execute the dynamic SQL like this:

    Execute(@query) As USER='CLIENTS_Reporter'


The weaknesses of firewalling are first, that it is a point solution only, as a separate database User needs to be created for each separate stored procedure that uses client text in Dynamic SQL.  Using a single User for multiple stored procedures can comprise the integrity of this strategy because you may have to combine different permission sets to be sure that this one user can do everything that it might be legitimatelly called on to do.  Seperate User accounts for each one is fine if you only have a few of these, but quickly becomes unmanageble with larger numbers.  More discussion on this appraoch by me can be found here.

Secondly, it can be difficult with just security settings and permissions to get the fine control that you may need.  For instance, an application that supports bank tellers may use dynamic SQL to access account information.  While this could be managed with firewalling, you would probably also want to insure that the query could only return information for a single account at a time as any query from a teller that lists information for ALL accounts is likely to be something illegitmate.  In this case, firewalling cannot do much for you.

"Parametizing" is the practice of only allowing client-supplied text to be passed through parameters and variables and never actualized as executable code.  Thus, instead of code like this:

    Select @query = 'SELECT Column FROM table WHERE key=''' + @UserInput + ''''
Execute(@query)


The code would be rewritten like this:

    Select @query = 'SELECT Column FROM table WHERE key=@p1'
Set @P1Def = N'@P1 NVarchar(255)';
EXEC sp_executesql @query, @P1Def, @UserQuery;


In the first case, a client string of " '; DELETE FROM CLIENTS; --" could spell disaster, however, in the parametized instance, it would just return no records.

The shortcomings of parametization are first, it requires more work and thought than other approaches.  Secondly, it initially seems to be applicable to only the simplest cases of dynamic SQL (as above).  However, it turns out that there are some tricky ways to use parametization that allow it to be applied safely and sucessfully to all but the most complex cases of dynamic SQL.  These methods (variable execution, reconstitution, and directed expression) are beyond the scope of this article, however, it should be noted that they have some disadvantages of their own: primarily that they typically require a stricter and more complex interface between the client and the server procedure and thus cannot usually be implemented with making changes in the client as well.

The fourth stategy, "Parsing" is really only included for completeness as it is not actually implmented in any case that I know of.  If it were used, parsing would invovle completely parsing the SQL syntax of dynamic text to be executed and then insuring that it conforms to restrictions defined for that particular use of dynamic SQL (for instance, that it consist of only a single SQL statement).  Parsing is not used primarily because it would be just too hard to implement on SQL Server.  SQL is not an easy language to parse completely and correctly and SQL is not a language that is well suited the task of syntactically and semantically parsing a string of text.  Even setting aside the difficulty of it, the overhead of doing it repeatedly for a heavily used procedure could be severe.  Consequently, without some substantial help in the future from SQL Server itself (or some other Microsoft facility), this approach will probably always remain impractical.

Here's a good article on SQL Injection by one of my favorite authors, Michael Coles: http://www.sqlservercentral.com/articles/Security/updatedsqlinjection/2065/.

Note that most of the SQL Injection articles will be filed under the "Dynamic SQL" heading.

I have implemented large shared schema, multi-tenant applications & databases before and it is suprisingly straight-forward as long as you follow certain rules:

1) All application users must be easily distinguishable by tenant to SQL. 

 Practically speaking this means that either each tenant has a separate Login or each user has a sepearate Login to SQL Server and a Tenant_Users table to map the user back to their tenant association.

2) Every table that the application can write to must have a tenant_id column that identifies the tenant-owner of the data.

3) Every such table must have a corresponding "security" view that insure that any user can only access rows in that table from the same tenant., like so:

CREATE VIEW Secure_TABLE as
Select *
  From Physical_TABLE T
   Inner Join Tenant_Users U
    ON U.tenant_id = T.tenant_id
     And U.UserName = sUser_sName()


4) No application code, including client code, application stored procedures, views, etc., is permitted to directly access the physical tables.  All such data access is only permitted through the Security views.  (use database roles, schemas and permissions to implement this, do not rely on code).

If you notice, this approach insures that, except for the user Logon's, the application has no knowledge of the multi-tenancy, and no application changes should normally be necessary.  This means that you can move a tenants data into or out of this approach without any changes to most applications, other than changing the Server.Database address. (note that to be truly transparent you would have to remove the tenant_id from the output of the security views).

The tenant_id should be made the first field of the primary key and probably also the Clustered Index (if different).

Recently asked question on SQLServerCentral.com:

Well here's my question. In some of the inline sql they are using variables to create the table name.
The code is in vb.net
name = "joesshop"
Example: "Select * from data_" + name
So the end result would be "Select * from data_joesshop"

Is there anyway to put this in a stored proc?

Dynamic SQL is the standard solution for this kind of problem, but Injection is an overriding concern.

Here is how you can recode this simple example into a stored procedure that protects itself from SQL Injection:

Create proc spSelectFromDataTable(

        @SuffixName Nvarchar(255))

/*        Procedure to demonstrate how to safely incorporate client

 text parameters into a SQL command.

 

        Note that the key to this technique is to NEVER actually

EXECute any string that has client-supplied text.  Rather all

of the client text must be purified by replacing it with the known

valid names of the objects being referenced.  Then Dynamic

SQL commands can be safely constructed using ONLY our own text and

these idealized replacement values.

*/

 AS

BEGIN

 

 Declare @ActualTableName SYSNAME

 --Find the actual table name that matches the clients

 -- text parameter:

 Select @ActualTableName = TABLE_NAME

  From INFORMATION_SCHEMA.TABLES

  Where TABLE_SCHEMA = N'dbo'

   And TABLE_NAME = N'data_' + @SuffixName

   --Note that it is safe to use the clients parameter

   -- here because it is only being used as a data value,

   -- it is not being EXECuted.

 

 IF @ActualTableName IS NOT NULL

  BEGIN

        --Here is where we construct and execute the Dynamic SQL

        EXEC(N'Select * From '+@ActualTableName)

        --Note that this execution string contains no part of

        -- the clients text parameter, it has been completely

        -- replaced with the known, valid, actual table name.

  END

 ELSE

  BEGIN

        Declare @msg as NVarchar(MAX)

        Select @msg = @SuffixName+N' is an invalid data table name.'

        RAISERROR(@msg, 11, 1)

        Return

  END

 

END

 

Comments welcome.

I've been thinking about writing an article that would be titled "15 Ways to Leave Your Cusors".  Ideas:

  1. Just put it in a Set, Brett  (change the declare cursor select statement  into a select)
  2. Put it all together, Heather (later one; using women's names opens this  up a lot too)
  3. Put it on the Stack, Jack (recursive CTE's)
  4. Its your last resort Mort (or While/Kyle?)
  5. Select into a variable, Aribel (simple pseudocursor)
  6. Use an ordered update, Nate (complex psuedocursor)
  7. Window on OVER, Grover (Row_Number(), etc.)
  8. Add a WITH CUBE, Noob (Rollup and Cube)
  9. Use a temproary table, Mabel
  10. Just begin again, Man (last one; start over with just specs)
  11. With OUTPUT (for loops with multiple outputs; no cute rhyme yet)
  12. Make a big string, Bing (dynamic SQL)
  13. bring it inline, Caroline
  14. ...?

Other ideas: Joins, Exists, IN(), derived tables, Merge(?)

Well at Jack Corbett's suggestion I have tunrned on RSS feeds for the blog.  I am still not sure if it will work right, so feel free to let me knowof any problems...

I've been invovled in some recent discussion on Multi-Tenenacy databases and in particular, the Shared Schema implementation:

http://www.sqlservercentral.com/Forums/FindPost610335.aspx
http://www.sqlservercentral.com/Forums/Topic529170-361-1.aspx
http://www.sqlservercentral.com/Forums/Topic569567-149-1.aspx

Multi-Tenancy databases are databases with multiple customer organizations ("tenants") all running the same application and all using the same database server.  The three stadard approaches to this are:

  1. Seperate Databases
  2. Same Database, Seperate Schemas
  3. Shared Schemas

In the last one, all of the tenants use the same tables, but a tenant_id is added to each table to distinguish them.  The is an MSDN article that explains this (http://msdn.microsoft.com/en-us/library/aa479086.aspx)

As it happens I designed and implemented a large Shared Schema Multi-Tenant DB several years ago, and  learned many insights from that.

Way back in March I answered question from two posters about the notorious Two-Hop Rule here: www.sqlservercentral.com/Forums/Topic471172-359-1.aspx#bm473248.  The thread has a good summary of the problem and cause by me, and an excellent post by Brian Kelley on how to use Kerberos to fix it.

A good article and monitoring script from Johan Bijnens (ALZDBA) at SQLServerCentral.com:

The Article: Adventures with Service Broker

And the monitoring script with just selects from every Service Broker DMV and catalog view: http://www.sqlservercentral.com/scripts/Maintenance+and+Management/31867/

Jon Reade describes a little known microsoft upgrade utility (SCPTXFR.EXE) from SQL Server 2000 that can be used to script the objects in a SQL Server 2000 database.  Apparently it works for SQL Server 2005 also:  www.sqlservercentral.com/articles/Administering/howtoscheduleasqlserverdatabasecreationscript/1834/

Armand Posted this cool function to tell if a specific year is a leap year or not at www.mssqltips.com/tip.asp :

CREATE Function dbo.fn_IsLeapYear (@year INT) returns BIT

 as

BEGIN

    RETURN(Select

            Case datepart(mm, dateadd(dd, 1, cast((cast(@year as varchar(4)) + '0228') as datetime)))

             When 2 Then 1

             Else 0 End)

END

 

OK, here is the site, blog & post that has the instructions on how to script SQL Server Objects with powershell, first he SQLServerCentral post: http://www.sqlservercentral.com/Forums/FindPost547249.aspx

Also the video Blog post:midnightdba.itbookworm.com/PowershellScriptDBObjects/PowershellScriptDBObjects.html

This is from the Blog: midnightdba.itbookworm.com/

Here is a great Simple-Talk article that explains how to script SQL Server Objects with powershell in SQL Server 2008: www.simple-talk.com/sql/sql-tools/using-powershell-to-generate-table-creation-scripts/

 

Thing is, I know that there is a shorter post a SQLServerCentral that explains how to do this also, but I cannot seem to find it now...

Recent conversation at SQLServerCentral.com:

Garadin Said:

 

In ASP for example, you can do:

sql="SELECT * FROM TABLE WHERE Field ='" & MyVariable & "'"
rs.open sql conn2

Instead of

sql ="Exec dbo.MySP @Variable =" & MyVariable
rs.open sql conn2

 

That said, I see how the hard coded one has injection vulnerabilities, but how is the SP vulnerable if that variable is used within a structured select statement?

 

My Response:

Well, let's say that MyVariable was set to " '; DROP TABLE USERS --", then the sql string in your client becomes:

 

SELECT * FROM TABLE WHERE Field =' '; DROP TABLE USERS --


, which might be a problem .  Effecitvely, the injection happened in the client code, instead of in the SQL Server, but the result is the same.

The question that has to be asked in cases like these is "Where is MyVariable coming from?"  If it is from a user who shouldn't be allowed to drop the Users table then you (both client & server) should take steps to prevent this.  If on the other hand, it is coming from you or another DBA or Admin who could do this anyway, then you do not normally have to be concerned.  The threat with injection is that an unprivliged user can play "stupid string tricks" to hijack the SQL Server rights of the server process, either directly or indirectly (through the client).

That's why protecting against injection is really a two tier defense;

1) The client should never allow user-supplied text to become part of a SQL command that it executes, and...

2) The server should never allow client-supplied text to become part of a SQL command that it executes

#2 is most easily accomplished by forcing the client to use stored procedures and then in those stored procedures, either never use dynamic SQL or by being extremely careful in how you construct the dynamic sql, again, following the rule that client-supplied text (parameters) must never become part of the actual SQL command.  (additionally, you should use different users/schemas to firewall the access rights of each piece of the execution chain).

If you do this, then #1 can be accomplished on the client by never trying to "compose" SQL commands (effectively, dynamic SQL) but rather always calling stored procedures and passing the user-supplied text as parameters.

 

(paraphrased from a recent SqlServerCenterl reply I posted)

The procedure below:

CREATE PROCEDURE FindUserDetails(

      @SelectColumns varchar(2000),

      @WhereClause varchar(2000),

      @OrderByClause varchar(50))

 AS

Declare @SQL varchar(2000)

 

Set @SQL = 'SELECT ' + @SelectColumns + '

 FROM CustomersDetails

  JOIN CompanyDetails ON CompanyID = UserCompanyID

' + @WhereClause + @OrderByClause

 

EXEC(@SQL)

GO

is pretty much exactly what the term "customizable report" means in any application feature list (not to be confused with "Ad-Hoc Reporting" which is a much more general thing).. 

To be clear: the SQL code above is a typical example of a bad use of dynamic SQL.  Not because Dynamic SQL is not needed here (it probably is to get decent performance) but because it is extremely unsafe, and there is virtually nothing that can be done within the stored procedure to remedy that.

Client-supplied text should never be allowed to become part of an executable SQL command.  That is how SQL Injection happens and that is what "Safe" dynamic SQL must never allow.

I saw this over a month ago and I can't beleive that I almost forgot to mention it...

SQL Server 2005 has a system trace running by default.  Recently Jack Corbett showed me a way (here) to not only find where these default trace files are, but also how to query & retrieve them into the running system!  For example, here is Jacks code that demonstrates how to find Logon events that used a Windows Group to Logon:

SELECT DISTINCT
  
I.NTUserName,
  
I.loginname,
  
I.SessionLoginName,
  
I.databasename,
  
S.*
FROM
  
sys.traces T CROSS Apply
  
::fn_trace_gettable(T.path, 5) I LEFT JOIN
  
sys.syslogins S ON
      
CONVERT(VARBINARY(MAX), I.loginsid) = S.sid    
WHERE
  
S.sid IS NULL

Obviously, this gets reset after every reboot.  Is this seriously cool or what?!?

 

Copyright 2008 by R. Barry Young
 RBarryYoung.net  |  Terms Of Use  |  Privacy Statement