March 11, 2010

"For successful technology, reality must take precedence over public relations, for nature cannot be fooled."    --  Richard Feynman
Moving Sql Server Code
Author:RBarryYoungCreated:8/11/2008 12:08 AM
My occasional adventures in performance, administration, secuirty and other database topics. -- RBarryYoung

Way back in March I answered question from two posters about the notorious Two-Hop Rule here: www.sqlservercentral.com/Forums/Topic471172-359-1.aspx#bm473248.  The thread has a good summary of the problem and cause by me, and an excellent post by Brian Kelley on how to use Kerberos to fix it.

A good article and monitoring script from Johan Bijnens (ALZDBA) at SQLServerCentral.com:

The Article: Adventures with Service Broker

And the monitoring script with just selects from every Service Broker DMV and catalog view: http://www.sqlservercentral.com/scripts/Maintenance+and+Management/31867/

Jon Reade describes a little known microsoft upgrade utility (SCPTXFR.EXE) from SQL Server 2000 that can be used to script the objects in a SQL Server 2000 database.  Apparently it works for SQL Server 2005 also:  www.sqlservercentral.com/articles/Administering/howtoscheduleasqlserverdatabasecreationscript/1834/

Armand Posted this cool function to tell if a specific year is a leap year or not at www.mssqltips.com/tip.asp :

CREATE Function dbo.fn_IsLeapYear (@year INT) returns BIT

 as

BEGIN

    RETURN(Select

            Case datepart(mm, dateadd(dd, 1, cast((cast(@year as varchar(4)) + '0228') as datetime)))

             When 2 Then 1

             Else 0 End)

END

 

OK, here is the site, blog & post that has the instructions on how to script SQL Server Objects with powershell, first he SQLServerCentral post: http://www.sqlservercentral.com/Forums/FindPost547249.aspx

Also the video Blog post:midnightdba.itbookworm.com/PowershellScriptDBObjects/PowershellScriptDBObjects.html

This is from the Blog: midnightdba.itbookworm.com/

Here is a great Simple-Talk article that explains how to script SQL Server Objects with powershell in SQL Server 2008: www.simple-talk.com/sql/sql-tools/using-powershell-to-generate-table-creation-scripts/

 

Thing is, I know that there is a shorter post a SQLServerCentral that explains how to do this also, but I cannot seem to find it now...

Recent conversation at SQLServerCentral.com:

Garadin Said:

 

In ASP for example, you can do:

sql="SELECT * FROM TABLE WHERE Field ='" & MyVariable & "'"
rs.open sql conn2

Instead of

sql ="Exec dbo.MySP @Variable =" & MyVariable
rs.open sql conn2

 

That said, I see how the hard coded one has injection vulnerabilities, but how is the SP vulnerable if that variable is used within a structured select statement?

 

My Response:

Well, let's say that MyVariable was set to " '; DROP TABLE USERS --", then the sql string in your client becomes:

 

SELECT * FROM TABLE WHERE Field =' '; DROP TABLE USERS --


, which might be a problem .  Effecitvely, the injection happened in the client code, instead of in the SQL Server, but the result is the same.

The question that has to be asked in cases like these is "Where is MyVariable coming from?"  If it is from a user who shouldn't be allowed to drop the Users table then you (both client & server) should take steps to prevent this.  If on the other hand, it is coming from you or another DBA or Admin who could do this anyway, then you do not normally have to be concerned.  The threat with injection is that an unprivliged user can play "stupid string tricks" to hijack the SQL Server rights of the server process, either directly or indirectly (through the client).

That's why protecting against injection is really a two tier defense;

1) The client should never allow user-supplied text to become part of a SQL command that it executes, and...

2) The server should never allow client-supplied text to become part of a SQL command that it executes

#2 is most easily accomplished by forcing the client to use stored procedures and then in those stored procedures, either never use dynamic SQL or by being extremely careful in how you construct the dynamic sql, again, following the rule that client-supplied text (parameters) must never become part of the actual SQL command.  (additionally, you should use different users/schemas to firewall the access rights of each piece of the execution chain).

If you do this, then #1 can be accomplished on the client by never trying to "compose" SQL commands (effectively, dynamic SQL) but rather always calling stored procedures and passing the user-supplied text as parameters.

 

(paraphrased from a recent SqlServerCenterl reply I posted)

The procedure below:

CREATE PROCEDURE FindUserDetails(

      @SelectColumns varchar(2000),

      @WhereClause varchar(2000),

      @OrderByClause varchar(50))

 AS

Declare @SQL varchar(2000)

 

Set @SQL = 'SELECT ' + @SelectColumns + '

 FROM CustomersDetails

  JOIN CompanyDetails ON CompanyID = UserCompanyID

' + @WhereClause + @OrderByClause

 

EXEC(@SQL)

GO

is pretty much exactly what the term "customizable report" means in any application feature list (not to be confused with "Ad-Hoc Reporting" which is a much more general thing).. 

To be clear: the SQL code above is a typical example of a bad use of dynamic SQL.  Not because Dynamic SQL is not needed here (it probably is to get decent performance) but because it is extremely unsafe, and there is virtually nothing that can be done within the stored procedure to remedy that.

Client-supplied text should never be allowed to become part of an executable SQL command.  That is how SQL Injection happens and that is what "Safe" dynamic SQL must never allow.

I saw this over a month ago and I can't beleive that I almost forgot to mention it...

SQL Server 2005 has a system trace running by default.  Recently Jack Corbett showed me a way (here) to not only find where these default trace files are, but also how to query & retrieve them into the running system!  For example, here is Jacks code that demonstrates how to find Logon events that used a Windows Group to Logon:

SELECT DISTINCT
  
I.NTUserName,
  
I.loginname,
  
I.SessionLoginName,
  
I.databasename,
  
S.*
FROM
  
sys.traces T CROSS Apply
  
::fn_trace_gettable(T.path, 5) I LEFT JOIN
  
sys.syslogins S ON
      
CONVERT(VARBINARY(MAX), I.loginsid) = S.sid    
WHERE
  
S.sid IS NULL

Obviously, this gets reset after every reboot.  Is this seriously cool or what?!?

Something that I have been thinking about lately is how you can write stored procedures that use Dynamic SQL with user-supplied text and still be safe and secure.  The previous post on Dynamic SQL "Levels" covers much of my thinking on this, but I had some related ideas today that I wanted to get down also.

Use Case 1:  The archetypical use-case for dynamic SQL using user or client supplied text is the "Dynamic Report" scenario, wherein the use specifies a table (or view) to report on, the columns (and column order), selection criteria and sorting. 

Use Case 2:  Additional complications to Use Case #1, might include functions, operators or expressions based on columns.

Thoughts:  The columns in (1) should be manageable through Level 1 transformations.  Likewise the sorting.  The overall query structure would be set by the particular stored procedure called.

The match criteria are slightly more difficult, and are usually handled with a fixed AND clause type of syntax like:

"Operand1 Operator1 Operand2 [Operator2 Operand3]"

Where the Operands and Operators are passed by the client.  The operators can be validated with Level 1 techniques, and the operators could be validated from a specific Operator List table.  Call this the "Single Template" method.

Allowing more flexibilty is complicated, but could be handled by a "template and substiution" approach, where the client would pass a table of criteria like this:

 Create Table #MatchClauses(
    ClauseText NVarchar(255),
    Subs1,
    Subs2,
    .... (up to 5?)
)
"AND @1 @2 @3", "C:col1", "O:<", "L:'Jones'"

Here, the embedded parameters are inteneded to be replaced, in order, by the passed parameter values.  The passed parameters are encoded with their "Type" (C=column name, O=operator, L=literal value).  Then the following table colud be defined to validate it:

 Create Table MatchTemplates(
    TemplateID int,
    TemplateText NVarchar(255)
)
Create Table MatchTemplateSubs(
    TemplateID int,
    SubsNumber int,
    AllowColumns bit,
    AllowOperators bit,
    AllowLiterals bit,
    ...
)

So that a table of pre-defined allowable Match templates could be used to transform the user text into lower level dynamic SQL.

 

The real exciting promise of this approach is to make one of the substitution types be (X:= expression) which could then substitute other MatchTemplate expressions or special ExpressionTemplates.  This would allow the client to specify "nested" query expressions. 

The difficulty of this is that the client woudl have to take some kind of input form the user and then transform it into a strict post-op or pre-op form and then construct the complicated strings and calling sequence from that.  Fortunately, client code is good at custom parsing (unlike SQL).

A poster (Garadin) at SQLServerCentral.com recently asked me if I knew of any good articles on how to protect a dynamic SQL  Here is my reply:

I wish I did, but I have not seen a really good one myself.For the purpose of discussion I like to classify different SQL code into levels according to the degree to which they incorporate user or client-supplied text into SQL statements:


  Level 0: No user text is used to construct dynamic SQL commands.
  Level 1: User text is only used to specify the values of columns, variables and expressions (i.e.: "something = client-text").
  Level 2: User text is used to specify Column Names.
  Level 3: User text is used to specify Table & View Names.
  Level 4: User text is used as expressions of a specific and limited type to be directly included in the SQL Command.
  Level 5: User text used as SQL commands or is directly incorporated as SQL commands.

Level 0 may have a lot of dynamic SQL in it but is perfectly safe, because it never includes user text in a SQL command.

Level 1 is not safe when done directly, but can be made safe by transforming it into a parametized call to sp_executesql that is then Level 0 Dynamic SQL.  Level 1 is as high as I have seen any article go, and I have seen many articles incorrectly assert (or imply) that this technique is applicable to and safe for all Dynamic SQL (it isn't).

The trick to making Levels 2 and 3 safe is called "Keying" wherein, instead of using the user text directly in creating the SQL string, it is used as a "Key" into a table of acceptable/allowable strings.  For instance, if the user text is supposed to specify a column name from a specific table then you do something like this:

Select @SafeColumnName = COLUMN_NAME
From INFORMATION_SCHEMA.COLUMNS
Where TABLE_NAME = 'mySpecificTable'
  And COLUMN_NAME = @UsersColumnText


  From this point you can transform the Dynamic SQL to Level 1 or 0, because the dynamic SQL text no longer includes any direct user-supplied text.  I have seen some (very few) responses in SQL support forums that discuss Level 2 techniques.  I have never seen or heard anything higher on the Internet (though of course, the Internet is a very big place).

Level 4 can be validated only by parsing the user text to insure that it stays strictly within the bounds of what it is supposed to be expressing.  Level 4 is distinct from Level 5 because Level 4 is still reasonably possible (if nonetheless extravagant).

Level 5 can only be validated through a full SQL parser.  Obviously unreasonable for anyone other than the vendor to attempt.

And all of this should be firewalled in special-purpose roles/schemas/db user accounts whose rights and permissions have been designed for this designated purpose.

This list is not 100% inclusive as there are some unusual schemes that do not quite fit into one of these levels and in particular, there is arguably enough room between Level 3 and Level 4 that you could squeeze in one or two more meaningful levels.  Hmm, maybe I should write an article...


Philly.net 2008.3 Code Camp is today, and I am delivering the above presentation.  So I am attaching the promised materials, including my scripts and slides. Here it is: dnn/Portals/4/Materials.zip

The answer to this question seems obvious at first: “Managing Performance to acceptable levels.” And indeed that is true.  However, how that is done is the real key, and the answer to that is different based on the context. 

In essence the practice of Performance Management can be broken down in the following ways:

PERFORMANCE MANAGEMENT:
1.       Performance Goal Management
a.       Tracking
b.      Monitoring
c.       Reporting
d.      General Goaling
e.      Service-Levels
f.       
2.       Performance Problem Management
a.       Problem Investigation
b.      Troubleshooting
c.      
3.       Performance Planning
a.       Server Sizing
b.      Application Sizing
c.       Capacity Planning
d.      Service Growth Planning
e.      ...
4.       Application Performance
a.       Application Performance Management
b.      Software Performance Engineering
c.      

This outline is not complete, and I will be filling it in over the next few months, as well as explaining the individual items.

An article at http://www.sql-server-pro.com/xp_cmdshell.html explains how to use the SQL Agent's CmdExec Job Steps as an alternative the xp_CmdShell.

Sometimes it would be useful to return a dataset of all possible permissions. Fortunately, there is a built-in system function that does exactly this It has the following syntax:

sys.fn_builtin_permissions('{securable_class}')

{securable_class} ::=

'' | APPLICATION ROLE | ASSEMBLY | ASYMMETRIC KEY | CERTIFICATE | CONTRACT | DATABASE | ENDPOINT | FULLTEXT CATALOG | LOGIN | MESSAGE TYPE | OBJECT | REMOTE SERVICE BINDING | ROLE | ROUTE | SCHEMA | SERVER | SERVICE | SYMMETRIC KEY | TYPE | USER | XML SCHEMA COLLECTION

If '' is passed, all possible permissions are returned. If a secureable class is passed, then all permissions that apply to that class are returned.

So, if you want to return a list of all possible permissions, you would do this:

Select * From sys.fn_builtin_permissions('').

If you want to limit the result set to just those permissions that could directly or indirectly affect a user’s access to an object, and exclude specific permissions such as SELECT and EXECUTE, then you would do something like this:

Select Class_desc, permission_name from fn_builtin_permissions('')
Where Class_desc IN ('SERVER','DATABASE','SCHEMA','OBJECT')
And permission_name NOT IN ('SELECT', 'EXECUTE')

There are a lot of reasons that I could give for the evilness of Cursors, but all of these specific instances and various details can be summarized in the following four problems:

  1. Performance: they are slooow.

  2. Impact on others: they impact other’s performance too.

  3. Power: Cursors have a low effort to impact ratio.

  4. Clarity: Hard to read, hard to understand.

So if Cursors (and other Loops) are so bad, why do SQL developers still use them? Here are the causes that I have observed:

  1. Ignorance: they don’t realize they’re bad. This we can fix.

  2. Familiarity: OO/3GL paradigm, Loops they know; Sets, not so much.

  3. Design-less Coding: just write a big Loop to process each input row and then insert a line/block for each requirement. The essence of bad programming, mediocre developers have been getting away with this for years.

  4. Applicability/Scope: Loops are easier to apply to more problems. Sure, everything after that is harder, but this way I can get say that I finished my task and leave those problems for another day/person.

More later …

Several months ago a poster at SqlServerCentral.com posed a problem that is all too common in SQL Server SAN environments:

Hi,

I have an application in use at a customer site that uses SQL Server 2000 and the customer is complaining about slow performance. The normal issues I receive is that a particular part of the application is running slow. So I request a profiler trace and identify the slow performing query and resolve the problem. This customer says its all slow. I have requested Profiler Traces and examined them and at times I can see queries that normally run in less than 1 second taking 20-30 seconds to complete.

I have also asked the customer for perfmon logs and examined these. The logs show that the database disk (not the log disk) has sometimes queues of 100's of I/O's. The Avg. Disk Sec/Reads show response times for the disk varying from less than 20ms upto 2 seconds.

I believe the issue is with the poor I/O performance, but the customer says its the application.

can anyone advise on how to proceed with an issue like this.

Ronnie

I gave the following response:

Although there are a huge number of possible reasons for this behavior, in my experience it is almost always due to one of two probable influences:

1) Irregularities in the optimizer/statistics facilities.  SQL Server 2000 had a number of issues in this area which you can research at Microsoft, however, it is usually easy enough to tell if this may be the cause.  Compare the query plan of 20 sec responses to the 2 sec instances.  If they are significantly different then that is your likely cause.  The easiest Quick & Dirty short-term fix is to add query hints to force them to choose the plan you know they should use.

However, the fact that CPU & Read IO's are the same is an indication that this is probably not what is happening.  Which leaves you with...

2) Contention.  Contention is a broad category that can take many forms, both obvious and subtle.  The high queue size on your data disk makes it likely that you are experiencing physical contention for that disk (and not contention for other resources, such as CPU and not logical application contention such as Locking).

At this point, the possible sources of contention break down as follows:

A) Application Internal:  In other words, you have too many users trying to do the same thing at the same time.  That disk has become the bottleneck that everyone is backing up behind.  You need to either put less demand on that disk or give that disk more IO capacity (ie., optimize the App, or the queries/sProcs, tune the tables/indexes, move the tables to less used disks, OR reconfigure the disk (see below)).

B) SQL Server internal: other usages of the SQL server are putting load on that disk that are contending with the app's usage of it.  You need to eliminate, reduce or redirect that other SQL usage.

C) Server Internal: non-SQL usages of that disk, but from the same server, are competing for that disk.  You need to eliminate, reduce or redirect that other usage.

D) SAN-concealed: SAN's are commonly carved up into separate Logical volumes that are nonetheless competing for the same physical resources.  So for instance, your Log disk and your Data disk may actually be sharing the same physical devices (this is a BAD thing, but much more common than you might think).  More likely, some completely unrelated server(s) have heavily-used logical volumes that are using the same physical drives as your data disk.  Solution: reconfigure your data disk on the SAN.

More at www.sqlservercentral.com/Forums/Topic466275-65-1.aspx

An interesting article from 2006 on SQLServerCentral.com  that details how to setup an "Asynchronus" Logon trigger, using a Logon Event Notification (and Service Broker):

http://www.sqlservercentral.com/articles/SQLServerCentral.com/sqlserver2005logontriggers/2366/

Like most Service Broker/Event Notifications articles, it appears that there may be some mistakes and/or omissions (Master Key?).


A pseudocursor is a type of external implicit cursor which means that it is an externally visible way of serializing a dataset by taking advantage of implicit behavior in Transact-SQL.  Specifically, a pseudocursor leverages the ability of SELECT and UPDATE to assign to variables on a per-output row basis.  The following is an example of a simple pseudocursor that counts the number of rows in a Cross Join (approximately 350,000):

Declare @count int

Set @count = 0

 

Select @count = @count + 1

 From sys.columns c1, sys.columns c2

 

Print @count

The Select statement simply adds one to the counter for each output record that it would have produced.  Note that the SELECT statement cannot both produce output and assign to a variable in the same statement.

A pseudocursor runs very fast compared to a true Cursor, but not nearly as fast as the best pure Set-based T-SQL, when that is possible.  For comparison, I tested this routine on my against two other functionally identical procedures.  First a true Cursor routine:

Declare @dummy int

Declare @count int

Set @count = 0

 

Declare CountCursor Cursor LOCAL FORWARD_ONLY For

       Select 1

        From sys.columns c1, sys.columns c2

 

OPEN CountCursor

 

FETCH NEXT FROM CountCursor Into @Dummy

WHILE @@Fetch_Status = 0

 Begin

       Set @count = @count + 1

       FETCH NEXT FROM CountCursor Into @Dummy

 End

Print @count

Secondly, a pure Set-based T-SQL using the built-in Count() aggregate function:

Declare @count int

 

Select @count=count(*) From sys.columns c1, sys.columns c2

Print @count

The results were as follows:

Method

Seconds

Pseudocursor

  0.121

True Cursor

30.823

Pure Set-based

  0.004


What this shows is that when there are functionally identical T-SQL scripts of the same apparent operational Order, then Pseudocursors will typically beat explicit Cursors, but they will also typically lose to pure Set-based SQL.  A look at the number of lines and obvious complexity differences between these three approaches reveals their other two characteristic differences: expressive Power and code Clarity, which follow the same precedence as performance: pure Set-based, then Pseudocursors and finally true Cursors.

What all of this presumes, however, is that these equivalent algorithms for these three approaches exist and this is not always so.  This is because of the final attribute or comparison: scope of Applicability.  Because Cursors can incorporate the attributes of unbound loops, they gain full Turing-completeness which means that they can be successfully applied to any programming problem that can be solved algorithmically.  Set-based T-SQL, on the other hand is implicitly bound by its source tables and is thus unable to applied to certain algorithms.  Theoretically, this only applies to problems whose halting is not predictable or certain; however, in practice it can become unusable well before that.  For instance, a Hailstone Function generator would be either impractical or impossible to implement in pure Set-based SQL (depending on the rules imposed).

As with performance, power and clarity, Pseudocursors again occupy a very useful middle ground between these two.  Although Pseudocursors are also not Turing-complete, they can implement a very large set of algorithms, beyond what is possible with pure Set-based SQL alone, though still within certain input bounds.  For instance, you could very easily implement a Hailstone Function generator with Pseudocursors, and although technically it would still be bound by its source tables, it could easily be made to run for longer than any of us will be alive.

More to come in Part 2 …

When discussions about Cursors get hot and heavy, one thing that comes out is that there is a lot of confusions about what a cursor (or "Cursor") really is.  In general a cursor is "a moving placement or pointer that indicates a position."1  More specifically for databases, a cursor can be formally defined as "a control structure for the successive traversal (and potential processing) of records in a result set."2  Less formally, a cursor is any method (structures and processes) that serializes a data set, that is sequences (imposes an order on) it and then processes it one record at a time, in order, keeping track or the current position with the sequence.

Given that definition, cursors are in a lot more places than we might initially assume and these more general cursors are not what we usually mean when we are talking about the Evils of Cursors.  For that reason, I distinguish betweengeneral "cursors" (lower case) and the explicit user written SQL code constructs called "Cursors".  This naturally raises two important questions;

  1. What's so bad about Explicit Cursors?  and,...
  2. Are these other cursors OK then?

I will deal with both of these issues in a future post...

 

Copyright 2008 by R. Barry Young
 RBarryYoung.net  |  Terms Of Use  |  Privacy Statement