March 11, 2010

"For successful technology, reality must take precedence over public relations, for nature cannot be fooled."    --  Richard Feynman
Moving Sql Server Code
Author:RBarryYoungCreated:8/11/2008 12:08 AM
My occasional adventures in performance, administration, secuirty and other database topics. -- RBarryYoung

The answer to this question seems obvious at first: “Managing Performance to acceptable levels.” And indeed that is true.  However, how that is done is the real key, and the answer to that is different based on the context. 

In essence the practice of Performance Management can be broken down in the following ways:

PERFORMANCE MANAGEMENT:
1.       Performance Goal Management
a.       Tracking
b.      Monitoring
c.       Reporting
d.      General Goaling
e.      Service-Levels
f.       
2.       Performance Problem Management
a.       Problem Investigation
b.      Troubleshooting
c.      
3.       Performance Planning
a.       Server Sizing
b.      Application Sizing
c.       Capacity Planning
d.      Service Growth Planning
e.      ...
4.       Application Performance
a.       Application Performance Management
b.      Software Performance Engineering
c.      

This outline is not complete, and I will be filling it in over the next few months, as well as explaining the individual items.

An article at http://www.sql-server-pro.com/xp_cmdshell.html explains how to use the SQL Agent's CmdExec Job Steps as an alternative the xp_CmdShell.

Sometimes it would be useful to return a dataset of all possible permissions. Fortunately, there is a built-in system function that does exactly this It has the following syntax:

sys.fn_builtin_permissions('{securable_class}')

{securable_class} ::=

'' | APPLICATION ROLE | ASSEMBLY | ASYMMETRIC KEY | CERTIFICATE | CONTRACT | DATABASE | ENDPOINT | FULLTEXT CATALOG | LOGIN | MESSAGE TYPE | OBJECT | REMOTE SERVICE BINDING | ROLE | ROUTE | SCHEMA | SERVER | SERVICE | SYMMETRIC KEY | TYPE | USER | XML SCHEMA COLLECTION

If '' is passed, all possible permissions are returned. If a secureable class is passed, then all permissions that apply to that class are returned.

So, if you want to return a list of all possible permissions, you would do this:

Select * From sys.fn_builtin_permissions('').

If you want to limit the result set to just those permissions that could directly or indirectly affect a user’s access to an object, and exclude specific permissions such as SELECT and EXECUTE, then you would do something like this:

Select Class_desc, permission_name from fn_builtin_permissions('')
Where Class_desc IN ('SERVER','DATABASE','SCHEMA','OBJECT')
And permission_name NOT IN ('SELECT', 'EXECUTE')

There are a lot of reasons that I could give for the evilness of Cursors, but all of these specific instances and various details can be summarized in the following four problems:

  1. Performance: they are slooow.

  2. Impact on others: they impact other’s performance too.

  3. Power: Cursors have a low effort to impact ratio.

  4. Clarity: Hard to read, hard to understand.

So if Cursors (and other Loops) are so bad, why do SQL developers still use them? Here are the causes that I have observed:

  1. Ignorance: they don’t realize they’re bad. This we can fix.

  2. Familiarity: OO/3GL paradigm, Loops they know; Sets, not so much.

  3. Design-less Coding: just write a big Loop to process each input row and then insert a line/block for each requirement. The essence of bad programming, mediocre developers have been getting away with this for years.

  4. Applicability/Scope: Loops are easier to apply to more problems. Sure, everything after that is harder, but this way I can get say that I finished my task and leave those problems for another day/person.

More later …

Several months ago a poster at SqlServerCentral.com posed a problem that is all too common in SQL Server SAN environments:

Hi,

I have an application in use at a customer site that uses SQL Server 2000 and the customer is complaining about slow performance. The normal issues I receive is that a particular part of the application is running slow. So I request a profiler trace and identify the slow performing query and resolve the problem. This customer says its all slow. I have requested Profiler Traces and examined them and at times I can see queries that normally run in less than 1 second taking 20-30 seconds to complete.

I have also asked the customer for perfmon logs and examined these. The logs show that the database disk (not the log disk) has sometimes queues of 100's of I/O's. The Avg. Disk Sec/Reads show response times for the disk varying from less than 20ms upto 2 seconds.

I believe the issue is with the poor I/O performance, but the customer says its the application.

can anyone advise on how to proceed with an issue like this.

Ronnie

I gave the following response:

Although there are a huge number of possible reasons for this behavior, in my experience it is almost always due to one of two probable influences:

1) Irregularities in the optimizer/statistics facilities.  SQL Server 2000 had a number of issues in this area which you can research at Microsoft, however, it is usually easy enough to tell if this may be the cause.  Compare the query plan of 20 sec responses to the 2 sec instances.  If they are significantly different then that is your likely cause.  The easiest Quick & Dirty short-term fix is to add query hints to force them to choose the plan you know they should use.

However, the fact that CPU & Read IO's are the same is an indication that this is probably not what is happening.  Which leaves you with...

2) Contention.  Contention is a broad category that can take many forms, both obvious and subtle.  The high queue size on your data disk makes it likely that you are experiencing physical contention for that disk (and not contention for other resources, such as CPU and not logical application contention such as Locking).

At this point, the possible sources of contention break down as follows:

A) Application Internal:  In other words, you have too many users trying to do the same thing at the same time.  That disk has become the bottleneck that everyone is backing up behind.  You need to either put less demand on that disk or give that disk more IO capacity (ie., optimize the App, or the queries/sProcs, tune the tables/indexes, move the tables to less used disks, OR reconfigure the disk (see below)).

B) SQL Server internal: other usages of the SQL server are putting load on that disk that are contending with the app's usage of it.  You need to eliminate, reduce or redirect that other SQL usage.

C) Server Internal: non-SQL usages of that disk, but from the same server, are competing for that disk.  You need to eliminate, reduce or redirect that other usage.

D) SAN-concealed: SAN's are commonly carved up into separate Logical volumes that are nonetheless competing for the same physical resources.  So for instance, your Log disk and your Data disk may actually be sharing the same physical devices (this is a BAD thing, but much more common than you might think).  More likely, some completely unrelated server(s) have heavily-used logical volumes that are using the same physical drives as your data disk.  Solution: reconfigure your data disk on the SAN.

More at www.sqlservercentral.com/Forums/Topic466275-65-1.aspx

An interesting article from 2006 on SQLServerCentral.com  that details how to setup an "Asynchronus" Logon trigger, using a Logon Event Notification (and Service Broker):

http://www.sqlservercentral.com/articles/SQLServerCentral.com/sqlserver2005logontriggers/2366/

Like most Service Broker/Event Notifications articles, it appears that there may be some mistakes and/or omissions (Master Key?).


A pseudocursor is a type of external implicit cursor which means that it is an externally visible way of serializing a dataset by taking advantage of implicit behavior in Transact-SQL.  Specifically, a pseudocursor leverages the ability of SELECT and UPDATE to assign to variables on a per-output row basis.  The following is an example of a simple pseudocursor that counts the number of rows in a Cross Join (approximately 350,000):

Declare @count int

Set @count = 0

 

Select @count = @count + 1

 From sys.columns c1, sys.columns c2

 

Print @count

The Select statement simply adds one to the counter for each output record that it would have produced.  Note that the SELECT statement cannot both produce output and assign to a variable in the same statement.

A pseudocursor runs very fast compared to a true Cursor, but not nearly as fast as the best pure Set-based T-SQL, when that is possible.  For comparison, I tested this routine on my against two other functionally identical procedures.  First a true Cursor routine:

Declare @dummy int

Declare @count int

Set @count = 0

 

Declare CountCursor Cursor LOCAL FORWARD_ONLY For

       Select 1

        From sys.columns c1, sys.columns c2

 

OPEN CountCursor

 

FETCH NEXT FROM CountCursor Into @Dummy

WHILE @@Fetch_Status = 0

 Begin

       Set @count = @count + 1

       FETCH NEXT FROM CountCursor Into @Dummy

 End

Print @count

Secondly, a pure Set-based T-SQL using the built-in Count() aggregate function:

Declare @count int

 

Select @count=count(*) From sys.columns c1, sys.columns c2

Print @count

The results were as follows:

Method

Seconds

Pseudocursor

  0.121

True Cursor

30.823

Pure Set-based

  0.004


What this shows is that when there are functionally identical T-SQL scripts of the same apparent operational Order, then Pseudocursors will typically beat explicit Cursors, but they will also typically lose to pure Set-based SQL.  A look at the number of lines and obvious complexity differences between these three approaches reveals their other two characteristic differences: expressive Power and code Clarity, which follow the same precedence as performance: pure Set-based, then Pseudocursors and finally true Cursors.

What all of this presumes, however, is that these equivalent algorithms for these three approaches exist and this is not always so.  This is because of the final attribute or comparison: scope of Applicability.  Because Cursors can incorporate the attributes of unbound loops, they gain full Turing-completeness which means that they can be successfully applied to any programming problem that can be solved algorithmically.  Set-based T-SQL, on the other hand is implicitly bound by its source tables and is thus unable to applied to certain algorithms.  Theoretically, this only applies to problems whose halting is not predictable or certain; however, in practice it can become unusable well before that.  For instance, a Hailstone Function generator would be either impractical or impossible to implement in pure Set-based SQL (depending on the rules imposed).

As with performance, power and clarity, Pseudocursors again occupy a very useful middle ground between these two.  Although Pseudocursors are also not Turing-complete, they can implement a very large set of algorithms, beyond what is possible with pure Set-based SQL alone, though still within certain input bounds.  For instance, you could very easily implement a Hailstone Function generator with Pseudocursors, and although technically it would still be bound by its source tables, it could easily be made to run for longer than any of us will be alive.

More to come in Part 2 …

When discussions about Cursors get hot and heavy, one thing that comes out is that there is a lot of confusions about what a cursor (or "Cursor") really is.  In general a cursor is "a moving placement or pointer that indicates a position."1  More specifically for databases, a cursor can be formally defined as "a control structure for the successive traversal (and potential processing) of records in a result set."2  Less formally, a cursor is any method (structures and processes) that serializes a data set, that is sequences (imposes an order on) it and then processes it one record at a time, in order, keeping track or the current position with the sequence.

Given that definition, cursors are in a lot more places than we might initially assume and these more general cursors are not what we usually mean when we are talking about the Evils of Cursors.  For that reason, I distinguish betweengeneral "cursors" (lower case) and the explicit user written SQL code constructs called "Cursors".  This naturally raises two important questions;

  1. What's so bad about Explicit Cursors?  and,...
  2. Are these other cursors OK then?

I will deal with both of these issues in a future post...

A helpful tip from Microsoft MVP Gail Shaw here on how to bypass a bad Logon Trigger:

Connections via the DAC don't fire login triggers. From management studio, open a new query and specift admin: before the server name (admin:MyServerName) and use windows authentication. You must be sysadmin.

If remote DAC hasn't been enabled (the default), you will have to do that from a querying tool on the server itself. If the server doesn't have management studio, you can use SQLCMD. Specify the -A switch

Once in, you can disable the trigger.

DISABLE TRIGGER MyBrokenLoginTrigger ON ALL SERVER
Be very, very careful when writing login triggers. If, for any reason, the trigger fires an error of Sev 16 or higher (object does not exist, database not found, permission denied) the trigger fails and rolls back the login.
I've had a couple panicked, late-night phone calls because of these.

Something that I have wanted to do for a while is to write a Logon Trigger for SQL Server.  This bit of code from BOL demonstrates how:

CREATE TRIGGER connection_limit_trigger
ON ALL SERVER WITH EXECUTE AS 'login_test'
FOR LOGON
AS
BEGIN
IF ORIGINAL_LOGIN()= 'login_test'
AND (SELECT COUNT(*) FROM sys.dm_exec_sessions
WHERE is_user_process = 1
AND original_login_name = 'login_test') > 3
ROLLBACK;
END;

Next up: how to bypass a bad Logon Triger with the DAC (Dedicated Adminstrator Connection).

SQL Server MVP Jeff Moden posted a technique (on SQLServerCentral.com forums) for something that I have struggled with many times: how to import CSV files that have a variable number of columns:

Heh... that's known as a "Headerless Ragged Right" file and they're a bugger to import correctly. BCP and BULK insert require that all the rows have the same number of delimiters so you can't do an import of such a file with either.

However... with a little help from a couple of DOS commands and a little prestidigitation with a text based linked server, it can be done with very little pain.First, I assume you have a file, like in your original post, in C:\Temp called Test01.txt and it looks like this...

col1,col2,col3
col1,col2
col1,col2,col3,col4
col1,col2,col,3,col4,col5

Next, let's setup a linked server and give it the necessary login privs...

--===== Create a linked server to the drive and path you desire.
EXEC dbo.sp_AddLinkedServer TxtSvr,
'Jet 4.0',
'Microsoft.Jet.OLEDB.4.0',
'C:\Temp',
NULL,
'Text'--===== Set up login mappings.
EXEC dbo.sp_AddLinkedSrvLogin TxtSvr, FALSE, NULL, Admin, NULL

Here comes the fun part... if we just read the file directly...
--===== Query one of the files by using a four-part name.
SELECT *
FROM TxtSvr...[Test01#txt]


... we get an awful mess that looks like this...

F4 F5 col1 col2 col3
col1 col2
col4 col1 col2 col3
col4 col5 col1 col2 col3

(3 row(s) affected)

 

notice that the first row was used as a header and that the data has been rearranged AND that two of the headers have been made up! We can fix that with a couple of simple DOS commands...

--===== Create a header that identifies the columns we want(any col names will do)
EXEC Master.dbo.xp_CmdShell
'ECHO HdrCol1,HdrCol2,HdrCol3,HdrCol4,HdrCol5 > C:\Temp\Header.txt'
--===== Create a new working file that has the header and the original file as one
EXEC Master.dbo.xp_CmdShell
'COPY C:\Temp\Header.txt+C:\Temp\Test01.txt C:\Temp\MyWork.txt'

Then, accessing the data in the file is a breeze... do with it what you want because it can now be referenced as a table with the correct column names...
--===== Read the csv text file as if it were a table
SELECT *
FROM TxtSvr...[MyWork#txt]

HdrCol1 HdrCol2 HdrCol3 HdrCol4 HdrCol5
col1 col2 col3
col1 col2
col1 col2 col3 col4
col1 col2 col3 col4 col5

(4 row(s) affected)

If you need to drop the linked server after than, the following command will do nicely...

  EXEC dbo.sp_DropServer 'TxtSvr', 'DropLogins'

--Jeff Moden

More at http://www.sqlservercentral.com/Forums/FindPost553970.aspx

Recently an article and discussion on SqlServerCentral.com turned to the popular opinion that "Cursors are OK somtimes."  Well, I couldn't let that go:

Wow. This obtuse defense of cursors and RBAR has more hand-waving than a Michael Jackson concert. Practically every other post here has statements to the the effect that "sometimes cursors are the best solution", starting with the original article itself, and yet none of these claims has been backed up with an example, or a pointer of a cursor-based SQL procedure that is the best solution for SQL Server in its current release.

Not one.

Hmm, makes me think of one of my own maxims:

When everyone says that something is true, but no one can provide an example, then it probably isn't true.

And let me be clear about what I mean by cursors being bad. Technically, anything in SQL that serializes a data stream and can keep positional context is a cursor. However, when we say "cursors are bad" we do not mean client-side cursors, internal cursors, implicit cursors, etc. We mean explicit Transact-SQL server-side cursors. The kind that people explicitly write in procedures all the time (or try to hide in another stored procedure) and that use the CURSOR datatype.

Since the advent of the new features in SQL Server 2005 (particularly Varchar(MAX) and CTE's), I have not seen one instance where a cursor was the best solution for a problem. The best that can be said for it is that there are some (very few) cases where a cursor was no worse than some other solutions. But real-world problems addressing situations that reasonably belong in the database-tier where cursors are the clear best solution? Not one.

More here: www.sqlservercentral.com/Forums/Topic550718-263-3.aspx

 

Copyright 2008 by R. Barry Young
 RBarryYoung.net  |  Terms Of Use  |  Privacy Statement