Monday 14 January 2013

Memory issues

I've a SQL script that produces cumulative waits (sys.dm_os_wait_stats), key items for review (same source & gives % of all waits to each type but ignore "benign" wait types) and then reproduces the same information but as a delta so current waits can be seen and finally looks at current activities with waits so it is possible to see what is hurting (via sys.dm_os_waiting_tasks and sys.dm_exec_sessions / sys.dm_exec_requests).

This gives me a broad summary of what's going on an is a good indicator of where to look when there are problems, what's more it generates a good viewpoint without taking an age to run which is always good news as the only boxes this needs to be executed against are those with problems.

What I also have is an accumulated list of wait types and meanings, I'm not sure where I picked up the initial list but as time has gone on I've added to and clarified meanings. One area I have been looking at lately is memory grants because I had a server which continually has outstanding memory grants - within the last 12 months we've gone from 16Gb to 40Gb and I think the machine would still like more. The core database suffers a range of problems include the wrong fields / no fields being indexed, uniqueidentifiers, non unique primary keys and a failure to agree an archive policy prior to implementation but that's a little out of scope here. What I do know is that the machine is now largely stable but could do with more memory (EDIT:10 days on and performance problems have persisted and more RAM would definitely be good).

During the course of investigating I have collected further information about the RESOURCE_SEMAPHORE wait type that appears to link wait types to memory shortages which ties in with the grants outstanding issue.

In this article MSSQL Tips : Resource_Semource waits the requested_memory_kb column from sys.dm_exec_query_memory_grants is highlighted showing how much memory has been requested and then talks about looking at sys.dm_exec_query_memory_grants and sys.dm_exec_sql_text(sql_handle) to look at the largest memory requests.

Having been really rather instructive the article then fizzles out and concludes the answer is to look at indexes or lack thereof.

That is good starting point and all to often basic indexes are the fix but indexes can be a minefield, vendors are often very sensitive to making changes  and if they discover changes have been made without agreement then blame all future problems on that one things - I had a vendor once offer to accept the idea of creating two indexes (which I had tested and transformed their system performance) but having received the creation statements from me wanted to charge £800 to allow us to run the statements (I would have been the one running them in the production instance). Further, changing indexes on a mature system can often lead to unexpected effects that are difficult to predict without testing the application in greater depth.

Looking further at the memory requested I would suggest that there is a case to consider other factors too, yes they will be equally contentious with vendors but adding indexes might be a lazy solution :

  1. Are joins of the correct type (can that outer be replaced with an inner) and are all join criteria specified
  2. Are functions being executed against fields in the where clause that might impact index usage (perhaps in the future, always worth checking as comparisons against variables might be alterable to modify the variable).
  3. If the query loading tables and fields that add no value.
  4. Are there varchar fields involved - the query optimiser will estimate that varchars take half their defined length for memory requirements usually - have a looked at "Estimated Row size" in the query plan - if your varchar fields are frequently using a lot less than or more than half their defined size the estimates of row size might be a long way out especially for longer fields. You can test this for yourself, turn on the actual query plan and try :

    select cast('hello' as varchar(30)) from

    Right click on the "compute scalar" item and have a look at the estimate row size (I see 29Bytes), now change the varchar(30) to text or varchar(max), nvarchar(30) or anything else and see what sizes you get. For varchar(max) I get an estimate of 4035B - in the table I picked I have 129 rows so the difference is (129 * 4035) - (129 * 29) - thats 504Kb difference on just 129 rows, it'll soon add up.
  5. How many rows does the query plan report will be retrieved - I've just been looking at a query that said it was retrieving 233,131 rows which didn't match the 136 rows I could see in the result set. Looking further at the query and the context from Profiler I can see that the query is generating a parameter list for an SSRS report - still too many rows though but the "distinct" reveals why I see so few records. More examination clarifies that the number of records reported is because the developer wants to get "types" of something that are used and is achieving that by an inner join between the types table (225 rows) and the "items" table that has 15m rows - replacing the inner join with an exists reduces the impact somewhat - runtime reduced from 12 seconds to less than 1.
I'm not saying this is the be all and end all of query optimization, it's a massive subject and I am grateful the optimizer does a brilliant job most of the time, I am though trying to highlight other potential issues.


No comments:

Post a Comment