Data and AI Knowledge: October 2015

Monday 12 October 2015

SQL Server Lock Pages in Memory

How to enable "Lock pages in Memory" at the Windows level?

1. On the Start menu, click Run. In the Open box, type gpedit.msc. The Group Policy dialog box opens.
2. On the Group Policy console, expand Computer Configuration, and then expand Windows Settings.
3. Expand Security Settings, and then expand Local Policies.
4. Select the User Rights Assignment folder. The policies will be displayed in the details pane.
5. In the pane, double-click Lock pages in memory.
6. In the Local Security Policy Setting dialog box, click Add.
7. In the Select Users or Groups dialog box, add the account that runs the SQL Server Service.

8. Restart the machine for these changes to take effect.

If you are running an Enterprise or Developer Edition of SQL Server 2005 or 2008, please stop here. The rest of the steps are for the Standard Edition Only.

Ensure that the build of SQL Server 2008 is Cumulative Update 2 on Service Pack 1. Preferably, run the "select @@version" command against the SQL Server Instance and verify that the build is 10.00.2714.00. In case of SQL Server 2005, please verify that the build is 9.00.4226.
9. Now set the Trace Flag 845 as a startup trace flag. This can be done by adding
-T845 to the startup parameters from the SQL Server Configuration Manager. Please refer to the screenshot below:

SQLConfigManager
10. Restart the SQL Server Service.
11. Verify that the following message is written in the SQL Server ErrorLog file at startup:
Using locked pages for buffer pool

Note: Enabling Locked Pages may have a negative performance impact on your system performance. Please consider all potential effects before you use this option. Enabling this trace flag on the Enterprise Edition or the Developer Edition has no effect.

Monday 5 October 2015

SQL Server Checkpoint Mystery

SQL Server Checkpoint Mystery
Checkpoint is the hidden Mystery in SQL Server,Thanks to Paul Randal for very good explanation.

Whenever a checkpoint occurs either auto/manual/indirect/direct following set of operations are occurred for the database:

1. All dirty data file pages for the database are written to disk (all pages that have changed in memory since they were read from disk or since the last checkpoint), regardless of the state of the transaction that made the change.
2. Before a page is written to disk, all log records up to and including the most recent log record describing a change to that page are written to disk (yes, log records can be cached in memory too). This guarantees recovery can work and is called write-ahead logging. Log records are written to the log sequentially, and log records from multiple transactions will be interspersed in the log. The log cannot be selectively written to disk, so writing a dirty page to disk that only has a single log record affecting it may mean writing many more previous log records to disk as well.
3. Log records describing the checkpoint are generated.
4. The LSN of the checkpoint is recorded in the database boot page in the dbi_checkptLSN field.

If in the SIMPLE recovery model, the VLFs in the log are checked to see whether they can be marked inactive (called clearing or truncating the log – both of which are terrible misnomers, as nothing is either physically cleared or truncated).

Checkpoints are not really tracked in the transaction log – it just serves as a useful repository for information about which transactions are active at the time of the checkpoint. The LSN of the last checkpoint is recorded in the database boot page. This is where recovery starts, and if this page is inaccessible, the database cannot be attached, opened, or processed in any way – partly because it’s the boot page that knows whether the database was cleanly shut down or not, and partly because it’s the only place that records the LSN of the last checkpoint record. You may say, well it’s recorded in the transaction log too, but what if the log is corrupt in some way?
One area of confusion I’ve seen is that the checkpoint log records are overwritten by subsequent checkpoints. Absolutely not – once written, a log record is NEVER updated or overwritten – it will only be overwritten when the log wraps and the VLFs are re-used.. This has led to further confusion about when checkpoint information is retrievable from the log, using commands such as fn_dblog.

Now What happens if a checkpoint starts but doesn’t finish before a crash occurs? Will that checkpoint be used for crash recovery?
The answer is no, it won’t. The purpose of a checkpoint is to bring the pages in the data files up-to-date with what’s in the transaction log. When a checkpoint ends, there’s a guarantee that as of the LSN of the LOP_BEGIN_CKPT log record, all changes from log records before that point are persisted in the data files on disk. There’s no guarantee about logged changes after that point, only before it. In other words, all the log records before the LSN of the LOP_BEGIN_CKPT log record are no longer required for crash recovery, unless there’s a long running transaction that started before that LSN.

When the checkpoint ends, the boot page of the database (page 9 in file 1) is updated with the beginning LSN of the checkpoint (and then if in the SIMPLE recovery mode, any log clearing/truncating can occur).

So if a checkpoint started but didn’t end before a crash, it’s LSN would not be in the boot page and so crash recovery would start from the previous checkpoint. This is good, because an incomplete checkpoint means there’s no guarantee about which logged changes are persisted in the data files, and so crash recovery wouldn’t be able to work correctly from only starting at the beginning of the incomplete checkpoint.

How does SQL Server guarantee that there’s always one complete checkpoint in the active portion of the log, in case a crash occurs?
The answer is that log clearing/truncation of a VLF containing an LOP_BEGIN_CKPT log record cannot happen until another complete checkpoint has occurred. In other words, a complete checkpoint has to occur since the last log clearing/truncation before the next one can happen. If a checkpoint hasn’t occurred, the log_reuse_wait_desc for the database in sys.databases will return CHECKPOINT. It’s not common to see this occur, but you might see it if there’s a very long running checkpoint (e.g. a very large update on a system with a slow I/O subsystem so the flushing of data file pages takes a long time) and very frequent log backups, so two log backups occur over the time taken by a single checkpoint operation. It could also happen if you’ve messed with the sp_configure recovery interval and set it higher than the default.

How Checkpoint works for tempDB database?
A checkpoint is only done for tempdb when the tempdb log file reaches 70% full – this is to prevent the tempdb log from growing if at all possible (note that a long-running transaction can still essentially hold the log hostage and prevent it from clearing, just like in a user database).
You know very well, when a checkpoint occurs for a user database, all dirty pages for that database are flushed to disk (as well as other operations). This does not happen for tempdb. Tempdb is not recovered in the event of a crash, and so there is no need to force dirty tempdb pages to disk, except in the case where the lazywriter process (part of the buffer pool) has to make space for pages from other databases. Of course, when you issue a *manual* CHECKPOINT, all the dirty pages are flushed, but for automatic checkpoints they’re not. You can easily prove this to yourself with a large transaction inserting into a user table on tempdb, and watch the Databases/Log File(s) Used Size (KB) and Databases/Log Truncations for tempdb, when you see them go up, check to see if the number of dirty pages in tempdb has reduced – it won’t have.

Thursday 1 October 2015

SQL Server Page Type

Types of Data Pages

In SQL Server, we have different type of Pages and its consists of numbers of 8K pages.
A Page includes 8192 bytes. First 96 bytes are used for header. The rest of the space is for data.
A variable length row offset array (or slot array) is located at the end of every page and grows backwards. Count of records is saved in the header.
The size of each element in the offset array is 2 bytes. Records in a page are not sorted even though it is an index page.
If the data needs to be sorted, the offset array will be is sorted by the key of the index.

Here is the Structure of the Data Page:

There are around 17 types of page in SQL Server data file and the arrangement of the data pages is same as shown below:

Type 1 – Data page
- Data records in heap
- Clustered index leaf-level
- Location can be random

Type 2 – Index page
- Non-clustered index
- Non-leave-level clustered index
- Location can be random

Type 3 – Text Mixed Page
- Small LOB value(s), multiple types and rows.
- Location can be random

Type 4 – Text Page
- LOB value from a single column value
- Location can be random

Type 7 – Sort Page
- Temporary page for sort operation.
- Usually tempdb, but can be in user database for online operations.
- Location can be random

Type 8 – GAM Page
- Global Allocation Map, track allocation of extents.
- One bit for each extent, if the bit is 1 then its free, otherwise its allocated.
- The first GAM page in each file is page 2

Type 9 – SGAM Page
- Shared Global Allocation Map, track allocation of shared extents
- One bit for each extent, if the bit is 1, then it has free space, otherwise its full
- The first SGAM page in each file is page 3

Type 10 – IAM Page
- Index Allocation Map. Extent allocation in a GAM interval for an index or heap table.
- Location can be random.

Type 11 – PFS Page
- Page Free Space. Byte map, keeps track of free space of pages
- The first PFS is page 1 in each file.

Type 13 – Boot Page
- Information about the page
- Only page 9 in file 1.
- To review "What's on boot page?" run below commands:

DBCC DBINFO (‘DB_Name’);
GO
OR
DBCC PAGE(DB_Name, 1, 9, 3)
GO
- More details click here to refer Paul's Article

Type 14 – Server Configuration Page (Don't know the official name)
- Part of information returned from sp_configure.
- It only exists in master database, file 1, page 10

Type 15 – File Header Page
- Information about the file.
- It’s always page 0 every data page.
- To review this page, run below command:

DBCC PAGE(DB_name, 1, 0, 3)
DBCC PAGE(DB_name, 1, 0, 3) WITH TABLERESULTS
GO

Type 16 – Differential Changed map
- Extents in GAM interval have changed since last full or differential backup
- The first Differential Changed Page is page 6 in each file

Type 17 – Bulk Change Map
- Extents in GAM interval modified by bulk operations since last backup
- The first Bulk Change Map page is page 7 in each file.