Guid vs IntRather interesting article on when tho use a guid or an int as a Primary Key:
http://databases.aspfaq.com/database/what-should-i-choose-for-my-primary-key.htmlSummary:
IDENTITY |
---|
PROS |
---|
- small storage footprint;
- optimal join / index performance (e.g. for time range queries, most rows recently inserted will be on a limited number of pages);
- highly useful for data warehousing;
- native data type of the OS, and easy to work with in all languages;
- easy to debug (where userid = 457);
- generated automatically (retrieved through SCOPE_IDENTITY() rather than assigned);
- not updateable (though some consider this a disadvantage, strangely enough).
|
CONS |
---|
- cannot be reliably "predicted" by applications รข€” can only be retrieved after the INSERT;
- need a complex scheme in multi-server environments, since IDENTITY is not allowed in some forms of replication;
- can be duplicated, if not explicitly set to PRIMARY KEY:
CREATE TABLE splunge( splungeID INT IDENTITY(1,1), foo CHAR(1) ) GO SET NOCOUNT ON INSERT splunge(foo) VALUES('a') INSERT splunge(foo) VALUES('b') INSERT splunge(foo) VALUES('c') GO DBCC CHECKIDENT('splunge', RESEED, 2) GO INSERT splunge(foo) VALUES('d') GO SELECT * FROM splunge GO DROP TABLE splunge GO |
- if part of the clustered index on the table, this can create an insert hot-spot;
- proprietary and not directly portable;
- only unique within a single table;
- gaps can occur (e.g. with a rolled back transaction), and this can cause chicken little-style alarms.
|
GUID() |
---|
PROS |
---|
- since they are {more or less} guaranteed to be unique, multiple tables/databases/instances/servers/networks/data centers can generate them independently, then merged without clashes;
- required for some forms of replication;
- can be generated outside the database (e.g. by an application);
- distributed values prevent hot-spot (as long as you don't cluster this column, which can lead to abnormally high fragmentation).
|
CONS |
---|
- the wider datatype leads to a drop in index performance (if clustered, each insert almost guaranteed to 'dirty' a different page), and an increase in storage requirements;
- cumbersome to debug (where userid = {BAE7DF4-DDF-3RG-5TY3E3RF456AS10});
- updateable (need to propogate changes, or prevent the activity altogether);
- sensitive to time rollbacks in certain environments (e.g. daylight savings time rollbacks);
- GROUP BY and other set operations often require CAST/CONVERT;
- not all languages and environments directly support GUIDs;
- there is no statement like SCOPE_GUID() to determine the value that was generated, e.g. by NEWID();
- there are display issues to consider, such as response.write issues (see Article #2358), and also consider cases where you display a large amount of rows as checkboxes or other form elements with IDs - your page just got a whole lot bigger, and you have to handle the IDs in a custom way because JavaScript can't deal with objects that have dashes in their name/id.
|
No comments:
Post a Comment