SQL Server Bitwise Processing Like C# Enum Flags

Disassembling Bit Flag Enumerations in SQL Server

SELECT * FROM first_table f 
JOIN second_table s ON s.ID & f.Flags <> 0
WHERE f.something = something

This would select all rows from second_table that matches any of the flags on the given row in the first table.

Any disadvantages to bit flags in database columns?

If you only have a handful of roles, you don't even save any storage space in PostgreSQL. An integer column uses 4 bytes, a bigint 8 bytes. Both may require alignment padding:

  • Making sense of Postgres row sizes
  • Calculating and saving space in PostgreSQL

A boolean column uses 1 byte. Effectively, you can fit four or more boolean columns for one integer column, eight or more for a bigint.

Also take into account that NULL values only use one bit (simplified) in the NULL bitmap.

Individual columns are easier to read and index. Others have commented on that already.

You could still utilize indexes on expressions or partial indexes to circumvent problems with indexes ("non-sargable"). Generalized statements like:

database cannot use indexes on a query like this

or

These conditions are non-SARGable!

are not entirely true - maybe for some others RDBMS lacking these features.

But why circumvent when you can avoid the problem altogether?

As you have clarified, we are talking about 6 distinct types (maybe more). Go with individual boolean columns. You'll probably even save space compared to one bigint. Space requirement seems immaterial in this case.


If these flags were mutually exclusive, you could use one column of type enum or a small look-up table and a foreign key referencing it. (Ruled out in question update.)

Most effecient way to keep status flags (under 32 items) in c#

As no body answered my question, and I performed some tests and researches, I will answer it myself and I hope to be usable for others:

Which is The Most Efficient way to keep status flags for this
situation?

Because the computer will align data in memory according to the processor architecture ,Even in C# (as a high level language), still It is generally a good advise to avoid separate boolean fields in classes.

  • Using bit-mask based solutions (same as flags Enum or BitVector32 or manual bit-mask operations) is preferable. For two or more boolean values, it’s a better solution in memory-load and is fast. But when we have a single boolean state var, this is useless.

Generally we can say if we choose flags Enum or else BitVector32 as solution, it should be almost as fast as we expect for a manual bit-masked operations in C# in most cases.

  • When we need to use various small numeric ranges in addition to boolean values as state, BitVector32 is helpful as an existing util that helps us to keep our states in one variable and saving memory-load.

  • We may prefer to use flags Enum to make our code more maintainable and clear.

Also we can say about the 2'nd part of the question

Is it ( = the most efficient way) deeply depends on current in-using
tools such as Compiler or even Runtime (CLR)?

Partially Yes.
When we choose each one of mentioned solutions (rather than manual bitwise operations), the performance is depended on compiler optimization that will do (for example in method calls we made when we were using BitVector32 or Enum and or enum operations, etc). So optimizations will boost up our code, and it seems this is common in C#, but for every solution rather than manual bitwise operations, with tools rather than .net official, it is better to be tested in that case.

Parse Enum with FlagsAttribute to a SqlParameter

I wrote below method to get the comma separated enum ids in order to parse that to the database.

private static string GetAlcoholStatuses(Enums.Enums.AlcoholStatus? alcoholStatus)
{
if (alcoholStatus == null)
return string.Empty;

Enums.Enums.AlcoholStatus alcoholStatusValue = alcoholStatus.Value;
string alcoholStatuses = string.Empty;

if (alcoholStatusValue.HasFlag(Enums.Enums.AlcoholStatus.Drinker))
{
alcoholStatuses = string.Format("{0}{1}{2}", alcoholStatuses, (int)Enums.Enums.AlcoholStatus.Drinker, ",");
}
if (alcoholStatusValue.HasFlag(Enums.Enums.AlcoholStatus.NonDrinker))
{
alcoholStatuses = string.Format("{0}{1}{2}", alcoholStatuses, (int)Enums.Enums.AlcoholStatus.NonDrinker, ",");
}
if (alcoholStatusValue.HasFlag(Enums.Enums.AlcoholStatus.NotRecorded))
{
alcoholStatuses = string.Format("{0}{1}{2}", alcoholStatuses, (int)Enums.Enums.AlcoholStatus.NotRecorded, ",");
}

return alcoholStatuses;
}

Join tables with enum values

Instead of hard-coding the roles, use a join.

SELECT
*
FROM
table2
LEFT JOIN
table1
ON table2.roles & table1.RoleEnumValue = 1

This will give each user's roles on a separate row per role.

I recommend you stop there, as that's the correct normalised data-structure suitable for SQL and relational databases.

You can, however, collapse the results to a single row per user, with all roles in a single string.

This is usually a bad idea, and Will make subsequent SQL much harder...

SELECT
table2.UserName,
STRING_AGG(table1.Role, ', ') AS all_roles
FROM
table2
LEFT JOIN
table1
ON table2.roles & table1.RoleEnumValue = 1
GROUP BY
table2.UserName

When is it better to store flags as a bitmask rather than using an associative table?

Splendid question!

Firstly, let's make some assumptions about "better".

I'm assuming you don't much care about disk space - a bitmask is efficient from a space point of view, but I'm not sure that matters much if you're using SQL server.

I'm assuming you do care about speed. A bitmask can be very fast when using calculations - but you won't be able to use an index when querying the bitmask. This shouldn't matter all that much, but if you want to know which users have create access, your query would be something like

select * from user where permsission & CREATE = TRUE

(haven't got access to SQL Server today, on the road). That query would not be able to use an index because of the mathematical operation - so if you have a huge number of users, this would be quite painful.

I'm assuming you care about maintainability. From a maintainability point of view, the bitmask is not as expressive as the underlying problem domain as storing explicit permissions. You'd almost certainly have to synchronize the value of the bitmask flags across multiple components - including the database. Not impossible, but pain in the backside.

So, unless there's another way of assessing "better", I'd say the bitmask route is not as good as storing the permissions in a normalized database structure. I don't agree that it would be "slower because you have to do a join" - unless you have a totally dysfunctional database, you won't be able to measure this (whereas querying without the benefit of an active index can become noticably slower with even a few thousand records).

Why use flags+bitmasks rather than a series of booleans?

It was traditionally a way of reducing memory usage. So, yes, its quite obsolete in C# :-)

As a programming technique, it may be obsolete in today's systems, and you'd be quite alright to use an array of bools, but...

It is fast to compare values stored as a bitmask. Use the AND and OR logic operators and compare the resulting 2 ints.

It uses considerably less memory. Putting all 4 of your example values in a bitmask would use half a byte. Using an array of bools, most likely would use a few bytes for the array object plus a long word for each bool. If you have to store a million values, you'll see exactly why a bitmask version is superior.

It is easier to manage, you only have to deal with a single integer value, whereas an array of bools would store quite differently in, say a database.

And, because of the memory layout, much faster in every aspect than an array. It's nearly as fast as using a single 32-bit integer. We all know that is as fast as you can get for operations on data.

General method for constructing bitwise expressions satisfying constraints/with certain values?

The most general construction is an extended version of "minterms". Use bitwise operators to construct a predicate that is -1 iff the input matches a specific thing, AND the predicate with whatever you want the result to be, then OR all those things together. That leads to horrible expressions of course, possibly of exponential size.

Using arithmetic right shifts, you can construct a predicate p(x, c) = x == c:

p(x, c) = ~(((x ^ c) >> 31) | (-(x ^ c) >> 31))

Replace 31 by the size of an int minus one.

The only number such that it and its negation are both non-negative, is zero. So the thing inside the final complement is only zero if x ^ c == 0, which is the same as saying that x == c.

So in this example, you would have:

(p(a, 0x00) & p(b, 0x00)) |
(p(a, 0x10) & p(b, 0x10)) |
(p(a, 0x11) & p(b, 0x10))

Just expand it.. into something horrible.

Obviously this construction usually doesn't give you anything sensible. But it's general.

In the specific example, you could do:

f(a, b) = (p(a, 0) & p(b, 0)) | ~p(a & b, 0)

Which can be simplified a little again (obviously the xors go away if c == 0, and two complements balance each other out).



Related Topics



Leave a reply



Submit