Except Has Similar Effect to Distinct

Except has similar effect to Distinct?

The documentation for the Except function states:

Produces the set difference of two sequences by using the default equality comparer to compare values.

The set difference of two sets is defined as the members of the first set that do not appear in the second set.

The important word here is set, which is defined as:

...an abstract data structure that can store certain values, without any particular order, and no repeated values...

Because Except is documented as a set-based operation, it also has the effect of making the resulting values distinct.

LINQ Except is not retrieving the duplicates

You can do Where and then get the Count of the output:

var result = listaA.Where(x =>!listaB.Contains(x));
var newcount=result.Count();

Alternate of except in sql

If you don't want nulls from the table TempNotRunResults then use this join:

SELECT A.ACCOUNT_NO, A.BILL_CYCLE_DATE, B.DATE_BILLED,
B.DATE_PAYMENT_DUE,B.TOTAL_BILL_AMT, B.LPC_AMT, B.BILL_FREQ, B.BILL_CYCLE_TYPE
FROM CSS_BILL_Job A LEFT JOIN TempNotRunResults B
ON A.ACCOUNT_NO = B.ACCOUNT_NO
WHERE
(A.BILL_CYCLE_DATE = 20190526 OR A.BILL_CYCLE_DATE = 20190525)
AND
B.ACCOUNT_NO IS NOT NULL

The condition B.ACCOUNT_NO IS NOT NULL fetches only matching rows, which would be done also by using an INNER join:

SELECT A.ACCOUNT_NO, A.BILL_CYCLE_DATE, B.DATE_BILLED,
B.DATE_PAYMENT_DUE,B.TOTAL_BILL_AMT, B.LPC_AMT, B.BILL_FREQ, B.BILL_CYCLE_TYPE
FROM CSS_BILL_Job A INNER JOIN TempNotRunResults B
ON A.ACCOUNT_NO = B.ACCOUNT_NO
WHERE (A.BILL_CYCLE_DATE = 20190526 OR A.BILL_CYCLE_DATE = 20190525)


But the title of your question is: "Alternate of except in sql", which is the opposite of what you say that you want to achieve: populate columns from table B in my Left join query, because if you want populated rows from table B then you need the matching rows.

LINQ Except sends first collection's items to IEqualityComparer Equals method

It's the expected behaviour; Except operates with sets (not bags) which contain uinque items only; so Except returns distinct items only:

var demo = new int[] {1, 1} 
.Except(new int[0])
.ToList();

Console.Write(string.Join(" ", demo));

Outcome:

1

In your case, Except tests items C and D (both from the 1st collection) in that very purpose: to ensure to return distinct items only:

https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,e289e6c98881b2b8

static IEnumerable<TSource> ExceptIterator<TSource>(
IEnumerable<TSource> first,
IEnumerable<TSource> second,
IEqualityComparer<TSource> comparer) {
Set<TSource> set = new Set<TSource>(comparer);

foreach (TSource element in second)
set.Add(element);

foreach (TSource element in first)
// Here Except tries adding element from first
// and have to compare if the element has been in set already.
// in your case 'D' will be tested on A, B (which are in second)
// and 'C' which has been added earlier
if (set.Add(element))
yield return element;
}

If you want "all items in first (duplicates included) except those which appear in second" you can create HashSet<T> manually and put a simple Where:

 var second = new MyType[] {A, B}; // or whatever IEnumerable<MyType>

...

// Items to exclude
HashSet<MyType> exclude = new HashSet<MyType>(second, new CustomComparer());

var result = first // {A, B, C, D, E}
.Where(item => !exclude.Contains(item)) // all items but appear in exclude - i.e. second
.ToList();

MySQL: Select rows that have only unique values except for a column

You can apply the max() aggregate to the ID column and then GROUP BY the rest:

select max(id) id, SATELLITE_ID, ATT_TYPE_ID, TIME, Roll, Pitch, yaw
from attitude
group by SATELLITE_ID, ATT_TYPE_ID, TIME, Roll, Pitch, yaw
order by id

See SQL Fiddle with Demo

Result:

| ID | SATELLITE_ID | ATT_TYPE_ID | TIME | ROLL | PITCH | YAW |
---------------------------------------------------------------
| 2 | 1 | 1 | 2012 | 1 | 2 | 1 |
| 3 | 1 | 1 | 2011 | 1 | 2 | 1 |

Using EXCEPT where 1=0

The reason that this works is due to the definition of EXCEPT which according to the MS docs is

EXCEPT returns distinct rows from the left input query that aren't
output by the right input query.

The key word here being distinct. Putting where 1 = 0 makes the second query return no results, but the EXCEPT operator itself then reduces the rows from the left query down to those which are distinct.

As @Gordon Linoff says in his answer, there is a simpler, more straightforward way to accomplish this.

The fact that the example uses the same table in the left and right queries could be misleading, the following query will accomplish the same thing, so long as the values in the right query don't exist in the left:

SELECT col1, col2
FROM @DuplicateRecordTable
EXCEPT
SELECT -1, -1

REF: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql?view=sql-server-2017

SQL: SELECT DISTINCT not returning distinct values

In a nutshell - the way you use DISTINCT is logically wrong from SQL perspective.

Your DISTINCT is in an IN subquery in the WHERE clause - and at that point of code it has absolutely no effect (except from the performance penalty). Think on it - if the outer query returns non-unique values of dbo.ISW_LPTrans.lp_num (which obvioulsy happens) those values can still be within the distinct values of the IN subquery - the IN does not enforce a 1-to-1 match, it only enforces the fact that the outer query values are within the inner values, but they can match multiple times. So it is definitely not DISTINCT's fault.

I would go through the following check steps:

  • See if there is insufficient JOIN ON condition(s) in the outer FROM section that leads to data multiplication (e.g. if a table has primary-to-foreign key relation on several columns, but you join on one of them only etc.).
  • Check which of the sources contains non-distinct records in the outer FROM section - then either cleanse your source, or adjust the JOIN condition and / or the WHERE clause so that you only pick distinct & correct records. In fact you might need to SELECT DISTINCT in the FROM sections - there it would make much more sense.


Related Topics



Leave a reply



Submit