Split the distinct values in a list separated by a comma
You can use Pandas .str.split()
to split on comma and semicolon, put the result in a numpy array. Then, use np.unique
to get the unique words after flatten from 2D array to 1D array, as follows:
import numpy as np
list_all = df_new['DevType'].str.split(r'(?:,|;)\s*').dropna().to_numpy()
list_unique = np.unique(sum(list_all, []))
Result:
print(list_unique)
['Devel...' 'Developer' 'back-end' 'desktop or ente...'
'embedded applications or devices' 'front-end' 'full-stack']
Get unique values in column separated by comma
You can achieve it by this simple way, live demo here
SELECT DISTINCT C2.Refnumber,
SUBSTRING(
(
SELECT ', ' + CAST(C1.Menu_id AS VARCHAR(20))
FROM TempTable C1
WHERE C1.Refnumber = C2.Refnumber AND C1.Status = 'True'
FOR XML PATH ('')
), 2, 1000) AS "MenuIdList"
FROM TempTable C2
Output
Refnumber MenuIdList
12121313 1, 4
28273882 1, 2, 3
Find all unique values in column separated by comma
You're almost there, you just need to unlist
before you do the unique
:
all_observers <- unique(unlist(strsplit(as.character(data$observer), ",")))
How to find the number of unique values in comma separated strings stored in an pandas data frame column?
You can use a list comprehension with a set
df['Unique_in_x'] = [len(set(x.split(','))) for x in df['x']]
Or using a split
and nunique
:
df['Unique_in_x'] = df['x'].str.split(',', expand=True).nunique(1)
Output:
x Unique_in_x
0 5,5,6,7,8,6,8 4
1 5,9,8,0 4
2 5,9,8,0 4
3 3,2 2
4 5,5,6,7,8,6,8 4
Unique Values in a Comma Separated List / Unique Values in Multiple Columns
=ARRAYFORMULA(UNIQUE(TRIM(TRANSPOSE(SPLIT(QUERY(REGEXREPLACE(
FILTER(Data!A:A, Data!A:A<>""), "(,)( )([A-Z])", "♦$2$3")&"♦",,999^99), "♦")))))
______________________________________________________________
=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(REGEXREPLACE(
FILTER(Data!A:A, Data!A:A<>""), "(,)( )([A-Z])", "♦$2$3")&"♦",,999^99), "♦"))),
"select Col1,count(Col1) group by Col1 order by count(Col1) desc label count(Col1)''"))
How can I select a list of distinct values from a field of comma separated values?
Here is a method that uses a an implicit conversion "trick" in MySQL:
select distinct substring_index(t.category, ',', -n) + 0
from (select 1 as n union all select 2 union all select 3
) n cross join
t
This gets the nth element from the end. And then uses + 0
to convert to a number. The conversion stops at the comma, so this gets the nth value from the end.
You can extend the n
derived table up to the maximum number of elements in a list.
Create a comma separated string in cell based on unique values in 2 columns
Put the following code in a regular module:
Function TEXTJOINIFS(rng As Range, delim As String, ParamArray arr() As Variant)
Dim rngarr As Variant
rngarr = Intersect(rng, rng.Parent.UsedRange).Value
Dim condArr() As Boolean
ReDim condArr(1 To Intersect(rng, rng.Parent.UsedRange).Rows.Count) As Boolean
Dim i As Long
For i = LBound(arr) To UBound(arr) Step 2
Dim colArr() As Variant
colArr = Intersect(arr(i), arr(i).Parent.UsedRange).Value
Dim j As Long
For j = LBound(colArr, 1) To UBound(colArr, 1)
If Not condArr(j) Then
Dim charind As Long
charind = Application.Max(InStr(arr(i + 1), ">"), InStr(arr(i + 1), "<"), InStr(arr(i + 1), "="))
Dim opprnd As String
If charind = 0 Then
opprnd = "="
Else
opprnd = Left(arr(i + 1), charind)
End If
Dim t As String
t = """" & colArr(j, 1) & """" & opprnd & """" & Mid(arr(i + 1), charind + 1) & """"
If Not Application.Evaluate(t) Then condArr(j) = True
End If
Next j
Next i
For i = LBound(rngarr, 1) To UBound(rngarr, 1)
If Not condArr(i) Then
TEXTJOINIFS = TEXTJOINIFS & rngarr(i, 1) & delim
End If
Next i
TEXTJOINIFS = Left(TEXTJOINIFS, Len(TEXTJOINIFS) - Len(delim))
End Function
Then in D2 put:
=IF(OR(A2<>A3,B2<>B3),TEXTJOINIFS(C:C,", ",A:A,A2,B:B,B2),"")
Related Topics
Dplyr Replacing Na Values in a Column Based on Multiple Conditions
How to Perform Pairwise Operation Like '%In%' and Set Operations for a List of Vectors
Save a Ggplot2 Time Series Plot Grob Generated by Ggplotgrob
R - How to Make Barplot Plot Zeros for Missing Values Over the Data Range
Passing a Variable Name to a Function in R
Plotting Interpolated Data on Map
R Markdown - Variable Output Name
Importing a Big Xlsx File into R
Left Join Only Selected Columns in R with the Merge() Function
Count Number of Non-Na Values for Every Column in a Dataframe
Create a Dynamic Table on Shiny App Based on Shiny Widget and Row Selection of Another Datatable
Save Imported CSV Data in Vector - R
Datalabels in R Highcharter Cannot Be Seen After Print as Png or Jpg
Determine the Number of Na Values in a Column
What Does the R Function 'Poly' Really Do