Using Read.Csv.Sql to Select Multiple Values from a Single Column

Using read.csv.sql to select multiple values from a single column

1) fn$ Substitution can be done with fn$ of gsubfn (which is automatically pulled in by sqldf). See the fn$ examples on the sqldf home page. In this case we have:

fn$read.csv.sql("mtcars.csv", 
sql = "select * from file where carb in ( `toString(cc)` )")

2) join Another approach would be to create a data.frame of the carb values desired and perform a join with it:

Carbs <- data.frame(carb = cc)
read.csv.sql("mtcars.csv", sql = "select * from Carbs join file using (carb)")

SQL Server : convert multiple rows in single column

Example

Declare @S varchar(max) = 'Recipe,Recipe,Recipe,Recipe
0,1,3,4
Data1,Data2,Data3,Data4'

;with cte as (
Select CN=A.RetSeq
,RN=B.RetSeq
,Value=B.RetVal
From [dbo].[tvf-Str-Parse](@S,char(13)+char(10)) A
Cross Apply [dbo].[tvf-Str-Parse](A.RetVal,',') B
)
Select Str = Stuff((Select ',' +Value From cte Where RN=A.RN Order By CN For XML Path ('')),1,1,'')
From (Select Distinct RN from cte) A
Order By A.RN

Returns

Str
Recipe,0,Data1
Recipe,1,Data2
Recipe,3,Data3
Recipe,4,Data4

The Function if Interested

CREATE FUNCTION [dbo].[tvf-Str-Parse] (@String varchar(max),@Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = row_number() over (order by 1/0)
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(@String,@Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);

EDIT - OPTION WITHOUT FUNCTION

Declare @S varchar(max) = 'Recipe,Recipe,Recipe,Recipe
0,1,3,4
Data1,Data2,Data3,Data4'

;with cte as (
Select CN=A.RetSeq
,RN=B.RetSeq
,Value=B.RetVal
From (
Select RetSeq = row_number() over (order by 1/0)
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(@S,char(13)+char(10),'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
) A
Cross Apply (
Select RetSeq = row_number() over (order by 1/0)
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(A.RetVal,',','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
) B
)
Select Str = Stuff((Select ',' +Value From cte Where RN=A.RN Order By CN For XML Path ('')),1,1,'')
From (Select Distinct RN from cte) A
Order By RN

Edit JSON OPTION -- Correcting for Double Quotes

Declare @S varchar(max) = 'Recipe,Recipe,Recipe,Recipe
1,,3,4
Data1,Data2,Data"3,Data4'

;with cte as (
Select CN = A.[key]
,RN = B.[Key]
,Value = replace(B.Value,'||','"')
From OpenJSON('["'+replace(replace(@S,'"','||'),char(13)+char(10),'","')+'"]') A
Cross Apply (
Select *
From OpenJSON('["'+replace(A.Value,',','","')+'"]')
) B
)
Select Str = Stuff((Select ',' +Value From cte Where RN=A.RN Order By CN For XML Path ('')),1,1,'')
From (Select Distinct RN from cte) A
Order By RN

Returns

Str
Recipe,1,Data1
Recipe,,Data2 -- null (2 is missing
Recipe,3,Data"3 -- has double quote
Recipe,4,Data4

Mysql - Import a csv file with multiple fields into a single field using LOAD DATA LOCAL INFILE

You can import your data into an intermediary table first and then split it into rows.

Let's say you have your whole csv line in a table import













line
11111111,22222222,33333333

How to split single column (Have multiple value) into multi value through powershell

Seems similar to this answer. Similar techniques of mixing string manipulation with the native Csv cmdlets.

$CSVpath = "c:\temp\test1.csv"
$Data = Get-Content $CSVpath
$Headers = $Data[0] -split ','

$Data = ($Data | Select-Object -Skip 1) -replace '\"|,'

$Data | ConvertFrom-Csv -Header $Headers -Delimiter ";"

Bring in the file, use the first line to determine the headers. Then convert the remaining lines from CSV to PSObjects using the ConvertFrom-Csv cmdlet with the -Headers & -Delimiter params. Also replacing what looks like superfluous quoting.

That will result in objects that look like:

H1 h2 H3 H4  H5 H6 H7                 H8 H9 H10
-- -- -- -- -- -- -- -- -- ---
A B C D E F G,,,,,,,,,,,,,,,,,
H I J K L M N O P Q
A B C D E F G,,,,,,,,H I J K
10 a C D,H I J K L M N

However, it's unclear what's to be done with the commas. I would suggest the data be cleaned up beforehand.

Replacing the commas as in the previous answer and also overwriting the original file:

$CSVpath = "c:\temp\test1.csv"
$Data = Get-Content $CSVpath
$Headers = $Data[0] -split ','

$Data = ($Data | Select-Object -Skip 1) -replace '\"|,'

$Data | ConvertFrom-Csv -Header $Headers -Delimiter ";" |
Export-Csv -Path $CSVpath -Delimiter ";" -NoTypeInformation

Withstanding the export this should look like:

H1 h2 H3 H4 H5 H6 H7 H8 H9 H10
-- -- -- -- -- -- -- -- -- ---
A B C D E F G
H I J K L M N O P Q
A B C D E F GH I J K
10 a C DH I J K L M N

Note: Both tables were output using Format-Table which by default truncates some of the columns. Use Format-Table * -AutoSize to see the full table.

All can be shortened down to:

$CSVpath = "c:\temp\test1.csv"
$Data = Get-Content $CSVpath
$Headers = $Data[0] -Split ','

($Data | Select-Object -Skip 1) -replace '\"|,' |
ConvertFrom-Csv -Header $Headers -Delimiter ";" |
Export-Csv -Path $CSVpath -Delimiter ";" -NoTypeInformation

# To trim superfluous ";" belonging to emty fields.
(Get-Content $CSVpath).TrimEnd(';') | Set-Content $CSVpath

Note: For a true Csv file it's unnecessary to remove trailing ;. They are natural representations of properties that don't have a value. At any rate, and per request, I added some code to remove them.

Read certain columns if exist using read.csv.sql from sqldf

This finds out which columns are available, intersects their names with the names of the columns that are wanted and only reads those.

library(sqldf)

nms_wanted <- c("locID", "City", "CRESTA", "Latitude", "Longitude")
nms_avail <- names(read.csv("data.csv", nrows = 0))
nms <- intersect(nms_avail, nms_wanted)
fn$read.csv.sql("data.csv", "select `toString(nms)` from file")

SQL : Output multiple left joined rows as one record for csv file

If you have a fixed number of equipment for stock (I assume 3 for this )

You can do a pivot using GROUP BY.

SQL DEMO

WITH tmpResult as (
SELECT
[Stock ID],
'Equipment ID ' + CAST(rn AS VARCHAR(16)) as lblEq,
[Equipment ID],
[Equipment Qty]
FROM ( SELECT *, row_number() over (partition by [Stock ID] ORDER BY [Equipment ID]) as rn
FROM LinkStockEquipment ) as rows
)
SELECT [Stock ID],
MAX( CASE WHEN lblEq = 'Equipment ID 1' THEN [Equipment ID] END) as [Eqp ID 1],
MAX( CASE WHEN lblEq = 'Equipment ID 1' THEN [Equipment Qty] END) as [Eqp Qty 1],
MAX( CASE WHEN lblEq = 'Equipment ID 2' THEN [Equipment ID] END) as [Eqp ID 2],
MAX( CASE WHEN lblEq = 'Equipment ID 2' THEN [Equipment Qty] END) as [Eqp Qty 2],
MAX( CASE WHEN lblEq = 'Equipment ID 3' THEN [Equipment ID] END) as [Eqp ID 3],
MAX( CASE WHEN lblEq = 'Equipment ID 3' THEN [Equipment Qty] END) as [Eqp Qty 3]
FROM tmpResult
GROUP BY [Stock ID];

OUTPUT

Sample Image

Now if you want use PIVOT the important part is on the data preparation. In this case I have to convert qty to string. Again you need to know the number of fields you want to pivot

SQL DEMO

WITH tmpResult as (
SELECT
[Stock ID],
'Eqp ID ' + CAST(rn AS VARCHAR(16)) as label,
[Equipment ID] as [Value]
FROM ( SELECT *, row_number() over (partition by [Stock ID] ORDER BY [Equipment ID]) as rn
FROM LinkStockEquipment ) as rows

UNION ALL

SELECT
[Stock ID],
'Eqp Qty ' + CAST(rn AS VARCHAR(16)) as label,
CAST([Equipment Qty] AS VARCHAR(16)) as [Value]
FROM ( SELECT *, row_number() over (partition by [Stock ID] ORDER BY [Equipment ID]) as rn
FROM LinkStockEquipment ) as rows

)
SELECT [Stock ID],
[Eqp ID 1], [Eqp Qty 1],
[Eqp ID 2], [Eqp Qty 2]
FROM ( SELECT * FROM tmpResult ) as x
PIVOT (
max( [Value] )
for label in ( [Eqp ID 1], [Eqp Qty 1], [Eqp ID 2], [Eqp Qty 2] )
) as pvt

OUTPUT

Sample Image

Now if you dont know how many equipment you have, then you need dynamic PIVOT.

SQL DEMO

First you need a temporal table.

SELECT 
[Stock ID],
[label],
[Value]
INTO tmpResult
FROM (
SELECT
[Stock ID],
'Eqp ID ' + CAST(rn AS VARCHAR(16)) as label,
[Equipment ID] as [Value]
FROM ( SELECT *, row_number() over (partition by [Stock ID] ORDER BY [Equipment ID]) as rn
FROM LinkStockEquipment ) as rows

UNION ALL

SELECT
[Stock ID],
'Eqp Qty ' + CAST(rn AS VARCHAR(16)) as label,
CAST([Equipment Qty] AS VARCHAR(16)) as [Value]
FROM ( SELECT *, row_number() over (partition by [Stock ID] ORDER BY [Equipment ID]) as rn
FROM LinkStockEquipment ) as rows
) as x;

Then you need prepare the pivot query:

DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX);

SET @cols = STUFF((SELECT distinct ',' + QUOTENAME(c.label)
FROM tmpResult c
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'');

SELECT @cols;

set @query = 'SELECT [Stock ID], ' + @cols + ' FROM
(
SELECT *
FROM tmpResult
) x
pivot
(
max(Value)
for label in (' + @cols + ')
) p '

execute(@query);

OUTPUT

Here the problem is the column order. I will try to see if can fix it.

Sample Image

How do I extract multiple values out of a single cell from csv to xml using python

You need to add an inner loop to handle one or more results from the split operation. The error comes from trying to cocatenate a string (the XML tag) and a list (the result of split).

distnumbers = '\n'.join([f'<distnumber>{item}</distnumber>'
for item in row[0].split('~')])

As an aside, this seems like an unpythonic and fragile way to build XML. I would build up a python data structure and then convert to XML in one pass, using a library such as XML (standard library), dicttoxml, or a custom function.



Related Topics



Leave a reply



Submit