How to Skip the First Line of a CSV File and Make the Second Line the Header

How to skip the first line of a CSV file and make the second line the header

I don't think there's an elegant way of doing it, but it can be done:

require "csv"

# Create a stream using the original file.
# Don't use `textmode` since it generates a problem when using this approach.
file = File.open "file.csv"

# Consume the first CSV row.
# `\r` is my row separator character. Verify your file to see if it's the same one.
loop { break if file.readchar == "\r" }

# Create your CSV object using the remainder of the stream.
csv = CSV.new file, headers: true

How to skip second line is csv file while maintaining first line as column names with read_csv?

You can just read in twice - once to get the names, and then to get the data.

library(readr)
library(dplyr)

csv_file <- "mpg,cyl,disp,hp,drat,wt
mpg,cyl,disp,hp,drat,wt
21.0,6,160,110,3.90,2.875
22.8,4,108,93,3.85,2.320
21.4,6,258,110,3.08,3.215
18.7,8,360,175,3.15,3.440
18.1,6,225,105,2.76,3.460"


df_names <- read_csv(csv_file, n_max = 0) %>% names()

df_names
#> [1] "mpg" "cyl" "disp" "hp" "drat" "wt"

df <- read_csv(csv_file, col_names = df_names, skip = 2)

df

#> # A tibble: 5 x 6
#> mpg cyl disp hp drat wt
#> <dbl> <int> <int> <int> <dbl> <dbl>
#> 1 21.0 6 160 110 3.90 2.875
#> 2 22.8 4 108 93 3.85 2.320
#> 3 21.4 6 258 110 3.08 3.215
#> 4 18.7 8 360 175 3.15 3.440
#> 5 18.1 6 225 105 2.76 3.460

read.csv, header on first line, skip second line

This should do the trick:

all_content = readLines("file.csv")
skip_second = all_content[-2]
dat = read.csv(textConnection(skip_second), header = TRUE, stringsAsFactors = FALSE)

The first step using readLines reads the entire file into a list, where each item in the list represents a line in the file. Next, you discard the second line using the fact that negative indexing in R means select all but this index. Finally, we feed this data to read.csv to process it into a data.frame.

Not able to read csv while skipping first row and using second as header in pandas for raw tick data of symbols

in skiprows you need to give number of rows you want to skip from the top of your csv

use utf-16

df = pd.read_csv(cwd + folder + name +'.csv',delimiter=';', encoding='utf-16', skiprows=1)

for more info:

To check the encoding i have checked in libreoffice. if you open with
libreoffice in its starting window you can choose delimiter, in which it
also shows utf encoding of that file.

How to skip the first row when reading a csv file?

skip the first row when reading a csv file


For example,

package main

import (
"bufio"
"encoding/csv"
"fmt"
"io"
"os"
)

func readSample(rs io.ReadSeeker) ([][]string, error) {
// Skip first row (line)
row1, err := bufio.NewReader(rs).ReadSlice('\n')
if err != nil {
return nil, err
}
_, err = rs.Seek(int64(len(row1)), io.SeekStart)
if err != nil {
return nil, err
}

// Read remaining rows
r := csv.NewReader(rs)
rows, err := r.ReadAll()
if err != nil {
return nil, err
}
return rows, nil
}

func main() {
f, err := os.Open("sample.csv")
if err != nil {
panic(err)
}
defer f.Close()
rows, err := readSample(f)
if err != nil {
panic(err)
}
fmt.Println(rows)
}

Output:

$ cat sample.csv
one,two,three,four
1,2,3
4,5,6
$ go run sample.go
[[1 2 3] [4 5 6]]
$

$ cat sample.csv
PTN Ethernet-Port RMON Performance,PORT_BW_UTILIZATION,2019-06-29 20:00:00,33366
DeviceID,DeviceName,ResourceName,CollectionTime,GranularityPeriod,PORT_RX_BW_UTILIZATION,PORT_TX_BW_UTILIZATION,RXGOODFULLFRAMESPEED,TXGOODFULLFRAMESPEED,PORT_RX_BW_UTILIZATION_MAX,PORT_TX_BW_UTILIZATION_MAX
3174659,H1095,H1095-11-ISM6-1(to ZJBSC-V1),2019-06-29 20:00:00,15,22.08,4.59,,,30.13,6.98
3174659,H1095,H1095-14-ISM6-1(to T6147-V),2019-06-29 20:00:00,15,2.11,10.92,,,4.43,22.45
$ go run sample.go
[[DeviceID DeviceName ResourceName CollectionTime GranularityPeriod PORT_RX_BW_UTILIZATION PORT_TX_BW_UTILIZATION RXGOODFULLFRAMESPEED TXGOODFULLFRAMESPEED PORT_RX_BW_UTILIZATION_MAX PORT_TX_BW_UTILIZATION_MAX] [3174659 H1095 H1095-11-ISM6-1(to ZJBSC-V1) 2019-06-29 20:00:00 15 22.08 4.59 30.13 6.98] [3174659 H1095 H1095-14-ISM6-1(to T6147-V) 2019-06-29 20:00:00 15 2.11 10.92 4.43 22.45]]
$

How to make first line of text file as header and skip second line in spark scala

        scala> val ds = spark.read.textFile("data.txt")  > spark-v2.0
(or)
val ds = spark.sparkContext.textFile("data.txt")

scala> val schemaArr = ds.filter(x=>x.contains("time")).collect.mkString.split("\t").toList

scala> val df = ds.filter(x=> !x.contains("time"))
.map(x=>{
val cols = x.split("\t")
(cols(0),cols(1),cols(2),cols(3),cols(4),cols(5))
}).toDF(schemaArr:_*)

scala> df.show(false)
+------------+----+-----+----+---+--------------------------------------------+
|time |task|event|port|cmd|args |
+------------+----+-----+----+---+--------------------------------------------+
|03:27:51.199|FCPH|seq |13 |28 |00300000,00000000,00000591,00020182,00000000|
|03:27:51.199|PORT|Rx |11 | 0 |c0fffffd,00fffffd,0ed10335,00000001 |
|03:27:51.200|PORT|Tx |13 |40 |02fffffd,00fffffd,0ed3ffff,14000000 |
|03:27:51.200|PORT|Rx |13 | 0 |c0fffffd,00fffffd,0ed329ae,00000001 |
|03:27:59.377|PORT|Rx |15 |40 |02fffffd,00fffffd,0336ffff,14000000 |
|03:27:59.377|PORT|Tx |15 | 0 |c0fffffd,00fffffd,03360ed2,00000001 |
|03:27:59.377|FCPH|read |15 |40 |02fffffd,00fffffd,d0000000,00000000,03360ed2|
|03:27:59.377|FCPH|seq |15 |28 |22380000,03360ed2,0000052b,0000001c,00000000|
|03:28:00.468|PORT|Rx |13 |40 |02fffffd,00fffffd,29afffff,14000000 |
|03:28:00.468|PORT|Tx |13 | 0 |c0fffffd,00fffffd,29af0ed5,00000001 |
+------------+----+-----+----+---+--------------------------------------------+

please try something like above and if you want schema then apply to it by using costume schema

Import csv via fread skipping first line and header on second line

fread is intended to be similar to read.table, read.csv, etc. so

data <- fread("C:/1.csv", skip=1, header=T)

will work.

How can I skip the first line of a csv in Java?

You may want to read the first line, before passing the reader to the CSVParser :

static void processFile(final File file) {
FileReader filereader = new FileReader(file);
BufferedReader bufferedReader = new BufferedReader(filereader);
bufferedReader.readLine();// try-catch omitted
final CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';');
CSVParser parser = new CSVParser(bufferedReader, format);
final List<CSVRecord> records = parser.getRecords();
//stuff
}


Related Topics



Leave a reply



Submit