How to determine the delimiter in CSV file
univocity-parsers supports automatic detection of the delimiter (also line endings and quotes). Just use it instead of fighting with your code:
CsvParserSettings settings = new CsvParserSettings();
settings.detectFormatAutomatically();
CsvParser parser = new CsvParser(settings);
List<String[]> rows = parser.parseAll(new File("/path/to/your.csv"));
// if you want to see what it detected
CsvFormat format = parser.getDetectedFormat();
Disclaimer: I'm the author of this library and I made sure all sorts of corner cases are covered. It's open source and free (Apache 2.0 license)
Hope this helps.
How to determine a file is tab delimited in PowerShell?
Another approach would be to use Select-String
to check for tab character and set delimiter.
if(Get-Content $csvfile -First 1 | Select-String -Pattern "`t")
{
$delim = "`t"
}
else
{
$delim = ','
}
Import-Csv $csvfile -Delimiter $delim
How to check if CSV file has a comma or a semicolon as separator?
Here are a few approaches assuming that the only difference among the format of the files is whether the separator is semicolon and the decimal is a comma or the separator is a comma and the decimal is a point.
1) fread As mentioned in the comments fread
in data.table package will automatically detect the separator for common separators and then read the file in using the separator it detected. This can also handle certain other changes in format such as automatically detecting whether the file has a header.
2) grepl Look at the first line and see if it has a comma or semicolon and then re-read the file:
L <- readLines("myfile", n = 1)
if (grepl(";", L)) read.csv2("myfile") else read.csv("myfile")
3) count.fields We can assume semicolon and then count the fields in the first line. If there is one field then it is comma separated and if not then it is semicolon separated.
L <- readLines("myfile", n = 1)
numfields <- count.fields(textConnection(L), sep = ";")
if (numfields == 1) read.csv("myfile") else read.csv2("myfile")
Update Added (3) and made improvements to all three.
How should I detect which delimiter is used in a text file?
You could show them the results in preview window - similar to the way Excel does it. It's pretty clear when the wrong delimiter is being used in that case. You could then allow them to select a range of delimiters and have the preview update in real time.
Then you could just make a simple guess as to the delimiter to start with (e.g. does a comma or a tab come first).
Related Topics
Php Warning: Mysqli_Connect(): (Hy000/2002): Connection Refused
Api to Get All the Reviews and Rating from Google for Business
Dompdf Table Fixed Column Width and Break Long Text
How to Not Make Phpmailer Send an Email With Multiple 'To' Addresses
Display Data from Database to Dropdown Codeigniter
Get All Hrefs from String But Then Replace Via Another Method
How to Apply Bindvalue Method in Limit Clause
How to Convert Array to Simplexml
Laravel Eloquent Sum of Multiplied Columns
How to Get All Month Record Count in Laravel
Laravel: Products,Categories and Subcategories! (Relation Ships)
Laravel - How to Join 2 Tables from Different Db Connection
Php Adding 15 Minutes to Time Value
Showing Image Binary Data Using JavaScript