How to Read from Text File Line-By-Line and Split the Line by a Character

Read text files and separate to lines by character

Are you sure you want a new line before every 12?

I see that it comes from a date, but if you look at:

12/13/18, 14:06 - her:IMG-20181213-WA0005.jpg

there is a 12 in the text where you might not want a new line.

The solution might be robust if you break before a date:

with open('file.txt', 'r') as file: # the file you want to read
with open('new_file.txt', 'w') as new_file: # the file you want to save
for line in file.readlines():

new_line = re.sub('(\d+/\d+/\d+)', '\n\g<1>', line).strip()) # find all dates

new_file.write('{}\n'.format(new_line)) # save to new file

How to read from text file line-by-line and split the line by a character?

while IFS=';' read -r fullName userName password; do
useradd ... # $fullName, $userName, and $password are available
done < users.txt

Python read text file and split on control character

Assuming that control-A is character "\x01" (ASCII code 1):

>>> line="field1\x01field2\x01field3\x01field4"
>>> line.split("\x01")
['field1', 'field2', 'field3', 'field4']

If you want to use the "\u0001" notation, you need the 'u' prefix (Python 2):

>>> line.split(u"\u0001")
[u'field1', u'field2', u'field3', u'field4']

How do you read a text file, line by line, and separate contents of each line in Java?

Use BufferedReader to read line, and then split on whitespace.

while((line=br.readLine())!=null){
String[] arry = line.split("\\s+");
inType2 = arry[0];
inAmount2 = arry[1];
}

Reading/splitting different parts from different lines in a text file in Java

Each data record in a text file always has a Start and an End. The easiest records are obviously those that are contained on a single delimited line within the text file, where each file line is in fact a record as you can see within a typical CSV format data file. The harder records to read are the Multi-Line records whereas each data record consists of several sequential text file lines but still, there is a Start and a End to each record.

The Start of a record is usually pretty easy to distinguish. For example, in the file example you provided in your post it is obviously any file line that starts with Student Name:.

The End of a record may not always be so easy to determine since many applications do not save fields which contain no data value in order to help increase access speed and reduce file bloat. The thought is "why have a text file full of empty fields" and to be honest, rightly so. I'm not a big fan of text file records anyways since utilizing a database would make far better sense for larger amounts of data. In any case, there will always be a file line that will indicate the Start of a record so it would make sense to read from Start to Start of the next record or in the case of the last record in file, from Start to End Of File (EOF).

Here is an example (read the comments in code):

// System line separator to use in files.
String ls = System.lineSeparator();
/* Array will hold student data: Student Name, Student ID, College,
Credits Attempted, Credits Earned, and finally Grade Points. */
String[] studentData = new String[6];
// String Array to hold Course Table Header Names.
String[] coursesHeader = {"COURSE NO", "COURSE TITLE", "CREDITS", "GRADE"};
// List Interface to hold all the course Data line Arrays for each record
java.util.List<String[]> cousesList = new java.util.ArrayList<>();
// Underlines to be used for Console display and file records
// Under courses Header
String underline1 = "-------------------------------------------------------------";
// Under all the courses
String underline2 = "------------------------------------------------------------------------------------";

/* Read and Write to files using 'Try With Resources' so to
automatically close the reader an writer objects. */
try (Scanner reader = new Scanner(new java.io.File("StudentData.txt"), "UTF-8");
java.io.Writer writer = new java.io.FileWriter("StudentsGPA.txt")) {
// For console display only! [Anything (except caught errors) to Console can be deleted]
System.out.println("The 'StudentsGPA.txt' file will contain:");
System.out.println("======================================");
System.out.println();

// Will hold each line read from the reader
String line = "";
/* Will hold the name for the next record. This would be the record START
but only AFTER the first record has been read. */
String newName = "";

// Start reading the 'StudentData.txt' file (line by line)...
while (reader.hasNextLine()) {
/* If newName is empty then we're on our first record or
there is only one record in file. */
if (newName.isEmpty()) {
line = reader.nextLine(); // read in a file line...
}
else {
/* newName contains a name so we must have bumped into
the START of a new record during processing of the
previous record. We aleady now have the first line
of this new record (which is the student's name line)
currently held in the 'newName' variable so we just
make 'line' equal what is in the 'newName' variable
and carry on processing the data as normal. in essance,
we simply skipped a read because we've already read it
earlier when processing the previous record. */
line = newName;
// Clear this variable in preparation for another record START.
newName = "";
}
/* Skip comment lines (lines that start with a semicolon (;)
or a hash mark (#). Also skip any blank lines. */
if (line.startsWith(";") || line.startsWith("#") || line.isEmpty()) {
continue;
}
/* Does this file line start with 'Student Name:'? If so then
this is a record START, let's process this record. If not
then just keep reading the file. */
if (line.startsWith("Student Name:")) {
/* Let's put the student name into the studentData array at
index 0. If it is detected that there has been no name
applied for some reason then we place "N/A" as the name.
We use a Ternary Operator for this. So, "N/A" will be a
Default if there is not name. This will be typical for
the other portions of student data. */
studentData[0] = line.isEmpty() ? "N/A" : line.split("\\s*:\\s*")[1].trim();
/* Let's keep reading the file from this point on and retrieve
the other bits of student data to fill the studentData[]
Array... */
for (int i = 1; i < 6; i++) {
line = reader.nextLine().trim();
/* If we encounter a comment line or a blank line then let's
just skip past it. We don't want these. */
if (line.startsWith(";") || line.startsWith("#") || line.isEmpty()) {
i--;
continue;
}
/* Store the other portions of student data into the
studentData Array using "N/A" as a default should
any student data field contain nothing. */
studentData[i] = line.isEmpty() ? "N/A" : line.split("\\s*:\\s*")[1].trim();
}
// The current Student's Courses...
/* Clear the List Interface object in preparation for new
Courses from this particular record. */
cousesList.clear();
// Read past the courses header line...We don't want it.
reader.nextLine();
// Get the courses data (line by line)...
while (reader.hasNextLine()) {
line = reader.nextLine().trim();
/* Again, if we encounter a comment line or a blank line
in this section then let's just skip past it. We don't
want these. */
if (line.startsWith(";") || line.startsWith("#") || line.isEmpty()) {
continue;
}
/* At this point, if we have read in a line that starts
with 'Student Name:' then we just hit the START of a
NEW Record! This then means that the record we're
currently working on is now finished. Let's store this
file line into the 'newRecord' variable and then break
out of this current record read. */
if (line.startsWith("Student Name:")) {
newName = line;
break;
}
/* Well, we haven't reached the START of a New Record yet
so let's keep creating the courses list (line by line).
Break the read in course line into a String[] array.
We use the String#split() method for this with a small
Regular Expression (regex) to split each line based on
comma delimiters no matter how the delimiter spacing
might be (ex: "," " ," " , " or even " , "). */
String[] coursesData = line.split("\\s*,\\s*");
/* Add this above newly created coursesData string array
to the list. */
cousesList.add(coursesData);
}
/* Write (append) this current record to new file. The String#format()
method is used here to save the desired data into the 'StudentGPA.txt'
file in a table style format. */
// Student Data...
writer.append(String.format("%-12s: %-25s", "ID", studentData[1])).append(ls);
writer.append(String.format("%-12s: %-25s", "Name", studentData[0])).append(ls);
writer.append(String.format("%-12s: %-25s", "College", studentData[2])).append(ls);
// Student Courses...
// The Header line
writer.append(String.format("%-13s %-30s %-10s %-4s", coursesHeader[0],
coursesHeader[1], coursesHeader[2], coursesHeader[3])).append(ls);
// Apply an Underline (underline1) under the header.
writer.append(underline1).append(ls);
// Write the Courses data in a table style format to make the Header format.
for (String[] cData : cousesList) {
writer.append(String.format("%-13s %-33s %-9s %-4s",
cData[0], cData[1], cData[2], cData[3])).append(ls);
}
// Apply an Underline (underline2) under the Courses table.
writer.append(underline2).append(ls);

// Display In Console Window (you can delete this if you want)...
System.out.println(String.format("%-12s: %-25s", "ID", studentData[1]));
System.out.println(String.format("%-12s: %-25s", "Name", studentData[0]));
System.out.println(String.format("%-12s: %-25s", "College", studentData[2]));
System.out.println(String.format("%-13s %-30s %-10s %-4s", coursesHeader[0],
coursesHeader[1], coursesHeader[2], coursesHeader[3]));
System.out.println(underline1);
for (String[] cData : cousesList) {
System.out.println(String.format("%-13s %-33s %-9s %-4s",
cData[0], cData[1], cData[2], cData[3]));
}
System.out.println(underline2);

// The LAST line of each record, the Credits...
// YOU DO THE CALCULATIONS FOR: totalAttemped, semGPA, and cumGPA
String creditsAttempted = studentData[3];
String creditsEarned = studentData[4];
int credAttempted = 0;
int credEarned = 0;
int totalAttempted = 0;
double semGPA = 0.0d;
double cumGPA = 0.0d;
/* Make sure the 'credits attemted' numerical value is in fact
a string representaion of an integer value. if it is then
convert that string numerical value to integer. */
if (creditsAttempted.matches("\\d+")) {
credAttempted = Integer.valueOf(creditsAttempted);
}
/* Make sure the 'credits earned' numerical value is in fact
a string representaion of an integer value. if it is then
convert that string numerical value to integer. */
if (creditsEarned.matches("\\d+")) {
credEarned = Integer.valueOf(creditsEarned);
}
// Build the last record line (the Credits string) with the acquired data.
String creditsString = new StringBuilder("CREDITS: TOTAL.ATTEMPTED ")
.append(totalAttempted).append("? EARNED ").append(credEarned)
.append(" ATTEMPTED ").append(credAttempted).append(" SEM GPA ")
.append(semGPA).append("? CUM GPA ").append(cumGPA).append("?")
.toString();
// Display it to the console Window (you can delete this).
System.out.println(creditsString);
System.out.println();

// Write the built 'credit string' to file which finishes this record.
writer.append(creditsString).append(ls);
writer.append(ls); // Blank Line in preparation for next record.
writer.flush(); // Flush the data buffer - write record to disk NOW.
}
}
}
// Trap Errors...Do whatever you want with these.
catch (FileNotFoundException ex) {
System.err.println("File Not Found!\n" + ex.getMessage());
}
catch (IOException ex) {
System.err.println("IO Error Encountered!\n" + ex.getMessage());
}

Yes, it looks long but if you get rid of all the comments you can see that it really isn't. Don't be afraid to experiment with the code. Make it do what you want.


EDIT: (as per comments)

To place the student info portion of each record into an ArrayList so that you can parse it the way you want:

Where the forloop is located within the example code above for gathering the student info, just change this loop to this code and parse the data the way you want:

// Place this ArrayList declaration underneath the 'underline2' variable declaration:
java.util.ArrayList<String> studentInfo = new java.util.ArrayList<>();

then:

if (line.startsWith("Student Name:")) {
studentInfo.clear();
studentInfo.add(line);
/* Let's keep reading the file from this point on and retrieve
the other bits of student data to fill the studentData[]
Array... */
for (int i = 1; i < 6; i++) {
line = reader.nextLine().trim();
/* If we encounter a comment line or a blank line then let's
just skip past it. We don't want these. */
if (line.startsWith(";") || line.startsWith("#") || line.isEmpty()) {
i--;
continue;
}
studentInfo.add(line);
}

// .................................................
// .... The rest of the code for this `if` block ...
// .................................................
}

You will of course need to change the code after this loop to properly represent this ArrayList.

Split text file into lines by key word python

When you get the content of a .txt file like this...

with open("file.txt", 'r') as file:
content = file.read()

...you have it as a string, so you can split it with the function str.split():

content = content.split(my_keyword)

You can do it with a function:

def splitter(path: str, keyword: str) -> str:
with open(path, 'r') as file:
content = file.read()
return content.split(keyword)

that you can call this way:

>>> splitter("file.txt", "data")
["I really like to write the word ", ", because I think it has a lot of meaning."]


Related Topics



Leave a reply



Submit