Parse Key Value Pairs in a Text File

Parse key value pairs in a text file

I suggest storing the values in a dictionary instead of in separate local variables:

myvars = {}
with open("namelist.txt") as myfile:
for line in myfile:
name, var = line.partition("=")[::2]
myvars[name.strip()] = float(var)

Now access them as myvars["var1"]. If the names are all valid python variable names, you can put this below:

names = type("Names", [object], myvars)

and access the values as e.g. names.var1.

How to parse a file in a key=value format

Something like this:

myObject = {}
with open("something.ini") as f:
for line in f.readlines():
key, value = line.rstrip("\n").split("=")
myObject[key] = value

Note that, as @Goodies mentioned below, if you assign to the same key multiple times, this will just take the last value. It is however trivial to add some error handing:

myObject = {}
with open("something.ini") as f:
for line in f.readlines():
key, value = line.rstrip("\n").split("=")
if(not key in myObject):
myObject[key] = value
else:
print "Duplicate assignment of key '%s'" % key

Extracting value from a key value pair in a .txt file using python

open('filename.txt').readline().split(':')[1].strip()
  • open('filename.txt') opens the file
  • .read reads the line
  • .split(':')[1] splits the line on : and takes the second element (the value)
  • .strip() strips the value from the first space

parsing text file in python and creating dictionary of key value pair where values are in list format

I believe that this will do what you want.

import re

studentInfo = {} # create empty dict to fill
pattern = re.compile(r"(\S+)\s+\S+\s+(\S+).*\s(\S+)") # compile patten to use

line = "Joy 23 Science Exp related to magnet 45 pass" # this is just for example

# repeat this for each line
match = pattern.match(line) # get match
name = match.group(1) # extract name
subject = match.group(2) # extract subject
result = match.group(3) # extract result
if result == "pass":
studentInfo[subject] = name # add to dict

Convert a text file with different key value pairs to a csv file

TL;DR:

pd.DataFrame.from_records(
dict(field.split(': ') for field in line.split('|'))
for line in lines
)

Long Version

Assuming you already split your data into lines you then need to process them into records such as:

{' address': '4853 Radio Park Drive', ' dob': '10-06-1960', 'name': 'john'}

Each line needs to be split into fields:

>>> line = 'name: john| dob: 10-06-1960| address: 4853 Radio Park Drive'
>>> line.split('|')
['name: john', ' dob: 10-06-1960', ' address: 4853 Radio Park Drive']

Then each field needs to be split into the name of the column and the value itself:

>>> field = 'name: John'
>>> field.split(': ')
['name', 'john']

Once you do this for every field in the line you end up with a list of these:

>>> [field.split(': ') for field in line.split('|')]
[['name', 'john'],
[' dob', '10-06-1960'],
[' address', '4853 Radio Park Drive']]

A dictionary initialised with this list gets you the record from the beginning of the answer.

Since you have many lines, you need to produce many records but it's better to produce these lazily, in other words using a generator:

>>> (dict(field.split(': ') for field in line.split('|')) for line in s.split('\n'))
<generator object <genexpr> at 0x7f0d06bf8dd0>

Rather than producing you a whole list of records, the generator gives you one at a time when you iterate over it. This way you can start forming your dataframe without having to wait for all the records to be processed.

There is a special syntax in Python called generator comprehension that let's you define generators to be passed as an argument to functions and constructors.

Putting it all together, we construct a dataframe using the appropriate constructor (from_records) and the generator defined above:

pd.DataFrame.from_records(
dict(field.split(': ') for field in line.split('|'))
for line in lines
)

This produces the following output:

    name         dob                   address        mobile  telephone number       state  Telephone Number
0 john 10-06-1960 4853 Radio Park Drive NaN NaN NaN NaN
1 jane 07-10-1973 1537 Timbercrest Road 706-289-6746 NaN NaN NaN
2 liam 12-08-1986 4853 498 Fairmont Avenue NaN 706-687-5021 NaN NaN
3 chris 09-12-1965 485 Green Avenue NaN NaN California 510-855-5213

As a bonus, you can speed this up further by reading the file lazily too. Define a custom generator for reading lines:

def lines(path):
with open(path) as file:
while line := file.readline():
yield line.rstrip()

Note this will only work with Python 3.8+. Otherwise, instead of using the walrus operator you need to do this instead:

def lines(path):
with open(path) as file:
while True:
line = file.readline()
if line:
yield line.rstrip()
else:
return

read key value pairs from a text file in pyspark

Read the 2 files into Dataframes and:

  1. get the list of keys (columns) of the the first dataframe
  2. do some transformations on the second dataframe which contains the data, by splitting the values first by , then second by : using combination of transform and map_from_entries functions to convert each row into a map column
  3. finally using list comprehension on the list of keys select the columns and fillna to replace nulls by $:
from pyspark.sql import functions as F

keys = spark.read.csv(keys_file_path, sep="|", header=True).columns
data = spark.read.text(data_file_path)

df = data.withColumn(
"value",
F.map_from_entries(
F.expr("""transform(
split(value , ','),
x -> struct(split(x, ':')[0] as col, split(x, ':')[1] as val)
)""")
)
).select(*[
F.col("value").getItem(k).alias(k) for k in keys
]).fillna("$")

df.show(truncate=False)
#+---+---+---+---+---+---+---+---+---+---+
#|a |b |c |d |e |f |g |h |i |j |
#+---+---+---+---+---+---+---+---+---+---+
#|1 |1 |$ |1 |1 |$ |$ |1 |$ |$ |
#|2 |$ |$ |2 |2 |2 |$ |2 |$ |$ |
#|$ |$ |3 |3 |3 |3 |$ |3 |$ |$ |
#|4 |4 |4 |$ |4 |4 |$ |4 |4 |4 |
#+---+---+---+---+---+---+---+---+---+---+

Storing parameters as key value pairs in a text file and should be able to perform operations on the values

So the code is a bit confused with some debugging stuff mixed in. But if you look at what you've written you are missing a couple of pieces.

You convert the string to an integer value = stoi(str); so you can increment it. But you never convert the integer back to a string, instead you do this line[pos+1] = value+1; which makes no sense at all.

Secondly although you extract the value from the string pos = line.find(" "); str = line.substr(pos+1); you never extract the key. When you put the key and the incremented value back together you are going to need the key.

Here's a code fragment that does all these things

pos = line.find(" ");              // find a space
key = line.substr(0, pos); // extract the key
value_str = line.substr(pos+1); // extract the value
value = stoi(value_str); // convert the value
++value; // increment the value
value_str = std::to_string(value); // value back to a string again
line = key + " " + value; // combine the key and incremented value
fn1 << line << endl; // write to file

This code is a bit long winded, I've broken it down so you can see all the steps. In reality you could combine some of the steps and get rid of some of the variables.

One simple improvement is to convert the value back to a string by writing it out to the file, replace the last three lines above with

fn1 << key << ' ' << value << endl; // write key and new value

Your code has one other problem. You are trying to open the "samplefile.txt" file for reading and writing simultaneously, that cannot work. Instead just choose a different file name for the output file, and then rename the file at the end.

Something like this (notice I use ifstream for input files and ofstream for output files).

ifstream fin("samplefile.txt");
if (fin) // does the file exist
{
ofstream fout("samplefile.tmp"); // temporary filename
...
fin.close(); // must close the files before trying
fout.close(); // to delete and rename
remove("samplefile.txt"); // delete the original file
rename("samplefile.tmp", "samplefile.txt"); // rename the temporary file
}
else
{
ofstream fout("samplefile.txt");
...
}

Parsing key value pairs c#

You need to specify the IniFileType, i.e.:

IniConfigSource inifile = new IniConfigSource(file, IniFileType.MysqlStyle);

Long example:

IniDocument inifile = new IniDocument(file, IniFileType.MysqlStyle);
IniConfigSource source = new IniConfigSource(inifile);

Parsing key values pairs from text as dictionary

Iterate over your file and build a dictionary.

def create_colours_dictionary(filename):
colours_dict = {}
with open(filename) as file:
for line in file:
k, v = line.rstrip().split(':')
colours_dict[k] = v

return colours_dict

dct = create_colours_dictionary('file.txt')

Or, if you're looking for something compact, you can use a dict comprehension with a lambda to split on colons.

colours_dict = {k : v for k, v in (
line.rstrip().split(':') for line in open(filename)
}

This approach will need some modification if the colon is surrounded by spaces—perhaps regex?



Related Topics



Leave a reply



Submit