DOCX file to text file conversion using Python
Problem
as your code says in the last for
loop:
for para in document.paragraphs:
textFilename = path + d.split(".")[0] + ".txt"
with io.open(textFilename,"w", encoding="utf-8") as textFile:
x=unicode(para.text)
textFile.write((x))
for each paragraph in whole document, you try to open a file named textFilename
so let's say you have a file named MyFile.docx
in /home/python/resumes/
so the textFilename
value that contains the path will be /home/python/resumes/MyFile.txt
always in whole of for
loop, so the problem is that you open the same file in w
mode which is a Write
mode, and will overwrite the whole file content.
Solution:
you must open the file once out of that for loop then try add paragraphs one by one to it.
Python - doc to docx file converter input, file path from a txt file
with open("file_path",'r') as file_content:
content=file_content.read()
content=content.split('\n')
You can read the data of the file using the method above, Then covert the data of file into a list(or any other iteratable data type) so that we can use it with for loop.I used content=content.split('\n')
to split the data of content by '\n' (Every time you press enter key, a new line character '\n' is sended), you can use any other character to split.
for i in content:
# the code you want to execute
Note
Some useful links:
- Split
- File writing
- File read and write
Related Topics
Json Valueerror: Expecting Property Name: Line 1 Column 2 (Char 1)
How to Find the Longest Word in a Text File
Keeping High Scores in a Text File
How to Stagger or Offset X-Axis Labels in Matplotlib
Dice Rolling Simulator in Python
Find the Longest Substring in Alphabetical Order
How to Merge Json Objects Containing Arrays Using Python
How to Strip Comma in Python String
How to Select Only One Column Using Sqlalchemy
Tensorflow: Convert Tensor to Numpy Array Without .Eval() or Sess.Run()
Swap First and Last Digits of a Number( Using Loops)
Use Subprocess to Send a Password
How to Change the Title Bar in Tkinter
Python: How to Keep Repeating a Program Until a Specific Input Is Obtained
How to Get the Amount of Consecutive Sub Strings of an Object in a List