How to search/replace a string in docx file by python

Hey guys,

I need to read a file and replace an old string by new string in some position which I already know.
I use python 2.7 and install python-docx.

I search on StackOverFlow.com and found that we could read a ms-word by convert to xml. I use re to find the string but I don’t know how to replace this string and save document after that.
I found that tag <w:t> contains text value.

Does anybody use python-docx and know how to search/replace in docx file ?

from docx import *
import re
with open ("text.docx", "rb") as f:
    document = Document(f)
    body_element = document._body._body    
    print body_element.xml
    print re.findall('<w:t>', body_element.xml)
1 Like

Check this out :smile:

http://stackoverflow.com/questions/24805671/how-to-use-python-docx-to-replace-text-in-a-word-document-and-save

In [1]: import docx

In [2]: doc = docx.Document('test.docx')

In [3]: doc.paragraphs[0]
Out[3]: <docx.text.paragraph.Paragraph at 0x43b72d0>

In [4]: doc.paragraphs[0].text
Out[4]: u'hello world, abcd '

In [5]: doc.paragraphs[0].text = doc.paragraphs[0].text.replace('hell
   ...: o', 'hi')

In [6]: doc.paragraphs[0].text
Out[6]: u'hi world, abcd '

In [7]: doc.save()
----------------------------------------------------------------------
TypeError                            Traceback (most recent call last)
<ipython-input-7-d2e0991f336a> in <module>()
----> 1 doc.save()

TypeError: save() takes exactly 2 arguments (1 given)

In [8]: doc.save('test.docx')

In [9]: doc = docx.Document('test.docx')

In [10]:

In [10]: doc.paragraphs[0].text
Out[10]: u'hi world, abcd '

In [11]:
1 Like
83% thành viên diễn đàn không hỏi bài tập, còn bạn thì sao?