Search and Replace in ODT using AODL

Recently I needed to create a way to search and replace a word in a OpenOffice OpenDocument ODT. It would seem the project itself already has a library AODL (take care, at least two other project pages but with older code) that exposes the format model but I didn't find any utility methods like search and replace functionality. I've created two versions of the method, one a old-style, imperative approach, and the other, a LINQ version. As I'm still new to LINQ it would seem that the classical approach produces a more efficient code. Anyone care to chip in and make a faster algorithm? In this particular version I'm just going for simple string equality, I didn't want to go with other StringComparison methods (just look at the Equals overload method).


public static void SearchAndReplaceString(TextDocument document, string searchText, string replaceText)
{
var content = document.Content;

foreach (var item in content)
{
if (item is Paragraph)
{
foreach (var textContent in ((Paragraph)item).TextContent)
{
if (textContent.Text == searchText)
{
textContent.Text = replaceText;
}
}
}
}
}


public static void SearchAndReplaceStringLINQ(TextDocument document, string searchText, string replaceText)
{
var content = document.Content;

IEnumerable<Paragraph> paragraphs =
from item in content
where item is Paragraph
select (Paragraph)item;


foreach (var paragraph in paragraphs)
{
var paragraphText = paragraph.TextContent.Where<IText>(t => t.Text == searchText);
foreach (var textItem in paragraphText)
{
textItem.Text = replaceText;
}
}
}


I should point out this method matches only single words.

Fix 1. Replace text if under table structure, doesn't work if the Cell is in the header.

public static void SearchAndReplaceString(TextDocument document, string searchText, string replaceText)
{
var content = document.Content;
ReplaceInContent(searchText, replaceText, content);
}

private static void ReplaceInContent(string searchText, string replaceText, ContentCollection content)
{
foreach (var item in content)
{
if (item is Paragraph)
{
foreach (var textContent in ((Paragraph)item).TextContent)
{
if (textContent.Text == searchText)
{
textContent.Text = replaceText;
}
}
}
else if (item is Table)
{
foreach (var row in ((Table)item).Rows)
{
foreach (var cell in row.Cells)
{
var cellContent = cell.Content;
ReplaceInContent(searchText, replaceText, cellContent);
}
}
}
}
}

Comments

Would it find a text inside a table?
schrepfler said…
Good point, I've tried it and it doesn't work. I'll try to fix the imperative code. I wonder if LINQ could be used to identify any IText in the object graph no matter under what kind of structure it's in, therefore making the code very succint? HELP LINQ EXPERTS!
schrepfler said…
Fixed, works also with nested tables but no header cells.
schrepfler said…
I missed this, it seems the XmlDocument is accessible directly from this API, I wonder if perhaps it would be best to use Xpath or LINQ for XML in order to select the desired text.
I m newer to C# a well as for AODL. I want to find some values in ODT text document and replace those as you have done. but when I m going to do that

"Error 1 'object' does not contain a definition for 'Text' and no extension method 'Text' accepting a first argument of type 'object' could be found (are you missing a using directive or an assembly reference?)"

that error comes from "textContent.Text"

if you can help me..I m thankful to you

Nadz
schrepfler said…
Did you download the AODL library dll and made a reference to it?

Popular posts from this blog

Relaxing SSL validation for JaxWS

Job Hunting in the Time of COVID-19