环境
Visual Studio 2017 C#(Word .docx 文件)
问题
查找/替换仅替换“{Today}” - 它无法替换“{ConsultantName}”字段。我检查了文档并尝试使用不同的方法(请参阅注释掉的代码),但没有任何乐趣。
Word 文档只有几段文本 - 文档中没有表格或文本框。我做错了什么?
更新
当我检查 doc_text 字符串时,我可以看到“{Today}”,但“{ConsultantName}”被分成多个运行。左大括号和右大括号不与单词在一起 - 它们之间有 XML 标签:
{</w:t></w:r><w:proofErr w:type="spellStart"/><w:r w:rsidR="00544806"><w:t>ConsultantName</w:t></w:r><w:proofErr w:type="spellEnd"/><w:r w:rsidR="00544806"><w:t>}
代码
string doc_text = string.Empty;
List<string> s_find = new List<string>();
List<string> s_replace = new List<string>();
// Regex regexText = null;
s_find.Add("{Today}");
s_replace.Add("24 Sep 2018");
s_find.Add("{ConsultantName}");
s_replace.Add("John Doe");
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filePath, true))
{
// read document
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
doc_text = sr.ReadToEnd();
}
// find replace
for (byte b = 0; b < s_find.Count; b++)
{
doc_text = new Regex(s_find[b], RegexOptions.IgnoreCase).Replace(doc_text, s_replace[b]);
// regexText = new Regex(s_find[b]);
// doc_text = doc_text.Replace(s_find[b], s_replace[b]);
// doc_text = regexText.Replace(doc_text, s_replace[b]);
}
// update document
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(doc_text);
}
}
注意:我想避免使用 Word Interop。我不想创建 Word 实例并使用 Word 的对象模型来执行查找/替换。
没有办法避免 Word 将文本拆分为多个运行。即使您直接在文档中键入文本、不进行任何更改也不应用格式,也会发生这种情况。
但是,我通过向文档添加自定义字段来解决该问题,如下所示:
这会将字段插入到您的文档中,即使您应用格式设置,字段名称也将是完整的,不会被分成多个运行。
更新
为了节省用户手动向文档添加大量自定义属性的繁琐任务,我编写了一个使用 OpenXML 来执行此操作的方法。
添加以下用法:
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.CustomProperties;
using DocumentFormat.OpenXml.VariantTypes;
向文档添加自定义(文本)属性的代码:
static public bool RunWordDocumentAddProperties(string filePath, List<string> strName, List<string> strVal)
{
bool is_ok = true;
try
{
if (File.Exists(filePath) == false)
return false;
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filePath, true))
{
var customProps = wordDoc.CustomFilePropertiesPart;
if (customProps == null)
{
// no custom properties? Add the part, and the collection of properties
customProps = wordDoc.AddCustomFilePropertiesPart();
customProps.Properties = new DocumentFormat.OpenXml.CustomProperties.Properties();
}
for (byte b = 0; b < strName.Count; b++)
{
var props = customProps.Properties;
if (props != null)
{
var newProp = new CustomDocumentProperty();
newProp.VTLPWSTR = new VTLPWSTR(strVal[b].ToString());
newProp.FormatId = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";
newProp.Name = strName[b];
// append the new property, and fix up all the property ID values
// property ID values must start at 2
props.AppendChild(newProp);
int pid = 2;
foreach (CustomDocumentProperty item in props)
{
item.PropertyId = pid++;
}
props.Save();
}
}
}
}
catch (Exception ex)
{
is_ok = false;
ProcessError(ex);
}
return is_ok;
}
你只需要这样做:
*.csproj
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp3.1</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="DocumentFormat.OpenXml" Version="2.12.3" />
</ItemGroup>
</Project>
添加这些包:
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
并将此代码放入您的系统中
using (WordprocessingDocument wordprocessingDocument =
WordprocessingDocument.Open(filepath, true))
{
var body = wordprocessingDocument.MainDocumentPart.Document.Body;
var paras = body.Elements<Paragraph>();
foreach (var para in paras)
{
foreach (var run in para.Elements<Run>())
{
foreach (var text in run.Elements<Text>())
{
if (text.Text.Contains("#_KEY_1_#"))
{
text.Text = text.Text.Replace("#_KEY_1_#", "replaced-text");
}
}
}
}
}
完成
我想分享一个替换Word、Excel和PowerPoint文档中文本的解决方案。它采用简单的基于字典的方法,提供了一种简单的方法来定义您的替代品。
用途:
Dictionary<string, string> replacements = new()
{
{ "##keyword##", sampleText },
};
IFileHandler fileHandler = FileHandlerFactory.Create(yourFileExtension); //docx, xlsx, pptx
byte[] updatedFile = fileHandler.UpdateFile(originalFile, replacements); //byte[]
最佳实践:我建议使用 ## 作为关键字的前缀和后缀,并以小写形式插入。这确保了可靠的关键字识别和替换。
代码:
public interface IFileHandler
{
byte[] UpdateFile(byte[] file, Dictionary<string, string> replacements);
}
public static class FileHandlerFactory
{
public static IFileHandler Create(string fileExtension)
{
return fileExtension switch
{
"docx" => new WordDocumentHandler(),
"xlsx" => new ExcelHandler(),
"pptx" => new PowerPointHandler(),
_ => throw new NotSupportedException("File type not supported"),
};
}
}
public class WordDocumentHandler : IFileHandler
{
public byte[] UpdateFile(byte[] file, Dictionary<string, string> replacements)
{
string temporaryFilePath = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString() + ".docx");
File.WriteAllBytes(temporaryFilePath, file);
using (WordprocessingDocument document = WordprocessingDocument.Open(temporaryFilePath, isEditable: true))
{
MainDocumentPart? documentPart = document.MainDocumentPart;
if (documentPart != null)
{
var header = documentPart.HeaderParts.SelectMany(header => header.RootElement!.Descendants<DocumentFormat.OpenXml.Wordprocessing.Text>());
var body = documentPart.Document.Descendants<DocumentFormat.OpenXml.Wordprocessing.Text>();
var footer = documentPart.FooterParts.SelectMany(header => header.RootElement!.Descendants<DocumentFormat.OpenXml.Wordprocessing.Text>());
var allText = header.Concat(body).Concat(footer);
foreach (var textElement in allText)
{
string textContent = textElement.Text;
foreach (var replacement in replacements.Where(replacement => textContent.Contains(replacement.Key)))
{
textElement.Text = textElement.Text.Replace(textContent, replacement.Value);
}
}
document.Save();
}
}
return File.ReadAllBytes(temporaryFilePath);
}
}
public class ExcelHandler : IFileHandler
{
public byte[] UpdateFile(byte[] file, Dictionary<string, string> replacements)
{
string temporaryFilePath = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString() + ".xlsx");
File.WriteAllBytes(temporaryFilePath, file);
using (SpreadsheetDocument document = SpreadsheetDocument.Open(temporaryFilePath, isEditable: true))
{
WorkbookPart? workbookPart = document.WorkbookPart;
if (workbookPart != null)
{
workbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
workbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
SharedStringTablePart? sharedStringTablePart = workbookPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
if (sharedStringTablePart != null)
{
foreach (WorksheetPart worksheetPart in workbookPart.WorksheetParts)
{
GetCells(worksheetPart).ForEach(cell => ProcessCell(replacements, cell, sharedStringTablePart));
}
}
document.Save();
}
}
return File.ReadAllBytes(temporaryFilePath);
}
private static List<Cell> GetCells(WorksheetPart worksheetPart)
{
return worksheetPart.Worksheet.Elements<SheetData>().SelectMany(i => i.Elements<Row>())
.SelectMany(i => i.Elements<Cell>()).ToList();
}
private static void ProcessCell(Dictionary<string, string> replacements, Cell cell, SharedStringTablePart sharedStringTablePart)
{
bool isValidCell = cell.DataType != null && cell.DataType.Value == CellValues.SharedString && cell.CellValue != null;
if (isValidCell)
{
int sharedStringIndex = int.Parse(cell.CellValue.InnerText);
SharedStringItem sharedStringItem = sharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(sharedStringIndex);
string? text = sharedStringItem.Text?.Text;
foreach (var replacement in replacements.Where(replacement => !string.IsNullOrEmpty(text) && text.Contains(replacement.Key)))
{
cell.CellValue = new CellValue(replacement.Value);
cell.DataType = new EnumValue<CellValues>(CellValues.String);
}
}
}
}
public class PowerPointHandler : IFileHandler
{
public byte[] UpdateFile(byte[] file, Dictionary<string, string> replacements)
{
string temporaryFilePath = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString() + ".pptx");
File.WriteAllBytes(temporaryFilePath, file);
using (PresentationDocument document = PresentationDocument.Open(temporaryFilePath, isEditable: true))
{
PresentationPart? presentationPart = document.PresentationPart;
if (presentationPart != null)
{
foreach (SlideMasterPart slideMasterPart in presentationPart.SlideMasterParts)
{
ReplaceText(slideMasterPart.SlideMaster.Descendants<DocumentFormat.OpenXml.Drawing.Text>(), replacements);
}
foreach (SlidePart slidePart in presentationPart.SlideParts)
{
ReplaceText(slidePart.Slide.Descendants<DocumentFormat.OpenXml.Drawing.Text>(), replacements);
}
}
document.Save();
}
return File.ReadAllBytes(temporaryFilePath);
}
private static void ReplaceText(IEnumerable<DocumentFormat.OpenXml.Drawing.Text> texts, Dictionary<string, string> replacements)
{
foreach (var text in texts)
{
foreach (var replacement in replacements.Where(replacement => text.Text.Contains(replacement.Key)))
{
text.Text = text.Text.Replace(replacement.Key, replacement.Value);
}
}
}
}
要点: