Replacing text in business documents is a common task. This article explains how to find and replace content in PDF documents using AI and C#. You’ll learn how to apply custom redactions and integrate AI to modify PDF content.

The following topics are covered in this article:

GroupDocs.Redaction feature for Replacing Text

GroupDocs.Redaction allows you to replace text in various supported file formats. This method relies on regular expressions to identify the text that needs to be replaced. However, working with regular expressions can require additional effort, especially in more complex scenarios. For more information, see our documentation.

Steps to redact PDF using AI tools via C#

You can use this feature to hide sensitive information or to generate a customized document from a template. The following steps show how to use AI to replace specific text in a PDF document within a .NET application.

  • Load the PDF file using the Redactor class.
  • Provide a custom redaction handler by implementing your AI logic through the ICustomRedactionHandler interface.
  • Process the document text, using PageAreaRedaction along with ReplacementOptions.
  • Apply the redaction using the Apply() method.
  • Save the processed document to a new location using the Save() method.

Common C# code to use GroupDocs.Redaction functionality

The following code uses AI to find and replace credit card numbers in a document. This code snippet includes the main method that initializes the Redactor and applies redactions by calling the Apply() method.

public async Task Redaction_Custom_AI()
{
    // Usually, this regex is used to find text for replacement
    // To provide all possible text for custom redaction, use a regex like in the example
    Regex regex = new Regex(".*");

    //Define target pages and replacement text for redactions
    ReplacementOptions optionsText = new ReplacementOptions("[replaced]");
        optionsText.Filters = new RedactionFilter[] {
        new PageRangeFilter(PageSeekOrigin.Begin, 0, 2)
    };

    //Provide a custom redaction handler implementation
    optionsText.CustomRedaction = new TextRedactor() { Test = this };

    var textRedaction = new PageAreaRedaction(regex, optionsText);
    var redactions = new Redaction[] { textRedaction };

    //Process the document
    using (var redactor = new Redactor("source.pdf"))
    {
        //Apply redactions to the document
        RedactorChangeLog result = redactor.Apply(redactions);
        if (result.Status != RedactionStatus.Failed)
        {
            redactor.Save(new GroupDocs.Redaction.Options.SaveOptions(false, "Result"));
        }
    }
}

Custom redaction C# code

The ICustomRedactionHandler implementation allows users to define their own logic for redacting text paragraphs in PDF files. Using such classes enables flexible algorithms tailored to specific business needs.

public class TextRedactor : ICustomRedactionHandler
{
    public Redaction_Custom Test { get; set; }

    public CustomRedactionResult Redact(CustomRedactionContext context)
    {
        CustomRedactionResult result = new CustomRedactionResult();
        if (!String.IsNullOrEmpty(context.Text))
        {
            var response = Process_AI(context.Text, "[redacted-custom]").GetAwaiter().GetResult();
            if (response.Result != "none")
            {
                result.Apply = true;
                result.Text = response.Result;
            }
        }
        return result;
    }
}

Example of AI prompt

The final part is the AI integration code. The provided prompt is quite sophisticated, as not all AI tools can process sensitive data, such as credit card numbers.

public async Task<OpenAIResult> Process_AI(string text, string replacement)
{
    string prompt =
        "Hey, I’ve got a piece of a document here. " +
        "Could you help me swap out any parts that look like digital blocks, such as 'XXXX-'? " +
        "These blocks are just numbers and dashes. " +
        "Each entry I want to replace might have anywhere from one to four of these blocks. " +
        $"Please replace the entire block with '{replacement}' in the text. " +
        "I don't need any of your comments. " +
        "Return as result only text with replaced entries or just word 'none' if there weren't anything to replace " +
        $"Here’s the text to work with \n\n {text}";

    // User AI integration code, which depends of the used AI tool
    return await RequestToAI(prompt);
}

The output of the code above is as follows:

Conclusion

In this article, we learned how to use custom redactions and AI integrations to process PDFs. AI tools can greatly simplify text processing but may take more time and be less predictable compared to regular tools.

For more information about our product, visit the documentation. If you have any queries, feel free to contact us via the forum.

Try our free web app

Explore the capabilities of GroupDocs.Redaction using our online web application. Test core features directly in your browser without installing anything.

See Also