En

Extract ZIP Files Data in Java

ZIP Archives are one of the most popular and commonly used compressed file formats. The main reason for using ZIP files is to reduce the total file size and to send multiple files as a single archive. As a developer, you can extract the text, images, and even metadata from the files that are compressed within ZIP archives. In this article, we will discuss how to extract the ZIP archives data in Java.
September 8, 2021 · 4 min · Shoaib Khan

Extract ZIP Files Data in C#

Archives like ZIP, RAR, TAR, GZIP, BZIP2 are commonly used to store more than one file and folder in a single container. Another main reason for archive files is to reduce the total file size using compression algorithms. Just like parsing and extracting data from documents of various file formats, you can treat the archive files in the same way. You can extract the text, images, and even metadata from the files that are compressed within the archives. In this article, we will discuss how to extract the ZIP archives data using C# with your .NET applications.
August 25, 2021 · 3 min · Shoaib Khan

Extract Images from EPUB, FB2, CHM eBooks in Java

eBooks of various formats are very common in everyday use. The eBook can contain text as well as images. If you want to use the images of any eBook elsewhere, you can get these easily extracted programmatically within your Java application. In this article, you will learn to automate, how to extract images from eBook files such as EPUB, PDF, FB2, CHM in Java.
March 15, 2021 · 3 min · Shoaib Khan

Extract Images from EPUB, FB2, CHM eBooks in C#

An electronic book, popularly known as eBook, is a book in digital form that is readable on various electronic devices. These devices include dedicated eReaders like Kindle, or laptops, desktop computers, and smartphones. There are many popular file formats of eBooks in-use in the market that include; EPUB, FictionBook FB2, Microsoft Compiled HTML Help - CHM, DjVu, MOBI, PDF, and many others. As a programmer, this article will help you to programmatically extract images from eBooks in C# within .NET applications.

{{< figure align=center src=“images/extract-images-from-epub-fb2-chm-ebooks-in-csharp-dotnet.jpg” alt=“Extract Images from eBooks in C# .NET” caption=“EPUB eBook from the Adobe Sample eBook Library">}}

The following topics will be covered in this article:

  • .NET API for Image Extraction from eBooks * Extract Images from EPUB eBook in C# * Extract Images from FB2, CHM eBooks in C# [Continue Reading …][1]
February 26, 2021 · 3 min · Shoaib Khan

Extract Data from Invoices and Receipts in Java

In the era of online businesses, the use of digital invoices and receipts has largely increased. Similarly, the efficient data extraction from these digital invoices is also demanding. In this article, you will be knowing how to extract data from PDF invoices or receipts programmatically in Java. [Continue Reading…][1]
January 22, 2021 · 2 min · Shoaib Khan

Read PDF Form Fields using C#

In this article, we will learn how to read and parse PDF documents and then programmatically extract PDF form field values in C#. Earlier, we have seen [how to extract values from PDF forms in Java][1]. After reading these articles, if you have filled feedback forms, you can extract the values within your .NET & Java applications for analysis or save them in the database.
December 23, 2020 · 2 min · Shoaib Khan

Read PDF Form Fields in Java

In this article, we will discuss how to parse PDF document and extract values from PDF forms programmatically in Java. There are many situations, where we have several filled survey forms or feedbacks in PDF format from a large audience. We can easily extract the filled data values and use them for analysis. Let us now move straight towards reading these PDF forms and extract filled data field values within Java applications.
December 9, 2020 · 2 min · Shoaib Khan

Extract Images from Documents using C#

In this article, we will be learning to programmatically extract images from PDF, Excel, PowerPoint, and Word documents in a C# application using document parsing .NET API. [GroupDocs.Parser for .NET][1] is document parsing and data extraction .NET API. It supports document parsing and extraction of images, text, and metadata from word-processing documents, spreadsheets, presentations, archives, and email documents.
October 28, 2020 · 3 min · Shoaib Khan

Extract Images from Documents using Java

Today, we will learn to programmatically extract images from PDF, Excel, PowerPoint, and Word documents using Java. For the extraction of images, we will use [GroupDocs.Parser for Java][1]. This Java API supports the parsing of documents and extraction of images, text, and metadata from word-processing documents, spreadsheets, presentations, archives, and email documents. Extracted images can be saved in BMP, GIF, JPEG, PNG, and WebP formats.
October 27, 2020 · 3 min · Shoaib Khan

Parse Documents to Extract Text and Metadata using Java

GroupDocs.Parser for Java API is in the market since last year and it is proved to be one of the powerful document parser APIs. It allows parsing and reading popular formats of word processing documents, spreadsheets, presentations, ebooks, emails, markup documents, notes, archives, and databases. Not only the text but you can also extract the images and metadata properties from various document formats including PDF, XLS, XLSX, CSV, DOC, DOCX, PPT, PPTX, MPP, EML, MSG, OST, PST, ONE, and many more.
December 3, 2019 · 3 min · Usman Aziz