In today’s digital age, document management can often become a daunting task. But what if there was a tool that could simplify this process, making it efficient and effortless?
Welcome to the world of ChatGPT Code Interpreter—your ultimate ally for handling documents. From batch processing documents, watermarking PDFs, merging and splitting files, to password protecting content and more, this all-in-one tool elevates your productivity to a whole new level.
This article uncovers the endless possibilities that Code Interpreter offers, making document management as simple as a breeze.
Let’s dive in and explore its amazing features.
Document Batch Processing
Documents are vital carriers of information. Processing them efficiently is key to enhancing productivity in daily office tasks.
If you want to significantly boost productivity, you can compress multiple documents at once and upload the compressed package to Code Interpreter and perform batch processing.
Batch Watermarking PDFs
Watermarking PDFs is common, especially in professional and business contexts. Many organizations, including corporations, educational institutions, government agencies, and content creators, choose to add watermarks to PDFs to protect their content, assert ownership, and uphold brand identity.
With Code Interpreter, you can bulk watermark PDFs using a simple prompt.
Please add a diagonal watermark to the center of each page in multiple PDFs found in the provided ZIP file. The watermark should be scaled to 75% of the target page size. The content of the watermark should be “myaiforce.com” written in Courier-Bold font, gray color, with 50% opacity. Once the task is finished, kindly share a download link for all the processed PDFs.
To demonstrate, I first used Code Interpreter to auto-generate 5 PDFs and compressed them into a package. Then, I uploaded the file and entered the prompt above.
Following a series of steps, Code Interpreter added a watermark to each page of every PDF.
In addition to specifying text content for the watermark, you can also upload a logo image or add a personalized signature at the end of the PDF.
Merge and Split PDFs
Merging multiple related PDF files into a single file can simplify storage and viewing. To merge multiple PDFs in the example above, simply enter the following prompt.
Please combine all the PDFs from the provided ZIP file into a single PDF. After completing the task, kindly provide a download link for the processed PDF.
ChatGPT quickly merged the PDFs.
Furthermore, splitting PDFs using Code Interpreter is also straightforward.
Extract Tables and Images from PDFs
Extracting tables from PDFs makes data analysis and visualization easier. After extracting the tables, we can convert them into Excel files for further processing.
To instruct Code Interpreter to perform this task, the prompt is simple:
Please extract the table from the PDF and save it as an Excel file. Let’s go step by step. After completing the task, kindly provide a download link for the Excel file.
In my initial attempt, it didn’t work, but adding the phrase “Let’s go step by step” made it work. If you encounter unstable output from ChatGPT, you can try adding this sentence.
I want to extract a table from the PDF shown below:
The resulting table generated by ChatGPT looks like this:
Since the original table had merged cells, the output table differed slightly in terms of row names, but the data below was correct. If you want to extract the original table without merged cells, the results should be identical to the original.
After extracting the table, we can also use Code Interpreter to analyze the data or create charts.
Additionally, there was an image of a bar chart in the original PDF, which Code Interpreter could also help extract.
Simply submit the following prompt to ChatGPT.
Please extract the image from the PDF. Let’s go step by step. After completing the task, kindly provide a download link for the image.
Code Interpreter successfully extracted the image for us.
Although I only used one PDF for the demonstration, the power of Code Interpreter lies in batch processing. You can compress multiple PDFs into a package and upload them to ChatGPT for batch extraction. It is also possible to extract tables and images from other document formats, such as Word documents.
Password Protect PDFs
Adding a password to a PDF ensures that only authorized individuals can access its content. By adding a password, you can protect sensitive information and reduce the risk of data leakage.
Adding password protection to PDFs can be achieved with a single line of prompt:
Add a password (yU5NIK5Y) to this pdf. Once done, kindly provide a download link.
You can also add multiple PDFs simultaneously for password protection, but make sure to keep a record of the passwords. Besides PDFs, you can encrypt other document types, such as Excel and Word files.
Summarize PDF Content
There are various methods to summarize PDF content, and Code Interpreter is one of them. One of the significant advantages of Code Interpreter over other methods is its ability to process documents in batches.
You can compress multiple PDFs and upload them, allowing ChatGPT to generate a summarized Excel file, making it more convenient to access the content.
To demonstrate, I uploaded a PDF mentioned earlier and submitted the following prompt.
Summarize this PDF in three sentences. Let’s go step by step.
Let’s examine ChatGPT’s response to understand the process:
Sure, let me first read the contents of the PDF and then I will provide a summary. Let’s start by loading and extracting the text from the PDF.
The initial extracted text contains quite a bit of repetitive information, likely due to metadata or headers and footers in the PDF. These repetitions could potentially dilute the actual content of the document. To get a more accurate summary, I will attempt to clean up this text by removing the repetitive phrases, if possible. Let’s proceed with cleaning and then continue with the summarization.
The text cleaning has helped to some extent, but the preview still contains repetitive information, likely because it’s from the header or footer of the document. To provide a better summary, let’s identify the main body of the text and focus on that. I’ll do this by removing repetitive patterns that occur throughout the text, which often represent headers, footers, or sidebars in a PDF document. Let’s proceed with this approach.
It appears that the extraction of main body content from the PDF is still not successful. The text extracted seems to be repeating certain phrases which look like part of a header or footer.
The PyPDF2 library that I’m using for PDF text extraction can sometimes struggle with complex PDF layouts or PDFs that have been created from scanned images rather than text.
To ensure the best results, I’m going to try another method using the
pdfplumberlibrary, which often provides better results with complex PDFs.
Let’s try extracting the text again using
The extraction with
pdfplumberseems to have been more successful. The text appears to be much cleaner and more coherent. I can see the title, authors, affiliations, and the start of an abstract. Let’s now proceed with summarizing the main points of the document. For this, I’ll use a text summarization model.
I’m sorry for the confusion, but it appears that the current environment does not support the
transformerslibrary, which I planned to use for text summarization. However, I can still provide a manual summary based on the first few lines of text from the document:
- The document, titled “Improved Performance of ChatGPT-4 on the OKAP Exam: A Comparative Study with ChatGPT-3.5”, was authored by a team from the University of Massachusetts Medical School Department of Ophthalmology.
- The study investigates the performance of ChatGPT-4, an advanced Artificial Intelligence (AI) language model, on the Ophthalmology Knowledge Assessment Program (OKAP) examination.
- It compares the performance of ChatGPT-4 with its predecessor, ChatGPT-3.5, using 180 OKAP practice questions.
Please note that this summary is based on the initial part of the document that I’ve read so far. I can read further into the document and provide a more complete summary if needed.
Upon analyzing ChatGPT’s processing, we noticed that it deleted duplicate content during the summarization process, resulting in a more accurate summary. Based on the summary provided in the previous part, it suggested that I could perform a more in-depth summary as well.
Generate Word Clouds for PDFs
Word clouds condense and visualize the most frequently used words in a document, helping us grasp the key concepts covered in the PDF.
Generating word clouds for multiple PDFs can reveal similarities and differences between texts, aiding in comparative analysis or cluster analysis.
To generate a word cloud for a PDF, I used the following prompt.
Please remove the stop words from the PDF and generate a word cloud.
However, Code Interpreter can also generate word clouds for multiple PDFs in bulk. It is not limited to PDFs only; it can process other document formats such as Word, Markdown, Epub, etc., to generate word clouds.
Extract Information from PDFs
Based on the word cloud generated earlier, we already know the main keywords in the PDF. For example, if I want to extract all sentences that include the word “model” and output them to a text file, Code Interpreter can easily accomplish this task.
Please extract the sentence that contains the word “model” (case insensitive) from the PDF and save it as a text file. Let’s go step by step. Once the task is complete, kindly provide a download link for the text file.
Code Interpreter extracted the sentences from the PDF according to my request, exported them to a text file, and provided a download link. The first five extracted sentences are highlighted in the red box below.
OCR (Optical Character Recognition) can convert scanned or image-based PDFs into searchable and selectable text, making PDFs more user-friendly and reducing file size. For images, OCR can convert text within the image into editable text for further processing.
To demonstrate the recognition function of Code Interpreter, I chose a scanned copy of a 1972 letter. You can also add multiple PDFs for recognition simultaneously.
Enter the following prompt:
Please apply OCR technology to recognize the text in the provided PDF and generate a new PDF containing the recognized text. After completing the task, please share a download link for the resulting PDF.
After applying OCR recognition to a two-page PDF, here is the effect of the first page:
The results are satisfactory, as almost all the words are correctly recognized.
Batch File Format Conversion
Reading and processing documents often requires the use of different applications, each supporting different formats. Utilizing the most suitable format for specific purposes allows for efficient processing, sharing, saving, and utilization of documents. Code Interpreter enables batch conversion between different formats, making files more versatile across different systems.
Batch WebP to JPG Conversion
Many images downloaded from the internet are in the WebP image format. Since many image editing software cannot recognize or support this format, we often need to convert WebP images to other formats like JPG.
Batch conversion with Code Interpreter is convenient. Simply place the images to be converted into a zip file, upload it, and use a simple prompt to guide ChatGPT in the conversion process.
Please convert all the pictures in the compressed package from the webp format to jpg format. Once the conversion is complete, please provide a download link for the compressed package containing the processed pictures.
Convert Web Pages to Markdown
Markdown is a plain text format that can be easily opened and read even without an internet connection. Converting web pages to Markdown allows for more convenient storage. Many note-taking applications, such as Notion, support importing Markdown content.
Converting web pages to Markdown is straightforward. First, save the web page as an MHTML file, then upload the file to Code Interpreter. Finally, write a prompt for ChatGPT, such as:
Convert this mhtml file to markdown file. Provide me a download link. Let’s go step by step.
Conversion of Other Formats
Code Interpreter supports many other format conversions in addition to the ones mentioned above.
- Archive: 7Z, TAR, ZIP, RAR, GZ, BZ2, XZ, LZMA
- Audio: MP3, WAV, FLAC, AAC, OGG, M4A, AIFF, WMA, AMR
- Documentation: DOC, DOCX, PDF, RTF, HTML, TXT, ODT, EPUB, CSV, XLS, XLSX, PPT, PPTX, Markdown, LaTeX
- Form: XLSX, CSV
- Images: PNG, JPG, JPEG, BMP, TIFF, GIF, SVG, ICO, WEBP, RAW, HEIC, EPS, PSD
- Video: MP4, AVI, MOV, FLV, MKV, WMV, 3GP, WebM, MPEG, VOB
- Code files: PY, JS, JAVA, C, CPP, CS, R, Swift, PHP, Ruby, Go, Kotlin, Lua, Shell
- Data files: JSON, XML, YAML, SQL, HDF5, PICKLE, Parquet, Protobuf
- Subtitles: SRT, ASS, SSA, VTT
- Fonts: TTF, OTF, WOFF, EOT
- 3D files: STL, OBJ, FBX, COLLADA, 3DS, IGES, STEP
- Geospatial data: GeoJSON, SHP, KML, GPX, GeoTIFF
- Scientific data formats: FITS, VTK, NetCDF, DICOM
- CAD files: DWG, DXF
- eBooks: EPUB, MOBI, AZW3
- Email: EML, MSG
- Web: HTML, CSS, JS, WebAssembly
- Other: LOG
Please note that although Code Interpreter can handle these formats, converting between them may not always be easy or even possible in some cases.
Each conversion depends on the specific Python library used, the complexity of the file, and the support provided by the format itself. For example, converting a complex DOCX document to HTML may not preserve the formatting and layout perfectly due to differences between the two formats.
And there you have it — with Code Interpreter, document management is elevated to new heights of efficiency. AI is truly revolutionizing how we handle files.
Recently, an exciting open-source project called Open Interpreter emerged as well. It allows you to leverage code interpretation directly on your local machine, streamlining certain workflows even further. Interested? Check this out: