将pdf转为图片python脚本
安装依赖
python:
CODEpip install pdf2image
poppler:
Linux :
CODE
sudo apt install poppler-utils
Windows:
Windows 用户需下载 poppler,并配置环境变量,下载地址:https://github.com/oschwartz10612/poppler-windows 也可以直接使用 scoop 命令安装:CODE
scoop install main/poppler
Python脚本
CODEimport argparse
import os
from pdf2image import convert_from_path
from PIL import Image
def pdf_to_images(pdf_path, output_folder, dpi=300):
os.makedirs(output_folder, exist_ok=True)
images = convert_from_path(pdf_path, dpi=dpi)
image_paths = []
for i, image in enumerate(images):
image_path = os.path.join(output_folder, f'page_{i+1}.png')
image.save(image_path, 'PNG')
image_paths.append(image_path)
print(f"保存: {image_path}")
return image_paths
def merge_images(image_paths, output_path):
images = [Image.open(p) for p in image_paths]
widths, heights = zip(*(img.size for img in images))
total_height = sum(heights)
max_width = max(widths)
merged_image = Image.new('RGB', (max_width, total_height), color=(255, 255, 255))
y_offset = 0
for img in images:
merged_image.paste(img, (0, y_offset))
y_offset += img.height
merged_image.save(output_path)
print(f"合并图片保存为: {output_path}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='将 PDF 转换为图片,并可选择合并')
parser.add_argument('pdf_path', help='输入 PDF 文件路径')
parser.add_argument('output_folder', help='输出图片文件夹路径')
parser.add_argument('--merge', action='store_true', help='是否将所有页面合并为一张长图')
args = parser.parse_args()
image_paths = pdf_to_images(args.pdf_path, args.output_folder)
if args.merge:
merged_path = os.path.join(args.output_folder, 'merged.png')
merge_images(image_paths, merged_path)
使用方法
创建名为 pdf2img.py
文件,将脚本内容写入其中。命令格式如下:
CODEpython pdf2img.py ./test.pdf ./images_output
每一页会生成一张png格式的图片。
如果需要将生成的图片合为一张(长图)可加上 --merge
参数。
CODEpython pdf2img.py ./test.pdf ./output_folder --merge