PDF가 Adobe Portfolio pdf인지 감지하는 방법이 있습니까?

PDF가 Adobe Portfolio pdf인지 감지하는 방법이 있습니까?

나는 사용자가 이러한 유형의 파일을 업로드하지 못하도록 노력하고 있습니다(기술적으로 여전히 PDF이기 때문에 실제로 다른 파일 유형은 아닙니다).

pdfinfo를 사용하려고합니다

$ pdfinfo portfolio-sample.pdf 
Title:          Sample PDF Portfolio
Subject:        Adobe Acrobat XI
Keywords:       adobe, acrobat, xi, pdf, portfolio, sample
Creator:        Adobe Acrobat Pro 10.1.3
Producer:       Adobe Acrobat Pro 10.1.3
CreationDate:   Thu Jun 21 15:03:15 2012 EDT
ModDate:        Fri Sep 28 17:49:50 2012 EDT
Tagged:         yes
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          1
Encrypted:      no
Page size:      504 x 360 pts
Page rot:       0
File size:      3600732 bytes
Optimized:      no
PDF version:    1.7

및 EXIF ​​도구

$ exiftool -a -G1 portfolio-sample.pdf 
[ExifTool]      ExifTool Version Number         : 10.80
[System]        File Name                       : portfolio-sample.pdf
[System]        Directory                       : .
[System]        File Size                       : 3.4 MB
[System]        File Modification Date/Time     : 2019:08:05 15:23:05-04:00
[System]        File Access Date/Time           : 2019:08:05 15:25:41-04:00
[System]        File Inode Change Date/Time     : 2019:08:05 15:23:10-04:00
[System]        File Permissions                : rw-rw-r--
[File]          File Type                       : PDF
[File]          File Type Extension             : pdf
[File]          MIME Type                       : application/pdf
[PDF]           PDF Version                     : 1.7
[PDF]           Linearized                      : No
[PDF]           Create Date                     : 2012:06:21 15:03:15-04:00
[PDF]           Creator                         : Adobe Acrobat Pro 10.1.3
[PDF]           Keywords                        : adobe, acrobat, xi, pdf, portfolio, sample
[PDF]           Modify Date                     : 2012:09:28 17:49:50-04:00
[PDF]           Producer                        : Adobe Acrobat Pro 10.1.3
[PDF]           Subject                         : Adobe Acrobat XI
[PDF]           Title                           : Sample PDF Portfolio
[PDF]           Language                        : en
[PDF]           Tagged PDF                      : Yes
[PDF]           Page Count                      : 1
[XMP-x]         XMP Toolkit                     : Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03
[XMP-xmp]       Modify Date                     : 2012:09:28 17:49:50-04:00
[XMP-xmp]       Create Date                     : 2012:06:21 15:03:15-04:00
[XMP-xmp]       Metadata Date                   : 2012:09:28 17:49:50-04:00
[XMP-xmp]       Creator Tool                    : Adobe Acrobat Pro 10.1.3
[XMP-dc]        Format                          : application/pdf
[XMP-dc]        Title                           : Sample PDF Portfolio
[XMP-dc]        Creator                         : 
[XMP-dc]        Description                     : Adobe Acrobat XI
[XMP-dc]        Subject                         : adobe, acrobat, xi, pdf, portfolio, sample
[XMP-xmpMM]     Document ID                     : uuid:2d7598db-3b0a-4510-bc0a-4ac1c570a3fa
[XMP-xmpMM]     Instance ID                     : uuid:153f73de-3b2a-4d04-ab31-bb46ec3a5b79
[XMP-pdf]       Producer                        : Adobe Acrobat Pro 10.1.3
[XMP-pdf]       Keywords                        : adobe, acrobat, xi, pdf, portfolio, sample

그러나 출력에는 PDF를 Adobe Portfolio pdf로 표시하는 플래그가 표시되지 않습니다.

답변1

Python 모듈을 사용할 수 있습니다 python-poppler.

from poppler import load_from_file

pdf_document = load_from_file("portfolio-sample.pdf")

if pdf_document.has_embedded_files():
    print("PDF contains Adobe Portfolio attachments")
    
    

관련 정보