Generate non-obfuscated binary content for PDF files
Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.
nÃÂÃÂ¥9.â^ÃÂäùàÃÂèâ¢â HTÃÂâ¢ì#ò à ½âÂÂ}qâÂÂmÃÂÃ¤à  9ÃÂrbtRà ¡Ã¡â¢gâÂÂû}AçúænÃÂÃÂâ¦â¡ÃÂâÂÂjKà ÂàFàÃÂõÃÂmþÃ¥áâ¢N:úâÂÂ~éWFöDXâ¹m#âÂÂDÃÂÃÂm;ÃÂum?Oà  ÃÂÃÂâÃÂÃÂ[ÃÂuóõÃÂ÷;ÃÂ6"-@pñÃÂäÃÂ(ÃÂXÃÂÃÂÃÂaà ÂyýûdRìørêÃÂbÃÂò(n^ÃÂþ2ÃÂÃÂ;ì÷êûæÃÂv0þîñúÃÂY'ðóýâ¹%â¦ÃÂÃ¥úÃÂÃºà ¸KÃÂ¥ÃÂìööêæñÃÂ_âÂÂáúê âÂÂò1üj9ö,ÃÂÃÂVæYüwæìDöð}]
Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?
This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.
add a comment |Â
up vote
4
down vote
favorite
When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.
nÃÂÃÂ¥9.â^ÃÂäùàÃÂèâ¢â HTÃÂâ¢ì#ò à ½âÂÂ}qâÂÂmÃÂÃ¤à  9ÃÂrbtRà ¡Ã¡â¢gâÂÂû}AçúænÃÂÃÂâ¦â¡ÃÂâÂÂjKà ÂàFàÃÂõÃÂmþÃ¥áâ¢N:úâÂÂ~éWFöDXâ¹m#âÂÂDÃÂÃÂm;ÃÂum?Oà  ÃÂÃÂâÃÂÃÂ[ÃÂuóõÃÂ÷;ÃÂ6"-@pñÃÂäÃÂ(ÃÂXÃÂÃÂÃÂaà ÂyýûdRìørêÃÂbÃÂò(n^ÃÂþ2ÃÂÃÂ;ì÷êûæÃÂv0þîñúÃÂY'ðóýâ¹%â¦ÃÂÃ¥úÃÂÃºà ¸KÃÂ¥ÃÂìööêæñÃÂ_âÂÂáúê âÂÂò1üj9ö,ÃÂÃÂVæYüwæìDöð}]
Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?
This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.
2
Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
â TeXnician
Aug 29 at 9:22
I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
â Alexandru Irimiea
Aug 29 at 22:12
add a comment |Â
up vote
4
down vote
favorite
up vote
4
down vote
favorite
When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.
nÃÂÃÂ¥9.â^ÃÂäùàÃÂèâ¢â HTÃÂâ¢ì#ò à ½âÂÂ}qâÂÂmÃÂÃ¤à  9ÃÂrbtRà ¡Ã¡â¢gâÂÂû}AçúænÃÂÃÂâ¦â¡ÃÂâÂÂjKà ÂàFàÃÂõÃÂmþÃ¥áâ¢N:úâÂÂ~éWFöDXâ¹m#âÂÂDÃÂÃÂm;ÃÂum?Oà  ÃÂÃÂâÃÂÃÂ[ÃÂuóõÃÂ÷;ÃÂ6"-@pñÃÂäÃÂ(ÃÂXÃÂÃÂÃÂaà ÂyýûdRìørêÃÂbÃÂò(n^ÃÂþ2ÃÂÃÂ;ì÷êûæÃÂv0þîñúÃÂY'ðóýâ¹%â¦ÃÂÃ¥úÃÂÃºà ¸KÃÂ¥ÃÂìööêæñÃÂ_âÂÂáúê âÂÂò1üj9ö,ÃÂÃÂVæYüwæìDöð}]
Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?
This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.
When I open a PDF file compiled from LaTeX with a text editor (e.g. Notepad++), the content of the file is a not human readable and it looks like below, so it seems to me that the information cannot be processed by potential crawlers.
nÃÂÃÂ¥9.â^ÃÂäùàÃÂèâ¢â HTÃÂâ¢ì#ò à ½âÂÂ}qâÂÂmÃÂÃ¤à  9ÃÂrbtRà ¡Ã¡â¢gâÂÂû}AçúænÃÂÃÂâ¦â¡ÃÂâÂÂjKà ÂàFàÃÂõÃÂmþÃ¥áâ¢N:úâÂÂ~éWFöDXâ¹m#âÂÂDÃÂÃÂm;ÃÂum?Oà  ÃÂÃÂâÃÂÃÂ[ÃÂuóõÃÂ÷;ÃÂ6"-@pñÃÂäÃÂ(ÃÂXÃÂÃÂÃÂaà ÂyýûdRìørêÃÂbÃÂò(n^ÃÂþ2ÃÂÃÂ;ì÷êûæÃÂv0þîñúÃÂY'ðóýâ¹%â¦ÃÂÃ¥úÃÂÃºà ¸KÃÂ¥ÃÂìööêæñÃÂ_âÂÂáúê âÂÂò1üj9ö,ÃÂÃÂVæYüwæìDöð}]
Is there a possibility to generate the PDF file so that when the PDF document displays an information like "Specific detail 1", I can also find this string "Specific detail 1" in the binary content of the file when I open it with a text editor?
This is useful for example when a PDF resume is created in LaTeX and it must be automatically parsed by various text analyzers.
asked Aug 29 at 9:20
Alexandru Irimiea
1233
1233
2
Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
â TeXnician
Aug 29 at 9:22
I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
â Alexandru Irimiea
Aug 29 at 22:12
add a comment |Â
2
Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
â TeXnician
Aug 29 at 9:22
I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
â Alexandru Irimiea
Aug 29 at 22:12
2
2
Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
â TeXnician
Aug 29 at 9:22
Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
â TeXnician
Aug 29 at 9:22
I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
â Alexandru Irimiea
Aug 29 at 22:12
I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
â Alexandru Irimiea
Aug 29 at 22:12
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
8
down vote
accepted
For pdfTeX
pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %
For LuaTeX
pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %
For use with (x)dvipdfmx (XeTeX, upTeX, etc.)
specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40
1
Probably in the near future I'll add an interface for this toexpl3
.
â Joseph Wrightâ¦
Aug 29 at 9:25
1
+1, but I wonder whether a crawler that doesn't support PDF compression likesBT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET
â¦
â TeXnician
Aug 29 at 9:27
@TeXnician Sure, we can't do that much about that!
â Joseph Wrightâ¦
Aug 29 at 9:28
3
@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
â Joseph Wrightâ¦
Aug 29 at 10:14
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
8
down vote
accepted
For pdfTeX
pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %
For LuaTeX
pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %
For use with (x)dvipdfmx (XeTeX, upTeX, etc.)
specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40
1
Probably in the near future I'll add an interface for this toexpl3
.
â Joseph Wrightâ¦
Aug 29 at 9:25
1
+1, but I wonder whether a crawler that doesn't support PDF compression likesBT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET
â¦
â TeXnician
Aug 29 at 9:27
@TeXnician Sure, we can't do that much about that!
â Joseph Wrightâ¦
Aug 29 at 9:28
3
@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
â Joseph Wrightâ¦
Aug 29 at 10:14
add a comment |Â
up vote
8
down vote
accepted
For pdfTeX
pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %
For LuaTeX
pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %
For use with (x)dvipdfmx (XeTeX, upTeX, etc.)
specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40
1
Probably in the near future I'll add an interface for this toexpl3
.
â Joseph Wrightâ¦
Aug 29 at 9:25
1
+1, but I wonder whether a crawler that doesn't support PDF compression likesBT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET
â¦
â TeXnician
Aug 29 at 9:27
@TeXnician Sure, we can't do that much about that!
â Joseph Wrightâ¦
Aug 29 at 9:28
3
@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
â Joseph Wrightâ¦
Aug 29 at 10:14
add a comment |Â
up vote
8
down vote
accepted
up vote
8
down vote
accepted
For pdfTeX
pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %
For LuaTeX
pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %
For use with (x)dvipdfmx (XeTeX, upTeX, etc.)
specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40
For pdfTeX
pdfcompresslevel = 0 %
pdfobjcompresslevel = 0 %
For LuaTeX
pdfvariable compresslevel = 0 %
pdfvariable objcompresslevel = 0 %
For use with (x)dvipdfmx (XeTeX, upTeX, etc.)
specialdvipdfmx:config z 0
specialdvipdfmx:config C 0x40
answered Aug 29 at 9:25
Joseph Wrightâ¦
196k21540860
196k21540860
1
Probably in the near future I'll add an interface for this toexpl3
.
â Joseph Wrightâ¦
Aug 29 at 9:25
1
+1, but I wonder whether a crawler that doesn't support PDF compression likesBT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET
â¦
â TeXnician
Aug 29 at 9:27
@TeXnician Sure, we can't do that much about that!
â Joseph Wrightâ¦
Aug 29 at 9:28
3
@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
â Joseph Wrightâ¦
Aug 29 at 10:14
add a comment |Â
1
Probably in the near future I'll add an interface for this toexpl3
.
â Joseph Wrightâ¦
Aug 29 at 9:25
1
+1, but I wonder whether a crawler that doesn't support PDF compression likesBT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET
â¦
â TeXnician
Aug 29 at 9:27
@TeXnician Sure, we can't do that much about that!
â Joseph Wrightâ¦
Aug 29 at 9:28
3
@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
â Joseph Wrightâ¦
Aug 29 at 10:14
1
1
Probably in the near future I'll add an interface for this to
expl3
.â Joseph Wrightâ¦
Aug 29 at 9:25
Probably in the near future I'll add an interface for this to
expl3
.â Joseph Wrightâ¦
Aug 29 at 9:25
1
1
+1, but I wonder whether a crawler that doesn't support PDF compression likes
BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET
â¦â TeXnician
Aug 29 at 9:27
+1, but I wonder whether a crawler that doesn't support PDF compression likes
BT /F8 9.9626 Tf 148.712 707.125 Td [(My)-333(sup)-28(er)-333(imp)-28(ortan)28(t)-334(text.)]TJ 154.421 -567.87 Td [(1)]TJ ET
â¦â TeXnician
Aug 29 at 9:27
@TeXnician Sure, we can't do that much about that!
â Joseph Wrightâ¦
Aug 29 at 9:28
@TeXnician Sure, we can't do that much about that!
â Joseph Wrightâ¦
Aug 29 at 9:28
3
3
@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
â Joseph Wrightâ¦
Aug 29 at 10:14
@AlexG Sure, but the point is that if a crawler can't understand PDF compression, only 'text' in a PDF, it probably can't follow the kerning and whatever either
â Joseph Wrightâ¦
Aug 29 at 10:14
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f448296%2fgenerate-non-obfuscated-binary-content-for-pdf-files%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
Welcome to TeX.SX! What you see is called compression and contains the textual output (pdftotext etc. are able to process it). Which crawler do you refer to that does not support compression?
â TeXnician
Aug 29 at 9:22
I didn't know that crawlers support reading compressed PDFs. I assumed that if it's not human readable then it's also not crawler readable.
â Alexandru Irimiea
Aug 29 at 22:12