文本處理pdf處理-xpdf

安裝

詳見
http://blog.csdn.net/u011334621/article/details/44059697

pdf轉(zhuǎn)txt工具(version 4.00)
語法
pdftotext [options] [PDF-file [text-file]]

讀取pdf文件蒿叠,輸出txt文本,如果text文本沒有指定,輸出和pdf同名的TXT文件蜻拨,如果txt為-,直接打印出來庄呈。

CONFIGURATION FILE

Pdftotext reads a configuration file at startup. It first tries to find the user’s private config file, ~/.xpdfrc. If that doesn’t exist, it looks for a system-wide config file, typically /usr/local/etc/xpdfrc (but this location can be changed when pdftotext is built). See the xpdfrc(5) man page for details.

OPTIONS

Many of the following options can be set with configuration file commands. These are listed in square brackets with the description of the corresponding command line option.
?f number

Specifies the first page to convert.

?l number

Specifies the last page to convert.

?layout

Maintain (as best as possible) the original physical layout of the text. The default is to ′undo’ physical layout (columns, hyphenation, etc.) and output the text in reading order. If the ?fixed option is given, character spacing within each line will be determined by the specified character pitch.

?simple

Similar to ?layout, but optimized for simple one-column pages. This mode will do a better job of maintaining horizontal spacing, but it will only work properly with a single column of text.

?table

Table mode is similar to physical layout mode, but optimized for tabular data, with the goal of keeping rows and columns aligned (at the expense of inserting extra whitespace). If the ?fixed option is given, character spacing within each line will be determined by the specified character pitch.

?lineprinter
Line printer mode uses a strict fixed-character-pitch and -height layout. That is, the page is broken into a grid, and characters are placed into that grid. If the grid spacing is too small for the actual characters, the result is extra whitespace. If the grid spacing is too large, the result is missing whitespace. The grid spacing can be specified using the ?fixed and ?linespacing options. If one or both are not given on the command line, pdftotext will attempt to compute appropriate value(s).
?raw
Keep the text in content stream order. Depending on how the PDF file was generated, this may or may not be useful.
?fixed number
Specify the character pitch (character width), in points, for physical layout, table, or line printer mode. This is ignored in all other modes.
?linespacing number
Specify the line spacing, in points, for line printer mode. This is ignored in all other modes.
?clip

Text which is hidden because of clipping is removed before doing layout, and then added back in. This can be helpful for tables where clipped (invisible) text would overlap the next column.

?nodiag

Diagonal text, i.e., text that is not close to one of the 0, 90, 180, or 270 degree axes, is discarded. This is useful to skip watermarks drawn on top of body text, etc.

?enc encoding-name

Sets the encoding to use for text output. The encoding?name must be defined with the unicodeMap command (see xpdfrc(5)). The encoding name is case-sensitive. This defaults to "Latin1" (which is a built-in encoding). [config file: textEncoding]

?eol unix | dos | mac

Sets the end-of-line convention to use for text output. [config file: textEOL]

?nopgbrk

Don’t insert page breaks (form feed characters) between pages. [config file: textPageBreaks]

?bom

Insert a Unicode byte order marker (BOM) at the start of the text output.

?opw password
Specify the owner password for the PDF file. Providing this will bypass all security restrictions.
?upw password
Specify the user password for the PDF file.
?q
Don’t print any messages or errors. [config file: errQuiet]
?cfg config-file
Read config-file in place of ~/.xpdfrc or the system-wide config file.
?v
Print copyright and version information.
?h
Print usage information. (?help and ??help are equivalent.)
BUGS
Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to extract text from these files.

EXIT CODES

The Xpdf tools use the following exit codes:

  • 0-無錯(cuò)誤
  • 1-打開pdf文件有誤
  • 2-輸出文件有誤
  • 3-pdf文件權(quán)限
  • 99-其他錯(cuò)誤

AUTHOR

The pdftotext software and documentation are copyright 1996-2017 Glyph & Cog, LLC.

SEE ALSO

xpdf(1), pdftops(1), pdftohtml(1), pdfinfo(1), pdffonts(1), pdfdetach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5)
http://www.xpdfreader.com/

xpdf配置文件

xpdfrc
NAME
DESCRIPTION
INCLUDE FILES
GENERAL FONT CONFIGURATION
POSTSCRIPT FONT CONFIGURATION
POSTSCRIPT CONTROL
TEXT CONTROL AND CHARACTER MAPPING
RASTERIZER SETTINGS
VIEWER SETTINGS
MISCELLANEOUS SETTINGS
EXAMPLES
FILES
AUTHOR
SEE ALSO
NAME

xpdfrc ? configuration file for Xpdf tools (version 4.00)

DESCRIPTION

All of the Xpdf tools read a single configuration file. If you have a .xpdfrc file in your home directory, it will be read. Otherwise, a system-wide configuration file will be read from /usr/local/etc/xpdfrc, if it exists. (This is its default location; depending on build options, it may be placed elsewhere.) On Win32 systems, the xpdfrc file should be placed in the same directory as the executables.

The xpdfrc file consists of a series of configuration options, one per line. Blank lines and lines starting with a ′#’ (comments) are ignored.

Arguments may be quoted, using "double-quote" characters, e.g., for file names that contain spaces.

The following sections list all of the configuration options, sorted into functional groups. There is an examples section at the end.

INCLUDE FILES

include config?file

Includes the specified config file. The effect of this is equivalent to inserting the contents of config?file directly into the parent config file in place of the include command. Config files can be nested arbitrarily deeply.

GENERAL FONT CONFIGURATION

fontFile PDF?font?name font?file

Maps a PDF font, PDF?font?name, to a font for display or PostScript output. The font file, font?file, can be any type allowed in a PDF file. This command can be used for 8-bit or 16-bit (CID) fonts.

fontDir dir

Specifies a search directory for font files. There can be multiple fontDir commands; all of the specified directories will be searched in order. The font files can be Type 1 (.pfa or .pfb) or TrueType (.ttf or .ttc); other files in the directory will be ignored. The font file name (not including the extension) must exactly match the PDF font name. This search is performed if the font name doesn’t match any of the fonts declared with the fontFile command. There are no default fontDir directories.

fontFileCC registry?ordering font?file

Maps the registry?ordering character collection to a font for display or PostScript output. This mapping is used if the font name doesn’t match any of the fonts declared with the fontFile, fontDir, psResidentFont16, or psResidentFontCC commands.

POSTSCRIPT FONT CONFIGURATION

psFontPassthrough yes | no

If set to "yes", pass 8-bit font names through to the PostScript output without substitution. Fonts which are not embedded in the PDF file are expected to be available on the printer. This defaults to "no".

psResidentFont PDF?font?name PS?font?name

When the 8-bit font PDF?font?name is used (without embedding) in a PDF file, it will be translated to the PostScript font PS?font?name, which is assumed to be resident in the printer. Typically, PDF?font?name and PS?font?name are the same. By default, only the Base-14 fonts are assumed to be resident.

psResidentFont16 PDF?font?name wMode PS?font?name encoding

When the 16-bit (CID) font PDF?font?name with writing mode wMode is used (without embedding) in a PDF file, it will be translated to the PostScript font PS?font?name, which is assumed to be resident in the printer. The writing mode must be either ′H’ for horizontal or ′V’ for vertical. The resident font is assumed to use the specified encoding (which must have been defined with the unicodeMap command).

psResidentFontCC registry?ordering wMode PS?font?name encoding

When a 16-bit (CID) font using the registry?ordering character collection and wMode writing mode is used (without embedding) in a PDF file, the PostScript font, PS?font?name, is substituted for it. The substituted font is assumed to be resident in the printer. The writing mode must be either ′H’ for horizontal or ′V’ for vertical. The resident font is assumed to use the specified encoding (which must have been defined with the unicodeMap command).

psEmbedType1Fonts yes | no

If set to "no", prevents embedding of Type 1 fonts in generated PostScript. This defaults to "yes".

psEmbedTrueTypeFonts yes | no

If set to "no", prevents embedding of TrueType fonts in generated PostScript. This defaults to "yes".

psEmbedCIDTrueTypeFonts yes | no

If set to "no", prevents embedding of CID TrueType fonts in generated PostScript. For Level 3 PostScript, this generates a CID font, for lower levels it generates a non-CID composite font. This defaults to "yes".

psEmbedCIDPostScriptFonts yes | no

If set to "no", prevents embedding of CID PostScript fonts in generated PostScript. For Level 3 PostScript, this generates a CID font, for lower levels it generates a non-CID composite font. This defaults to "yes".

POSTSCRIPT CONTROL

psPaperSize width(pts) height(pts)

Sets the paper size for PostScript output. The width and height parameters give the paper size in PostScript points (1 point = 1/72 inch).

psPaperSize letter | legal | A4 | A3 | match

Sets the paper size for PostScript output to a standard size. The default paper size is set when xpdf and pdftops are built, typically to "letter" or "A4". This can also be set to "match", which will set the paper size to match the size specified in the PDF file.

psImageableArea llx lly urx ury

Sets the imageable area for PostScript output. The four integers are the coordinates of the lower-left and upper-right corners of the imageable region, specified in points (with the origin being the lower-left corner of the paper). This defaults to the full paper size; the psPaperSize option will reset the imageable area coordinates.

psCrop yes | no

If set to "yes", PostScript output is cropped to the CropBox specified in the PDF file; otherwise no cropping is done. This defaults to "yes".

psUseCropBoxAsPage yes | no

If set to "yes", PostScript output treats the CropBox as the page size. By default, this is "no", and the MediaBox is used as the page size.

psExpandSmaller yes | no

If set to "yes", PDF pages smaller than the PostScript imageable area are expanded to fill the imageable area. Otherwise, no scaling is done on smaller pages. This defaults to "no".

psShrinkLarger yes | no

If set to yes, PDF pages larger than the PostScript imageable area are shrunk to fit the imageable area. Otherwise, no scaling is done on larger pages. This defaults to "yes".

psCenter yes | no

If set to yes, PDF pages smaller than the PostScript imageable area (after any scaling) are centered in the imageable area. Otherwise, they are aligned at the lower-left corner of the imageable area. This defaults to "yes".

psDuplex yes | no

If set to "yes", the generated PostScript will set the "Duplex" pagedevice entry. This tells duplex-capable printers to enable duplexing. This defaults to "no".

psLevel level1 | level1sep | level2 | level2gray | level2sep | level3 |
level3gray | level3Sep

Sets the PostScript level to generate. This defaults to "level2".

psPreload yes | no

If set to "yes", PDF forms are converted to PS procedures, and image data is preloaded. This uses more memory in the PostScript interpreter, but generates significantly smaller PS files in situations where, e.g., the same image is drawn on every page of a long document. This defaults to "no".

psOPI yes | no

If set to "yes", generates PostScript OPI comments for all images and forms which have OPI information. This option is only available if the Xpdf tools were compiled with OPI support. This defaults to "no".

psASCIIHex yes | no

If set to "yes", the ASCIIHexEncode filter will be used instead of ASCII85Encode for binary data. This defaults to "no".

psLZW yes | no

If set to "yes", the LZWEncode filter will be used for lossless compression in PostScript output; if set to "no", the RunLengthEncode filter will be used instead. LZW generates better compression (smaller PS files), but may not be supported by some printers. This defaults to "yes".

psUncompressPreloadedImages yes | no

If set to "yes", all preloaded images in PS files will uncompressed. If set to "no", the original compressed images will be used when possible. The "yes" setting is useful to work around certain buggy PostScript interpreters. This defaults to "no".

psMinLineWidth float

Set the minimum line width, in points, for PostScript output. The default value is 0 (no minimum).

psRasterResolution float

Set the resolution (in dpi) for rasterized pages in PostScript output. (Pdftops will rasterize pages which use transparency.) This defaults to 300.

psRasterMono yes | no

If set to "yes", rasterized pages in PS files will be monochrome (8-bit gray) instead of color. This defaults to "no".

psRasterSliceSize pixels

When rasterizing pages, pdftops splits the page into horizontal "slices", to limit memory usage. This option sets the maximum slice size, in pixels. This defaults to 20000000 (20 million).

psAlwaysRasterize yes | no

If set to "yes", all PostScript output will be rasterized. This defaults to "no".

psNeverRasterize yes | no

Pdftops rasterizes an pages that use transparency (because PostScript doesn’t support transparency). If psNeverRasterize is set to "yes", rasterization is disabled: pages will never be rasterized, even if they contain transparency. This will likely result in incorrect output for PDF files that use transparency, and a warning message to that effect will be printed. This defaults to "no".

fontDir dir

See the description above, in the DISPLAY FONTS section.

TEXT CONTROL AND CHARACTER MAPPING

textEncoding encoding?name

Sets the encoding to use for text output. (This can be overridden with the "?enc" switch on the command line.) The encoding?name must be defined with the unicodeMap command (see above). This defaults to "Latin1".

textEOL unix | dos | mac

Sets the end-of-line convention to use for text output. The options are:

unix = LF
dos = CR+LF
mac = CR

(This can be overridden with the "?eol" switch on the command line.) The default value is based on the OS where xpdf and pdftotext were built.

textPageBreaks yes | no

If set to "yes", text extraction will insert page breaks (form feed characters) between pages. This defaults to "yes".

textKeepTinyChars yes | no

If set to "yes", text extraction will keep all characters. If set to "no", text extraction will discard tiny (smaller than 3 point) characters after the first 50000 per page, avoiding extremely slow run times for PDF files that use special fonts to do shading or cross-hatching. This defaults to "yes".

nameToUnicode map?file

Specifies a file with the mapping from character names to Unicode. This is used to handle PDF fonts that have valid encodings but no ToUnicode entry. Each line of a nameToUnicode file looks like this:

hex?string name

The hex?string is the Unicode (UCS-2) character index, and name is the corresponding character name. Multiple nameToUnicode files can be used; if a character name is given more than once, the code in the last specified file is used. There is a built-in default nameToUnicode table with all of Adobe’s standard character names.

cidToUnicode registry?ordering map?file

Specifies the file with the mapping from character collection to Unicode. Each line of a cidToUnicode file represents one character:

hex?string

The hex?string is the Unicode (UCS-2) index for that character. The first line maps CID 0, the second line CID 1, etc. File size is determined by size of the character collection. Only one file is allowed per character collection; the last specified file is used. There are no built-in cidToUnicode mappings.

unicodeToUnicode font?name?substring map?file

This is used to work around PDF fonts which have incorrect Unicode information. It specifies a file which maps from the given (incorrect) Unicode indexes to the correct ones. The mapping will be used for any font whose name contains font?name?substring. Each line of a unicodeToUnicode file represents one Unicode character:

in?hex out?hex1 out?hex2 ...

The in?hex field is an input (incorrect) Unicode index, and the rest of the fields are one or more output (correct) Unicode indexes. Each occurrence of in?hex will be converted to the specified output sequence.

unicodeMap encoding?name map?file

Specifies the file with mapping from Unicode to encoding?name. These encodings are used for text output (see below). Each line of a unicodeMap file represents a range of one or more Unicode characters which maps linearly to a range in the output encoding:

in?start?hex in?end?hex out?start?hex

Entries for single characters can be abbreviated to:

in?hex out?hex

The in?start?hex and in?end?hex fields (or the single in?hex field) specify the Unicode range. The out?start?hex field (or the out?hex field) specifies the start of the output encoding range. The length of the out?start?hex (or out?hex) string determines the length of the output characters (e.g., UTF-8 uses different numbers of bytes to represent characters in different ranges). Entries must be given in increasing Unicode order. Only one file is allowed per encoding; the last specified file is used. The Latin1, ASCII7, Symbol, ZapfDingbats, UTF-8, and UCS-2 encodings are predefined.

cMapDir registry?ordering dir

Specifies a search directory, dir, for CMaps for the registry?ordering character collection. There can be multiple directories for a particular collection. There are no default CMap directories.

toUnicodeDir dir

Specifies a search directory, dir, for ToUnicode CMaps. There can be multiple ToUnicode directories. There are no default ToUnicode directories.

mapNumericCharNames yes | no

If set to "yes", the Xpdf tools will attempt to map various numeric character names sometimes used in font subsets. In some cases this leads to usable text, and in other cases it leads to gibberish -- there is no way for Xpdf to tell. This defaults to "yes".

mapUnknownCharNames yes | no

If set to "yes", and mapNumericCharNames is set to "no", the Xpdf tools will apply a simple pass-through mapping (Unicode index = character code) for all unrecognized glyph names. (For CID fonts, setting mapNumericCharNames to "no" is unnecessary.) In some cases, this leads to usable text, and in other cases it leads to gibberish -- there is no way for Xpdf to tell. This defaults to "no".

mapExtTrueTypeFontsViaUnicode yes | no

When rasterizing text using an external TrueType font, there are two options for handling character codes. If mapExtTrueTypeFontsViaUnicode is set to "yes", Xpdf will use the font encoding/ToUnicode info to map character codes to Unicode, and then use the font’s Unicode cmap to map Unicode to GIDs. If mapExtTrueTypeFontsViaUnicode is set to "no", Xpdf will assume the character codes are GIDs (i.e., use an identity mapping). This defaults to "yes".

RASTERIZER SETTINGS

enableFreeType yes | no

Enables or disables use of FreeType (a TrueType / Type 1 font rasterizer). This is only relevant if the Xpdf tools were built with FreeType support. ("enableFreeType" replaces the old "freetypeControl" option.) This option defaults to "yes".

disableFreeTypeHinting yes | no

If this is set to "yes", FreeType hinting will be forced off. This option defaults to "no".

antialias yes | no

Enables or disables font anti-aliasing in the PDF rasterizer. This option affects all font rasterizers. ("antialias" replaces the anti-aliasing control provided by the old "t1libControl" and "freetypeControl" options.) This default to "yes".

vectorAntialias yes | no

Enables or disables anti-aliasing of vector graphics in the PDF rasterizer. This defaults to "yes".

antialiasPrinting yes | no

If this is "yes", bitmaps sent to the printer will be antialiased (according to the "antialias" and "vectorAntialias" settings). If this is "no", printed bitmaps will not be antialiased. This defaults to "no".

strokeAdjust yes | no | cad

Sets the stroke adjustment mode. If set to "no", no stroke adjustment will be done. If set to "yes", normal stroke adjustment will be done: horizontal and vertical lines will be moved by up to half a pixel to make them look cleaner when vector anti-aliasing is enabled. If set to "cad", a slightly different stroke adjustment algorithm will be used to ensure that lines of the same original width will always have the same adjusted width (at the expense of allowing gaps and overlaps between adjacent lines). This defaults to "yes".

screenType dispersed | clustered | stochasticClustered

Sets the halftone screen type, which will be used when generating a monochrome (1-bit) bitmap. The three options are dispersed-dot dithering, clustered-dot dithering (with a round dot and 45-degree screen angle), and stochastic clustered-dot dithering. By default, "stochasticClustered" is used for resolutions of 300 dpi and higher, and "dispersed" is used for resolutions lower then 300 dpi.

screenSize integer

Sets the size of the (square) halftone screen threshold matrix. By default, this is 4 for dispersed-dot dithering, 10 for clustered-dot dithering, and 100 for stochastic clustered-dot dithering.

screenDotRadius integer

Sets the halftone screen dot radius. This is only used when screenType is set to stochasticClustered, and it defaults to 2. In clustered-dot mode, the dot radius is half of the screen size. Dispersed-dot dithering doesn’t have a dot radius.

screenGamma float

Sets the halftone screen gamma correction parameter. Gamma values greater than 1 make the output brighter; gamma values less than 1 make it darker. The default value is 1.

screenBlackThreshold float

When halftoning, all values below this threshold are forced to solid black. This parameter is a floating point value between 0 (black) and 1 (white). The default value is 0.

screenWhiteThreshold float

When halftoning, all values above this threshold are forced to solid white. This parameter is a floating point value between 0 (black) and 1 (white). The default value is 1.

minLineWidth float

Set the minimum line width, in device pixels. This affects the rasterizer only, not the PostScript converter (except when it uses rasterization to handle transparency). The default value is 0 (no minimum).

enablePathSimplification yes | no

If set to "yes", simplify paths by removing points where it won’t make a significant difference to the shape. The default value is "no".

overprintPreview yes | no

If set to "yes", generate overprint preview output, honoring the OP/op/OPM settings in the PDF file. Ignored for non-CMYK output. The default value is "no".

VIEWER SETTINGS

These settings only apply to the Xpdf GUI PDF viewer.
initialZoom percentage | page | width

Sets the initial zoom factor. A number specifies a zoom percentage, where 100 means 72 dpi. You may also specify ′page’, to fit the page to the window size, or ′width’, to fit the page width to the window width.

defaultFitZoom percentage

If xpdf is started with fit-page or fit-width zoom and no window geometry, it will calculate a desired window size based on the PDF page size and this defaultFitZoom value. I.e., the window size will be chosen such that exactly one page will fit in the window at this zoom factor (which must be a percentage). The default value is based on the screen resolution.

initialSidebarState yes | no

If set to "yes", xpdf opens with the sidebar (tabs, outline, etc.) visible. If set to "no", xpdf opens with the sidebar collapsed. The default is "no".

paperColor color

Set the "paper color", i.e., the background of the page display. The color can be #RRGGBB (hexadecimal) or a named color. This option will not work well with PDF files that do things like filling in white behind the text.

matteColor color

Set the matte color, i.e., the color used for background outside the actual page area. The color can be #RRGGBB (hexadecimal) or a named color.

fullScreenMatteColor color

Set the matte color for full-screen mode. The color can be #RRGGBB (hexadecimal) or a named color.

popupMenuCmd title command ...

Add a command to the popup menu. Title is the text to be displayed in the menu. Command is an Xpdf command (see the COMMANDS section of the xpdf(1) man page for details). Multiple commands are separated by whitespace.

maxTileWidth pixels

Set the maximum width of tiles to be used by xpdf when rasterizing pages. This defaults to 1500.

maxTileHeight pixels

Set the maximum height of tiles to be used by xpdf when rasterizing pages. This defaults to 1500.

tileCacheSize tiles

Set the maximum number of tiles to be cached by xpdf when rasterizing pages. This defaults to 10.

workerThreads numThreads

Set the number of worker threads to be used by xpdf when rasterizing pages. This defaults to 1.

launchCommand command

Sets the command executed when you click on a "launch"-type link. The intent is for the command to be a program/script which determines the file type and runs the appropriate viewer. The command line will consist of the file to be launched, followed by any parameters specified with the link. Do not use "%s" in "command". By default, this is unset, and Xpdf will simply try to execute the file (after prompting the user).

movieCommand command

Sets the command executed when you click on a movie annotation. The string "%s" will be replaced with the movie file name. This has no default value.

bind modifiers-key context command ...

Add a key or mouse button binding. Modifiers can be zero or more of:

shift-
ctrl-
alt-

Key can be a regular ASCII character, or any one of:

space
tab
return
enter
backspace
esc
insert
delete
home
end
pgup
pgdn
left / right / up / down (arrow keys)
f1 .. f35 (function keys)
mousePress1 .. mousePress7 (mouse buttons)
mouseRelease1 .. mouseRelease7 (mouse buttons)
mouseClick1 .. mouseClick7 (mouse buttons)

Context is either "any" or a comma-separated combination of:

fullScreen / window (full screen mode on/off)
continuous / singlePage (continuous mode on/off)
overLink / offLink (mouse over link or not)
scrLockOn / scrLockOff (scroll lock on/off)

The context string can include only one of each pair in the above list.

Command is an Xpdf command (see the COMMANDS section of the xpdf(1) man page for details). Multiple commands are separated by whitespace.

The bind command replaces any existing binding, but only if it was defined for the exact same modifiers, key, and context. All tokens (modifiers, key, context, commands) are case-sensitive.

Example key bindings:

bind ctrl-a in any context to the nextPage

command

bind ctrl-a any nextPage

bind uppercase B, when in continuous mode

with scroll lock on, to the reload command

followed by the prevPage command

bind B continuous,scrLockOn reload prevPage

See the xpdf(1) man page for more examples.

unbind modifiers-key context

Removes a key binding established with the bind command. This is most useful to remove default key bindings before establishing new ones (e.g., if the default key binding is given for "any" context, and you want to create new key bindings for multiple contexts).

MISCELLANEOUS SETTINGS

drawAnnotations yes | no

If set to "no", annotations will not be drawn or printed. The default value is "yes".

drawFormFields yes | no

If set to "no", form fields will not be drawn or printed. The default value is "yes".

enableXFA yes | no

If set to "yes", an XFA form (if present) will be rendered in place of an AcroForm. If "no", an XFA form will never be rendered. This defaults to "yes".

printCommands yes | no

If set to "yes", drawing commands are printed as they’re executed (useful for debugging). This defaults to "no".

errQuiet yes | no

If set to "yes", this suppresses all error and warning messages from all of the Xpdf tools. This defaults to "no".

EXAMPLES

The following is a sample xpdfrc file.

from the Thai support package

nameToUnicode /usr/local/share/xpdf/Thai.nameToUnicode

from the Japanese support package

cidToUnicode Adobe-Japan1 /usr/local/share/xpdf/Adobe-Japan1.cidToUnicode
unicodeMap JISX0208 /usr/local/share/xpdf/JISX0208.unicodeMap
cMapDir Adobe-Japan1 /usr/local/share/xpdf/cmap/Adobe-Japan1

use the Base-14 Type 1 fonts from ghostscript

fontFile Times-Roman /usr/local/share/ghostscript/fonts/n021003l.pfb
fontFile Times-Italic /usr/local/share/ghostscript/fonts/n021023l.pfb
fontFile Times-Bold /usr/local/share/ghostscript/fonts/n021004l.pfb
fontFile Times-BoldItalic /usr/local/share/ghostscript/fonts/n021024l.pfb
fontFile Helvetica /usr/local/share/ghostscript/fonts/n019003l.pfb
fontFile Helvetica-Oblique /usr/local/share/ghostscript/fonts/n019023l.pfb
fontFile Helvetica-Bold /usr/local/share/ghostscript/fonts/n019004l.pfb
fontFile Helvetica-BoldOblique /usr/local/share/ghostscript/fonts/n019024l.pfb
fontFile Courier /usr/local/share/ghostscript/fonts/n022003l.pfb
fontFile Courier-Oblique /usr/local/share/ghostscript/fonts/n022023l.pfb
fontFile Courier-Bold /usr/local/share/ghostscript/fonts/n022004l.pfb
fontFile Courier-BoldOblique /usr/local/share/ghostscript/fonts/n022024l.pfb
fontFile Symbol /usr/local/share/ghostscript/fonts/s050000l.pfb
fontFile ZapfDingbats /usr/local/share/ghostscript/fonts/d050000l.pfb

use the Bakoma Type 1 fonts

(this assumes they happen to be installed in /usr/local/fonts/bakoma)

fontDir /usr/local/fonts/bakoma

set some PostScript options

psPaperSize letter
psDuplex no
psLevel level2
psEmbedType1Fonts yes
psEmbedTrueTypeFonts yes

assume that the PostScript printer has the Univers and

Univers-Bold fonts

psResidentFont Univers Univers
psResidentFont Univers-Bold Univers-Bold

set the text output options

textEncoding UTF-8
textEOL unix

misc options

enableFreeType yes
launchCommand viewer-script

FILES

/usr/local/etc/xpdfrc

This is the default location for the system-wide configuration file. Depending on build options, it may be placed elsewhere.

$HOME/.xpdfrc

This is the user’s configuration file. If it exists, it will be read in place of the system-wide file.

AUTHOR

The Xpdf software and documentation are copyright 1996-2017 Glyph & Cog, LLC.

SEE ALSO

xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdffonts(1), pdfdetach(1), pdftoppm(1), pdftopng(1), pdfimages(1)
http://www.xpdfreader.com/

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子真慢,更是在濱河造成了極大的恐慌潮尝,老刑警劉巖榕吼,帶你破解...
    沈念sama閱讀 206,311評(píng)論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場(chǎng)離奇詭異勉失,居然都是意外死亡羹蚣,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,339評(píng)論 2 382
  • 文/潘曉璐 我一進(jìn)店門乱凿,熙熙樓的掌柜王于貴愁眉苦臉地迎上來顽素,“玉大人咽弦,你說我怎么就攤上這事⌒渤觯” “怎么了离唬?”我有些...
    開封第一講書人閱讀 152,671評(píng)論 0 342
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)划鸽。 經(jīng)常有香客問我输莺,道長(zhǎng),這世上最難降的妖魔是什么裸诽? 我笑而不...
    開封第一講書人閱讀 55,252評(píng)論 1 279
  • 正文 為了忘掉前任嫂用,我火速辦了婚禮,結(jié)果婚禮上丈冬,老公的妹妹穿的比我還像新娘嘱函。我一直安慰自己,他們只是感情好埂蕊,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,253評(píng)論 5 371
  • 文/花漫 我一把揭開白布往弓。 她就那樣靜靜地躺著,像睡著了一般蓄氧。 火紅的嫁衣襯著肌膚如雪函似。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 49,031評(píng)論 1 285
  • 那天喉童,我揣著相機(jī)與錄音撇寞,去河邊找鬼。 笑死堂氯,一個(gè)胖子當(dāng)著我的面吹牛蔑担,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播咽白,決...
    沈念sama閱讀 38,340評(píng)論 3 399
  • 文/蒼蘭香墨 我猛地睜開眼啤握,長(zhǎng)吁一口氣:“原來是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來了晶框?” 一聲冷哼從身側(cè)響起排抬,我...
    開封第一講書人閱讀 36,973評(píng)論 0 259
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎三妈,沒想到半個(gè)月后畜埋,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 43,466評(píng)論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡畴蒲,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 35,937評(píng)論 2 323
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了对室。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片模燥。...
    茶點(diǎn)故事閱讀 38,039評(píng)論 1 333
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡咖祭,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出蔫骂,到底是詐尸還是另有隱情么翰,我是刑警寧澤,帶...
    沈念sama閱讀 33,701評(píng)論 4 323
  • 正文 年R本政府宣布辽旋,位于F島的核電站浩嫌,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏补胚。R本人自食惡果不足惜码耐,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,254評(píng)論 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望溶其。 院中可真熱鬧骚腥,春花似錦、人聲如沸瓶逃。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,259評(píng)論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽厢绝。三九已至契沫,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間昔汉,已是汗流浹背埠褪。 一陣腳步聲響...
    開封第一講書人閱讀 31,485評(píng)論 1 262
  • 我被黑心中介騙來泰國(guó)打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留挤庇,地道東北人钞速。 一個(gè)月前我還...
    沈念sama閱讀 45,497評(píng)論 2 354
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像嫡秕,于是被迫代替她去往敵國(guó)和親渴语。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,786評(píng)論 2 345

推薦閱讀更多精彩內(nèi)容

  • **2014真題Directions:Read the following text. Choose the be...
    又是夜半驚坐起閱讀 9,389評(píng)論 0 23
  • 第一道美食【叫化雞】 特點(diǎn):肉質(zhì)細(xì)嫩昆咽,餡味鮮香驾凶,別具風(fēng)味。 所需要的材料:開膛嫩仔雞一只(約500克)掷酗。 豬肉50...
    花開在于半夏閱讀 414評(píng)論 4 2
  • 婚姻就像一趟列車泻轰,對(duì)的人會(huì)和你一起坐到終點(diǎn)技肩;錯(cuò)的人會(huì)半路下車去換另一個(gè)人「∩可感情畢竟不像上下車那么容易放下虚婿,...
    王某人_閱讀 1,261評(píng)論 5 22
  • 我今天和姐姐吵架了然痊!她罵我難受至朗,還總是以難受的字眼罵我!又是因?yàn)橐患∈戮缃η乱∈俏义e(cuò)了嗎?好吧唆香!我以后一定不會(huì)再多...
    傾黑夜來臨閱讀 768評(píng)論 0 2
  • 16招必備經(jīng)典吸粉方法 做自媒體的自然離不開粉絲用戶嫌变,那漲粉有哪些技巧方法呢?漲粉方式多達(dá)數(shù)十上百種袋马,但玩法細(xì)節(jié)大...
    一份好心閱讀 631評(píng)論 0 0