Text
This chapter describes the Bigloo API for processing texts.BibTeX
bibtex objbigloo text function
bibtex-port input-portbigloo text function
bibtex-file file-namebigloo text function
bibtex-string stringbigloo text function
These function parse BibTeX sources. The variable obj can either be an input-port or a string which denotes a file name. It returns a list of BibTeX entries. The functionsbibtex-port
, bibtex-file
, and
bibtex-string
are mere wrappers that invoke bibtex
.
Example:
(bibtex (open-input-string "@book{ as:sicp, author = {Abelson, H. and Sussman, G.}, title = {Structure and Interpretation of Computer Programs}, year = 1985, publisher = {MIT Press}, address = {Cambridge, Mass., USA}, }")) ⇒ (("as:sicp" BOOK (author ("Abelson" "H.") ("Sussman" "G.")) (title . "Structure and Interpretation of Computer Programs") (year . "1985") (publisher . "MIT Press") (address . "Cambridge, Mass., USA")))
.keep
bibtex-parse-authors stringbigloo text function
This function parses the author field of a bibtex entry. Example:(bibtex-parse-authors "Abelson, H. and Sussman, G.") ⇒ (("Abelson" "H.") ("Sussman" "G."))
.keep
Character strings
hyphenate word hyphensbigloo text function
The functionhyphenate
accepts as input a single word and
returns as output a list of subwords. The argument hyphens is
an opaque data structure obtained by calling the function load-hyphens
or make-hyphens
.
Example:
(hyphenate "software" (load-hyphens 'en)) ⇒ ("soft" "ware")
.keep
load-hyphens objbigloo text function
Loads an hyphens table and returns a data structure suitable forhyphenate
. The variable obj can either be a file name
containing an hyphens table or a symbol denoting a pre-defined hyphens
table. Currently, Bigloo supports two tables: en
for an English
table and fr
for a French table. The procedure load-hyphens
invokes make-hyphens
to build the hyphens table.
.keep
Example:
(define (hyphenate-text text lang) (let ((table (with-handler (lambda (e) (unless (&io-file-not-found-error? e) (raise e))) (load-hyphens lang))) (words (string-split text " "))) (if table (append-map (lambda (w) (hyphenate w table)) words) words)))The procedure
hyphenate-text
hyphenates the words of the
text
according to the rules for the language denoted by
its code lang
if there is a file lang-hyphens.sch
.
If there is no such file, the text remains un-hyphenated.
make-hyphens [:language] [:exceptions] [:patterns]bigloo text function
Creates an hyphens table out of the arguments exceptions and patterns. The implementation of the table of hyphens created bymake-hyphens
follows closely Frank Liang's algorithm as published in his doctoral
dissertation Word Hy-phen-a-tion By Com-pu-ter
available on the TeX Users Group site here:
http://www.tug.org/docs/liang/. This table is a
trie (see http://en.wikipedia.org/wiki/Trie for
a definition and an explanation).
Most of this implementation is borrowed from Phil Bewig's work available
here: http://sites.google.com/site/schemephil/, along with
his paper describing the program from which the Bigloo implementation is
largely borrowed.
exceptions must be a non-empty list of explicitly hyphenated
words.
Explicitly hyphenated words are like the following:
"as-so-ciate"
, "as-so-ciates"
, "dec-li-na-tion"
,
where the hyphens indicate the places where hyphenation is allowed. The
words in exceptions are used to generate hyphenation patterns,
which are added to patterns (see next paragraph).
patterns must be a non-empty list of hyphenation patterns.
Hyphenation patterns are strings of the form ".anti5s"
, where a
period denotes the beginning or the end of a word, an odd number denotes
a place where hyphenation is allowed, and an even number a place where
hyphenation is forbidden. This notation is part of Frank Liang's
algorithm created for Donald Knuth's TeX typographic system.
.keep
Character encodings
gb2312->ucs2 stringbigloo text function
Converts a GB2312 (aka cp936) encoded 8bits string into an UCS2 string..keep