Dico |
|
GNU Dictionary Server |
Sergey Poznyakoff |
5.6 Python
The python
module provides an interface which allows
programmers to write loadable modules in Python. The syntax for
loading the module is:
load-module name { command "python" " init-script=name" " load-path=path" " root-class=name"; }
All parameters are optional:
- python module: load-path=path
Augments the default search path for Python modules. The format of path is the usual UNIX path specification: a colon-separated list of directory names.
- python module: init-script=name
Specifies the name of the initial Python source file. This file will be loaded and interpreted immediately after loading the module.
- python module: root-class=name
Sets the name of the Python root class, which is responsible for the dictionary operations.
A particular instance of the python
module is loaded using
the handler
statement within a database
block. This
statement takes the same parameters as described above, plus any
number of command line arguments, which will be passed to the root
class constructor.
5.6.1 Python Dictionary Class
The dictionary class must define the following methods:
- Method on DictionaryClass: __init__ self *argv
Class constructor. The argv array supplies positional arguments from the
handler
statement in the configuration file.
- Method on DictionaryClass: open self dbname
Opens the database named dbname. Returns ‘True’ on success and ‘False’ on failure.
- Method on DictionaryClass: close self
Closes the database.
- Method on DictionaryClass: descr self
Returns a short description of the database.
- Method on DictionaryClass: info self
Returns a text describing the database.
- Method on DictionaryClass: lang self
Optional. Returns supported languages as ‘(src, dst)’.
- Method on DictionaryClass: define_word self word
Defines word. Returns a result (an opaque Python object) if the definition was found or ‘False’ otherwise.
- Method on DictionaryClass: match_word self strat word
Searches for word in the database using strategy strat. Returns a result (an opaque Python object) if some matches were found or ‘False’ otherwise.
- Method on DictionaryClass: output self result n
Outputs nth result from the result set result.
- Method on DictionaryClass: result_count self result
Returns number of elements in the result set.
- Method on DictionaryClass: compare_count self result
Optional. Returns the number of comparisons performed when constructing the result set.
- Method on DictionaryClass: result_headers self result hdr
Optional. Returns a dictionary of MIME headers.
- Method on DictionaryClass: free_result self result
Reclaims any resources used by the result set.
5.6.2 Dico Python Primitives
- Python primitive: register_strat name descr [proc]
Registers new match strategy. The arguments are:
- name
Strategy name for use in the
MATCH
command.- descr
The dscription, which will appear in the output of
SHOW STRAT
command.- proc
Optional selector procedure.
If the proc argument is present, it must be the name of a Python function declared as:
def select(opcode key headword):
Its arguments are:
- opcode
Integer operation code.
- key
An
DicoSelectionKey
object identifying the search term (see DicoSelectionKey).- headword
The headword being examined.
At the beginning of the search, the function is called with the ‘DICO_SELECT_BEGIN’ as its opcode argument. It must perform the necessary initialization and return.
At the end of the search loop, the function is called with opcode ‘DICO_SELECT_END’. It must perform the necessary deinitialization procedures and exit.
In both cases, the key and headword arguments are not defined.
Within the search loop, the function will be called for each headword from the database. The opcode parameter will be ‘DICO_SELECT_RUN’. In this case the function must return ‘True’ if the headword matches the key and ‘False’ otherwise.
- Python primitive: register_markup name
Registers a markup name.
- Python primitive: current_markup
Returns the name of the current markup.
5.6.2.1 The DicoSelectionKey
class
The DicoSelectionKey
class represents a search key and is used
when looking for matches. Calling str
on the object of that
class returns the search term itself, as does the word
method:
- Method on DicoSelectionKey: word
Returns the search term. It is equivalent to the
__str__
attribute.
5.6.2.2 The DicoStrategy
class
A match strategy is represented by an object of the
DicoStrategy
class.
- Variable of DicoStrategy: name
The name of that strategy.
- Variable of DicoStrategy: descr
Textual description of the strategy.
- Variable of DicoStrategy: has_selector
‘True’ if this strategy has a selector (see Python Selector).
- Variable of DicoStrategy: name is_default
‘True’ if this is the default strategy.
- Method on DicoStrategy: select headword key
Returns ‘True’ if key matches headword as per this strategy.
5.6.3 Python Example
In this subsection we will show a simple database module written in Python. This module handles simple textual databases in the following format:
- Empty lines and lines beginning with double dash are ignored.
- A line beginning with ‘descr:’ introduces a short
dictionary description for
SHOW DB
. The ‘descr:’ prefix and the white space immediately following it are removed. E.g.:descr: Short English-Norwegian numerals dictionary
- Lines beginning with ‘info:’ provide a verbose description
of the database. These lines are concatenated after removing the
‘info:’ prefix and white space immediately following it. E.g.:
info: A short English-Norwegian (Bokmål) dictionary info: of numerals. info: info: This dictionary is public domain.
- A line beginning with ‘lang:’ defines source and
destination languages for this dictionary. E.g.:
lang: en : nb
- Any line consisting of exactly two words defines a dictionary
entry. E.g.:
one en two to three tre four fire
Now, let’s create a module for handling this format. First, we need to import Dico primitives (see Dico Python Primitives) and the ‘sys’ module. The latter is needed for output functions:
import dico import sys
Then, a result class will be needed for match_word
and
define_word
methods. It will contain the actual data in
the variable ‘result’:
class DicoResult: # actual data. result = {} # number of comparisons. compcount = 0 def __init__ (self, *argv): self.result = argv[0] if len (argv) == 2: self.compcount = argv[1] def count (self): return len (self.result) def output (self, n): pass def append (self, elt): self.result.append (elt)
The following two classes extend ‘DicoResult’ for use with
‘DEFINE’ and ‘MATCH’ operations. The define_word
method will return an instance of the ‘DicoDefineResult’ class:
class DicoDefineResult (DicoResult): def output (self, n): print "%d. %s" % (n + 1, self.result[n]) print "---------",
The match_word
method will return an instance of the
‘MatchResult’ class:
class DicoMatchResult (DicoResult): def output (self, n): sys.stdout.softspace = 0 print self.result[n],
Now, let’s define the dictionary class:
class DicoModule: # The dictionary converted to associative array. adict = {} # The database name. dbname = '' # The name of the corresponding disk file. filename = '' # A sort information about the database. mod_descr = '' # A verbose description of the database is kept. # as an array of strings. mod_info = [] # A list of source and destination languages: langlist = ()
The class constructor takes a single argument, defining the name of the database file:
def __init__ (self, *argv): self.filename = argv[0] pass
The ‘open’ method opens the database and reads its data:
def open (self, dbname): self.dbname = dbname file = open (self.filename, "r") for line in file: if line.startswith ('--'): continue if line.startswith ('descr: '): self.mod_descr = line[7:].strip (' \n') continue if line.startswith ('info: '): self.mod_info.append (line[6:].strip (' \n')) continue if line.startswith ('lang: '): s = line[6:].strip (' \n').split(':', 2) if (len(s) == 1): self.langlist = (s[0].split (), \ s[0].split ()) else: self.langlist = (s[0].split (), \ s[1].split ()) continue f = line.strip (' \n').split (' ', 1) if len (f) == 2: self.adict[f[0].lower()] = f[1].strip (' ') file.close() return True
The database is kept entirely in memory, so there is no need for ‘close’ method. However, it must be declared anyway:
def close (self): return True
The methods returning database information are trivial:
def descr (self): return self.mod_descr def info (self): return '\n'.join (self.mod_info) def lang (self): return self.langlist
The ‘define_word’ method checks if the search term is present in
the dictionary, and, if so, converts it to the DicoDefineResult
:
def define_word (self, word): if self.adict.has_key (word): return DicoDefineResult ([self.adict[word]]) return False
The ‘match_word’ method supports the ‘exact’ strategy
natively via the has_key
attribute of adict
:
def match_word (self, strat, key): if strat.name == "exact": if self.adict.has_key (key.word.lower ()): return DicoMatchResult \ ([self.adict[key.word.lower()]])
Other strategies are supported as long as they have selectors:
elif strat.has_selector: res = DicoMatchResult ([], len (self.adict)) for k in self.adict: if strat.select (k, key): res.append (k) if res.count > 0: return res return False
The rest of methods rely on the result object to do the right thing:
def output (self, rh, n): rh.output (n) return True def result_count (self, rh): return rh.count () def compare_count (self, rh): return rh.compcount
This document was generated on September 4, 2020 using makeinfo.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.