The piw-master script is intended to be run on the database and file-server machine. It is recommended you do not run piw-slave on the same machine as the piw-master script. The database specified in the configuration must exist and have been configured with the piw-initdb script. It is recommended you run piw-master as an ordinary unprivileged user, although obviously it will need write access to the output directory.
piw-master [-h] [--version] [-c FILE] [-q] [-v] [-l FILE] [-d DSN] [--pypi-xmlrpc URL] [--pypi-simple URL] [-o PATH] [--index-queue ADDR] [--status-queue ADDR] [--control-queue ADDR] [--builds-queue ADDR] [--db-queue ADDR] [--fs-queue ADDR] [--slave-queue ADDR] [--file-queue ADDR] [--import-queue ADDR]
show this help message and exit
show program’s version number and exit
Specify a configuration file to load
produce less console output
produce more console output
log messages to the specified file
The database to use; this database must be configured with piw-initdb and the user should not be a PostgreSQL superuser (default: postgres:///piwheels)
The path under which the website should be written; must be writable by the current user
The address of the IndexScribe queue (default: inproc://indexes)
The address of the queue used to report status to monitors (default: ipc:///tmp/piw-status)
The address of the queue a monitor can use to control the master (default: ipc:///tmp/piw-control)
The address of the queue used to store pending builds (default: inproc://builds)
The address of the queue used to talk to the database server (default: inproc://db)
The address of the queue used to talk to the file- system server (default: inproc://fs)
The address of the queue used to talk to the build slaves (default: tcp://*:5555)
The address of the queue used to transfer files from slaves (default: tcp://*:5556)
The address of the queue used by piw-import (default: (ipc:///tmp/piw-import); this should always be an ipc address
Although the piwheels master appears to be a monolithic script, it’s actually composed of numerous (often extremely simple) tasks. Each task runs its own thread and all communication between tasks takes place over ZeroMQ sockets. This is also how communication occurs between the master and the piw-slave, and the piw-monitor.
The following diagram roughly illustrates all the tasks in the system (including those of the build slaves and the monitor), along with details of the type of ZeroMQ socket used to communicate between them:
It may be confusing that the file server and database server appear to be separate to the master in the diagram. This is deliberate as the system’s architecture is such that certain tasks can be easily broken off into entirely separate processes (potentially on separate machines), if required in future (either for performance or security reasons).
The following sections document the tasks shown above (listed from the “front” at PyPI to the “back” at Users):
2.4.1. Cloud Gazer¶
This task is the “front” of the system. It follows PyPI’s event log for new package and version registrations, and writes those entries to the database. It does this via The Oracle.
2.4.2. The Oracle¶
This task is the main interface to the database. It accepts requests from other tasks (“register this new package”, “log this build”, “what files were built with this package”, etc.) and executes them against the database. Because database requests are extremely variable in their execution time, there are actually several instances of the oracle which sit behind Seraph.
Seraph is a simple load-balancer for the various instances of The Oracle. This is the task that actually accepts database requests. It finds a free oracle and passes the request along, passing back the reply when it’s finished.
2.4.4. The Architect¶
This task is the final database related task in the master script. Unlike The Oracle it simply queries the database for the packages that need building. Whenever Slave Driver needs a task to hand to a build slave, it asks the Architect for one matching the build slave’s ABI.
2.4.5. Slave Driver¶
This task is the main coordinator of the build slave’s activities. When a build slave first comes online it introduces itself to this task (with information including the ABI it can build for), and asks for a package to build. As described above, this task asks The Architect for the next package matching the build slave’s ABI and passes this back.
Eventually the build slave will communicate whether or not the build succeeded, along with information about the build (log output, files generated, etc.). This task writes this information to the database via The Oracle. If the build was successful, it informs the File Juggler that it should expect a file transfer from the relevant build slave.
Finally, when all files from the build have been transferred, the Slave Driver informs the Index Scribe that the package’s index will need (re)writing.
2.4.6. Mr. Chase¶
This task talks to piw-import and handles importing builds manually into the system. It is essentially a cut-down version of the Slave Driver with a correspondingly simpler protocol.
Finally, when all files from the build have been transferred, it informs the Index Scribe that the package’s index will need (re)writing.
2.4.7. File Juggler¶
This task handles file transfers from the build slaves to the master. Files are transferred in multiple (relatively small) chunks and are verified with the hash reported by the build slave (retrieved from the database via The Oracle).
2.4.8. Big Brother¶
This task is a bit of a miscellaneous one. It sits around periodically generating statistics about the system as a whole (number of files, number of packages, number of successful builds, number of builds in the last hour, free disk space, etc.) and sends these off to the Index Scribe.
It should be noted that the diagram omits several queues for the sake of
brevity. For instance, there is a simple PUSH/PULL control queue between the
master’s “main” task and each sub-task which is used to relay control messages
Most of the protocols used by the queues are (currently) undocumented with the exception of those between the build slaves and the Slave Driver and File Juggler tasks (documented in the piw-slave chapter).
However, all protocols share a common basis: messages are lists of Python
objects. The first element is always string containing the action. Further
elements are parameters specific to the action. Messages are encoded with
pickle. This is an untrusted format but was the quickest to get started
with (and the inter-process queues aren’t exposed to the internet). A future
version may switch to something slightly safer like JSON or better still