The Pi3Web Server owns an object oriented design and this concept has also been applied to its configuration. The configuration realizes the 'late binding' conception. The following diagram provides a simplified example of how dynamically loaded components are instantiated into a hierarchy of objects which implement a secure HTTP server :
______________ | Main | | (UNIXDaemon) | -------------- | _________________________ | ThreadPoolDispatcher | | (MultithreadedIOServer) | ------------------------- / \ _____________ __________________ | SSLIOObject | | HTTPLogicObject | | (SSL) | | (HTTPDispatcher) | ------------- ------------------ / / \ ________________ ___________________ ___________________ | ServerIOObject | | CGI | | ISAPI | | (TCPIPIO) | | (FlexibleHandler) | | (FlexibleHandler) | ---------------- ------------------- -------------------
The following shows the same tree of objects in the (simplified) notation of the configuration.
<Object> Name Main Class UNIXDaemonClass ServerObject ThreadPoolDispatcher </Object> <Object> Name ThreadPoolDispatcher Class MultithreadedIOServerClass IOObject SSLIOObject </Object> <Object> Name SSLIOObject Class SSLClass Type Passive IOObject ServerIOObject </Object> <Object> Name ServerIOObject Class TCPIPIOClass Type Passive BindHost localhost BindPort 80 </Object> <Object> Name HTTPLogicObject Class HTTPDispatcherClass Handlers CGI SSI ISAPI </Object> <Object> Name CGI Class FlexibleHandlerClass Handle StandardCGI </Object> <Object> Name ISAPI Class FlexibleHandlerClass Handle ISAPIExtensions </Object>
Each object belongs to a configuration class, which connects the object with the binary code. The name of the object is used to identify the object in the configuration tree. The root object must have the name Main. Each configuration object may refer to its children by their names.
A Pi3Web configuration file uses a "shebang line" as known for shell or perl scripts on POSIX systems but on a Windows system this line is not interpreted:
#!../bin/Pi3
A top level configuration file has usually the extension .pi3 and is located in the subdirectory Conf of the Pi3Web server. This is not binding because the complete name of the configuration file is a startup parameter of the server. The top level configuration files are supplemented by generated files containing definitions of libraries and configuration classes. These supplementary files have the extension .cnf and an arbitrary number of them is included by the top level configuration file:
include ../Conf/IO.cnf include ../Conf/Server.cnf include ../Conf/Pi3API.cnf include ../Conf/HTTP.cnf
Note, that path and file names are case sensitive depending on the operating system (e.g. Linux, Solaris). The configuration file may furthermore contain comments in the format also known from shell or perl:
# A hostname must be specified for the Pi3Web HTTP server. The hostname # will be used by the server lookup the IP address that this server will # listen on. A value of INADDR_ANY let the server listen on any IP address.
The format of Pi3Web configuration files is extended by the administration applications using a proprietary syntax within comments. The whole configuration tree consists only of the following functional elements:
A library is defined within the tags <Library></Library>. This element defines, where the binary implementation of one or more classes resides. Its parameters are Name, POSIXPath, and Win32Path, e.g.:
<Library> Name Pi3API POSIXPath ./libPi3API.so Win32Path Pi3API.dll </Library>
A class is defined within the tags <Class></Class>. The class element defines the interface of one class to its binary implementation. The parameters are Name, Type, Library, OnClassLoad, Constructor, CopyConstructor, Destructor, Execute, In, Out. The parameter Type has either the value 'IO' or 'LogicExtension' and by that means specifies the superclass of a class.'In' and 'Out' are valid for I/O classes only, i.e. only for classes of type 'IO':
<Class> Name TCPIPIOClass Type IO Library IO OnClassLoad TCPIPSocket_onClassLoad Constructor TCPIPSocket_TCPIPSocket CopyConstructor TCPIPSocket_copyTCPIPSocket Destructor TCPIPSocket_xTCPIPSocket Out TCPIPSocket_send In TCPIPSocket_recv </Class>
On the other hand 'Execute' is only valid for logic extensions, i.e. only for classes of type 'LogicExtension':
<Class> Name HTTPDispatcherClass Type LogicExtension Library Pi3API OnClassLoad Dispatcher_onClassLoad Constructor Dispatcher_constructor CopyConstructor Dispatcher_copyConstructor Destructor Dispatcher_destructor Execute Dispatcher_execute </Class>
A library is defined within the tags <Object></Object>. This element defines the parameters of an object of the specified class. Its default parameters are Name and Class but in general an object has one or more additional parameters, which are specific for its class:
<Object> Name [ObjectName] Class [ClassName] Parameter1 Value1 Parameter2 Value2 . . ParameterN ValueN </Object>
There are only a few formats for parameter values:
Parameter values representing a string can be quoted using '"', quoting is required for strings containing spaces
Parameter values representing an enumerated value, e.g. phase name
Parameter values representing an integer number, e.g. port number, timeout
A string representing a list of flags separated by ' | '
A sequence of strings representing a configuration object and its parameters
The complete list of all configuration classes, objects and their parameters can be obtained from the online documentation of the Pi3Web server.
The last parameter format can be used to reuse a configuration object and its parameters. A 'default' configuration object can be defined once:
<Object> Name CGIMapper Class PathMapperClass CaseSensitive "No" PathInfo "Yes" Action "&dbreplace(response,string,ObjectMap,Scripts)" </Object>
Then it may be used multiple times, whereas additional parameters may be declared or existing parameters can be redeclared overriding the original value:
Mapping CGIMapper From="/cgi-bin/" To="Cgi-Bin\" Mapping CGIMapper From="/my_cgi/" To="MyCgi\" PathInfo="No"
The general format for a configuration parameter, which overrides another configuration object is:
Object Parameter1="Value1" Parameter2="Value2" .. ParameterN="ValueN"
If the configuration class needs mandatory parameters, which are not declared by the 'default' configuration object, this 'abstract' object cannot be referred directly by other configuration objects but it must be overridden.
As already written the name of the top level configuration file is part of the server startup command, e.g. Pi3.exe ../Conf/Config.pi3. This makes it possible to run the server in different configurations from a single server installation. At startup the configuration tree is parsed beginning at the Main object and the runtime objects are instantiated. Each object gets its parameters from the configuration. Changes in the configuration file have no effect to a running server, that means the server needs to be stopped and started again to reflect the changes.
The generated tree of dynamic components interacts at runtime in the following way: The main object represents the main process of the server, the server daemon. The daemon may fork one ore more child processes (UNIX only, on Win32 there's exactly one server process). If one of the child processes died, the main object will start the next child until the server is stopped. Each child process owns one prototype I/O object (listening socket) and runs the accept loop. For each incoming connection a copy of the I/O object is handed over to the thread dispatcher. Now the dispatcher starts a free worker thread which handles the incoming connection by executing the HTTP logic object.
The HTTP logic object calls each handler which has been configured in its own list of handlers one time for each request processing phase until one handler completes the processing of that phase. After the last phase is finished, the execution of the HTTP logic object itself returns, the worker thread has finished and the copied I/O object is destroyed (the socket connection is closed).
There are ten phases of request processing: Init, Headers, HostMap, Mapping, CheckPath, CheckAccess, CheckType, Handle, Log and Destroy. A configured handler must not handle each phase of request processing but exactly one of the handlers in the chain has to do this for a certain phase, otherwise the HTTP logic object returns with an error. The collaboration of the handlers during all phases of request processing can easily be observed, when the debug mode of the Pi3Web server is switched on.
There is a special handler class called ReturnCode, which is used to specify the completion of a certain phase of request processing in the configuration file:
<Object> Name Options Class FlexibleHandlerClass Condition "&cmpi($m,OPTIONS)" Mapping ReturnCode ReturnCode=COMPLETED CheckPath ReturnCode ReturnCode=COMPLETED CheckAccess ReturnCode ReturnCode=COMPLETED Handle SendFile </Object>
This example shows also the usage of another important configuration class, the FlexibleHandler. It can be used to handle each processing phase under a general or a specific condition. With its handlers parameter it can even be used to dispatch the request to a subsequent handler chain, when the specified condition is true. Conditions are parameters, which are evaluated as Pi3Expressions, a powerful logical extension of the configuration language. Refer to the documentation of Pi3Expressions for the details.
The Pi2API functions provide an interface to the complete configuration functionality for application development, e.g. PIConfig_loadConfigurationFile, PIClass_getLibrary or PIObject_load. Refer to the Pi3Web online API documentation for the complete list.
In general, the performance and the security of a small configuration is better than of a big one. The startup time and memory usage may also increase, when much objects are parsed from the configuration. But there are also some specific things, which must be taken into consideration:
will increase resource usage, when they are added to the configuration (e.g. by including PHP4.cnf or SSL.cnf), the related libraries are loaded and classes are instantiated.
different types of handlers for dynamic content have different performance and resource usage, e.g. the embedded PHP4 interpreter processes requests faster than the CGI module (PHP4.exe) but keeps resources allocated at startup in use as long as the server lives. On the other hand a dynamically loaded module may increase resource usage due to programing errors (e.g. memory leaks) and may even crash the whole server on failure, a CGI program normally won't do this.
than deeper the configured tree of dynamically loaded objects, than longer the request processing will take.
the partial tree which is used to process a certain request depends on the URI, the configured conditions and the result of the execution of handler logic. But most requests will be handled by objects from the same part of the tree and this is the critical path through the processing chain. Take care to keep the depth of the configuration tree as low as possible for the critical path for maximum performance.
some parameters have major influence on performance and resource usage(e.g. number of parallel threads, timeout values)
Taking this into consideration you can create different configurations for different purposes such as internet server, web application development or proxy. A couple of top level configuration files for different scenarios are part of the binary distribution of the Pi3Web Server. But how can I obtain the required information to observe the effects of configuration changes, as depth of request processing, time consumption, and how to detect probable configuration errors? For this purpose the debug option of the HttpLogicObject can be switched on by configuration of the DebugLogFile parameter. After beeing restarted, the server will produce comprehensive log information in the specified file:
[3368:388] |--> 0.015000 --- HANDLE UNKNOWN TopSiteRoot "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |--> 0.015000 --- HANDLE UNKNOWN Start "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |<-- 0.015000 0 HANDLE CONTINUE Start "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |--> 0.015000 --- HANDLE UNKNOWN Scripts "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |<-- 0.015000 0 HANDLE CONTINUE Scripts "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |--> 0.015000 --- HANDLE UNKNOWN ISAPI "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |<-- 0.015000 0 HANDLE CONTINUE ISAPI "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |--> 0.015000 --- HANDLE UNKNOWN Default "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |--> 0.015000 --- HANDLE UNKNOWN PHP4 "GET /pidocs/Features/test2.php HTTP/1.1" 0 [3368:388] |<-- 0.020000 5 HANDLE COMPLETED PHP4 "GET /pidocs/Features/test2.php HTTP/1.1" 200 [3368:388] |<-- 0.020000 0 HANDLE COMPLETED Default "GET /pidocs/Features/test2.php HTTP/1.1" 200 [3368:388] |<-- 0.020000 0 HANDLE COMPLETED TopSiteRoot "GET /pidocs/Features/test2.php HTTP/1.1" 200
The format of a single line in the debug log is a configurable Pi3Expression. For the above debug log the meaning of the logged fields is:
[PID:TID] nesting timestamp elapsed phase resultcode handler "request line" statuscode
The seperate entries are each based on Pi3Expression variables (in brackets) and is configured in the parameters DebugBeforeHandler and DebugAfterHandler of the HttpLogicObject with the following meaning:
Process Identifier of the operationg system ($P)
Thread Identifier of the operationg system ($k)
Nesting level in the handler hierarchy during request processing ($X)
Seconds elapsed from the beginning of processing of this request, e.g. 0.015000 means 15 milliseconds ($T)
Milliseconds elapsed in this handler object ($D)
Phase of request processing ($N)
Resultcode of processing by the current handler ($R)
Name of the current handler ($n)
Original HTTP request line ($r)
HTTP statuscode set during processing ($s)
Because the multithreaded Pi3Web Server processes multiple requests in parallel, it may occur the debug lines of different requests are mixed. But they can be separated by the ID of the child process (PID, Unix and Linux only) and the worker thread (TID) responsible for processing the request.
Writing this debug information is extremely verbose and debugging should always be switched off in an production system.