Lumiera
0.pre.03
»edit your freedom«
|
Go to the source code of this file.
Generate synthetic computation load for Scheduler performance tests.
The Scheduler is a service to invoke Render Job instances concurrently in accordance to a time plan. To investigate the runtime and performance characteristics of the implementation, a well-defined artificial computation load is necessary, comprised of the invocation of an extended number of Jobs, each configured to carry out a reproducible computation. Data dependencies between jobs can be established to verify handling of dependent jobs and job completion messages within the scheduler.
A system of connected hash values is used as computation load, akin to a blockchain. Each processing step is embodied into a node, with a hash value computed by combining all predecessor nodes. Connectivity is represented as bidirectional pointer links, each nodes knows its predecessors and successors (if any), while the maximum fan out or fan in and the total number of nodes is limited statically. All nodes are placed into a single pre-allocated memory block and always processed in ascending dependency order. The result hash from complete processing can thus be computed by a single linear pass over all nodes; yet alternatively each node can be scheduled as an individual computation job, which obviously requires that it's predecessor nodes have already been computed, otherwise the resulting hash will not match up with expectation. If significant per-node computation time is required, an artificial computational load can be generated, controlled by a weight setting computed for each node.
The topology of connectivity is generated randomly, yet completely deterministic through configurable control functions driven by each node's (hash)value. This way, each node can optionally fork out to several successor nodes, but can also reduce and combine its predecessor nodes; additionally, new chains can be spawned (to simulate the effect of data loading Jobs without predecessor) and chains can be deliberately pruned, possibly splitting the computation into several disjoint sub-graphs. Anyway, the computation always begins with the root node, establishes the node links and marks each open end as an exit node — until all the nodes in the pre-allocated node space were visited. Hash values of all exit nodes will be combined into one characteristic hash for the graph, The probabilistic rules controlling the topology can be configured using the lib::RandomDraw component, allowing either just to set a fixed probability or to define elaborate dynamic configurations based on the graph height or node connectivity properties.
A TestChainLoad instance is created with predetermined maximum fan factor and a fixed number of nodes, which are immediately allocated and initialised. Using builder notation, control functions can then be configured. The topology generation then traverses the nodes, starting with the seed value from the root node, and establishes the complete node connectivity. After this priming, the expected result hash should be retrieved. The node structure can then be traversed or scheduled as Render Jobs.
A special use case is not to build any topology, rather only #setWeight(). All nodes ** will then be at level-0 and scheduled at t=0, causing the scheduler to process best effort in arbitrary order.
A tool for generating roughly calibrated computational load is included, to be controlled by the Node::weight stepping. Load can either be generated by arithmetic (hash) computations, or by accessing and adding memory in a private allocated data block. To make this load controllable, the instance is configured with a time base setting, with sensible settings between 50µs to 100ms; moreover, a calibration run is necessary once per runtime (static variables); the actual invocation uses a multiple of this base setting, as determined by the Node::weight.
This allows to render individual steps in the calculation graph much more expensive than the associated organisational code and setup, and thus creates the foundation for building realistic test scenarios. For a given graph instance, a runtime weight can be estimated heuristically, expressed in multiples of the Node::weight ≡ 1, taking into account the possible parallelisation per level of nodes described by an idealised model.
For the actual test run, a ScheduleCtx is built, using an actual scheduler instance. Specialised render job functors are provided to perform incremental job planning and invocation of individual nodes in the graph as computation steps, optionally with a computation load. The scheduler is triggered by inserting the initial planning job in a pre roll phase, blocking the main thread until a callback job is invoked, which is set as final dependency behind the exit node of the complete graph, returning a observed runtime in microseconds from the nominal start point of the schedule.
The generated topology can be visualised as a graph, using the Graphviz-DOT language. Nodes are rendered from bottom to top, organised into strata according to the time-level and showing predecessor -> successor connectivity. Seed nodes are distinguished by circular shape.
The complete graph can be performed synchronously, allowing to watch a baseline run-time when execution all nodes consecutively, using the configured load but without any time gaps. The run time in µs can be compared to the timings observed when performing the graph through the Scheduler. Moreover, Statistics can be computed over the generated graph, allowing to draw some conclusions regarding node distribution and connectivity.
Definition in file test-chain-load.hpp.
#include "vault/common.hpp"
#include "lib/test/transiently.hpp"
#include "vault/gear/job.h"
#include "vault/gear/scheduler.hpp"
#include "vault/gear/special-job-fun.hpp"
#include "lib/uninitialised-storage.hpp"
#include "lib/test/microbenchmark.hpp"
#include "lib/incidence-count.hpp"
#include "lib/time/timevalue.hpp"
#include "lib/time/quantiser.hpp"
#include "lib/iter-explorer.hpp"
#include "lib/format-string.hpp"
#include "lib/format-cout.hpp"
#include "lib/random-draw.hpp"
#include "lib/dot-gen.hpp"
#include "lib/util.hpp"
#include <boost/functional/hash.hpp>
#include <functional>
#include <utility>
#include <future>
#include <memory>
#include <string>
#include <vector>
#include <tuple>
#include <array>
Classes | |
struct | TestChainLoad< maxFan >::NodeControlBinding::Adaptor< SIG > |
Adaptor to handle further mapping functions. More... | |
struct | TestChainLoad< maxFan >::NodeControlBinding::Adaptor< RES(double)> |
rules may also build solely on the (guessed) height. More... | |
struct | TestChainLoad< maxFan >::NodeControlBinding::Adaptor< RES(size_t)> |
allow simple rules directly manipulating the hash value More... | |
struct | TestChainLoad< maxFan >::NodeControlBinding::Adaptor< RES(size_t, double)> |
allow rules additionally involving the height of the graph, which also represents time. More... | |
class | ChainFunctor |
Baseclass: JobFunctor to invoke TestChainLoad. More... | |
class | ComputationalLoad |
A calibratable CPU load to be invoked from a node job functor. More... | |
struct | Indicator |
Distribution indicators for one kind of evaluation. More... | |
struct | LevelWeight |
struct | TestChainLoad< maxFan >::Node |
Graph Data structure. More... | |
class | TestChainLoad< maxFan >::NodeControlBinding |
Policy/Binding for generation of random parameters by »drawing« based on the node-hash. More... | |
class | RandomChainCalcFunctor< maxFan > |
Render JobFunctor to invoke the calculation of a single Node. More... | |
class | RandomChainPlanFunctor< maxFan > |
Render JobFunctor to perform chunk wise planning of Node jobs to calculate a complete Chain-Load graph step by step. More... | |
class | TestChainLoad< maxFan >::ScheduleCtx |
Setup and wiring for a test run to schedule a computation structure as defined by this TestChainLoad instance. More... | |
struct | Statistic |
Statistic data calculated for a given chain-load topology. More... | |
struct | StatKey |
struct | TestChainLoad< maxFan >::Node::Tab |
Table with connections to other Node records. More... | |
class | TestChainLoad< maxFan > |
A Generator for synthetic Render Jobs for Scheduler load testing. More... | |
Typedefs | |
using | LevelSums = std::array< uint, CAT > |
using | VecU = std::vector< uint > |
Functions | |
double | _uSec (microseconds ticks) |
double | computeWeightFactor (LevelWeight const &lw, uint concurrency) |
simplified model for expense of a level of nodes, computed concurrently. More... | |
uint | defaultConcurrency () |
SpecialJobFun | onetimeCrunch (milliseconds runTime) |
a »throw-away« render-job | |
template<class NOD > | |
auto | prepareEvaluations () |
Variables | |
const uint | CAT = KEYS.size() |
const size_t | DEFAULT_CHUNKSIZE = 64 |
number of computation jobs to prepare in each planning round | |
const size_t | DEFAULT_FAN = 16 |
default maximum connectivity per Node | |
const size_t | DEFAULT_SIZ = 256 |
default node count for the complete load graph | |
const size_t | GRAPH_BENCHMARK_RUNS = 5 |
repetition count for reference calculation of a complete node graph | |
const uint | IDX_SEED = 1 |
const std::array | KEYS = {STAT_NODE,STAT_SEED,STAT_EXIT,STAT_INNR,STAT_FORK,STAT_JOIN,STAT_LINK,STAT_KNOT,STAT_WGHT} |
const size_t | LOAD_BENCHMARK_RUNS = 500 |
repetition count for calibration benchmark for ComputationalLoad | |
const size_t | LOAD_DEFAULT_MEM_SIZE = 1000 |
default allocation base size used if ComputationalLoad.useAllocation | |
const microseconds | LOAD_DEFAULT_TIME = 100us |
default time delay produced by ComputationalLoad at Node.weight==1 | |
const double | LOAD_SPEED_BASELINE = 100 |
initial assumption for calculation speed (without calibration) | |
const auto | SAFETY_TIMEOUT = 5s |
maximum time limit for test run, abort if exceeded | |
const bool | SCHED_DEPENDS = false |
explicitly schedule a dependent job (or rely on NOTIFY) | |
const bool | SCHED_NOTIFY = true |
explicitly set notify dispatch time to the dependency's start time. | |
const Duration | SCHEDULE_LEVEL_STEP {_uTicks(1ms)} |
time budget to plan for the calculation of each »time level« of jobs | |
const Duration | SCHEDULE_NODE_STEP {Duration::NIL} |
additional time step to include in the plan for each job (node). | |
const Duration | SCHEDULE_PLAN_STEP {_uTicks(100us)} |
time budget to reserve for each node to be planned and scheduled | |
const Offset | SCHEDULE_WAKE_UP {_uTicks(10us)} |
tiny offset to place the final wake-up job behind any systematic schedule | |
const auto | STANDARD_DEADLINE = 30ms |
deadline to use for each individual computation job | |
const StatKey | STAT_EXIT {2,"exit"} |
exit node | |
const StatKey | STAT_FORK {4,"fork"} |
forking node | |
const StatKey | STAT_INNR {3,"innr"} |
inner node | |
const StatKey | STAT_JOIN {5,"join"} |
joining node | |
const StatKey | STAT_KNOT {7,"knot"} |
knot (joins and forks) | |
const StatKey | STAT_LINK {6,"link"} |
1:1 linking node | |
const StatKey | STAT_NODE {0,"node"} |
all nodes | |
const StatKey | STAT_SEED {1,"seed"} |
seed node | |
const StatKey | STAT_WGHT {8,"wght"} |
node weight | |
const double | UPFRONT_PLANNING_BOOST = 2.6 |
factor to increase the computed pre-roll to ensure up-front planning | |
Namespaces | |
vault | |
Vault-Layer implementation namespace root. | |
vault::gear | |
Active working gear and plumbing. | |
struct vault::gear::test::TestChainLoad::NodeControlBinding::Adaptor |
|
inline |
simplified model for expense of a level of nodes, computed concurrently.
Definition at line 237 of file test-chain-load.hpp.
References vault::gear::test::computeWeightFactor().
Referenced by TestChainLoad< maxFan >::calcExpectedCompoundedWeight(), vault::gear::test::computeWeightFactor(), and TestChainLoad< maxFan >::levelScheduleSequence().