$> dbtoaster [options] <input file 1> [<input file 2> [...]]
1. Command Line Options
- -c <target file>
- Compile the query into a standalone binary. By default, the C++ code generator will be used with G++ to generate the binary. An alternate compiled target language (currently, C++ or Scala) may be selected using the -l flag.
- -l <language>
- Compile the query into the specified target language (see below). By default, the query will be interpreted. The use of this flag overrides any previous -l or -r.
- -o <output file>
- Redirect the compiler's output to the specified file. If used in conjunction with -c, the source code for the compiled binary will be directed to this file. The special output filename '-' refers to stdout. By default, output is directed to stdout, or discarded if the -c flag is used.
- Run the query (queries) in interpreter mode, overriding any target language previously specified by a -l. This is the default.
- -F <optimization>
- Activate the specified optimization flag. These are documented below.
- -O1 | -O2 | -O3
- Set the optimization level to 1, 2, or 3 respectively. At optimization level 1, compilation is faster and generated code is (usually) easier to understand and follow. At optimization level 3, compilation is slower, but more efficient code is produced. Optimization level 2 is the default. Overrides any prior -O flags provided on the command line.
- -I <dir>
- When invoking a second-stage compiler with the -c flag, add dir to the include file search path.
- -L <dir>
- When invoking a second-stage compiler with the -c flag, add dir to the library file search path.
- -D <macro>
- When invoking a second-stage compiler with the -c flag, define the preprocessor macro macro.
- -g <arg>
- When invoking a second-stage compiler with the -c flag, pass through the argument arg.
- --depth <level>
- Limit the compiler's maximum recursive depth. By default, DBToaster compiles queries with the depth set to infinity.
- --custom-prefix <prefix>
- Prefix all dbtoaster-generated symbols with this character string. Use this if DBToaster generates a symbol that conflicts with a symbol in user code. (Default: "__")
2. Supported Languages
|Language||Commandline Name||Output Format||Description|
|DBT Relational Calculus||calc||output||DBToaster's internal query representation. This is a direct translation of the input queries.|
|M3||m3||output||A map-mantenance messages program. This is the set of triggers (written in DBT Relational Calculus) that will incrementally maintain the input queries and all supporting datastructures.|
|C++||cpp||output/compiled||A C++ class implementing the queries.|
|Scala||scala||output/compiled||A Scala class implementing the queries.|
3. Optimization FlagsThese flags are passed to the dbtoaster compiler with the -F flag. The -O1 and -O3 flags each activate a subset of these flags. -O2 is used by default (no optimization flags active).
- Enable experimental support for incremental view caches. Queries with joins (and correlations) on inequality predicates are implemented in a way that corresponds roughly to nested-loop one-way joins in stream processing (a tree-based implementation is in development). If this flag is on, the compiler will cache and incrementally maintain the results of this one-way join. This is typically a bad idea, since the cost of maintaining the cached values is often higher than the cost of the nested loop scan. However, if the domains of the variables appearing in the join predicate are small, this flag can drastically improve performance (e.g., for the VWAP example query). Future versions of DBToaster will include a cost-based optimizer that automatically applies this flag when appropriate. This optimization is not activated by default at any optimization level.
- Enable experimental support for aggresive materialization of maps with input variables (view caches). It requires the HEURISTICS-ENABLE-INPUTVARS flag to be active. If the calculus optimizations are disabled (see CALC-NO-OPTIMIZE), this option might significantly prolong the compilation time. Note: the C++ backend might fail to compile certain classes of queries when this flag is on.
- Prevent value terms (variables and comparisons) from being materialized inside maps. In certain cases (e.g., mddb/query2.sql), this option reduces the number of generated maps and speed-ups the compilation time at the expense of doing more computation at runtime.
- By default, each user-provided (top-level) query is materialized as a single map. If this flag is turned on, the compiler will materialize top-level queries as multiple maps (if it is more efficient to do so), and only combine them on request. For more complex queries (in particular nested aggregate, and AVG aggregate queries), this results in faster processing rates, and if fresh results are required less than once per update, a lower overall computational cost as well. However, because the final evaluation of the top-level query is not performed until a result is requested, access latencies are higher. This optimization is not activated by default at any optimization level.
- Do not generate code for deletion triggers. The resulting programs will be simpler, and sometimes have fewer datastructures, but will not support deletion events. This optimization is not activated by default at any optimization level.
- In some cases, it is slightly more efficient to re-evaluate expressions from scratch rather than maintaining them with their deltas (for example, certain queries containing nested aggregates). Normally the compiler's heuristics will make a best-effort guess about whether to re-evaluate or incrementally maintain the expression. If this flag is on, the compiler will incrementally maintain all expressions and never re-evaluate.
- Do not use strings during evalation. All strings are immediately replaced by their integer hashes (using each runtime's native hashing mechanism) as soon as they are parsed. This makes query evalation faster, but is not guaranteed to produce correct results if a hash collision occurs. Furthermore, strings that would normally appear in the output are output as their integer hash values instead. This optimization is not activated by default at any optimization level.
- Perform static linking on compiled binaries (e.g., invoke gcc with -static). The resulting binaries will be faster the first time they are run. This optimization is not activated by default at any optimization level.
- Avoid creating empty relation terms during pre-evaluation. Empty relation terms are aggressively propagated thoughout expressions in which they occur, and may result in expressions that do not need to be incrementally maintained (because they are guaranteed to be always empty). Activating this flag is only useful if you want to inspect the generated Calculus/M3 code by hand. This optimization is not activated by default at any optimization level.
- When optimizing expressions in DBToaster relational calculus, perform factorization as aggressively as possible. For some queries, particularly those with nested subqueries, this can generate much more efficient code. However, it makes compilation slower on some queries. This optimization is automatically activated by -O3.
- When optimizing expressions in DBToaster relational calculus, inline lifted variables wherever possible, even if the lift term can not be eliminated entirely. This can produce substantially tighter code for queries with lots of constants, but slightly increases compilation time. This optimization is automatically activated by -O3.
- In generated code, when a map value becomes 0, remove the value from the map. Resulting programs are more efficient over long stretches of insertions and deletions. This optimization is automatically activated by -O3.
- Request that the second-stage compiler disable any unnecessary optimizations (e.g., by default, GCC is invoked with -O3, but not if this flag is active). This optimization is automatically activated by -O1.
- Use nested tuples in generated C++ programs. This option is mostly used at lower compilation levels (depth 0 or 1) to overcome the Boost limitation that tuples may contain at most 50 attributes.
- When testing for expression equivalence, perform only a naive structural comparison rather than a (at least quadratic, and potentially exponential) matching. This accelerates compilation, but may result in the creation of duplicate maps. This optimization is automatically activated by -O1.
- Do not apply calculus optimizations that simplify delta expressions. This option prevents range restrictions from being propragated through expressions, which usually leads to significantly worse performance. The resulting code is close to what the naive recursive incremental algorithm would produce. This flag is not activated by default at any optimization level.
- Do not apply query decomposition when computing deltas. This option is not activated by default at any optimization level.
- Do not apply functional optimizations. This optimization is automatically activated by -O1.
- When computing the viewlet transform, use the delta rule for lifts precisely as described in the PODS10 paper. If this flag is not active, a postprocessing step is applied to lift deltas, that range-restricts the resulting expression to only those tuples that are affected. This optimization is automatically activated by -O1.