A number of system configuration variables exist to allow fine tuning of the system to suit the physical hardware and query characteristics. In general the default values will work relatively well for many cases.
The configuration parameters are maintained in the /usr/local/mariadb/columnstore/etc/Columnstore.xml file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.
Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information.
The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. For a Single Server or Combined Multi Server deployment the default value is 50 to ensure enough physical memory for the UM and for a non combined multi serve deployment the default value is 70.
The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations on the UM. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory.
In a single server or combined deployment, the sum of NumBlocksPct and TotalUmMemory should typically not exceed 75% of physical memory. With very large memory servers this could be raised but the key point is to leave enough memory for other processes including mysqld.
With version 1.2.2 onwards these can be set to static numeric limits instead of percentages by entering a number with 'M' or 'G' at the end to signify MiB or GiB.
ColumnStore handles concurrent query execution by managing the rate of concurrent batch primitive steps from the UM to the PM. This is configured using the MaxOutstandingRequests parameter and has a default value of 20. Each batch primitive step is executed within the context of 1 extent column according to this high level process:
This scheme allows for large queries to use all available resources when not otherwise being consumed and for smaller queries to execute with minimal delay. Lower values optimize for higher throughput of smaller queries while a larger value optimizes for response time of that query. The default value should work well under most circumstances however the value should be increased as the number of PM nodes is increased.
ColumnStore maintains statistics for table and utilizes this to determine which is the larger table of the two. This is based both on the number of blocks in that table and estimation of the predicate cardinality. The first step is to apply any filters as appropriate to the smaller table and returning this data set to the UM. The size of this data set is compared against the configuration parameter PmMaxMemorySmallSide which has a default value of 64 (MB). This value can be set all the way up to 4GB. This default allows for approximately 1M rows on the small table side to be joined against billions (or trillions) on the large table side. If the size of the small data set is less than PmMaxMemorySmallSide the dataset will be sent to the PM for creation of a distributed hashmap otherwise it is created on the UM. Thus this setting is important to tuning of joins and whether the operation can be distributed or not. This should be set to support your largest expected small table join size up to available memory:
The above logic for a single table join extrapolates out to multi table joins where the small table values are precalculated and performed as one single scan against the large table. This works well for the typical star schema case joining multiple dimension tables with a large fact table. For some join scenarios it may be necessary to sequence joins to create the intermediate datasets for joining, this would happen for instance with a snowflake schema structure. In some extreme cases it may be hard for the optimizer to be able to determine the most optimal join path. In this case a hint is available to force a join ordering. The INFINIDB_ORDERED hint will force the first table in the from clause to be considered the largest table and override any statistics based decision, for example:
select /*! INFINIDB_ORDERED */ r_regionkey from region r, customer c, nation n where r.r_regionkey = n.n_regionkey and n.n_nationkey = c.c_nationkey
When a join is very large and exceeds the PmMaxMemorySmallSide setting it is performed in memory in the UM server. For very large joins this could exceed the available memory in which case this is detected and a query error reported. A number of configuration parameters are available to enable and configure usage of disk overflow should this occur:
A mariadb global or session variable is available to specify a memory limit at which point the query is switched over to disk based joins:
© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.