W3cubDocs

/OpenTSDB

Query Examples

The following is a list of example queries using an example data set. We'll illustrate a number of common query types that may be encountered so you can get an understanding of how the query system works. Each time series in the example set has only a single data point stored and the UIDs have been truncated to a single byte to make it easier to read. The example queries are all Metric queries from the HTTP API and only show the m= component. See /api/query for details. If you are using a CLI tool, the query format will differ slightly so read the documentation for the particular command.

Sample Data

Time Series

TS# Metric Tags TSUID
1 sys.cpu.system dc=dal host=web01 0102040101
2 sys.cpu.system dc=dal host=web02 0102040102
3 sys.cpu.system dc=dal host=web03 0102040103
4 sys.cpu.system host=web01 010101
5 sys.cpu.system host=web01 owner=jdoe 0101010306
6 sys.cpu.system dc=lax host=web01 0102050101
7 sys.cpu.system dc=lax host=web02 0102050102
8 sys.cpu.user dc=dal host=web01 0202040101
9 sys.cpu.user dc=dal host=web02 0202040102

UIDs

Name UID
Metrics
cpu.system 01
cpu.user 02
Tagks
host 01
dc 02
owner 03
Tagvs
web01 01
web02 02
web03 03
dal 04
lax 05
doe 06

Warning

This isn't necesarily the best way to setup your metrics and tags, rather it's meant to be illustrative of how the query system works. In particular, TS #4 and 5, while legitimate timeseries, may screw up your queries unless you know how they work. In general, try to maintain the same number and type of tags for each timeseries.

Under the Hood

You may want to read up on how OpenTSDB stores timeseries data here: Storage. Otherwise, remember that each row in storage has a unique key formatted:

<metricID><normalized_timestamp><tagkID1><tagvID1>[...<tagkIDN><tagvIDN>]

The data table above would be stored as:

01<ts>0101
01<ts>01010306
01<ts>02040101
01<ts>02040102
01<ts>02040103
01<ts>02050101
01<ts>02050102
02<ts>02040101
02<ts>02040102

When you query OpenTSDB, here's what happens under the hood.

  • The query is parsed and verified to make sure that the format is correct and that the metrics, tag names and tag values exist. If a single metric, tag name or value doesn't exist in the system, it will kick back an error.
  • Then it sets up a scanner for the underlying storage system.
    • If the query doesn't have any tags or tag values, then it will grab any rows of data that match <metricID><timestamp>, so if you have a ton of time series for a particular metric, this could be many, many rows.
    • If the query does have one or more tags defined, then it will still scan all of the rows matching <metricID><timestamp>, but also perform a regex to return only the rows that contain the requested tag.
  • Once all of the data has been returned, OpenTSDB organizes it into groups, if required
  • If downsampling was requested, each individual time series is down sampled into smaller time spans using the proper aggregator
  • Then each group of data is aggregated using the specific aggregation function
  • If the rate flag was detected, each aggregate will then be adjusted to get the rate.
  • Results are returned to the caller

Query 1 - All Time Series for a Metric

m=sum:cpu.system

This is the simplest query to make. TSDB will setup a scanner to fetch all data points for the metric UID 01 between <start> and <end>. The result will be the a single dataset with time series #1 through #7 summed together. If you have thousands of unique tag combinations for a given metric, they will all be added together into one series.

[
  {
    "metric": "cpu.system",
    "tags": {},
    "aggregated_tags": [
      "host"
    ],
    "tsuids": [
      "010101",
      "0101010306",
      "0102050101",
      "0102040101",
      "0102040102",
      "0102040103",
      "0102050102"
    ],
    "dps": {
      "1346846400": 130.29999923706055
    }
  }
]

Query 2 - Filter on a Tag

Usually aggregating all of the time series for a metric isn't particularly useful. Instead we can drill down a little by filtering for time series that contain a specific tagk/tagv pair combination. Simply put the pair in curly braces:

m=sum:cpu.system{host=web01}

This will return an aggregate of time series #1, #4, #5 and #6 since they're the only series that include host=web01.

[
  {
    "metric": "cpu.system",
    "tags": {
      "host": "web01"
    },
    "aggregated_tags": [],
    "tsuids": [
      "010101",
      "0101010306",
      "0102040101",
      "0102050101"
    ],
    "dps": {
      "1346846400": 63.59999942779541
    }
  }
]

Query 3 - Specific Time Series

What if you want a specific timeseries? You have to include every tag and coresponding value.

m=sum:cpu.system{host=web01,dc=lax}

This will return the data from timeseries #6 only.

[
  {
    "metric": "cpu.system",
    "tags": {
      "dc": "lax",
      "host": "web01"
    },
    "aggregated_tags": [],
    "tsuids": [
      "0102050101"
    ],
    "dps": {
      "1346846400": 15.199999809265137
    }
  }
]

Warning

This is where a tagging scheme will stand or fall. Let's say you want to get just the data from timeseries #4. With the current system, you are unable to. You would send in query #2 m=sum:cpu.system{host=web01} thinking that it will return just the data from #4, but as we saw, you'll get the aggregate results for #1, #4, #5 and #6. To prevent such an occurance, you would need to add another tag to #4 that differentiates it from other timeseries in the group. Or if you've already commited, you can use TSUID queries.

Query 4 - TSUID Query

If you know the exact TSUID of the timeseries that you want to retrieve, you can simply pass it in like so:

tsuids=sum:0102040102

The results will be the data points that you requested.

[
  {
    "metric": "cpu.system",
    "tags": {
      "dc": "lax",
      "host": "web01"
    },
    "aggregated_tags": [],
    "tsuids": [
      "0102050101"
    ],
    "dps": {
      "1346846400": 15.199999809265137
    }
  }
]

Query 5 - Multi-TSUID Query

You can also aggregate multiple TSUIDs in the same query, provided they share the same metric. If you attempt to aggregate multiple metrics, the API will issue an error.

tsuids=sum:0102040101,0102050101
[
  {
    "metric": "cpu.system",
    "tags": {
      "host": "web01"
    },
    "aggregated_tags": [
      "dc"
    ],
    "tsuids": [
      "0102040101",
      "0102050101"
    ],
    "dps": {
      "1346846400": 33.19999980926514
    }
  }
]

Query 6 - Grouping

m=sum:cpu.system{host=*}

The * (asterisk) is a grouping operator that will return a data set for each unique value of the tag name given. Every timeseries that includes the given metric and the given tag name, regardless of other tags or values, will be included in the results. After the individual timeseries results are grouped, they'll be aggregated and returned.

In this example, we will have 3 groups returned:

Group Time Series Included
web01 #1, #4, #5, #6
web02 #2, #7
web03 #3

TSDB found 7 total timeseries that included the "host" tag. There were 3 unique values for that tag (web01, web02, and web03).

[
  {
    "metric": "cpu.system",
    "tags": {
      "host": "web01"
    },
    "aggregated_tags": [],
    "tsuids": [
      "010101",
      "0101010306",
      "0102040101",
      "0102050101"
    ],
    "dps": {
      "1346846400": 63.59999942779541
    }
  },
  {
    "metric": "cpu.system",
    "tags": {
      "host": "web02"
    },
    "aggregated_tags": [
      "dc"
    ],
    "tsuids": [
      "0102040102",
      "0102050102"
    ],
    "dps": {
      "1346846400": 24.199999809265137
    }
  },
  {
    "metric": "cpu.system",
    "tags": {
      "dc": "dal",
      "host": "web03"
    },
    "aggregated_tags": [],
    "tsuids": [
      "0102040103"
    ],
    "dps": {
      "1346846400": 42.5
    }
  }
]

Query 7 - Group and Filter

Note that the in example #2, the web01 group included the odd-ball timeseries #4 and #5. We can filter those out by specifying a second tag ala:

m=sum:cpu.nice{host=*,dc=dal}

Now we'll only get results for #1 - #3, but we lose the dc=lax values.

[
  {
    "metric": "cpu.system",
    "tags": {
      "dc": "dal",
      "host": "web01"
    },
    "aggregated_tags": [],
    "tsuids": [
      "0102040101"
    ],
    "dps": {
      "1346846400": 18
    }
  },
  {
    "metric": "cpu.system",
    "tags": {
      "dc": "dal",
      "host": "web02"
    },
    "aggregated_tags": [],
    "tsuids": [
      "0102040102"
    ],
    "dps": {
      "1346846400": 9
    }
  },
  {
    "metric": "cpu.system",
    "tags": {
      "dc": "dal",
      "host": "web03"
    },
    "aggregated_tags": [],
    "tsuids": [
      "0102040103"
    ],
    "dps": {
      "1346846400": 42.5
    }
  }
]

Query 8 - Grouping With OR

The * operator is greedy and will return all values that are assigned to a tag name. If you only want a few tag values, you can use the | (pipe) operator instead.

m=sum:cpu.nice{host=web01|web02}

This will find all of the timeseries that include "host" values for "web01" OR "web02", then group them by value, similar to the * operator. Our groups, this time, will look like this:

Group Time Series Included
web01 #1, #4, #5, #6
web02 #2, #7
[
  {
    "metric": "cpu.system",
    "tags": {
      "host": "web01"
    },
    "aggregated_tags": [],
    "tsuids": [
      "010101",
      "0101010306",
      "0102040101",
      "0102050101"
    ],
    "dps": {
      "1346846400": 63.59999942779541
    }
  },
  {
    "metric": "cpu.system",
    "tags": {
      "host": "web02"
    },
    "aggregated_tags": [
      "dc"
    ],
    "tsuids": [
      "0102040102",
      "0102050102"
    ],
    "dps": {
      "1346846400": 24.199999809265137
    }
  }
]

© 2010–2016 The OpenTSDB Authors
Licensed under the GNU LGPLv2.1+ and GPLv3+ licenses.
http://opentsdb.net/docs/build/html/user_guide/query/examples.html