I few months back we finally had our solar power system installed. This is all quite nice from an ecological point of view but one of the best things is that it yields a lot of data. I like data.
So first thing was to set up TimescaleDB, Node-RED and Grafana in Docker and attach it all to SolarEdge’s monitoring portal. That worked great and I’m tracking production, load, import, export and WHAT THE HELL IS THAT FLAT LINE DOING THERE!?
While it turns out that the inverter eventually updates all values, the monitoring portal itself can be somewhat flaky.
This his what the result set looks like:
+---------------------------------+----+----+---+------+------+
|time |grid|load|pv |import|export|
+---------------------------------+----+----+---+------+------+
|2022-01-15 08:20:01.698941 +00:00|220 |260 |40 |220 |0 |
|2022-01-15 08:20:11.678517 +00:00|1500|1530|30 |1500 |0 |
|2022-01-15 08:20:21.927625 +00:00|1500|1530|30 |1500 |0 |
|2022-01-15 08:20:31.638271 +00:00|1500|1530|30 |1500 |0 |
+---------------------------------+----+----+---+------+------+
... this just goes on and on every 10 seconds ...
+---------------------------------+----+----+---+------+------+
|2022-01-15 09:46:21.969070 +00:00|1500|1530|30 |1500 |0 |
|2022-01-15 09:46:31.960681 +00:00|1500|1530|30 |1500 |0 |
|2022-01-15 09:46:41.981632 +00:00|250 |300 |50 |250 |0 |
|2022-01-15 09:46:51.926296 +00:00|370 |260 |630|0 |370 |
|2022-01-15 09:47:01.943392 +00:00|360 |260 |620|0 |360 |
+---------------------------------+----+----+---+------+------+
This raises a couple of questions:
- How to detect if rows had repeated values over time?
- What are the time periods that need to be reconstructed?
Step 1: Detect unchanged rows
For the sake of readability I’ll stick to one column in this example.
I’ll go with grid
in this one since it’d be quite unusual to have the exact same energy usage over time.
The real query is a bit more complex of course but you get the gist.
SELECT s."time",
s.grid AS current,
lag(s.grid) OVER w1 AS previous,
coalesce(s.grid <> lag(s.grid) OVER w1, FALSE) AS changed
FROM solaredge AS s
WINDOW w1 AS (ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ORDER BY "time";
There are a couple of things happening here:
First thing to notice is the WINDOW FUNCTION.
It gives us an additional way to access our results via a set of functions.
(Think of it as aggregate functions without the aggregation …)
Our window w1
is defined as ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
which will let us access the current row and its predecessor.
This can be done using lag(<column>) OVER w1
.
We’re now able to compare the current value of s.grid
with its previous value using s.grid <> lag(s.grid) OVER w1
.
Next we wrap this comparison with coalesce()
.
The first row won’t have a preceding row, so lag()
will yield NULL
.
While it’s usually used to return the first non null value in a list of arguments, it can also be used to substitue a default value for null.
We go with false
.
Our result now looks like this:
+---------------------------------+-------+--------+-------+
|time |current|previous|changed|
+---------------------------------+-------+--------+-------+
|2022-01-15 08:20:11.678517 +00:00|1500 |220 |true |
|2022-01-15 08:20:21.927625 +00:00|1500 |1500 |false |
|2022-01-15 08:20:31.638271 +00:00|1500 |1500 |false |
|2022-01-15 08:20:41.648238 +00:00|1500 |1500 |false |
|2022-01-15 08:20:51.607577 +00:00|1500 |1500 |false |
...
|2022-01-15 09:46:31.960681 +00:00|1500 |1500 |false |
|2022-01-15 09:46:41.981632 +00:00|250 |1500 |true |
+---------------------------------+-------+--------+-------+
Step 2: Get time period for reconstruction
To get the time periods we can simply wrap the previous query as a subquery and select only the rows WHERE changed IS TRUE
.
SELECT *
FROM (SELECT s."time",
grid AS current,
lag(s.grid) OVER w1 AS previous,
coalesce(s.grid <> lag(s.grid) OVER w1, FALSE) AS changed
FROM solaredge AS s
WINDOW w1 AS (ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ORDER BY "time") s1
WHERE changed IS TRUE;
Giving us …
+---------------------------------+-------+--------+-------+
|time |current|previous|changed|
+---------------------------------+-------+--------+-------+
|2022-01-15 08:20:11.678517 +00:00|1500 |220 |true |
|2022-01-15 09:46:41.981632 +00:00|250 |1500 |true |
|2022-01-15 09:46:51.926296 +00:00|370 |250 |true |
|2022-01-15 09:47:01.943392 +00:00|360 |370 |true |
|2022-01-15 09:47:11.918865 +00:00|380 |360 |true |
+---------------------------------+-------+--------+-------+
To finally get the start- and end-times of our missing data, we wrap this one once more:
SELECT "time" AS start,
lead("time") OVER w2 AS end,
lead("time") OVER w2 - "time" AS duration
FROM (SELECT *
FROM (SELECT s."time",
grid AS current,
lag(s.grid) OVER w1 AS previous,
coalesce(s.grid <> lag(s.grid) OVER w1, FALSE) AS changed
FROM solaredge AS s
WINDOW w1 AS (ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ORDER BY "time") s1
WHERE changed IS TRUE) s3
WINDOW w2 AS (ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING);
As you can see, we apply another window function, but this time ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING
, covering the current line and the following one.
This one can be accessed using lead()
instead of lag()
.
Last thing is to subtract "time"
from lead("time")
and we’re done:
+---------------------------------+---------------------------------+----------------------------------------------------+
|start |end |duration |
+---------------------------------+---------------------------------+----------------------------------------------------+
|2022-01-15 08:20:11.678517 +00:00|2022-01-15 09:46:41.981632 +00:00|0 years 0 mons 0 days 1 hours 26 mins 30.303115 secs|
|2022-01-15 09:46:41.981632 +00:00|2022-01-15 09:46:51.926296 +00:00|0 years 0 mons 0 days 0 hours 0 mins 9.944664 secs |
|2022-01-15 09:46:51.926296 +00:00|2022-01-15 09:47:01.943392 +00:00|0 years 0 mons 0 days 0 hours 0 mins 10.017096 secs |
|2022-01-15 09:47:01.943392 +00:00|2022-01-15 09:47:11.918865 +00:00|0 years 0 mons 0 days 0 hours 0 mins 9.975473 secs |
|2022-01-15 09:47:11.918865 +00:00|2022-01-15 09:47:35.931409 +00:00|0 years 0 mons 0 days 0 hours 0 mins 24.012544 secs |
+---------------------------------+---------------------------------+----------------------------------------------------+