Hey,
I am calculating a moving average of [value] (n = 10) across a table. To reduce computation, I am using an app formula on a physical column, so [value_moving_avg] is calculated once during record entry and then saved.
I have a workflow which triggers on any update or delete which changes the [value]. The workflow identifies the following 9 records, and updates updates their [value_moving_avg] so the dataset remains accurate.
My moving average formula is shown below:
AVERAGE(
SELECT(
table[value],
IN(
[id],
TOP(ORDERBY(FILTER(table,[created_datetime]<[_THISROW].[created_datetime]),[created_datetime], TRUE),9)
)
)+LIST([value])
)
The identification of the rows to include in the moving average calculates rather quickly (the portion of the formula nested inside the IN()).
The challenge is, taking the referenced rows and turning them into [value] to be averaged. The IN() formula is very slow in this implementation. Now that my dataset has grown, the workflow rule hangs or fails entirely.
Any thoughts on how to optimize something like this?
@Aleksi please donโt say its not possible at this time
Solved! Go to Solution.
By the nature of the calculation, it will require multiple rows. One row for every row in the table, sort of as described here: https://www.essentialsql.com/sql-puzzle-calculate-moving-averages/
Maybe it will provide more clarity; here is what I am producing for my users:
The bigger problem is AppSheets operations for filtering, ordering, and selecting data from a list only returns a list of primary keys. There isnt a simple way to convert this list of primary keys back to a piece of data stored along with them. Its unfortunate that we canโt specifically return a column when using operators like ORDERBY(). Rarely am I wanting a list of primary keys, i usually want some associated value.
Iโve found a solution to streamline the calculation, although its clunky in implementation it works.
The solution requires two physical columns: one column is EnumList base type REF, and holds a list of the 10 related records I would like to average. Essentially, this part of the bigger formula:
TOP(ORDERBY(FILTER(table,[created_datetime]<[_THISROW].[created_datetime]),[created_datetime], TRUE),9)
Then, I have a second formula which looks up each associated value directly, rather than using IN(). Essentially,
AVERAGE(LIST(
[value],
LOOKUP(INDEX([_THISROW].[EnumListREF],1),โtableโ,โidโ,โvalueโ),
...,
LOOKUP(INDEX([_THISROW].[EnumListREF],9),โtableโ,โidโ,โvalueโ)
))
Doing the calculation this way has it performing an order of magnitude better. I still have a workflow rule to update associated records on updates or deletes.
I should also note, in case anyone uses this post in the future, that there are other options. Each has some benefits and downsidesโฆ to name two:
Cheers @Aleksi ! Would love to know if you have any other methods of doing this that may be better implented.
User | Count |
---|---|
15 | |
15 | |
8 | |
7 | |
4 |