SILICES AND SELECTS

Gustavo_Eduardo · 08-01-2024 04:53 AM

It's a great time to start a discussion about synchronization, speed, and expression optimization in AppSheet. While we don't directly code in AppSheet, these concepts significantly impact our app performance.

How about we each share tips for improving app efficiency in this post?

I am sure our community can benefit from these contributions and enhance the performance of their apps. Can you also check: AppSheet Work Standard Protocol

Gustavo_Eduardo

VIRTUAL COLUMS?

They are calculated in real-time, and all columns are recalculated during every synchronization. The more columns you have, the longer the synchronization will take, as this is not just a database read operation but also involves additional calculations for virtual column formulas that rely on table data.

It’s important to understand that calculations are performed as many times as they are requested, regardless of whether the column is virtual or physical. If you need to perform the same calculation multiple times (e.g., the same FILTER()), instead of including the expression directly in the field’s property, create a physical column with the expression and mark it as hidden. This way, the calculation is only performed once and then reused whenever needed, improving efficiency and keeping the "code" or expressions cleaner and more manageable. Of course, the database will grow by one column (so this concept should be handled carefully), but the impact on synchronization time is significant.

Imagine how clean and simple development becomes when, instead of seeing long chains of expressions, you see clear and specific names.

You can verify these optimizations using the Performance Monitor.

For example, with Show Ifs: if you have a sequence of columns to display in a form, it’s not efficient to write complete nested expressions in every column. It’s better to create a physical column with a name like [S_fieldName] for each column and write the visibility expression there. If an expression is shared between columns, point to the physical column instead of repeating the calculation. This approach is not only more efficient for calculations but also easier to maintain, as you only need to modify the expression in the [S_fieldName] column to change the entire logic.

What other tips do you have to share?

kvngo94

@Gustavo_Eduardo Hi Gustavo, could you please make a more example on this tip ?

Gustavo_Eduardo

Hello @kvngo94 , no problem! Sorry for the delay.

Example: Displaying Fields Progressively in a Form

Let’s say you have three fields in your form:

[Label]
[Comment]
[Date]

The goal is to display the fields progressively when the form is opened: first [Label], then [Comment], and finally [Date] (though you could have more fields).

Step-by-Step Solution:

1. Configure Show If for the Second Field

To ensure [Comment] appears only after [Label] has been completed, use the following expression in the Show If property of the [Comment] field:

ISNOTBLANK([Label])

This formula ensures that the [Comment] field is shown only if the [Label] field is not blank.

2. Configure Show If for the Third Field

If you also want [Date] to appear only after [Comment] has been completed, use the following expression:

AND(ISNOTBLANK([Label]), ISNOTBLANK([Comment]))

This checks that both preceding fields have been completed.

3. Scalability for Multiple Fields

If you have many fields and keep adding conditions with AND() in each Show If, maintaining these formulas can become complex and cumbersome. Additionally, each column will independently recalculate its expression, which can impact performance.

Efficient Solution:

Create physical columns to store the conditions for each step. These columns act as indicators to control the visibility of the fields.

Configuration Example:

Create a Physical Column for Each Condition:
- [S_Comment]:
```
ISNOTBLANK([Label])
```
- [S_Date]:
```
AND([S_Comment], ISNOTBLANK([Comment]))
```
Use These Columns for Subsequent Fields:
Instead of writing long, repetitive expressions in Show If, simply reference the column that stores the condition:
- For the [Comment] field, the Show If property will be:
```
[S_Comment]
```
- For the [Date] field, the Show If property will be:
```
[S_Date]
```

Benefits of This Strategy:

If you need to modify the conditions, you only edit the formula in the corresponding column, not in every Show If.
It’s easy to add more fields by following this logic.
It improves performance by avoiding repeated recalculations of complex expressions in each column.

Summary:

This technique modularizes your conditions and simplifies the maintenance of your app. It will allow you to scale and adjust the logic with ease!

Gustavo_Eduardo

SILICES AND SELECTS

Many times is better to use a slice instead of a FILTER. In AppSheet is more efficient for several reasons:

Performance and Scalability

Slices are applied on the server before the data is sent to the user's device, so only the necessary data is transferred. This reduces the amount of data processed and the load on the user's device. On the other hand, FILTER expressions are applied on the device, which means all the data is first transferred to the device and then filtered. This can be inefficient and slow, especially with large datasets.

Resource Usage

Slices optimize system resources by defining a subset of data that the app handles from the start, reducing memory and processing needs on the user's device. Using FILTER() in expressions can be resource-intensive because AppSheet has to review all relevant data each time the expression is evaluated.

Maintenance and Readability

Slices separate the filtering logic from the rest of the app, making it easier to maintain and understand since it's defined in one place. FILTER expressions can become complex and hard to maintain if used in multiple places, especially if the filtering criteria change.

User Experience

Slices improve the user experience by reducing initial load times and updates, providing a faster and smoother experience. FILTER expressions can cause delays and a less smooth experience due to the additional processing required on the user's device.

Example:

Suppose you have an app for orders and need to show only the pending orders for each customer.

Using `FILTER()`:

FILTER(Orders, [Status] = "Pending" AND [Customer_ID] = [_THISROW].[Customer_ID])

The problem with this approach is that AppSheet has to review all order records every time this expression is evaluated, which can be slow if there are many records.

Using a Slice:

Create a slice called "Pending_Orders" with the condition [Status] = "Pending".
In the Customers table, use REF_ROWS("Pending_Orders", "Customer_ID") to get the pending orders for each customer.

The advantage here is that AppSheet filters the pending orders on the server, sending only the necessary data to the user's device, thus improving overall performance.

Conclusion:

Using slices is a best practice in AppSheet to optimize performance, reduce load on the user's device, and improve app maintainability.

Aurelien

Hi

@Gustavo_Eduardo wrote:

FILTER(Orders, [Status] = "Pending" AND [Customer_ID] = [_THISROW].[Customer_ID])

Hi @Gustavo_Eduardo

Thanks for the tip.

Additionnaly, a FILTER() or SELECT() expression running on a related table can be hugely improved using:

SELECT([Related TABLEs][id], condition-to-apply)

or

SELECT([Related TABLEs][column-to-read],condition-to-apply)

Because [related TABLEs] already gives a filtered list of records, the expected calculation runs faster.

Gustavo_Eduardo

Applause! Thanks Aurelien, the community is very fortunate to have people like you and several others who have been working here for a long time.

Gustavo_Eduardo

FILTERING PARENTS AND CHILDS IN A NESTED SEQUENCE

We often find ourselves in situations where we want to select related lists.

The simplest case to understand is the following (in my view):

Country
State
City

If we relate the tables with a foreign key, AppSheet will take care of creating a virtual column in the parent table that relates the parent ID with its child records.

If for some reason we need to filter the child records of the selected parent, the best practice is to use DEREFERENCE.

For example, when choosing a country, it will have its Ref_Rows. So, it makes sense to use an expression in the Valid If field of the child like [_THISROW].[Country].[Related States].

The same applies to cities, [_THISROW].[State].[Related Cities].

This way, it is not necessary to configure the Show If of each of the predecessors because if one of them is invalid, everything resets. Moreover, by pointing to a column that is already calculated in a system-created virtual column, we save AppSheet resources and therefore the application runs more smoothly.

Gustavo_Eduardo

In addition, you might think that there are some registers that you don't want to filter, for example, those whose status has changed; in that case, you create a slice and point to the slice, not the original table.

leinharte

I figured all sentences out except the following one:

it is not necessary to configure the Show If of each of the predecessors because if one of them is invalid, everything resets.

Could you explain what kind of situation this is designed for?

Thank you.

Gustavo_Eduardo

Hi @leinharte , of cours

Many times you have a form whose fields you want/need to show one by one as they are filled out. For example, you want to show the first name once the last name has been completed. In those cases, you need to configure the Show If property of the field. In the case of the first name/last name example, it would be ISNOTBLANK([surname]), for instance. So, once you write the last name, the first name will be shown.

Now, an example that almost always occurs is that of related children. If you have a chained structure like Country/State/City, you know that the country is USA, the state could be California, and the city could be Los Angeles.

Sometimes it happens that you configure everything, and you need to show the related states and cities. You can create a FILTER() in suggested values to achieve this, but in my view, the most practical thing is to enter the list in Valid If (in this case) which will allow you to take the valid values for each field.

In the case of City, it could be [_THISROW].[State].[related Cities]

In the case of States, it could be [_THISROW].[Country].[Related States]

Let's make an assumption. You are writing USA, then California, and then choose Los Angeles, but you remember that it was not USA but Mexico. Well, by changing the country, the other fields corresponding to state and city disappear since they are not valid, however, the ShowIf property is not configured but marked as "visible". In this specific case, it works without the need to configure it.

Gustavo_Eduardo

Optimizing Application Performance by Reducing Filter Expressions

Suppose I have a child table that has a numeric type field where a natural value is recorded, with the purpose of numbering the records with a unique number as they grow. The parent table will naturally have a field called [Related Childs]. There is an option that I often use.

To number incrementally, I create a virtual column called [UDX]. This virtual column contains a filter expression as follows:

FILTER(
"Child Table",
AND(
[ID] <> [_THISROW].[ID],
[Parent Table] = [_THISROW].[Parent Table]
)
)

Essentially, I’m telling it to count all records that have the same parent, ignoring the ID of this row.

Then, as the "initial value" of the index field, I place the expression COUNT([UDX]) + 1.

Every time I add a record, it counts all existing records for the parent and adds one.

A more efficient way to do the same thing is to get rid of the FILTER.

For example, we can use a virtual column already created in the parent table. In the Initial Value property, we can place the following expression, obtaining the same result with half the synchronization speed since filtering is done only once and not twice. In the "Initial Value" property of the [UDX] field, enter:

[_THISROW].[Parent Table].[Related Childs] - LIST([_THISROW].[ID])

Notice how neat it becomes and how efficient the application is when we reduce the filtering. Note that when we write filter expressions, we almost always impact the performance of our applications. I hope this example helps you not only for a case like this but for most of the filters you need to make.

Have a good day.

Gustavo_Eduardo

EFFICIENTLY CREATING SLICES FOR PARENT-CHILD TABLES IN APPSHEET

Let’s consider a common scenario. Imagine a table containing purchase operations and another child table that contains the purchased items. Additionally, let's envision two areas: a purchasing area and a warehouse area. Logically, the warehouse cannot receive anything that has not been purchased.

In the first table of purchase operations (P_PURCHASE_TUAD), we will have:

[ID], [label], [date]

In the second inventory table (P_INVENTORY_TUAD), we will have:

[ID], [label], [I_itemPurchaseQty], [I_itemPendingQty], [I_itemDeliveredQty], [date]

As we can see, the P_INVENTORY_TUAD table contains a column for the purchased quantity, another for the quantity pending delivery, and a column for the quantity delivered to the warehouse.

It may be necessary to create slices from the operations and items table.

As mentioned earlier, in AppSheet, it is more efficient to filter with a slice than with FILTER() or SELECT() since the request is sent directly to the server and the data arrives already filtered.

For example, we create a slice for pending items in the inventory called "P_INVENTORY_SUAD_pending" with the condition [I_itemPendingQty] > 0.

In this way, we are asking the server: "give me all the data from the table where the pending quantity is greater than zero," meaning all items that have not been delivered.

Now, we want to create a new slice for the "P_PURCHASE_TUAD" table that stores those purchases that are pending. These will be the ones containing child records with a pending quantity greater than zero.

An efficient way to do this is:

In the P_PURCHASES_TUAD table, we create a virtual column similar to Related P_INVENTORY_TUADs (which is created automatically) and we will call it Related P_INVENTORY_SUAD_pendings. Instead of writing:

REF_ROWS("P_INVENTORY_TUAD", "P_PURCHASES_TUAD")

We will write:

REF_ROWS("P_INVENTORY_SUAD_delivered", "P_PURCHASES_TUAD")

This will bring the list of child records but from the slice P_INVENTORY_SUAD_pending, meaning the pending ones.

Next, we create the slice for the P_PURCHASES_TUAD table.

We name the slice P_PURCHASES_SUAD_pending.

In the conditional expression, we place the expression ISNOTBLANK([Related P_INVENTORY_SUAD_pendings]).

This expression will bring all the purchases that contain pending child records.

The column [Related P_INVENTORY_SUAD_pendings] could have been created with a FILTER():

FILTER(
"P_INVENTORY_TUAD",
AND(
[P_PURCHASES_TUAD] = [_THISROW].[ID],
[I_inventoryPendingQty] > 0
)
)

In no case have we used a SELECT() or FILTER(); we have simply filtered with slices, which has resulted in better performance.

Gustavo_Eduardo

How to Identify and Remove Unnecessary Virtual Columns in AppSheet

Before deleting a virtual column in AppSheet, it is advisable to rename it first to check if it is being used somewhere in the application. This includes verifying its use in other columns, actions, views, bots, or formatting rules. If renaming the column does not affect any functionality, it is a sign that the column is not being used and can be safely deleted without causing issues, thereby optimizing resource usage.

Often, due to the visual nature of AppSheet, we tend to create columns that eventually become redundant or unnecessary. A good practice is to periodically review the tables, rename suspect virtual columns, and assess their impact. This process not only keeps the application organized but also improves sync performance by removing unnecessary elements.

Steve

Although these are calculated in real-time,

Virtual columns are recalculated "in real-time" only when they are part of a row being modified, as when the user is in a form view, or when modified by an action. There are some other circumstances where virtual columns are updated immediately, but they are few, not well-defined, and are best not to be expected.

All virtual columns are recomputed when the app syncs.

Gustavo_Eduardo

Thank you so much, Steve, for your valuable contribution. I’ll give it a try and share my results. On that note, I wanted to ask you something. I’ve found that structuring my applications by using virtual columns with reusable calculations across different parts of the app has significantly simplified my design and reduced errors. For instance, if I need to use a FILTER() that applies to multiple columns, I prefer to consolidate the logic into a single virtual column and reference it, rather than replicating the formula across various columns. This way, if I need to make adjustments, I only have to modify the virtual column, and everything else updates automatically. This approach is far more efficient than editing each individual column, especially when there are more than three affected, which would otherwise be a tedious and error-prone task.

However, I’m concerned that this approach might negatively impact the app’s synchronization time. If that’s the case, I’d be willing to change my strategy. Previously, I used to include expressions directly in ShowIf fields, but if these were nested and dependent on other columns, adding a new column to the table required reprogramming the entire visibility flow, which was quite complex. What’s your opinion on my current methodology? Is it viable from a performance standpoint?
Also, could you share how you would approach this?

Steve

Your approach of using virtual columns for Show if is good as long as it doesn't affect sync time. As long as you are careful in your expressions, your approach can provide great improvements! I have myself used this approach.

Gustavo_Eduardo

Mil gracias @Steve

Gustavo_Eduardo

I’ve been thinking a lot about this concept and trying to come up with a practical solution. I’m not sure if it’s the best one, but I’d like to hear your opinion, dear @Steve . What do you think about the following?

My approach is to create reusable expressions. They might work in some cases (those related to the current row) and not in others, such as those requiring a specific context (this is just a hypothesis, as I haven’t tested it yet).

Would it be feasible to replace the virtual columns used in showIf with real columns in the database of the boolean type? I know this would increase the number of columns in the database, but it could simplify maintenance while keeping synchronization performance unaffected. What’s your take on this?

Steve

If you have a chain of columns to display in a form, it is inefficient to write complete nested expressions in each column. It’s better to create a virtual column called [ShowIf_Column] for each column and write the display expression there. If an expression is shared between columns, point to the virtual column instead of repeating the calculation. This is not only more efficient for calculation but also easier to maintain.

This can be a useful approach, but care should be taken if the expressions include queries (SELECT() or the like). All virtual columns are computed during a sync, so adding more virtual columns has the potential to increase sync time, whereas Show if is computed by the app itself when columns are actually displayed. So keeping an expression in Show if won't affect sync but may affect display performance. There are trade-offs that should be considered.

Steve

@Gustavo_Eduardo wrote:

Would it be feasible to replace the virtual columns used in showIf with real columns in the database of the boolean type?

That's fine, too, with the understanding that physical column values only update when the row itself is updated. If the expression needs to react to independent changes outside the row itself, you must use a virtual column.

Gustavo_Eduardo

I’m running these tests, and by the way, I’ve noticed some instant advantages. I was concerned about database storage, but since these are boolean fields and I place them all on one side of the table, it doesn’t affect performance or organization.

On the other hand, I was wondering (will the same happen with lists?)

For example, if I have a Categories table and a Subcategories table (child of Categories), the system creates a “Related Subcategories” filter for the Categories table.

When naming a subcategory in the form, I usually create a validation rule to prevent writing the same name (to avoid inconsistencies). For instance, if I’ve typed “drinks,” I can’t type it again.

There are several ways to do this, but the most practical one I found, which has the least impact, is to create a column that stores the list of subcategories (excluding the current ID) and then verify that the name isn’t in that list. How do I do that?

I call the column [UDX] (unique duplicated excluded) and inside it, I place the following expression:

[Categories].[Related Subcategories] - LIST([_THISROW])

This brings the list of sibling subcategories without including this row.

Then in the validation rule for the subcategory name (validIf), I write:

NOT(

IN(

)

[UDX] is a list that could be obtained using SELECT() or FILTER(), however, adding a FILTER() would impact the app’s performance, so I decided to reuse something already calculated by the system.

My question is: Since [UDX] is a dereferenced list, and assuming it has less impact than a new FILTER since it’s a reading of another pre-calculated column, is it better to keep this in a virtual column or do you think it’s better to move it to a real column? Often, there are many records in the list, and that's why I think adding such a column to the database would be unnecessary (unless there are real and tangible improvements during synchronization).

Also, there is always the need to read from the database and write values, which I’m not sure will have more impact in the case of lists. In the case of boolean fields, it’s clearly not an issue, but with large lists (I assume without being fully aware, it could have an impact).

For these cases, I’ve considered adding an action every time the form is saved, which resets the stored value (to avoid reading lists when requesting data), but that adds another step to the save process, which could also lead to errors.

Steve

do you think it’s better to move [the UDX virtual column] to a real column?

I'd leave it virtual.

For these cases, I’ve considered adding an action every time the form is saved, which resets the stored value (to avoid reading lists when requesting data), but that adds another step to the save process, which could also lead to errors.

You have to consider supportability. These ideas can seem revolutionary when you come up with them, but six months from now you may find the complexity challenging when you no longer remember why you did it this way.

Gustavo_Eduardo

Gracias por todo el tiempo que nos brindas y por tu conocimiento.

Gustavo_Eduardo

REFs o ENUMs base type REF?

I learned about this concept late and by chance. I'm sharing it with you. When you want a dropdown list in your forms, you usually use "ref" type columns, but this has a huge impact on synchronization since a virtual "RELATED" field is created in the referenced table.

These are dynamic lists in virtual columns that consume resources with each synchronization. If you want to improve your experience, choose Enum with base type Ref instead of Ref. This field does not create a reference in the table being called, and you can use all the records from the table that you need. It's very satisfying to see how performance improves and tables aren't filled with unnecessary RELATEDs.

Have a good day!

Marc_Dillon

Do you see an actual change in the sync time, or are you just looking at the virtual columns chart in the performance profile? If the latter, it has been explained by Appsheet personnel that the related columns are calculated in parallel with the loading of the tables, which just means that the size of the bar in that chart is just the amount of time it takes to load the tables, and that loading time is present whether or not the Related column exists.

Gustavo_Eduardo

Good morning, @Marc_Dillon !

It depends. Noticing changes and measuring them in seconds would lead to errors due to the speed of data transfer at any given moment, which would make it an unreliable way to ensure what I’m saying is accurate. In conclusion, the only way to measure it quantitatively (although not a reliable parameter either) would be to check AppSheet’s performance profile.

That said, we all know that the fewer virtual columns, the better, since there are fewer calculations to perform (whether during each sync or in parallel). Fewer calculations lighten the code, which should be a positive thing in terms of scalability.

To directly answer your question, I don’t notice significant changes when there are fewer "Related Columns," but there should be some impact. Moreover, the editor tends to be cleaner when unnecessary columns are removed, and it’s always better not to have them if they’re not useful.

Let me give you an example:
I have four tables:

Categories
Families
Types
Components

In the first three, I define things, and in the Components table, I select the items that will become components. For better organization, I arranged it this way: If all of them were references in the Components table, Categories would have a "related to Components" column, and the same would happen for Families and Types. However, I need Families to have a reference-type Categories column, Types to have a reference-type Families column, and in Components, I only need Types to be a reference. Categories and Families can instead be enum columns with a base type of reference.

Once I’ve selected the Categories and Families columns, I can access the necessary Types in Components through a simple dereference. If my case required them to be references, I wouldn’t hesitate to change it. But in my opinion, when it’s not necessary, adding it is not worth it. Let’s travel light.

Democratic post to discuss best practices for application performance

Example: Displaying Fields Progressively in a Form

Step-by-Step Solution:

1. Configure Show If for the Second Field

2. Configure Show If for the Third Field

3. Scalability for Multiple Fields

Efficient Solution:

Configuration Example:

Benefits of This Strategy:

Summary:

SILICES AND SELECTS

Performance and Scalability

Resource Usage

Maintenance and Readability

User Experience

Example:

Using FILTER():

Using a Slice:

Conclusion:

FILTERING PARENTS AND CHILDS IN A NESTED SEQUENCE

Optimizing Application Performance by Reducing Filter Expressions

Using `FILTER()`: