💯

riaxi18 · 07-06-2023 06:00 PM

Wanted to see if anyone else had any experience supporting apps for multiple years.

One of our client has been using our app for almost 2 years now, they collect data on a weekly basis. Their situation is a bit unique, they have around 10 tablets they take offline Monday through Friday in rural areas to collect data (medical information, nothing too large scale), and return to internet to sync on the weekends.

It's been working well until the client started reporting slower syncing time (their internet is around 3.5 mbps download and upload, one weekend it took them over 14 hours to finish syncing). And what's more concerning is that the apps sometimes won't start anymore unless connected to wifi (see screenshot attached). This is happening at an increasing frequency.

My concerns are:

The app won't be able to support this client on the long run with all the data they've been collecting (5/10+ years), and the app will just bug out more
The app will exceed the 1 million cell limit soon or in 5/10 years

Is there anything we can do to address these issues and to continue using AppSheet? We love the simplicity of AppSheet but have been exploring other low-code platforms to go for a on premise approach.

dbaum

@riaxi18 wrote:

One of our client has been using our app for almost 2 years now, they collect data on a weekly basis. Their situation is a bit unique, they have around 10 tablets they take offline Monday through Friday in rural areas to collect data (medical information, nothing too large scale), and return to internet to sync on the weekends.

It's great to hear about that use case! Sounds perfect for AppSheet.

@riaxi18 wrote:

over 14 hours to finish syncing

When using Sheets as an app data source, I have observed that Sheets itself bogs down the sync when it has, say, many tens of thousands of rows. Depending on how and when you need to be able to access what data, you might be able to do something like split any such tables into current and archived rows--e.g., have new data added to current rows and then asynchronously moved or cloned to archived rows.

@riaxi18 wrote:

1 million cell limit

I don't know what this refers to. If you're using Sheets, the documented limit is 10M cells. Regardless, unless all the cells are in a single table, you could simply separate the tables into multiple files.

WillowMobileSys

To expand on @dbaum 's comments.

10M cells per sheet (i.e. per table) is fairly big - it's 100,000 rows with 100 columns.

Typically, apps with large data is due to history and there is a very small fraction of that data needed on a day to day basis. So with care, by filtering the data down to only what each user needs, I believe an AppSheet app can serve the users indefinitely - even when it's sheets based. How? By archiving older data completely out of the datasource and by limiting the downloaded data to fit the user through Security Filters - (e.g. Field tech may only need to see entries he personally made over the past week. Supervisor may need to see all entries by all Field Techs over the past month).

Once the minimal set of data has been established, the next focus is on Sync time. See my previous comments in this thread. Bottom line - once data is properly filtered slow Sync times are are almost always due to virtual column processing. As I said before, most virtual columns can be re-implemented to eliminate them from Sync times. This usually involves implementing normal columns that are updated by actions only when needed.

For example, in an Order system the total order amount is summed from the order details. The tendency is to use a Virtual column for this calculation. On every Sync the Order Total is re-calculated on every Order row. Even if there are optimizations that only re-calc when there was a change, the check still needs to be performed on every row in the table. As the table grows, so does the time to perform this check/recalc.

However, if the Order Total was a normal column and you used actions to recalculate the Order Total only when a user changed data that impacted the amount, that recalc is performed only that one time and then treated as a data column change within Syncs. That virtual column processing is eliminated from the Sync time altogether. Repeat for other virtual columns and you can/will drastically reduce the Sync time.

Note that this will introduce many more actions creating app-bloat, but the benefit in improved Sync time is well worth it. The end users perception is EVERYTHING!

Which virtual columns? Use the Performance Analyzer, it will show you the top 10 virtual columns impacting Sync times.

I hope this helps!

riaxi18

This is super helpful for thinking about sync time! Any idea what might be causing the infinite sync/loading page in the screenshot?

Chern

My team encountered similar issues. We attempted to resolve the situation by using appscript to back up and remove records periodically so as to declutter. We are still observing the situation but preliminary feedback from users are looking optimistic.

Ryan_Wagner

You will definitely need to switch to SQL and make use of security filters. That is what I do to handle the 200,000+ rows I have in one "parent" table and the growing "child" table for field inspections.

That said, my sync times are pretty miserable still and crashing is a constant problem (the white screen of death.) I think I've got it as good as I can have it, but I would love more speed and stability.

WillowMobileSys

@Ryan_Wagner wrote:

You will definitely need to switch to SQL and make use of security filters.

This is not entirely accurate.

Security Filters are beneficial in combination with an SQL database ONLY IF they are kept simple. Basically the Security Filter is converted into an SQL SELECT query to pull in from the database only the filtered rows. For complex Security Filters, the behavior with SQL is exactly the same as for sheets - all data is loaded into the server, Security filter applied and filtered rows are handed to the user devices. Where that threshold is from simple to complex is anyone's guess.

@Ryan_Wagner wrote:

That said, my sync times are pretty miserable still and crashing is a constant problem

This brings up an often overlooked fact. The majority of Sync times are taken over by Virtual column processing. Switching to SQL typically only improves the read times of the data which is very small part of the Sync Time. the good news is that, In many cases, the virtual column functionality can be implemented in a different way eliminating its processing time completely from the Sync.

Ryan_Wagner

I was referring to his scale concerns...

Do you have any experience "hard-wiring" a column in SQL to do the heavy lifting that a virtual column does? I am new to BigQuery, but I'm wondering if it's possible to write calculations in a column...

I have reduced my virtual columns down to just a couple of ref row expressions. I know the performance analyzer is said to not be the best way to visualize what is taxing an app, but it has me clocked at a 4-second sync time...The reality is more like 20 seconds. My crew will typicallly come back from the field with anywhere from 50-200 edits still syncing in the queue.

WillVerkaik

Yes it can. But Bigquery is NOT for a Transactional system.

There is however a unique function that allows Bigquery to reference a Google Sheet as a data source. I have used this for small amount of records ( 10K ) successfully. I however still leverage this data as part of an aggregation. ( multiple joins and pivot reporting ).

Still, you need to filter back to 100K records to pull and use. I have so far in the same appsheet, pulled 3 instance of 100K simultaneously. 20 sec load time, with all 300k records. All read only data, but it then allows me to create related data against each record as needed in separate tables and the sync is short for adding records.

This is the farthest I have stress tested.

WillowMobileSys

@Ryan_Wagner wrote:

I am new to BigQuery, but I'm wondering if it's possible to write calculations in a column...

Be careful with BigQuery. My understanding, when checking into it for a client, is that BigQuery is more for data warehousing such as number crunching operations and reporting. Below is what I found on some of its limitations. Carefully double-check them and make sure the limitations work for your app.

@Ryan_Wagner wrote:

but I'm wondering if it's possible to write calculations in a column...

Not that I'm aware of. Many databases allow for stored procedures to perform calculations and updates columns but I would only recommend this for background info - i.e. data that is calculated independently of the user facing data. For example, maybe a monthly summary table.

@Ryan_Wagner wrote:

I know the performance analyzer is said to not be the best way to visualize what is taxing an app

Correct! The Performance Analyzer is just a guide. There too many factors that can impact Sync times that an accurate estimate cannot be determined. Connectivity be one of the most frequent violators.

Ryan_Wagner

I can attest to BQ handling our payload, though it can be slow as I said above...Roughly 2,000 edits a day. That is an interesting warning message you shared, though. I had not seen that and it might have dissuaded me from trying it out...

Thanks for all the info! I am a heavy Appsheet user, but I have a lot to learn about the mechanics of it all...

ChristianLi

@WillowMobileSys Hi.

Where is this BigQuery limitations from?

I am very interested in moving data-sources to bigQuery and the only thing I read was about 100k row limit, partitioned tables, cloud access based on https://support.google.com/appsheet/answer/11905680?sjid=5573571566701618435-AP#limits

WillowMobileSys

@ChristianLi wrote:

Where is this BigQuery limitations from?

Very good question. I cannot seem to find it now. I had reviewed dozens of pages on behalf of a client and quickly took a sceen shot. It is so unlike me not to include a link to the source as well but I was in a hurry that day.

I think the AppSheet article you included is a good place to start. You can also go here for details directly from BigQuery. I have to admit though, I don't understand even half of the limitations they outline.

WillVerkaik

I think it is older limitations from a specific solution that has API's connecting into bigquery. https://community.microstrategy.com/s/article/Known-Limitations-of-Adding-Data-into-Google-BigQuery?...

I use Bigquery extensively and very successfully. Keep in mind, Bigquery is for datawharehousing not transactional systems. So these limitations maybe real/close in a different context. Upto a terabyte of data, it does extremely well without much notice.

Mostly I used it for ETL needs from a large number of other System of Records for referrence. So mostly batch processing that does not require realtime data. ( 1 hr intervals).

The Source details of limitations is quite extensive : https://cloud.google.com/bigquery/quotas

SkrOYC

@WillowMobileSys wrote:

For example, in an Order system the total order amount is summed from the order details. The tendency is to use a Virtual column for this calculation. On every Sync the Order Total is re-calculated on every Order row. Even if there are optimizations that only re-calc when there was a change, the check still needs to be performed on every row in the table. As the table grows, so does the time to perform this check/recalc.

However, if the Order Total was a normal column and you used actions to recalculate the Order Total only when a user changed data that impacted the amount, that recalc is performed only that one time and then treated as a data column change within Syncs. That virtual column processing is eliminated from the Sync time altogether. Repeat for other virtual columns and you can/will drastically reduce the Sync time.

Note that this will introduce many more actions creating app-bloat, but the benefit in improved Sync time is well worth it. The end users perception is EVERYTHING!

💯

AleksiAlkio

Interesting topic 🙂

When using SQL as a data source, the most important thing is to reduce the data amount with the security filter as you first need to decrease the data amount from the SQL server to AppSheet server. Then it will reduce the data amount from AppSheet server to the client. When the data amount in AppSheet server is minimized, less computation the app needs with virtual columns. The best option is that you don't have any virtual columns, but that you can't avoid always.

So.. then it's a question what security filters is possible to convert to SQL query. It's known, that you can use AND(), CONTAINS(), IN(), and like equal, bigger or smaller. But that is actually not the whole truth. If you need OR() functionality, you can write that for example..
IN(USEREMAIL(),Employees[Email]+[Companies[Email]). The catch here is that the list is generated in AppSheet server before it's sent to SQL server. That you can find out from the performance analyzer as it shows what is converted to SQL server. Check for example this one..

The security filter in this case is
AND(
[RestaurantID]=[_THISUSER].[Restaurant],
IN([RestaurantID],Restaurants[ID])
)

What all this means.. you can write actually quite complex expressions and they will converted to query.. when it generates a list.

Alan_Thorp

I think that there are two/three issues highlighted here, scale and sync time.

Scale appears to have been covered quite well, as does read sync time. Google sheets will be slow with large tables, and SQL will be much faster, (if properly implemented), and have more storage.

One aspect that has not been covered is sync back to the database after being off-line for some time. I have users who have had similar experiences of multiple hours of uploading changes. I only have what users tell me to go on, but it also seems that devices goung to sleep during this may also be involved, but either way it still takes a long time. Would anyone be able to comment on ways to improve this?

dbaum

@Alan_Thorp wrote:

sync back to the database after being off-line for some time

I also welcome suggested techniques to address this issue.

Even when continuously online, users who make lots of data changes in a short period must wait an inordinate length of time for the dozens or hundreds of changes to upload. When one of those changes triggers a server-side action they must then wait on to continue working, it's a big problem--e.g., they make 100 data changes, most are still pending sync when they make the 101st change, and their ongoing work depends on server-side actions triggered by that 101st change.

AleksiAlkio

First thing to check is how much there is a difference (or is there) when the adds or edits is done from the queue versus when it's done individually when online. That you can find out from the Performance analyzer.

I would also check if you are triggering Bots that need time to execute. One possible way to help is to add a 5 minute wait (task) as then the queue doesn't need to wait it.

WillowMobileSys

@Alan_Thorp wrote:

One aspect that has not been covered is sync back to the database after being off-line for some time. ...

...Would anyone be able to comment on ways to improve this?

@dbaum wrote:

I also welcome suggested techniques to address this issue.

Unfortunately, Edits are applied to the datasource one-by-one. Each edit needs to go through a cycle of processing and then updating the data before the next edit is applied. This includes any automation triggers. If the processing is long running on each edit, then 100+ edits may take a while.

At the very least, the user needs to be sure they have a good solid connection, preferably WiFi connection, before Syncing the offline edits. A week connection will delay things even more.

What can you do to help speed up these offline Syncs?

These are the same for ANY app, offline or not:

Minimize the number of edits. If you have processing such that columns on a row are updated by different actions, each one of those count as a separate edit and any automation triggered will likely be duplicated - i.e. wasted processing time. Try to adjust the processing so ALL column changes are handled with a single action resulting is just a single row edit and therefore a single automation trigger.
Do as much processing as possible with actions on Save instead of relying on automation. These actions run on the device side and are much, MUCH faster than automation. The result on the server side is a simple data row edit rather than an expensive automation trigger. A side-benefit is that the user making the changes can see the results immediately on their device. Others, obviously, will have to wait for the Sync.
Be stingy with automation triggers. Only trigger when absolutely needed. I think there is a tendency, especially on Updates, to be okay with an automation triggering on back-to-back edits even though the updates it makes results in no real additional row changes. But each of these triggers has a processing overhead that is expensive - even if it's doing nothing. Over multiple unnecessary edits, the wasted time can add up.
Avoid complicated automations. Try to build automations that perform isolated discrete tasks rather than combining into a single automation. They can then be run in parallel.
Analyze large data tables for splitting into multiple tables. Many times, tables with large number of columns end up being a combination of several "groups" of data that are better served as separate tables. This can cause the larger single table to have a much bigger data footprint than separate smaller tables. This translates into longer processing times for that larger single table.
Keep the working data as trim as possible through Security Filters and/or archiving. These affect server-side processing for the submitted edits just as well as the data processed on the device side.

AleksiAlkio

A little bit longer description 🙂 Well written!

dbaum

That's a great overview, @WillowMobileSys. Thanks for taking the time to document those important considerations. It's worth posting as a tip if you're so inclined.

Joseph_Seddik

@WillowMobileSys wrote:

Do as much processing as possible with actions on Save instead of relying on automation. These actions run on the device side and are much, MUCH faster than automation. The result on the server side is a simple data row edit rather than an expensive automation trigger. A side-benefit is that the user making the changes can see the results immediately on their device. Others, obviously, will have to wait for the Sync.

I have found that multi-step actions locally triggered by a save or an action button to be NOT reliable at all, and it depends much on whether the user would patiently wait for the sync to complete or would he just close the browser, thus truncating the process resulting in data discrepancies and errors. From a data-protection perspective, server-side actions are far more secure even if they run slower.

@WillowMobileSys wrote:

Be stingy with automation triggers. Only trigger when absolutely needed. I think there is a tendency, especially on Updates, to be okay with an automation triggering on back-to-back edits even though the updates it makes results in no real additional row changes. But each of these triggers has a processing overhead that is expensive - even if it's doing nothing. Over multiple unnecessary edits, the wasted time can add up.

Agree. And the way to do that is using the awesome [_ThisRow_Before] and [_ThisRow_After], to target only specific triggers.

WillowMobileSys

@Joseph_Seddik wrote:

I have found that multi-step actions locally triggered by a save or an action button to be NOT reliable at all, and it depends much on whether the user would patiently wait for the sync to complete or would he just close the browser, thus truncating the process resulting in data discrepancies and errors. From a data-protection perspective, server-side actions are far more secure even if they run slower.

I could see that - especially in a browser app where users will simply close the browser tab when they think they are finished. If it is a process where the user will just close the app then...yes, I would use an automation as well. These circumstances, in my experiences, should be rare.

The focus of my suggestions was on offline apps which implies the usages of mobile devices. In these circumstances I would recommend using device-side actions. These should continue even if the user minimizes the app - though I am not 100% certain of that.

If the user is in the habit of fully closing an app on a mobile device then that is a training concern. I have seen an AppSheet message that states there are edits not yet submitted so I think there is some protection against this use case. I am not certain if this same message is presented on desktop.

Joseph_Seddik

@WillowMobileSys

Not rare at all. I'd say, never depend on local actions with Desktop user. Training cannot prevent accidental browser closure or even shutdowns especially with laptops.

Mobile apps on the other hand keep the list of pending actions living even after closure. However, no action will be executed and no sync will be performed if the app is not running in the foreground, and this will result in bigger delays in updating the data. Try it yourself. Sync an app then open another app, return after a minute or so and you'll find the app still syncing. This is the case also with the browser, it has to run in the foreground for the OS to prioritize processing power to its pending tasks. And when the browser tab is closed, all is lost.

WillowMobileSys

@Joseph_Seddik wrote:

Not rare at all. I'd say, never depend on local actions with Desktop user. Training cannot prevent accidental browser closure or even shutdowns especially with laptops.

Not sure I'd go as far as saying "never" and I would place the "accidental" closure in the rare category.

I suppose which way to go might depend on the app and usage. I have three clients that use the Desktop version nearly everyday. I have yet to see any issues related to local actions being executed. But in all 3 cases, the AppSheet app is their work tool so they have the app open all day and probably leave the tab open from day to day.

Joseph_Seddik

@WillowMobileSys You are blessed with having such customers 🙂

But I'm curious, what do you do when such "rare" events occur?

WillowMobileSys

@Joseph_Seddik wrote:

But I'm curious, what do you do when such "rare" events occur?

I don't know, I haven't had to handle those issues yet.

However, now that you've raised it, there is one use case where workers Clock Out and several things happen resulting in a longer-than-Human-Patience-will-allow process (15 to 20 seconds). Users, in their rush to go home, close the app and their Clock Out details do not fully get saved. We have been instructing them to make sure they leave the app open until all edits are pushed through and so far the message seems to be getting through. I am not aware of any additional issues but possible I haven't been informed.

This may be a candidate for exactly the issue you have pointed out. I'll look at that and see if splitting between device-side and server-side processing helps to resolve permanently.

Joseph_Seddik

@WillowMobileSys Thank you!

WillowMobileSys

You're welcome! And welcome back to the Community!

SkrOYC

@Joseph_Seddik wrote:

And when the browser tab is closed, all is lost

Btw, Mobile or Desktop, changes are saved in the machine until you have the app in the foreground and it starts syncing again. Those don't get lost if someone closes the app, it's just that the changes are not synced. The only case where this wouldn't be true is if the user has some options in their browser of choice to clean cache every time they close it.

@WillowMobileSys wrote:

These should continue even if the user minimizes the app - though I am not 100% certain of that.

If the user is in the habit of fully closing an app on a mobile device then that is a training concern. I have seen an AppSheet message that states there are edits not yet submitted so I think there is some protection against this use case. I am not certain if this same message is presented on desktop

The app working on the background is something that I haven't see and even if it has worked it's something that depends on the device/os.

AppSheet shows a message on desktop. On iOS I haven't see that since you just swipe and go home to close the app. On Android you can also do the same, but maybe (since I don't use Android since a couple of months) it shows something when trying to close with the back button.

In any case, I'd love to see AppSheet finding a way to have background sync even if the app is closed and maybe the Quick Sync stuff will solve some of this

Joseph_Seddik

@SkrOYC wrote:

Those don't get lost if someone closes the app

Yes the certainly do ! Here is some precision. First we speak about local actions, not bots.

Case 1: Launch a Single action on a set of rows

Suppose that I select 20 rows and launch an action on all of them, or launch an action that will operate on 20 rows returned by an expression. If after 5 actions for example I close the tab, it might complete when I open it again on the remaining 15 rows, and this largely depends on the browser without having the cache cleared.

On Chrome for example (number 1 preference of my users), when you close the tab while having others open, or if you close the browser but you have another instance running, you will not lose it. You open the app again and the action will continue with the 15 remaining rows. However, if you close ALL instances in your computer, you are sure to stop everything.

Safari on Mac on the other hand, does a better job, and even if you close all instances, you won't lose the pending action, perhaps because you can never really close Safari on Mac. But how many uses that? almost no user.

Case 2: Launch a Group action even on a single row - LOST CASE

No matter what browser you use or what you do, if you have a grouped action that launches a sequence of say 10 actions. If you close the browser tab in the middle, then the sequence gets broken and there's no way to make it continue from where it stopped. This is the type that I would never launch as a local action.

SkrOYC

I'm talking specifically to pending sync, not when the browser is stuck computing stuff.
I had in the past syncs that were pending and even after I shutdown the laptop it showed me the pending ones the next time I opened the app

AppSheet capable of supporting long-term usage?

💯