Hi everyone,
I was trying to validate an email field properly in AppSheet, and while I searched the community, I couldnโt find something that covered all the key cases.
So I came up with this expression that works without using REGEX, and it checks for the essentials (and a bit more):
That the field is not blank
That it contains an @
That thereโs at least one .
after the @
That there are at least 2 characters after the last dot, making sure the domain extension is valid (like .com
, .org
, .net
, etc.), and filtering out suspicious cases like someone@mail.
or user@web.c
AND(
ISNOTBLANK([email]),
CONTAINS([email], "@"),
CONTAINS(RIGHT([email], LEN([email]) - FIND("@", [email])), "."),
LEN(
INDEX(
SPLIT(
RIGHT([email], LEN([email]) - FIND("@", [email])),
"."
),
COUNT(
SPLIT(
RIGHT([email], LEN([email]) - FIND("@", [email])),
"."
)
)
)
) >= 2
)
Because according to ICANN (the global domain authority), there are no valid top-level domains (TLDs) with a single character. So this rule helps prevent typos or invalid emails that could slip through.
Suggestion of @Suvrutt_Gurjar
I'm also leaving a formula up here in case the comment ever gets pushed down, by the authority Suvrutt_Gurjar, much more efficient than the one I presented and without so many internal evaluations.
AND(
LEN(SUBSTITUTE(@&EXTRACTDOMAINS([Email]), EXTRACTMENTIONS([Email]),""))>=3,
ANY(EXTRACTEMAILS([Email]))=[Email]
)
Here is an even more compact suggestion that allows us to achieve the same result, also suggested by our colleague @Suvrutt_Gurjar
Here are his words
"Please note that for checking if a field is left blank, a default approach in AppSheet is to make the field "required." So my suggested expression does not include checking it for a blank field."
AND(
LEN([email]) - FIND(".", [email])>=2,
ANY(EXTRACTEMAILS([email]))=[email]
)
While the previous expression is more compact than all the others and is more efficient, there's a blind spot there that many might find interesting, and that's the validation with real Top-Level Domains (TLDs).
If you try to write user@email.com.x, this email will be considered valid because it doesn't verify that there are at least two letters after the second dot.
For this, I have tried the following alternative:
AND(
ISNOTBLANK(EXTRACTEMAILS([user_id])),
ANY(EXTRACTEMAILS([user_id])) = [user_id],
LEN(
INDEX(
SPLIT([user_id], "."),
COUNT(SPLIT([user_id], "."))
)
) >= 2
)
This requires the user to enter a second letter after the final dot.
If you also need to ensure the user enters this in lowercase, you can use:
AND(
ISNOTBLANK(EXTRACTEMAILS([user_id])),
ANY(EXTRACTEMAILS([user_id])) = [user_id],
LEN(
INDEX(
SPLIT([user_id], "."),
COUNT(SPLIT([user_id], "."))
)
) >= 2,
FIND(
[user_id],
LOWER([user_id])
) > 0
)
If you also want to ensure the user enters a unique email, you could create a validation to prevent duplicates of the same email. To do this, you create a real column that will be a kind of shadow column; you can call it 'aux_unique_ids' of type EnumList, with Base Type Ref, in the same table they are in. In the formula, they write table[id] - LIST([_THISROW])
, for example. This will bring the list of IDs from the table, excluding the current row, with a very efficient formula, that is, practically read-only.
Then, the formula would look like this:
AND(
NOT(
IN(
[_THISROW].[user_id],
[aux_unique_ids][user_id]
)
),
ISNOTBLANK(EXTRACTEMAILS([user_id])),
ANY(EXTRACTEMAILS([user_id])) = [user_id],
LEN(
INDEX(
SPLIT([user_id], "."),
COUNT(SPLIT([user_id], "."))
)
) >= 2,
FIND(
[user_id],
LOWER([user_id])
) > 0
)
This expression validates that there are no repeated emails, that it has email characteristics, and that it is also in lowercase (so they don't have to do it using a Bot to transform it to lowercase).
"The following is a list of common domains (But I don't think they're all); I preferred not to add anything because it wouldn't be dynamic, and would require our intervention each time a new domain appeared or disappeared:
LIST(
"com", "org", "net", "edu", "gov", "mil", "int", "biz", "info",
"name", "pro", "co", "io", "me", "tv", "us", "uk", "mx", "ar", "de", "fr", "es", "ca", "au"
)
So it has simply remained like this.
I hope this tip is as helpful to you as it is to me."
Hi @Gustavo_Eduardo ,
Have you explored the EXTRACT() functions?
With EXTRACT() functions you could do the needful with a much shorter expression, something like
AND(
LEN(SUBSTITUTE(@&EXTRACTDOMAINS([Email]), EXTRACTMENTIONS([Email]),""))>=3,
ANY(EXTRACTEMAILS([Email]))=[Email]
)
Additionally valid_if using EXTRACTEMAILS() also checks that there are no other special characters such as say $, %, & etc. are entered in email that are not allowed.
You have no idea how much I appreciate this, my friend @Suvrutt_Gurjar ! I changed the formula, and yours is more efficient. Thank you so much for sharing it with the community! A hug!
You are welcome @Gustavo_Eduardo .
Please note that for checking if a field is left blank, a default approach in AppSheet is to make the field "required." So my suggested expression does not include checking it for a blank field.
Also here is another version of the expression that I believe is still more compact.
AND(
LEN([Email])- FIND(".", [Email])>=2,
ANY(EXTRACTEMAILS([Email]))=[Email]
)
Thank you very much my friend!!