How to extract specific data

Hello Everyone,

I'm trying to extract specific data from an image. I can extract all of the information from the image using OCRTEXT() in a field but want to extract just the results from the data. I have created another column below the readings column for the specific value, but how do I achieve this?

Nielay_0-1658836952303.png

 

0 11 909
11 REPLIES 11

If there's text that reliably immediately precedes or begins the results portion, then use FIND to determine that location and truncate everything prior using RIGHT. For example:

RIGHT(
  [Readings], 
  LEN([Readings]) - 
  (FIND("Results
", [Readings]) + 
  LEN("Results
") - 1)
  )

okay, so every time I upload an image, it will give me "result = {whatever value the image has}" as output?
for example, in the SS it is Results 5.2

Try the example, adapt it to your situation, and iterate on it to meet your need.

The example I gave will provide all the text following "Results" beginning with the next line. If you need only the next line of text, you'd need to wrap my example within an analogous approach but using LEFT to truncate the end of string.

If you know the length of the text following "Results" will always be 3 characters (e.g., "5.2"), then you could revise my example to instead use MID:

MID(
  [Readings], 
  LEN([Readings]) - 
  (FIND("Results
", [Readings]) + 
  LEN("Results
") - 1), 
  3
  )

Dr Lal PathLabs
LPL-LPL-ROHINI (NATIONAL REFERENCE
LAB)
SECTOR 18, BLOCK -E ROHINI
DELHI 110085
Name
Lab No.
: THREE DUMMY
Test Name
: 148712410
A/c Status : P
GLUCOSE, FASTING (F), PLASMA
(Hexokinase)
GLUCOSE, POST PRANDIAL (PP), 2 HOURS,
PLASMA
(Hexokinase)
Age: 25 Years
Ref By : Dr. UNKNWON
Interpretation
Status
Normal
Impaired fasting glucose
Impaired glucose tolerance
Pre-Diabetes
Diabetes mellitus
Buangshes
Dr Himangshu Mazumdar
MD, Biochemistry
Consultant Biochemist
NRL- Dr Lal PathLabs Ltd
Regd. Office/National Reference Lab: Dr. Lal PathLabs Ltd., Block E, Sector-18, Rohini, New Delhi - 110085
Tel: +91-11-30244-100, 3988-5050, Fax: +91-11-2788-2134, E-mail: lalpathlabs@lalpathlabs.com
Web: www.lalpathlabs.com, CIN No.: L74899DL1995PLC065388
N Kuusl
Gender:
NABL ACCREDITED
MC-2113
70-100
101-125
70-100
101-125
>126
Results
80.00
Fasting plasma glucose
in mg/dL
Dr Nimmi Kansal
MD, Biochemistry
National Head - Clinical Chemistry &
Biochemical Genetics
NRL - Dr Lal PathLabs Ltd
110.00
Male
--End of report
CAP
ACCREDITED
***********..โ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ..
COLLEGE of AMERICAN PATHOLOGISTS
Collected
Received
Reported
Report Status
Units
mg/dL
mg/dL
Note
1. The diagnosis of Diabetes requires a fasting plasma glucose of > or = 126 mg/dL and/or a random / 2
hr post glucose value of > or = 200 mg/dL on at least 2 occasions
2. Very low glucose levels cause severe CNS dysfunction
3. Very high glucose levels (>450 mg/dL in adults) may result in Diabetic Ketoacidosis & is considered
critical
PP plasma glucose
in mg/dL
BSI
โœช
70-140
70-140
141-199
141-199
>200
REG
RED
ISO 9001: 2008
FS 60411
bsi. ISO/IEC
27001
Information Security
Management
IS 616691
: 10/7/2019
3:07:00PM
10/7/2019 3:08:19PM
: 10/7/2019 3:44:01PM
: Final
Bio. Ref. Interval
70.00 100.00
70.00 140.00
Page 3 of 4
If test results are alarming or unexpected, client is advised to contact the laboratory immediately for possible remedial action.
@ Tests conducted at National Reference Lab, New Delhi, a CAP (7171001), NABL (MC-2113) and ISO (FS 60411) accredited laboratory


This is the extracted data on which I applied your formula, but it is giving me output as "tin"

Nielay_0-1658905988485.png

 

That's probably the "tin" from within "fasting" a couple lines just after "Results". Maybe there are hidden characters throwing things off? Try tweaking the " - 1" and "3" portions of the expression to see whether you can get it to reliably return the portion you need.

Steve
Platinum 5
Platinum 5

Please provide a screenshot of the complete results of the OCR scan. Please indicate what exactly in that screenshot you want extracted. If there are multiple things you want extracted, indicate all of them.

report 2-3-1.png

This is the sample report I'm uploading, the extracted data from this report is being pasted In the previous comment, cannot SS it because it's too big. I want to extract the results of the report, in this case I want extracted output to be 80.

What is "SS"?

Screenshot

Based on the OCR output you provided previously, I can't imagine any way to interpret the output or to reliably extract any particular piece of data. The output appears to be in a random order.

If you can't find a way to accomplish it with an AppSheet expression and you're using Excel or Sheets as the data source, you could likely accomplish it with a spreadsheet formula. The text searching functions in the spreadsheet apps are much more robust than AppSheet's FIND.

Top Labels in this Space