Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Unable to get import ndjson file from GCS to Vertex AI data stores using discoverengine

Hi,

I am trying to build an Automation solution for data upload to Vertex AI data stores. I am saving the nd.json file in cloud storage and trying to import it using the discovery engine. 
Ndjson format: 

{"id":"f58e39b1-36ec-4614-b79d-dbbdec7b8168","structData":{"title":"Tiger.pdf","description":"(autogenerated)","text":"\n\nCare For Us\nTiger (Panthera tigris)\n\nAnimal Welfare\nAnimal welfare refers to an animal’s state \nor feelings. An animal’s welfare state can \nbe positive, neutral or negative. \nAn animal’s welfare has the potential to \ndiffer on a daily basis. When an animal’s \nneeds -nutritional, behavioural, health \nand environmental -are met, they will \nhave positive welfare.  \nA good life in captivity might be one where \nanimals can consistently experience good \nwelfare -throughout  their entire life.  \n\nUnderstanding that animals have both \nsentient and cognitive abilities as well \nas pain perception, reinforces the need \nto provide appropriate husbandry \nprovisions for all captive animals, to \nensure positive welfare.\nIn captivity, the welfare of an animal is \ndependent on the environment \nprovided for them and the daily care \nand veterinary treatment they receive.  \nFlickr@RobBixey\n\nThe tiger is the largest member of the cat \nfamily. They have distinct colouring, with \nreddish coats covered in narrow black or \nbrown stripes. \nThe tiger is Endangered on the IUCN Red \nList of Threatened Species. Today, an \nestimated total of only around 3,000-\n4,500 tigers exist in the wild. Subspecies \nof the tiger include the Sumatran Tiger, \nSiberian Tiger, Bengal Tiger, South China \nTiger, Malayan Tiger and Indochinese \nTiger. \nThe demand for tiger parts in traditional \nmedicine, hunting, and the destruction of \nhabitats means that many of the \nsubspecies are either endangered or \nalready extinct.\n\nTigers are obligate carnivores –they must \neat meat because they derive most of their \nenergy and nutrients from animal tissue. \nThey eat many different species and in the \nwild, tigers will stalk their prey, hiding in \nhigh grass using their striped colouring as \ncamouflage. They chase over short \ndistances and can attack in water or chase \ntheir prey up trees, killing with a bite to the \nneck.\nPositive Behaviours to Encourage\nIn captivity, food should be offered in a way \nthat encourages natural behaviours. Tying \nchunks of meat to the top of a pole, placing \nfood in a sack, or hiding food in a boomer \nball all work well. This encourages running, \nstalking, chasing and tearing. Feeding whole \ncarcasses encourages the use of a tiger’s \npowerful muscles and teeth. \n.   \nTigers Like to Hunt\n\nTigers are Strong\nTigers can jump up to five metres in \nheight and more than six metres in \nlength. Their extremely powerful leg \nmuscles and large size mean they are \nincredibly strong. \nPositive Behaviours to Encourage\nProvide an interesting and dynamic \nenvironment for tigers to allow them to \ndemonstrate their strength. Placing \ndurable ropes, barrels, climbing \nstructures  and poles in an enclosure \nwill encourage a tiger to use its powerful \nbody to stay healthy and strong.\n\nTigers Like to Climb\nTigers use their extremely powerful leg \nmuscles and strong claws to climb trees. \nTigers like to rest high up and will also \nuse the height to spot possible prey. \nPositive Behaviours to Encourage \nEncourage active climbing behaviours by \nproviding multiple and different level \nclimbing opportunities. Platforms for \nresting and sleeping and natural trees or \nlogs will allow tigers to move both \nvertically and horizontally around their \nenclosure.\nFlickr@TambakoThe Jaguar\n\nTigers Like to use their \nClaws\nTigers have large, padded feet with four \nclaws and one specialised claw called the \ndewclaw. Dewclaws are used for grasping \nprey and aid in climbing. They sharpen their \nclaws by scraping them on trees. \nThey stand on their hind legs and rake them \ndownwards into the bark of a tree.  Their \nclaws are retractable, so they remain sharp \nand do not make a noise when stalking prey.  \nPositive Behaviours to Encourage\nIn captivity, large, strong poles wrapped \nwith thick rope can be provided to \nencourage grabbing and sharpening of the \nclaws.\n\nTigers Like to Swim\nTigers like water. They are good \nswimmers and even chase prey while \nin the water. They will seek out water \nto cool off during hot weather and can \neven dive and swim under water if \nnecessary.\nPositive Behaviours to \nEncourage\nProviding a deep pool and water \nmister within their enclosure will \nencourage swimming and enable tigers \nto cool off in hot weather.  Throwing \ndurable, scented toys into the water \nwill encourage play behaviours. Tigers \nalso enjoy natural running water to \ndrink from, such as a fountain or \nwaterfall.\n\nTigers Like to \nCommunicate\nTigers will use soft chuffing noises to \ncommunicate to each other. Young cubs \nplay in order to learn key survival skills \nfor later in life. Tigers also communicate \nthrough scent, marking their large \nhome territory or exchanging scents by \nrubbing against each other. \nPositive Behaviours to Encourage \nProviding an enclosure that is large, \nnatural and allows for species-specific \nactivities such as running, climbing, \nplaying, swimming and scent marking, \nwill encourage natural forms of \ncommunication.\n\nTigers Enjoy...\nPlaying, forming close bonds with their \nyoung, cooling off when hot, \nswimming, and eating different and \ninteresting foods.\nIn captivity we should always try and \nreplicate their natural and normal \nbehaviours so they are happy and \nhealthy throughout their lives.\nMore Tiger Care Information \nHERE"},"content":{"mime_type":"application/json","uri":"gs://autouploadstoragetest/KMS/pp@pp.com/Tiger.ndjson"}}

Error: 

Waiting for operation to complete: projects/projectid/locations/global/collections/default_collection/dataStores/kms-final-test-data-stores_/branches/0/operations/import-documents-4195131690287860842
error_samples {
code: 5
message: "Document projects/project id/locations/global/collections/default_collection/dataStores/kms-final-test-data-stores_/branches/0/documents/f58e39b1-36ec-4614-b79d-dbbdec7b8168 (uri: gs://autouploadstoragetest/KMS/pp@pp.com/Tiger.ndjson) is imported but not yet indexed. Please wait another hour for index. If it\'s still not searchable, there might be issues in the document, such as blank documents, formatting errors, or corrupted files."
details {
type_url: "type.googleapis.com/google.rpc.ResourceInfo"
value: "\0227gs://autouploadstoragetest/KMS/pp@pp.com/Tiger.ndjson:1"
}
}
error_config {
gcs_prefix: "gs://913102950475_us_import_document/errors4195131690287859730"
}

create_time {
seconds: 1719953828
nanos: 140053000
}
update_time {
seconds: 1719955197
nanos: 491205000
}
failure_count: 1
total_count: 1

I have tried everything that was there in the Google Docs still not fixed. Could anyone help me to fix this? 

4 4 310
4 REPLIES 4

Hello! Data stores expected format are kinda hard to figure out. What do you want to do? Im playing with agent builder and ended up using plain txt files.

Hi, I am even playing with Agent Builder by creating automation for cloud storage and data stores. This process requires discoveryengine API to work as middleware to import files from storage to data stores and that's required .ndjson file. no matter which Jsonl format I try to import using discoveryengine the agent builder won't complete the indexing. so I want to know the correct .ndjson format. The format in Google Docs does not work as expected. 

have you tried just changing the ext as .json?

I cannot figure it out either. I want to import entities from SQL DB and then ask about them. Im not having good results during the retrival. JSON main use case is referencing documents in GCS like pdfs. The only way I tried using structured data was through CSV as FAQ datastore type. But not having good results so far.