Hi, I'm trying to automate the creation of my datastore and import the data into it, I assembled the jsonl file as follows
{"_id": "d001", "content": {"mimeType": "text/plain", "uri": "gs://storage_processados_txt/Como cadastrar um administrador na revenda.txt"}, "structData": {"title": "Como Cadastrar um administrador na revenda", "url": "gs://storage_processados_txt/Como cadastrar um administrador na revenda.txt"}}
{"_id": "d002", "content": {"mimeType": "text/plain", "uri": "gs://storage_processados_txt/Como criar um Dominio.txt"}, "structData": {"title": "Como criar um Dominio", "url": "gs://storage_processados_txt/Como criar um Dominio.txt"}}
{"_id": "d003", "content": {"mimeType": "text/plain", "uri": "gs://storage_processados_txt/Como criar uma empresa.txt"}, "structData": {"title": "Como criar uma empresa", "url": "gs://storage_processados_txt/Como criar uma empresa.txt"}}
{"_id": "d004", "content": {"mimeType": "text/plain", "uri": "gs://storage_processados_txt/Como criar uma revenda.txt"}, "structData": {"title": "Como criar uma revenda", "url": "gs://storage_processados_txt/Como criar uma revenda.txt"}}
I'm running the following script to import the data
from google.cloud import discoveryengine
from google.api_core.client_options import ClientOptions
# Caminho para a chave JSON da conta de serviço
GOOGLE_APPLICATION_CREDENTIALS = './chave.json'
project_id = "rodrigo-estudos"
location = "global" # Values: "global"
data_store_id = "data-store-gpt-01"
# Format: `gs://bucket/directory/object.json` or `gs://bucket/directory/*.json`
gcs_uri = "gs://clean/lista_arquivos_bucket_2.json"
client_options = ClientOptions(api_endpoint=f"{location}-discoveryengine.googleapis.com")
client = discoveryengine.DocumentServiceClient(client_options=client_options)
parent = client.branch_path(project=project_id, location=location, data_store=data_store_id, branch="default_branch")
request = discoveryengine.ImportDocumentsRequest(
parent=parent,
gcs_source=discoveryengine.GcsSource(
input_uris=[gcs_uri],
data_schema="custom",
),
reconciliation_mode=discoveryengine.ImportDocumentsRequest.ReconciliationMode.INCREMENTAL,
)
operation = client.import_documents(request=request)
response = operation.result()
print("Importação concluída:", response)
and I'm getting the following error message
Importação concluída: error_samples {
code: 3
message: "To create document without content, content config of data store must be NO_CONTENT."
details {
type_url: "type.googleapis.com/google.rpc.ResourceInfo"
CONTENT."
details {
type_url: "type.googleapis.com/google.rpc.ResourceInfo"
value: "\0229gs://clean-rodrigo-estudos/lista_arquivos_bucket_2.json:2"
}
}
error_samples {
code: 3
message: "To create document without content, content config of data store must be NO_CONTENT."
details {
value: "\0229gs://clean-rodrigo-estudos/lista_arquivos_bucket_2.json:3"
}
}
error_samples {
code: 3
message: "To create document without content, content config of data store must be NO_CONTENT."
details {
type_url: "type.googleapis.com/google.rpc.ResourceInfo"
value: "\0229gs://clean-rodrigo-estudos/lista_arquivos_bucket_2.json:4"
}
}
error_config {
gcs_prefix: "gs://748489500091_us_import_custom/errors44188869074447421"
}
I confess that I have already read the documentation on the data import process but I am having difficulty assembling the import file, and I would like to know if I can simply pass the path to the bucket where the txt files are directly
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |