Document AI Warehouse API . projects . locations

Instance Methods

documentSchemas()

Returns the documentSchemas Resource.

documents()

Returns the documents Resource.

operations()

Returns the operations Resource.

ruleSets()

Returns the ruleSets Resource.

synonymSets()

Returns the synonymSets Resource.

close()

Close httplib2 connections.

initialize(location, body=None, x__xgafv=None)

Provisions resources for given tenant project. Returns a long running operation.

runPipeline(name, body=None, x__xgafv=None)

Run a predefined pipeline.

Method Details

close()
Close httplib2 connections.
initialize(location, body=None, x__xgafv=None)
Provisions resources for given tenant project. Returns a long running operation.

Args:
  location: string, Required. The location to be initialized Format: projects/{project_number}/locations/{location}. (required)
  body: object, The request body.
    The object takes the form of:

{ # Request message for projectService.InitializeProject
  "accessControlMode": "A String", # Required. The access control mode for accessing the customer data
  "databaseType": "A String", # Required. The type of database used to store customer data
  "documentCreatorDefaultRole": "A String", # Optional. The default role for the person who create a document.
  "kmsKey": "A String", # Optional. The KMS key used for CMEK encryption. It is required that the kms key is in the same region as the endpoint. The same key will be used for all provisioned resources, if encryption is available. If the kms_key is left empty, no encryption will be enforced.
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # This resource represents a long-running operation that is the result of a network API call.
  "done": True or False, # If the value is `false`, it means the operation is still in progress. If `true`, the operation is completed, and either `error` or `response` is available.
  "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # The error result of the operation in case of failure or cancellation.
    "code": 42, # The status code, which should be an enum value of google.rpc.Code.
    "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use.
      {
        "a_key": "", # Properties of the object. Contains field @type with type URL.
      },
    ],
    "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.
  },
  "metadata": { # Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
  "name": "A String", # The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the `name` should be a resource name ending with `operations/{unique_id}`.
  "response": { # The normal response of the operation in case of success. If the original method returns no data on success, such as `Delete`, the response is `google.protobuf.Empty`. If the original method is standard `Get`/`Create`/`Update`, the response should be the resource. For other methods, the response should have the type `XxxResponse`, where `Xxx` is the original method name. For example, if the original method name is `TakeSnapshot()`, the inferred response type is `TakeSnapshotResponse`.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
}
runPipeline(name, body=None, x__xgafv=None)
Run a predefined pipeline.

Args:
  name: string, Required. The resource name which owns the resources of the pipeline. Format: projects/{project_number}/locations/{location}. (required)
  body: object, The request body.
    The object takes the form of:

{ # Request message for DocumentService.RunPipeline.
  "exportCdwPipeline": { # The configuration of exporting documents from the Document Warehouse to CDW pipeline. # Export docuemnts from Document Warehouse to CDW for training purpose.
    "docAiDataset": "A String", # The CDW dataset resource name. Format: projects/{project}/locations/{location}/processors/{processor}/dataset
    "documents": [ # The list of all the resource names of the documents to be processed. Format: projects/{project_number}/locations/{location}/documents/{document_id}.
      "A String",
    ],
    "exportFolderPath": "A String", # The Cloud Storage folder path used to store the exported documents before being sent to CDW. Format: gs:///.
    "trainingSplitRatio": 3.14, # Ratio of training dataset split. When importing into Document AI Workbench, documents will be automatically split into training and test split category with the specified ratio.
  },
  "gcsIngestPipeline": { # The configuration of the Cloud Storage ingestion pipeline. # Cloud Storage ingestion pipeline.
    "inputPath": "A String", # The input Cloud Storage folder. All files under this folder will be imported to Document Warehouse. Format: gs:///.
    "processorResultsFolderPath": "A String", # The Cloud Storage folder path used to store the raw results from processors. Format: gs:///.
    "schemaName": "A String", # The Document Warehouse schema resource name. All documents processed by this pipeline will use this schema. Format: projects/{project_number}/locations/{location}/documentSchemas/{document_schema_id}.
  },
  "gcsIngestWithDocAiProcessorsPipeline": { # The configuration of the document classify/split and entity/kvp extraction pipeline. # Use DocAI processors to process documents in Cloud Storage and ingest them to Document Warehouse.
    "extractProcessorInfos": [ # The extract processors information. One matched extract processor will be used to process documents based on the classify processor result. If no classify processor is specificied, the first extract processor will be used.
      { # The DocAI processor information.
        "documentType": "A String", # The processor will process the documents with this document type.
        "processorName": "A String", # The processor resource name. Format is `projects/{project}/locations/{location}/processors/{processor}`, or `projects/{project}/locations/{location}/processors/{processor}/processorVersions/{processorVersion}`
        "schemaName": "A String", # The Document schema resource name. All documents processed by this processor will use this schema. Format: projects/{project_number}/locations/{location}/documentSchemas/{document_schema_id}.
      },
    ],
    "inputPath": "A String", # The input Cloud Storage folder. All files under this folder will be imported to Document Warehouse. Format: gs:///.
    "processorResultsFolderPath": "A String", # The Cloud Storage folder path used to store the raw results from processors. Format: gs:///.
    "splitClassifyProcessorInfo": { # The DocAI processor information. # The split and classify processor information. The split and classify result will be used to find a matched extract processor.
      "documentType": "A String", # The processor will process the documents with this document type.
      "processorName": "A String", # The processor resource name. Format is `projects/{project}/locations/{location}/processors/{processor}`, or `projects/{project}/locations/{location}/processors/{processor}/processorVersions/{processorVersion}`
      "schemaName": "A String", # The Document schema resource name. All documents processed by this processor will use this schema. Format: projects/{project_number}/locations/{location}/documentSchemas/{document_schema_id}.
    },
  },
  "processWithDocAiPipeline": { # The configuration of processing documents in Document Warehouse with DocAi processors pipeline. # Use a DocAI processor to process documents in Document Warehouse, and re-ingest the updated results into Document Warehouse.
    "documents": [ # The list of all the resource names of the documents to be processed. Format: projects/{project_number}/locations/{location}/documents/{document_id}.
      "A String",
    ],
    "exportFolderPath": "A String", # The Cloud Storage folder path used to store the exported documents before being sent to CDW. Format: gs:///.
    "processorInfo": { # The DocAI processor information. # The CDW processor information.
      "documentType": "A String", # The processor will process the documents with this document type.
      "processorName": "A String", # The processor resource name. Format is `projects/{project}/locations/{location}/processors/{processor}`, or `projects/{project}/locations/{location}/processors/{processor}/processorVersions/{processorVersion}`
      "schemaName": "A String", # The Document schema resource name. All documents processed by this processor will use this schema. Format: projects/{project_number}/locations/{location}/documentSchemas/{document_schema_id}.
    },
    "processorResultsFolderPath": "A String", # The Cloud Storage folder path used to store the raw results from processors. Format: gs:///.
  },
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # This resource represents a long-running operation that is the result of a network API call.
  "done": True or False, # If the value is `false`, it means the operation is still in progress. If `true`, the operation is completed, and either `error` or `response` is available.
  "error": { # The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by [gRPC](https://github.com/grpc). Each `Status` message contains three pieces of data: error code, error message, and error details. You can find out more about this error model and how to work with it in the [API Design Guide](https://cloud.google.com/apis/design/errors). # The error result of the operation in case of failure or cancellation.
    "code": 42, # The status code, which should be an enum value of google.rpc.Code.
    "details": [ # A list of messages that carry the error details. There is a common set of message types for APIs to use.
      {
        "a_key": "", # Properties of the object. Contains field @type with type URL.
      },
    ],
    "message": "A String", # A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.
  },
  "metadata": { # Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
  "name": "A String", # The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the `name` should be a resource name ending with `operations/{unique_id}`.
  "response": { # The normal response of the operation in case of success. If the original method returns no data on success, such as `Delete`, the response is `google.protobuf.Empty`. If the original method is standard `Get`/`Create`/`Update`, the response should be the resource. For other methods, the response should have the type `XxxResponse`, where `Xxx` is the original method name. For example, if the original method name is `TakeSnapshot()`, the inferred response type is `TakeSnapshotResponse`.
    "a_key": "", # Properties of the object. Contains field @type with type URL.
  },
}