If you’re new to Unstructured, read this note first.Before you can create a source connector, you must first sign in to your Unstructured account:
- If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
- If you already have an Unstructured account, go to https://platform.unstructured.io and sign in by using the email address, Google account, or GitHub account that is associated with your Unstructured account.
- For the Unstructured UI or the Unstructured API, only Couchbase Capella clusters are supported.
- For Unstructured Ingest, Couchbase Capella clusters and local Couchbase server deployments are supported.
For Couchbase Capella, you will need:
- A Couchbase Capella account.
- A Couchbase Capella cluster.
- A bucket, scope, and collection on the cluster.
- The cluster’s public connection string.
- The cluster access name (username) and secret (password).
-
Incoming IP address allowance for the cluster.
To get Unstructured’s IP address ranges, go to
https://assets.p6m.u10d.net/publicitems/ip-prefixes.json
and allow all of the
ip_prefix
fields’ values that are listed.These IP address ranges are subject to change. You can always find the latest ones in the preceding file.
- Installation of a local Couchbase server.
- Connection details to the local Couchbase server.
<name>
(required) - A unique name for this connector.<username>
(required) - The username for the Couchbase server.<bucket>
(required) - The name of the bucket in the Couchbase server.<connection-string>
(required) - The connection string for the Couchbase server.<scope>
- The name of the scope in the bucket. The default is_default
if not otherwise specified.<collection>
- The name of the collection in the scope. The default is_default
if not otherwise specified.<password>
(required) - The password for the Couchbase server.<batch-size>
- The maximum number of records to transmit per batch. The default is50
if not otherwise specified.<collection-id>
(source connector only) - The name of the collection field that contains the document ID. The default isid
if not otherwise specified.