Skip to main content

Salesforce

Incubating

Important Capabilities

CapabilityStatusNotes
Data ProfilingOnly table level profiling is supported via profiling.enabled config field
Detect Deleted EntitiesNot supported yet
DomainsSupported via the domain config field
Platform InstanceCan be equivalent to Salesforce organization

Prerequisites

In order to ingest metadata from Salesforce, you will need:

  • Salesforce username, password, security token OR
  • Salesforce instance url and access token/session id (suitable for one-shot ingestion only, as access token typically expires after 2 hours of inactivity)

The account used to access Salesforce requires the following permissions for this integration to work:

  • View Setup and Configuration
  • View All Data

Integration Details

This plugin extracts Salesforce Standard and Custom Objects and their details (fields, record count, etc) from a Salesforce instance. Python library simple-salesforce is used for authenticating and calling Salesforce REST API to retrive details from Salesforce instance.

REST API Resources used in this integration

Concept Mapping

This ingestion source maps the following Source System Concepts to DataHub Concepts:

Source ConceptDataHub ConceptNotes
SalesforceData Platform
Standard ObjectDatasetsubtype "Standard Object"
Custom ObjectDatasetsubtype "Custom Object"

Caveats

  • This connector has only been tested with Salesforce Developer Edition.
  • This connector only supports table level profiling (Row and Column counts) as of now. Row counts are approximate as returned by Salesforce RecordCount REST API.
  • This integration does not support ingesting Salesforce External Objects

CLI based Ingestion

Install the Plugin

pip install 'acryl-datahub[salesforce]'

Starter Recipe

Check out the following recipe to get started with ingestion! See below for full configuration options.

For general pointers on writing and running a recipe, see our main recipe guide.

pipeline_name: my_salesforce_pipeline
source:
type: "salesforce"
config:
instance_url: "https://mydomain.my.salesforce.com/"
username: user@company
password: password_for_user
security_token: security_token_for_user
platform_instance: mydomain-dev-ed
domain:
sales:
allow:
- "Opportunity$"
- "Lead$"

object_pattern:
allow:
- "Account$"
- "Opportunity$"
- "Lead$"

sink:
type: "datahub-rest"
config:
server: "http://localhost:8080"

Config Details

Note that a . is used to denote nested fields in the YAML recipe.

Field [Required]TypeDescriptionDefaultNotes
access_tokenstringAccess token for instance url
authEnumUSERNAME_PASSWORD
ingest_tagsbooleanIngest Tags from source. This will override Tags entered from UIFalse
instance_urlstringSalesforce instance url. e.g. https://MyDomainName.my.salesforce.com
is_sandboxbooleanConnect to Sandbox instance of your SalesforceFalse
passwordstringPassword for Salesforce user
platformstringsalesforce
platform_instancestringThe instance of the platform that all assets produced by this recipe belong to
security_tokenstringSecurity token for Salesforce username
usernamestringSalesforce username
envstringThe environment that all assets produced by this connector belong toPROD
domainmap(str,AllowDenyPattern)A class to store allow deny regexes
domain.key.allowarray(string)
domain.key.denyarray(string)
domain.key.ignoreCasebooleanWhether to ignore case sensitivity during pattern matching.True
object_patternAllowDenyPatternRegex patterns for Salesforce objects to filter in ingestion.{'allow': ['.*'], 'deny': [], 'ignoreCase': True}
object_pattern.allowarray(string)
object_pattern.denyarray(string)
object_pattern.ignoreCasebooleanWhether to ignore case sensitivity during pattern matching.True
profile_patternAllowDenyPatternRegex patterns for profiles to filter in ingestion, allowed by the object_pattern.{'allow': ['.*'], 'deny': [], 'ignoreCase': True}
profile_pattern.allowarray(string)
profile_pattern.denyarray(string)
profile_pattern.ignoreCasebooleanWhether to ignore case sensitivity during pattern matching.True
profilingSalesforceProfilingConfig{'enabled': False}
profiling.enabledbooleanWhether profiling should be done. Supports only table-level profiling at this stageFalse

Code Coordinates

  • Class Name: datahub.ingestion.source.salesforce.SalesforceSource
  • Browse on GitHub

Questions

If you've got any questions on configuring ingestion for Salesforce, feel free to ping us on our Slack.