Configuring Imports
Last updated
Last updated
From the ExpressionEngine Control Panel, go to Add-Ons > Modules and choose the DataGrab module.
DataGrab has the option to import different types of data. XML, CSV, JSON, and WordPress. Before configuring your import double check the valididty of your import file.
For additional security you can add $config['datagrab_verify_peer'] 'y';
to your config file. This will verify the SSL certificate of the host for the URL of the file you are importing.
Once you have choosen your data type, you are presented with options determined by that data type. Every data type requires a Filename or URL, and most require a path to each entry in the document. This could be the XML path or the node in a JSON file.
The Filename or URL field can also be an environment variable. For example, if you have the following in your .env.php
file, you can use $MY_IMPORT_FILE_URL
as the field value. Using an environment variable means you can have different import files in a local or dev environment vs production, and not have to change the DataGrab settings when moving between environments.
DataGrab will support basic authentication, but if your import URL requires additional authentication, like OAuth, to a private REST API, you will need to create a custom mediator script to handle the authentication, then use that mediator as the URL in your DataGrab configuration. For example: https://mysite.com/api-authenticator.php
If your file is valid and readable, you will be presented with an example of the data found in the file. If no data was found, then it means there was probably an error on the previous Import Settings page.
If everything looks correct, click on “Configure Import” to continue to the next step.
If you see an error on this page, usually something to do with cURL, there is a 99.9% chance that this is not a DataGrab issue. For some reason, DataGrab can not read your import file. Check the following:
The path in the config is correct
Make sure $config['datagrab_verify_peer'] = 'y';
is not in your config.php file, or it is set to 'n'
Open the import feed in a separate browser window, if it does not load and you do not see the XML or JSON data, neither can DataGrab.
If you are importing from a local file or URL and using basic auth to block access to your site while in development, then be sure to include the credentials in your import URL as well. The import process does not use the same authenticated session as your browser. For example: http://user:pass@mystagingsite.com/import-file.json
If you are importing from a remote file or URL that is behind a basic auth, make sure it is correct, and make sure the username and password don't contain an @. If you open the feed with the username and password in a separate browser window and you receive a login prompt, then DataGrab sees the same thing and can not read the file. This likely indicates your username or password is wrong, or something else is misconfigured on the server.
Make sure your site and feed protocols match. Don't load a feed from https if your site is running from http.
This documenation will not cover every option on the Configure Import page, because depending on the data type chosen the options will vary based on your ExpressionEngine configuration. The configuration screen is self documenting. Every option has a description explaining what it does.
If the channel you are importing into has category groups assigned to it you will be able to choose which values in your import file map to which category. The main options are the Title field, and the Custom Fields section. You will see an option for every custom field assigned to your channel. Just like the categories you choose which values in your import file map to which custom field.
Additional options let you tell DataGrab what to do if an entry already exists. This is especially useful if we want to periodically re-run the import. The best way to do this is to import a unique value into a custom field (often referred to as a GUID). This unique value could be an id, a url specific to the entry, a stock control number, or just something as simple as the title.
If the module encounters a duplicate entry, we can tell it ignore it and not import it, or we can get it to check to see if anything has changed and updates the existing record.
Sometimes, we want to delete entries that are not in the current import file. By adding a timestamp we can see when a record was last updated. We can then delete older entires if desired.
After you save the configuration you will have the chance to give the import a name, short description of what it does, and optionaly give it a passkey.