Many of the projects I've worked on over the past several years have required importing data either from SQL or from an existing Sitecore installation itself. The original application was a page form that imported data from a database and was developed by Mark Graber, Sitecore MVP and employee at Agency Oasis. Each time I've used it I improve on it and have since retooled it as an internal Sitecore application. I've recently cleaned it up to be a reusable, customizable application and am releasing it as a shared source module. You can find it on Sitecore Shared Source, or by getting the SVN repo at http://svn.sitecore.net/DataImporter. If you want to test it out you'll just need the package install. You'll also want to look over the documentation, which largely reiterates the information in this article.
There are two main pieces of the importer. The first is the import mapping which you define and then there is the application which runs the mapping. This was set up in this way so that you could test run an import by just querying for a subset of the results and then change the query and re-run an import easily. After the initial installation you'll need to login under desktop mode in the master database and browse to /sitecore/System/Modules/DataImports.
By right-clicking on the "Data Imports" folder you can create Import Map Folders, a Sitecore Import Map or a SQL Import Map as shown below.
The Import Map Folders are used to help organize a large set of imports which in my experience is common. One reason is that you will often find that there is a lot of information to import and it helps to break it up into several imports so that the import processing time can be broken up a bit.
The Sitecore Import Map is used to create new content items from existing ones. If, for example, you were upgrading an old Sitecore site with a new set of templates and wanted to copy the information from the old template model to items using the new template model. This allows you to do that. The only field that is specific to the Sitecore Import Map is the "XPath Query". This will be used by the importer to get the set of items you want to copy over.
The SQL Import Map is used to create content items from content in a SQL Database. This would be useful when you're rebuilding a site that was using another system other than in Sitecore. You will save a lot of time by importing content this way. Similar to the Sitecore Import Map the SQL Import Map only has one unique field which is the "SQL Query". This is used by the importer to query for the result set to be imported.
There are also a set of fields shared by both the Sitecore Import Map and SQL Import Map. The Base Import Map section defines several fields that help you setup your import process. The "Import To Where" field is a DropLink field and allows you choose a folder where you want to import the content to. The "Import To What Template" field defines which template the new items will be created in. The "Pull Item Name from What Fields" field is used to create the new items with a name from the comma separated set of fields provided. The "Item Name Max Length" determines how long is too long for an item name.
There is also an Item Folder section that is shared by both map types. This is useful for importing articles or people since you'll likely want to folder them by date or name. You can select either folder by date or by name. If you choose to folder by date you must provide a "Date Field" value so that the importer knows where to look for the date value. You can also select a specific folder type to folder with by using the "Folder Template" field which is a DropLink used to select a template in Sitecores template section.
That's all there is for the setup of an import map. So far you have defined enough information to pull a dataset and create a new item for each row or item in that dataset. Now we need to define how to pass the information from the existing row or item to the new item. Generally this is handled on a field by field basis. This means that if you are importing articles you would define how to import the date field, author, article title and the article body separately. There is also one difference between importing content from Sitecore than from SQL. When importing Sitecore items you will often times want to access properties of that item and not just fields. A path to the item or the url of the item would be considered a property. This is something a SQL DataRow will not have. So when you create a Sitecore Import Map you'll see that there is a folder specifically for the properties separate from the fields.
There are also a number of property and field types already defined. If you find that the existing set isn't enough to fill your needs you can also create your own custom import types, which I'll describe later in this article. When you right-click on a Fields folder you'll be given a set of types shown below.
The list of provided Field types are:
ListToGuid
This Type is used to match the import value to the display name on a list of child items. You would set the list in the field and all the child item’s names will be compared. If there is a match then the GUID of the matched item will be stored in the new field. This is useful when you’ve imported an enumeration previously and you want to import items that point to the enumeration using link fields like DropLink.
ToDate
This is used to import date values into DateTime field types.
ToStaticValue
This is used when all the import items need to have a field populated with the same value. For example if you’re importing a list of blog articles and you want to set the value of a DropLink field to the same GUID
ToText
This is used to set the text value of the import to the TextField of the new item.
UrlToLink
This is used to set the import value which would be a url to the URL property of the LinkField..
When you right-click on a Properties folder you'll be given a set of types shown below.
The list of provided Property types are:
PathToText
This will set the Item.Paths.Path property of the imported item to a TextField on the new item
UrlToText
This will set the url of the imported item to the TextField on the new item.
Both fields and properties share some definition fields. The To What Field is the name of the field on the new item you'll be inserting the value into. The Handler Class and Handler Assembly is the class and binary file that will do the importing. This is what eventually allows you to customize your own mappings/handlers. Again more on that later.
The ToDate, ToText and UrlToLink fields use the Field Mapping section. The From What Fields allows you to select multiple fields to populate the new field value with.
The ToStaticValue field type has a Value Mapping section whose lone field is Value. The Value field is used so that you can set to copy a static value into a field during import
The ListToGuid field has a Droplink Mapping section. The Source List field allows you to define the parent of a list of items.
Once you've fully defined an import map you're now ready to start importing that data. You'll want to open up the Sitecore start menu and go to Development Tools->Data Import.
You'll be presented with an internal application. For SQL imports the Connection String section drop down list will populate with connection strings defined in your web.config or connectionstrings.config files. You'll want to select the database connection that you want to import from. The Import Setting drop down will populate with any SQL Import maps that you've defined. Once you've selected the import map you want to run just hit the Import button and wait for it to finish. Depending on the number of items to import you could be waiting for a while. I recommend you break up your imports if you can. I've imported close to 7 thousand items at a time but it took a good 30 - 45 minutes to run on a fairly powerful machine. Anything in the hundreds should be reasonably quick but on large sets you should consider whether or not you can wait so long to have your browser locked up.
Custom Mappings
To create your own custom mappings you will need to create a template item and class to handle the template. You could also download the source code to build the class library with your custom mappings in them or add them to your own library and reference them properly as explained below. Property Mappings are only for Sitecore imports since properties are only related to Sitecore items.
Creating the Template
You would need to define a template and extend from either BaseField or BaseProperty depending on if you are creating a Field or Property Mapping. You could also inherit from another existing class. You can also add any fields to store information you want on this template. The field values on this template will be manipulated in the class you define later. Now you would need to create a Standard Values for that template. In the Standard Values item you will see there are two inherited fields: Handler Class and Handler Assembly. These are used to determine which class runs this mapping. The “Handler Assembly” field should be populated with the “Sitecore.SharedSource.DataImporter”. This assumes you will be downloading the source and adding your fields to this class library. If you are not you can override the value in this field and then set the “Handler Class” field to the fully qualified name of your class (namespace and class name).
Creating the Class
The class you create should inherit from the BaseField or BaseProperty depending on if you are creating a Field or Property Mapping. You could also inherit from existing classes. In the new class the constructor should call the constructor of the BaseField or BaseProperty class. It should also implement the two FillField methods. One is to handle SQL imports and the other is to handle Sitecore imports. They’re separated because the import rows are different types. The FillField methods can be used to manage one field or if you’re in a hurry many fields. I was trying to define types that would be reusable so I decided to break each field into its own class but since you have access to both the existing DataRow and the new item. You can do much more than just handle a single field. You could write an entirely custom import in this one class. It’s intended for you to be able to manage the process of transforming the data so do what you will.
Insert Options
Once you’ve created your mappings update the Fields or Properties Folder insert options. This will allow you to create the mappings in the system and run the import utility against them.
Custom Importer - Support for Oracle or XML
I know that some people use Oracle. I haven’t used Oracle so I don’t know what form a resulting row is like. If it is a standard .NET DataRow You should be able to use the SQL importer. If this is not the case you need to create a new DataMap Class and sitecore template similar to the field and property mappings above. You would do the same for importing XML as well. From here you can add fields to the template and add them as properties to the class. The class should inherit from BaseDataMap and call the constructor on the BaseDataMap on its own constructor. You will also need to implement a method called Process. You can largely just copy from the SQLDataMap.Process method and make changes to query the data and loop through the result set.
When you create a new import map type you will be iterating through a different object type instead of DataRow or Items. To support this type you'll want to add a FillField method to the BaseField class so that each of the existing field classes will be forced to implement it to handle consuming the custom row's specific object type.
To support the right-click insert options you'll also want to create a branch template with a Field Folder as a child item. Change the branch item icon to match the template icon and add the new branch template to the insert options of the Data Import Folder and Import Map Folder.