AABC Toolkit: SFU data migration

From AtoM wiki

Main Page > Development > Development/Projects > Development/Projects/AABC Data Migration Toolkit > SFU

Note

This is historical development documentation relating to the AABC Data Migration Toolkit project (2010-2011). It has been migrated here from a now-defunct wiki for reference. The content in this page was first created in December of 2010, and last updated January 3, 2011. This content was migrated to the AtoM wiki in July of 2015. Links have been updated where possible, and some presentation elements may have been to account for style differences between the two wikis. Otherwise, the content is the same. For more information, see the project landing page: AABC Data Migration Toolkit.

Overview[edit]

Screenshot of final SFU -> ICA-AtoM PDI Spoon migration script

Extract[edit]

Transform[edit]

SFU AIS Data Mapping[edit]

Database: armd__zdata__archives.fp7[edit]

Table: FND__Fonds[edit]

Source column Dest. column Notes
_kplct__fondsRefCode information_object.identifier
fondsTitle__lat information_object_i18n.title
_kflxt__creatorID event.actor_id "Creation" event type; link with dates (below)
dateRange__lat event.date "Creation" event type
startYear__lxn event.start_date "Creation" event type
endYear__lxn event.end_date "Creation" event type
See note information_object.level_of_description_id All rows are "fonds" (id: 184) level
ExtentAll_lct information_object_i18n.extent_and_medium
See notes information_object.repository_id Link to pre-populated SFU repository record
RADNoteSourceAcquisition__lxt information_object_i18n.acquisition
scopeContent__lxt information_object_i18n.scope_and_content
RADNoteAccruals__lxt information_object_i18n.accruals
RADNoteArrangement__lxt information_object_i18n.arrangement
RADNoteRestrictions__lxt information_object_i18n.access_conditions
RADNoteFindingAids__lxt information_object_i18n.finding_aids
RADNoteRelatedMaterial__lxt information_object_i18n.related_units_of_description
RADNoteOther__lxt note.i18n "General note" (type_id: 125)
_kflxt__descriptionID information_object.description_identifier Use as primary key for source (keymap.source_id)
recordStatus__lat information_object.description_status_id
see notes information_object.description_detail_id Set to "Full" for all Fonds-level descriptions
faPreparedBy__lxt notes.i18n "Archivist's note" (type_id: 124)
faFinancialAssistance__lxt notes.i18n "Archivist's note" (type_id: 124)

Table: SRS__Series[edit]

Source column Dest. column Notes
_kplct__seriesRefCode information_object.identifier
_kfuct__parentID information_object.parent_id
seriesTitle__lat information_object_i18n.title
_kflxt__creatorID event.actor_id "Creation" event type; link with dates (below)
dateRange__lat event.date "Creation" event type
startYear__lxn event.start_date "Creation" event type
endYear__lxn event.end_date "Creation" event type
levelDescription__lct information_object.level_of_description_id
ExtentAll_lct information_object_i18n.extent_and_medium
See notes information_object.repository_id Link to pre-populated SFU repository record
RADNoteSourceAcquisition__lxt information_object_i18n.acquisition
scopeContent__lxt information_object_i18n.scope_and_content
RADNoteAccruals__lxt information_object_i18n.accruals
RADNoteArrangement__lxt information_object_i18n.arrangement
RADNoteRestrictions__lxt information_object_i18n.access_conditions
RADNoteFindingAids__lxt information_object_i18n.finding_aids
RADNoteRelatedMaterial__lxt information_object_i18n.related_units_of_description
RADNoteOther__lxt note.i18n "General note" (type_id: 125)
_kflxt__descriptionID information_object.description_identifier Use as primary key for source (keymap.source_id)
recordStatus__lat information_object.description_status_id
findingAidIncludeFlag__lat information_object.description_detail_id "Full" if flag is "Yes", "Partial" otherwise

Table: FLE__File[edit]

Source column Dest. column Notes
_kplct__fileRefCode information_object.identifier
_kfuct__parentID information_object.parent_id
fileTitle__lat information_object_i18n.title
_kflxt__creatorID event.actor_id "Creation" event type; link with dates (below)
dateRange__lat event.date "Creation" event type
startYear__lan event.start_date "Creation" event type
endYear__lan event.end_date "Creation" event type
See note information_object.level_of_description_id All rows are "file" (id: 189) level
scopeContent__lxt information_object_i18n.scope_and_content
accessStatus__lat information_object_i18n.access_conditions
RADNotes__lxt note.i18n "General note" (type_id: 125)
_kflxt__descriptionID information_object.description_identifier Use as primary key for source (keymap.source_id)
recordStatus__lat information_object.description_status_id

Table: ITM__Item[edit]

Source column Dest. column Notes
_kplct__itemRefCode information_object.identifier
_kfuct__parentID information_object.parent_id
itemTitle__lat information_object_i18n.title
_kflxt__creatorID event.actor_id "Creation" event type; link with dates (below)
dateRange__lat event.date "Creation" event type
startYear__lan event.start_date "Creation" event type
endYear__lan event.end_date "Creation" event type
See note information_object.level_of_description_id All rows are "item" (id: 189) level
scopeContent__lxt information_object_i18n.scope_and_content
accessStatus__lat information_object_i18n.access_conditions
RADNotes__lxt note.i18n "General note" (type_id: 125)
_kflxt__descriptionID information_object.description_identifier Use as primary key for source (keymap.source_id)
recordStatus__lat information_object.description_status_id

Table: ACP__AccessPoint[edit]

Source column Dest. column Notes
_kplat__accessPointID keymap.source_id
_kflxt__termID object_term_relation.term_id (Subject)
relation.object_id (Name)
event.actor_id (Custodian)
event.actor_id (Collector)
_kflxt__descriptionID object_term_relation.object_id (Subject)
relation.subject_id (Name)
event.information_object_id (Custodian)
event.information_object_id (Collector)

Database: armd__zdata__authoritycontrol.fp7[edit]

Table: CORP__CorporateBody[edit]

Source column Dest. column Notes
_kflxt__termID actor.description_identifier Also used as primary key (source) in keymap table
nameList__lxt actor_i18n.authorized_form_of_name
legalNumber__lxt actor.corporate_body_identifiers
dateExistence__lat actor_i18n.dates_of_existence
adminHistory__lxt actor_i18n.history
location__lxt actor_i18n.places
legalFramework__lxt actor_i18n.legal_status
mandateFunctions__lxt actor_i18n.functions
legalAuthorities__lxt actor_i18n.mandates
adminStructure__lxt actor_i18n.internal_structures
otherInfo__lxt actor_i18n.general_context
sources__lxt actor_i18n.sources

Table: FAM__Family[edit]

Source column Dest. column Notes
nameList__lxt actor_i18n.authorized_form_of_name
dateExistence__lat actor_i18n.dates_of_existence
familyHistory__lxt actor_i18n.history
residence__lxt actor_i18n.places
activities__lxt actor_i18n.functions
otherInfo__lxt actor_i18n.general_context
sources__lxt actor_i18n.sources
_kflxt__termID actor.description_identifier Also used as primary key (source) in keymap table

Table: PER__Person[edit]

Source column Dest. column Notes
nameList__lxt actor_i18n.authorized_form_of_name
dateExistenceDates__lat actor_i18n.dates_of_existence
familyHistory__lxt actor_i18n.history
residence__lxt actor_i18n.places
activities__lxt actor_i18n.functions Concatenation of "activities__lxt" and "occupations__lxt"
occupations__lxt actor_i18n.functions Concatenation of "activities__lxt" and "occupations__lxt"
otherInfo__lxt actor_i18n.general_context
sources__lxt actor_i18n.sources
_kflxt__termID actor.description_identifier Also used as primary key (source) in keymap table

Table: NME__Name[edit]

Alternate forms of name. We are only migrating "Previous form of name".

Source column Dest. column Notes
_kplat__nameID keymap.source_id primary key
_kflxt__termID other_name.object_id
startYear__lan other_name.start_date
endYear__lan other_name.end_date
name__lxt other_name_i18n.name
effectiveDateRange__lat other_name_i18n.dates

Table: RLT__RelatedTerm[edit]

The SFU data uses two reciprocal rows to describe (most) term relations with a different predicate for each. For the migration only the first relation is used (the row with the smallest "_kplat__relatedTermID") and a concatenate string is used to create a subject-predicate-object descriptive statement.

Source column Dest. column Notes
_kplat__relatedTermID keymap.source_id primary key
__kflxt__term1ID relation.subject_id
__kflxt__term2ID relation.object_id
startFormattedDates__lct relation.start_date
endFormattedDates__lct relation.end_date
datesFull__lct relation_i18n.date
See notes relation_i18n.description Concatenate of term1Name__lat (subject) + relation__lxt (predicate) + term2Name__lat (object)

Load[edit]

Notes[edit]

SFU system documentation[edit]

SFU AIS documentation

2010-Dec-10 meeting[edit]

Present: David (Artefactual), Richard and Paul (SFU Archives).

Archival descriptions[edit]

https://wiki.sfu.ca/departments/archives/index.php/AIS_ICA-AtoM_Field_Mapping:_Archival_Description

  • Richard to add "parent_id" field to FMPro to facilitate migration
  • Series table self-relates?
    • Yes. Capped at three levels - series, sub-series and sub-sub-series. All three levels are included in SFU 6-place hierarchical numbering system (e.g. "F-202-1-0-0-2-118")
  • Gap: RetrievalDescription table?
    • Used for physical access control. Not migrating in pilot.
  • Gap: Holding table?
    • Used for records management. Not migrating in pilot.
  • Will calc Level of Description in FMPro during migration
  • File::accessStatus__lat, Item::accessStatus__lat - gap in workflow (controlled vocab, default status to "pending review")
    • Can implement SFU vocab via taxonomy and migrate directly (no mapping)
    • For pilot default status will be migrated from system. May still be a workflow gap in the future if all descriptions are entered via ICA-AtoM.
  • RADNoteOther - keep as a single "blob" (don't break field by headers).
  • AccessPoint::heading__lxt
    • Add RAD GMD terms to "subject" taxonomy to replicate existing usage as subject access points
    • Ignore qualifier for amount of materials for pilot.
  • Officers are a special name access point relation
    • Description A -> corporation Acme -> officer Andy
    • Officers must be listed in description A to be included in access points according to RAD
    • Migrate officer (Andy) as name access point for Description A
    • Migrate corporation Acme as authority record linked to Description A
    • In migration, concatenate list of officers of corporation Acme to the "Admin history" field for corporation Acme
  • Institution identifier & Rules or conventions - not necessary to create a field in FMPro, will set during tranform
  • *::recordStatus__lat
    • duplicate SFU vocab in ICA-AtoM taxonomy
    • Don't migrate "superseded" or "obsolete" descriptions for pilot.
    •  ::recordStatus__lat - workflow gap "set to 'Provisional' on creation. Not a concern for pilot.
  • Language & script of description - set in migration tool
  • Fonds::faPreparedBy__lxt - store in "Date of creation, revision & deletion"?
  • Fonds::faFinancialAssistance__lxt - migrate to General note field for now, no better mapping in RAD.
  • Don't migrate version history

Authority records[edit]

https://wiki.sfu.ca/departments/archives/index.php/AIS_ICA-AtoM_Field_Mapping:_Authority_Record

  • Type of entity - set during transform
  • Authorized form of name - Richard will create a separate field for Persons in FMPro to get name in "Last, First" format. Migrate this name instead of "authorized form"
  • Other name - just migrate previous titles (have previousTitleFlag__lct flagged)
  • Most of fields not used, and have inconsistent data, so we will not migrate
    • Migrate all identity area
    • Migrate dates and history from description area
    • Don't migrate rest of authority record data
  • Relations
    • SFU will map SFU entity relation terms to ISAAR terms (hierarchical, family, temporal, associative)
    • Use SFU relation term to generate descriptive text for relation "e.g. Corp A supervises Corp B".
    • SFU relations are contextual (e.g. parent Person A/ parent of Person B), will make generic for ICA-Atom (e.g. Person A parent of Person B).

Online finding aids[edit]