BCAUL pilot project - Qubit input: RAD

From AtoM wiki
Revision as of 13:12, 29 July 2015 by Dan (talk | contribs) (Add historical note)

Main Page > Development > Development/Projects > Development/Projects/BCAUL Pilot > Development/Projects/BCAUL Pilot/Metadata mapping > EAD

Note

This is historical development documentation, migrated from the now-defunct Artefactual wiki. The content was first added there September 26, 2008, and last updated on October 30, 2008. For more information, see the landing page for this development project: BCAUL Pilot Project. The content was moved to the AtoM wiki on July 29, 2015.


Notes

  • This table identifies the EAD elements output by Qubit's EAD export, mapping EAD tags to Qubit fields and methods.
  • In the Output column, red text indicates database content (table_name::field_name).
  • Qubit generally adheres to the Research Libraries Group's RLG Best Practice Guidelines for Encoded Archival Description (August 2002). Exceptions are indicated below in the Notes column.
  • When exporting EAD, Qubit sets the relatedencoding attribute of <eadheader> to "MARC21" and <archdes> to "ISAD(G)".
  • When exporting EAD, Qubit does not use the <frontmatter> element; when importing, Qubit ignores any data wrapped in <frontmatter> tags.

Tags

EAD header tags

<eadheader> | <eadid> | <filedesc> | <titlestmt> | <titleproper> | <author> | <editionstmt> | <edition> | <publicationstmt> | <publisher> | <date> | <address> | <seriesstmt> | <notestmt> | <profiledesc> | <creation> | <langusage> | <desrules> | <revisiondesc>

ARCHDES tags

<archdes>

<did> tags: <did> | <origination> | <unittitle> | <unitdate> | <physdesc> | <phystech> | <abstract> | <physloc> | <originalsloc> | <repository> | <unitid> | <langmaterial> | <materialspec>

Other tags: <bioghist> | <scopecontent> | <arrangement> | <controlaccess> | <accessrestrict> | <accruals> | <acqinfo> | <altformavail> | <appraisal> | <custodhist> | <prefercite> | <processinfo> | <userestrict> | <relatedmaterial> | <separatedmaterial> | <otherfindaid> | <bibliography> | <odd>

Lower level tags: <dsc> | <c> | <daogrp> | <daoloc> | <daodesc>

Mapping

EAD header tags

EAD element Output Repeatable? Notes

<eadheader>   No Do not include relatedencoding attribute here.

Use different relatedencoding values for <eadheader> (MARC21) and <archdesc> (ISAD(G)).

  langencoding= "iso639-1" No Qubit uses language codes from Symfony framework.
  • ISO 639-1 = 2-letter alpha codes.
  • ISO 639-2 = 3-letter alpha codes.
  • Symfony appears to use 639-1 codes where possible and 639-2 codes only when no two-letter code is available.
  • Qubit therefore uses 639-2 in some cases, 639-1 in others.
  scriptencoding= "iso15924" No Qubit uses ISO-compliant 4-letter alpha codes for scripts from Symfony framework.
  relatedencoding= "MARC21" No Data in the <eadheader> section primarily relates to finding aid as a quasi-published work.
  • Therefore map to MARC21 as bibliographic standard.
  respositoryencoding= "iso15511" No Alpha-numeric code (maximum 16 characters) uniquely identifying any library or relating institution in the world.
  • Typically formed by 2-letter country code of authority that issues identifier, followed by dash, followed by identifier.
  • Qubit forms by concatenating the repository's Country code and Identifier.
  countryencoding= "iso3166-1" No Qubit uses ISO-compliant 2-letter alpha codes for countries from Symfony framework.
  dateencoding= "iso8601" No Qubit normalizes dates without dashes, e.g. September 29 2008 = 20080929.

  <eadid> information_object::identifier [get_current_date_timestamp] No Identifies document as unique instance of an EAD document.
  • Concatenate database id number with time-stamp.
  • Database id points to description.
  • Time-stamp differentiates different EAD outputs of same description.
    countrycode= contact_information::country_code No Get Country code value from primary contact of repository.

Assumes repository is the same as the agency responsible for maintaining description (not always the case).

  • In future iterations, should link to information_object::institution_responsible_identifier.
    mainagencycode= repository::identifier No Assumes repository is the same as the agency responsible for maintaining description (not always the case).
  • In future iterations, should link to information_object::institution_responsible_identifier.
    encodinganalog= "865$u" No MARC21 856$u = Electronic location and access / Uniform Resource Identifier'
    url= [get_url_of_server] + "/information/show/id/information_object::id" No Qubit permanent url.

RLG Guidelines mandate using at least one of publicid, identifier, or url attributes.

  <filedesc>   No  

    <titlestmt>   No  

      <titleproper> "Finding aid: " information_object::title No "Finding aid" tag needs to be translable.

In future iterations, administrator should have interface to set default titles.

        encodinganalog= "245$a" No MARC21 245$a = Title statement
      <author> information_object::revision_history No In future iteration, Finding aid may be separate Qubit object with own metadata.
        encodinganalog= "245$c" No MARC21 245$c = Title statement / Statement of responsibility
      <sponsor>   No Not currently supported by Qubit; data may be captured in future iteration.
    <editionstmt>   No  
      <edition> information_object::edition No  
        encodinganalog= "250$a" No MARC21 250$a = Edition statement

    <publicationstmt>   No  

      <publisher> actor::authorized_form_of_name No Get name of repository from actor table.

Assumes repository is the same as the agency responsible for publishing description (not always the case).

  • In future iterations, should link to information_object::institution_responsible_identifier.
        encodinganalog= "260$b" No MARC 21 260$b = Publication, distribution, etc / Name of publisher, distributor, etc.

      <date> [Get current date] No Use current date as date of publication, formatted as text "MonthName Day, Year".
  • Dates of revision registered in <revision> tag.
        encodinganalog= "260$c" No MARC21 260$c = Publication, distribution, etc / Date of publication, distribution, etc
        normal= [get_current_date] No Normalize current date as "YYYYMMDD"

      <address>   No Get address info from repository's primary contact record.
  • In future iterations, should link to information_object::institution_responsible_identifier.
        encodinganalog= "260$a" No MARC21 260$a = Publication, distribution, etc / Place of publication, distribution, etc.

        <addressline>   <addressline>actor::authorized_form_of_name</addressline>

  <addressline>contact_information::street</addressline>

  <addressline>contact_information::city contact_information::region contact_information::country_code contact_information::postal_code</addressline>

  <addressline>Telephone: contact_information::telephone</addressline>

  <addressline>Fax: contact_information::fax</addressline>

  <addressline>Email: contact_information::email</addressline>

  <addressline>URL: contact_information::url</addressline>

No Return info in separate <addressline> tags.

    <seriesstmt>   No Not currently supported. In future iterations, Qubit may include a separate finding_aid object with its own metadata, which may includes a field for information relating to the published monographic series to which the finding aid belongs.

    <notestmt>   Yes Not currently supported. In future iterations, Qubit may include a separate finding_aid object with its own metadata, which may includes notes relating to its publication.

  <profiledesc>   No  

    <creation> "EAD finding aid output from ICA-AtoM by " [get_user_name] " on " <date normal="YYYYMMDD">[get_current_date]</date> No Indicates that EAD was machine-generated rather than manually coded and uses current date and user name.
      encodinganalog= "500" No MARC21 500 = General note

    <langusage>   No  

      <language> property::name="language_of_information_object_description" value=" " Yes Get values from related records in property table.

Code each language in separate <language></language> tags.

        encodinganalog= "41" No MARC21 41 = Language code
        langcode= property::name="language_of_information_object_description" value=" " No  
        scriptcode= property::name="script_of_information_object_description" value=" " No  

    <desrules> information_object::rules No Get value from text field. There is no MARC encoding analog.

  <revisiondesc>   No Not currently supported.
  • Qubit stores information relating to revisions in a single text field, Revision history, but this cannot be easily normalized into separate <change><item> entries as required by EAD (one <item></item> tag for each revision).

Future iteration:

  • Get data for these tags from Qubit versioning module.

ARCHDES tags

EAD element Output Repeatable? Notes

<archdes>   No  

  level= term::name No Get term::name (name of Level of description) via information_object::level_of_description_id

  relatedencoding= "ISAD(G) 2nd edition, 2000" No Map <archdesc> tags to ISAD(G) elements, as ISAD(G) is the standard for archival description on which ICA-AtoM is built.

Problem: what if ICA-AtoM takes in a description originally based on other standard (e.g. RAD, DACS), but now exports EAD as if ISAD(G) were the original source standard?

Future iteration:

  • Take value from information_object::rules field.
  • Use disfferent encodinganalog values through <archdesc> depending on the standard.
  • Assumes that values in rules are controlled through taxonomy.
DID tags

EAD element Output Repeatable? Notes

  <did>   No  

    <origination>   Yes Get <origination> values from event table (creation events).
      <corpname>

      <famname>

      <persname>

actor::authorized_form_of_name Yes Get actor name via related creation event.

Use <corpname>, <famname> or <persname> as appropriate (from actor::entity_type_id).

Where multiple creators are registered, return each in its own <origination> tags.

        encodinganalog= "3.2.1" No ISAD(G) 3.2.1 = Name of creator(s)
        role= term::name No Get term::name via creation event::actor_role_id

    <unittitle> information_object::title

information_object::alternate_title

Yes Return Title and Alternate title in separate <unittitle> tags.
      encodinganalog= "3.1.2" No ISAD(G) 3.1.2 = Title
      type= "alternate" No Use only for Alternate title values.

    <unitdate> event::date_display Yes Get date information from related events.

Return multiple dates each in its own <unitdate></unitdate> tags.

      type=   No Not currently supported. While users can enter either inclusive or bulk dates in Date display field, there is no way to easily extract the Type from the data.
      normal= event::start_date

event::end_date

No Normalize as YYYYMMDD/YYYYMMDD
      encodinganalog= "3.1.3" No ISAD(G) 3.1.3 = Date(s)
      datechar= term::name No Get term::name via event::type_id

    <physdesc>   No  

      <extent> information_object::extent_and_medium No While EAD can accommodate multiple <extent> tags, Qubit stores all extent-related data in one field as a single string, can't easily normalize into multiple extent statements.
        encodinganalog= "3.1.5" No ISAD(G) 3.1.5 = Extent and medium.

    <phystech> information_object::physical_characteristics No  
      encodinganalog= "3.4.4" No ISAD(G) 3.4.4 = Physical characteristics.

    <abstract>   No Not currently supported.

Future iterations:

  • May include a field for brief summary of description.

    <container> physical_object::name No Get from related physical_object record.

Problem: EAD distinguishes <container> (storage device, e.g. cartons, boxes, reels, folders) and <physloc> (place where storage devices are located - building, room, stack, shelf).

  • Qubit doesn't distinguish, treats all physical locations as containers within containers within containers.

    <physloc> physical_object::location No Get info from and only include if there is a related physical_object record.

Include audience attribute = "internal" to make non-public?

    <originalsloc> information_object::location_of_originals No  
      encodinganalog= "3.5.1" No ISAD(G) 3.5.1 = Existence and location of originals.

    <repository>   No Get repository's data from related actor and repository records via repository_id value.

No ISAD(G) analog for <repository> element or its sub-elements.

      <corpname> actor::authorized_form_of_name No  

      <address>   No Get address info from repository's primary contact record.
  • In future iterations, should link to information_object::institution_responsible_identifier.

        <addressline>   <addressline>actor::authorized_form_of_name</addressline>

  <addressline>contact_information::street</addressline>

  <addressline>contact_information::city contact_information::region contact_information::country_code contact_information::postal_code</addressline>

  <addressline>Telephone: contact_information::telephone</addressline>

  <addressline>Fax: contact_information::fax</addressline>

  <addressline>Email: contact_information::email</addressline>

  <addressline>URL: contact_information::url</addressline>

No Return info in separate <addressline> tags.

    <unitid> information_object::identifier No  
      countrycode= contact_information::country_code No Get repository's country code from primary contact's related contact_information record.
      repositorycode= repository::identifier   Get repository code from related repository record.
      encodinganalog= "3.1.1"   ISAD(G) 3.1.1 = Reference code.

    <langmaterial>   No  
      encodinganalog= "3.4.3" No ISAD(G) 3.4.3 = Language / scripts of material.

      <language> property::information_object_language Yes Qubit stores code of language; transform to full string.
        langcode= property::information_object_language No  
        scriptcode= property::information_object_script No Problem: how to connect scripts and languages?
  • Qubit stores as unrelated properties.

    <materialspec>     Not currently supported by Qubit. But note that RAD version requires this EAD element for Class of material specific details (handled as properties).
Other tags

EAD element Output Repeatable? Notes

  <bioghist> actor::history Yes Can include multiple admin / bio histories (any actor registered as creator in creation event).
    encodinganalog= "3.2.2" No ISAD(G) 3.2.2 = Administrative / biographical history.

  <scopecontent> information_object::scope_and_content No  
    encodinganalog= "3.3.1" No ISAD(G) 3.3.1 = Scope and content.

  <arrangement> information_object::arrangement No  
    encodinganalog= "3.3.4" No ISAD(G) 3.3.4 = System of arrangement.

  <controlaccess>   No Return access points.
    <corpname> actor::authorized_form_of_name Yes Get actor via event.
    <persname> actor::authorized_form_of_name Yes Get actor via event.
    <famname> actor::authorized_form_of_name Yes Get actor via event.
    <geogname> term::name Yes Get term via object_term_relation.
    <subject> term::name Yes Get term via object_term_relation.
    <genreform>   Yes Not currently supported.
    <occupation>   Yes Not currently supported.
    <function>   Yes Not currently supported.
    <title>   Yes Not currently supported.

  <accessrestrict> information_object::access_condition No  
    encodinganalog= "3.4.1" No ISAD(G) 3.4.1 = Conditions governing access.

  <accruals> information_object::accruals No  
    encodinganalog= "3.3.3" No ISAD(G) 3.3.3 = Accruals.

  <acqinfo> information_object::acquisition No  
    encodinganalog= "3.2.4" No ISAD(G) 3.2.4 = Immediate source of acquisition or transfer.

  <altformavail> information_object::location_of_copies No  
    encodinganalog= "3.5.2" No ISAD(G) 3.5.2 = Existence and location of copies.

  <appraisal> information_object::appraisal No  
    encodinganalog= "3.3.2" No ISAD(G) 3.3.2 = Appraisal, destruction and scheduling information.

  <custodhist> information_object::archival_history No  
    encodinganalog= "3.2.3" No ISAD(G) 3.2.3 = Archival history.

  <prefercite>     Qubit does not currently support this element.

  <processinfo> information_object::revision_history No  
    encodinganalog= "3.7.3"   ISAD(G) 3.7.3 = Date(s) fo description.

  <userestrict> information_object::reproduction_conditions No  
    encodinganalog= "3.4.2" No ISAD(G) 3.4.2 = Conditions governing reproduction.

  <relatedmaterial> information_object::related_units_of_description No Neither ISAD(G) nor Qubit makes distinction available in EAD between <relatedmaterial> (may be of use to researcher, but not related by provenance, accumulation or use) and <separatedmaterial> (related by provenance but physically dispersed, eg. to different repositories).
  • Need to choose between EAD elements, but will results sometimes in poor EAD data.
  • E.g. Qubit Related units of description field will be output as <relatedmaterial> but may in fact contain only data that is properly <separatedmaterial>.
    encodinganalog= "3.5.3" No ISAD(G) 3.5.3 = Related units of description.

  <separatedmaterial>   No Not currently supported by Qubit.
  • Qubit does not distinguish between "related" and "separated" material (all contained in single Related units of description field).

  <otherfindaid> information_object::finding_aids No  
    encodinganalog= "3.4.5" No ISAD(G) 3.4.5 = Finding aids.

  <bibliography> information_object::publication_note No  
    encodinganalog= "3.5.4" No ISAD(G) 3.5.4 = Publication note.

  <odd> note::content Yes  
    encodinganalog= "3.6.1" No ISAD(G) 3.6.1 = Notes.
Lower level tags

EAD element Output Repeatable? Notes

  <dsc>   No  
    type= "combined" No "combined" indicates that a given series is always followed immediately by listing of its lower-level contents (any sub-series, files, items).

    <c01>

    <c02>

    <c03> ...

[Return <c01>, <c02> etc according to level of description] No Use numbered components <c01> rather than unnumbered <c>.

The number should equal the number of parents above the description in the hierarchy.

      level= term::name No Get name of level of description from term table via information_object::level_of_description_id field.

Note that relation between component number and level attribute not constant, but depends on the number of levels in the hierarchy of description.

  • E.g. an item registered directly to a series = <c02 level="item">.
  • E.g. a file registered to a sub-series included in a sub-fonds within a fonds = <c04 level="file">.

      <daogrp>   No Get related digital objects in separate <daoloc></daoloc> tags.
  • Return all three digital objects (master, reference, thumbnail) or just one?

      <daoloc>   Yes  
        role= digital_object::mime_type No  
        label=   No Returns "master", "reference" or "thumbnail".
        href=   No  

        <daodesc>   No Not currently supported by Qubit.
  • Provides description of individual digital object (child of <daoloc>) or group (child of <daogrp>).