Derived and Applied Metadata
Derived and Applied Metadata
The other day, I worked with a UX designer on a project to present assets and their metadata to users. It became clear in our discussion that the designer’s lack of experience with Digital Asset Management (DAM), and metadata in general, meant that I would need to provide a clear definition of what metadata is. Additionally, in my mind, it was important to ensure that arguably two of the most important top-level classifications for metadata would also need explaining: Derived and Applied Metadata.
While it may be evident to some, it is not always clear what we mean when we talk about the different kinds of metadata used in our daily lives. As a result, I thought writing a short post to clear up some of the confusion out there would be helpful. I expect it will be especially helpful for those just starting their DAM journey.
Derived and Applied Metadata are at the core of describing assets. Each complements the other and makes the digital assets important to your business more easily found when made available for search by your DAM platform. At the heart of every DAM system is its ability to make assets searchable and present them to users. Often the more business-relevant information that describes your assets that is available the better, but there are limits as you can have too much of a good thing. This information describing your assets is metadata. Some metadata is systematically extracted from assets while other metadata is context-driven and is informed and applied by users based on business needs: Derived and Applied metadata.
Derived Metadata
Derived Metadata is descriptive information extracted directly from an asset. As derived metadata is extracted directly from the assets as they are imported into a DAM, it provides users with an immediately searchable baseline of asset properties. They make it possible for users to find assets before they have had their metadata enriched by a user or asset librarian with information relevant to why it was created. Derived metadata comes from numerous places:
- File Information
- Embedded data
- IPTC
- XMP
- AI
- Facial Detection and Recognition
- Optical Character Recognition
- Speech to Text
File Information
File Information is the most basic information available for every file created on a computer. This information is directly related to the file. File information includes details like the name, size, type, creation, and modification dates. Additionally, attributes specific to each format provide additional information like dimensions, number of pages, and color space details among others.
Embedded Data
Embedded Metadata, is information stored in a known location within the file. Typically found in the header or footer of files – the beginning or end – embedded metadata can be stored anywhere in a file. Each file format is different. Some file formats store information in only one location while others may duplicate data in multiple places to provide a level of redundancy which reduces the risk of corruption causing embedded metadata to be lost.
IPTC and XMP are the two most commonly used standards for embedded metadata used in file formats today. These standards facilitate the sharing and use of metadata information allowing for descriptive data about assets to travel with them. They have allowed for a more rich and complete metadata record to travel with the file where ever it goes. While embedded metadata can be considered to be applied metadata, because it is not typically applied by your organization initially, it is more derived than applied. GPS (Global Positioning System) data inserted into photos and videos taken with smartphones and modern cameras and camcorders is a good example of embedded metadata that facilitates the search for imagery and video of specific places. Another example of embedded metadata is the keywording, photographer, and photoshoot details included in files by stock houses.
Artificial Intelligence (AI)
AI is becoming ever more accessible and available. It is used in many applications that are part of our daily lives. Through AI, platforms can derive detailed information about the assets they manage. They can analyze them and tag them with ever-improving accuracy. AI toolsets can complete facial recognition, identify text in images using OCR (optical character recognition), and transcribe speech in audio and video files without the need for human interaction. These automated processes must be trained to improve their accuracy. As AI becomes ever more accurate, it’s possible to have better described and more searchable assets without human intervention.
Applied Metadata
Applied metadata is information that has been applied to an asset to help describe what it is and what it represents in the context of the organization. Context is fast becoming king. Because context is king, having searchable details, metadata, applied to your assets that are indexed by your DAM and searchable by users and systems to quickly identify the “right assets” has become the holy grail. Having the ability to quickly find and reuse content is imperative to marketing organizations as has become increasingly more important in recent years. Organizations today want to measure the success of assets and optimize processes to ensure the right asset is presented creating a personalized experience.
Applied metadata is information that is not specific to a file or inserted into the file by the tool that creates it. It often describes the mood depicted telling the asset’s story in the context of its creation or use within your organization. Applied metadata allows for categorization and association with various parts of your business. It allows you to apply product and other related information to assets increasing findability for users. Applied metadata is any supplementary information describing an asset stored in your DAM. It can range from simple categorization like Asset Type, Usage and Rights information, or product and brand details. Applied metadata can and often is sourced from other enterprise systems including Product Information Management (PIM) and Title Management systems.
Conclusion
Many ways exist to categorize metadata used to describe assets. Derived and Applied Metadata are two major categorizations allowing us to better understand what can be automatically generated and extracted versus what is applied by the team. The more metadata that can be automatically applied the better. Reducing requirements expected of users importing or curating assets as part of their duties ensures they can work on higher-value tasks.
If you and your team need to learn more about metadata or DAM, contact us, we’re here to help.