Editing IV Information Objects

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 54: Line 54:
 
A copy of (persistent) data so it may be used to restore the original after a data loss event.  
 
A copy of (persistent) data so it may be used to restore the original after a data loss event.  
  
<span style="color: #5E6C84" id="mappingrule">'''mapping rule'''</span>
+
<span style="color: #5E6C84"id="mappingrule">'''mapping rule'''</span>
  
 
Configuration directives used for model-to-model transformation.  
 
Configuration directives used for model-to-model transformation.  
Line 174: Line 174:
 
|}
 
|}
  
<span style="color: #5E6C84" id="persistentdata">'''persistent data'''</span>
+
<span style="color: #5E6C84" id="persistentdata">'''<span style="color: #5E6C84">persistent data'''</span>
  
 
Data is the representations of information dealt with by information systems and users thereof (as defined in ODP, ISO/IEC 10746-2). Persistent Data denotes data that are persisted (stored for the long-term).
 
Data is the representations of information dealt with by information systems and users thereof (as defined in ODP, ISO/IEC 10746-2). Persistent Data denotes data that are persisted (stored for the long-term).
 
<span style="color: #5E6C84" id="persistentdatastate">'''persistent data state'''</span>
 
 
Persistent Data state is an object property that determines the set of all sequences of actions (or traces) in which the object can participate, at a given instant in time (as defined in ODP, ISO/IEC 10746-2). The persistent data states and their changes as effects of actions are illustrated as [[IV States|'''IV States''']].
 
 
In their lifecycle, persistent data may have the states described in the following table.
 
 
{| class="wikitable sortable"
 
|-
 
! style="padding: 8px" |'''<span style="color: #BBCE00">State</span>''' !! '''<span style="color: #BBCE00">Description</span>'''
 
|-
 
| raw || data derived from the primary results of observations or measurements
 
|-
 
| identified || data which has been assigned a unique identifier
 
|-
 
| annotated || data that are associated to concepts, describing their meaning
 
|-
 
| qa assessed || data that have undergone checks and are associated with descriptions of the results of those checks.
 
|-
 
| assigned metadata || data that are associated to metadata which describe those data
 
|-
 
| backed up || data that of which an identical copy has been stored securely
 
|-
 
| finally reviewed || data that have undergone a final review and therefore will not be changed any more
 
|-
 
| mapped || data that are mapped to a certain conceptual model
 
|-
 
| published || data that are presented to the outside world
 
|-
 
| processed  || data that have undergone a processing (evaluation, transformation)
 
|}
 
 
<p style="background-color: #fcfcfc; border-radius: 5px; border: 1px solid #ccc;  padding: 10px 20px 10px 20px;">
 
'''Note'''<br>
 
The state 'raw' refers to data as received into the ICT elements of the research infrastructure. Some pre-processing may or may not have been carried out closer to where measurements and observations were made
 
</p>
 
 
These states are referential states. The instantiated chain of data lifecyle can be expressed in data provenance.
 
 
<span style="color: #5E6C84" id="qanotation">'''qa notation'''</span>
 
 
Notation of the result of a Quality Assessment. This notation can be a nominal value out of a classification system up to a comprehensive (machine readable) description of the whole QA process.
 
 
In practice, this can be:
 
 
* simple flags like "valid" / "invalid" up to comprehensive descriptions like
 
* "data set to invalid by xxxxxx on ddmmyy because of yyyyyyy"
 
 
QA notation can be seen as a special annotation. To allow sharing with other users, the QA notation should be unambiguously described so as to be understood by others or interpretable by software tools.
 
 
<span style="color: #5E6C84" id="servicedescription">'''service description'''</span>
 
 
Description of services and processes available for reuse. The description is needed to facilitate usage. The service description usually includes a reference to a service or process making it available for reuse within a research infrastructure or within an open network like the Internet. Usually such descriptions include the accessibility of the service, the description of the interfaces, the description of behavior and/or implemented algorithms. Such descriptions are usually done along service description standards (e.g. WSDL, web service description language). Within some service description languages, semantic descriptions of the services and/or interfaces are possible (e.g. SAWSDL, Semantic Annotations for WSDL).
 
 
<span style="color: #5E6C84" id="specificationofinvestigationdesign">'''specification of investigation design'''</span>
 
 
This is the background data needed to understand the overall goal of the measurement or observation. It could be the sampling design of observation stations, the network design, the description of the setup parameters (interval of measurements) and so on. It usually contains important data for the allowed evaluations of research results (e.g. the question of whether a sampling design was done randomly or by stratification determines which statistical methods can be applied).
 
 
Investigations (and hence measurement and observation results) need not be quantitative. They can also be qualitative results (like "healthy", "ill") or classifications (like assignments to biological taxa). It is important for data processing to know whether they are quantitative or qualitative.
 
 
The specification of investigation design can be seen as part of metadata or as part of the [[Appendix B Terminology and Glossary#SemanticAnnotation|'''Semantic Annotation''']]. It is important that this description follows certain standards and it is desirable that the description is machine readable.
 
 
<span style="color: #5E6C84" id="specificationofmeasurementsorobservations">'''specification of measurements or observations'''</span>
 
 
The description of the measurement/observation which specifies:
 
 
* what is measured/observed;
 
* how it is measured/observed (including processes/metods and instruments to be used);
 
* by whom it is measured/observed (including project, organisation and experimenter/observer profile); and
 
* what the temporal design is (single / multiple measurements / interval of measurement etc. )
 
 
<p style="background-color: #fcfcfc; border-radius: 5px; border: 1px solid #ccc;  padding: 10px 20px 10px 20px;">
 
'''Note'''<br>
 
This specification can be included as metadata or as [[Appendix B Terminology and Glossary#SemanticAnnotation|'''Semantic Annotation''']] of the scientific data to be collected. It is important that such a design specification is both explicit and correct, so as to be understood or interpreted by external users or software tools. Ideally, a machine readable specification is desired.
 
</p>
 
 
<span style="color: #5E6C84" id="UID">'''unique identifier (UID)'''</span>
 
 
With reference to a given type of data, objects a unique identifier (UID) is any identifier which is guaranteed to be unique among all identifiers used for those type of objects and for a specific purpose. 
 
 
There are 3 main generation strategies:
 
 
* serial numbers, assigned incrementally;
 
* random numbers, selected from a number space much larger than the maximum (or expected) number of objects to be identified. Although not really unique, some identifiers of this type may be appropriate for identifying objects in many practical applications and are, with abuse of language, still referred to as "unique";
 
* names or codes allocated by choice which are forced to be unique by keeping a central registry.
 
 
The above methods can be combined, hierarchically or singly, to create other generation schemes which guarantee uniqueness.
 
 
In many cases, a single object may have more than one unique identifier, each of which identifies it for a different purpose. For example, a single object can be assigned with the following identifiers:
 
 
* global: unique for a higher level community
 
* local: unique for the subcommunity
 
 
The critical issues of unique identifiers include but not limited to:
 
 
* long term persistence – without efficient management tools, UIDs can be lost;
 
* resolvability -- without efficient management tools, the linkage between a UID and its associated contents can be lost.
 
  
 
[[Category:IV Components]]
 
[[Category:IV Components]]

Please note that all contributions to may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see Copyrights for details). Do not submit copyrighted work without permission!

Cancel Editing help (opens in new window)