Difference between revisions of "IV Lifecycle in Detail"

Revision as of 15:30, 27 March 2020

This section expands the IV Lifecycle Overview of the alignment between the information viewpoint and the data lifecycle. The descriptions uses the IV Information Objects and IV Information Action Types to a greater extent providing a deeper insight into the processing of information objects by the RI.

The notation for the diagrams in this section is as follows. The rounded rectangles represent IV actions on data and the straight rectangles represent instances of information objects at different stages. The arrow lines link actions and objects as follows: arrows leaving an action connect to IV objects created by the action while arrows entering an action connect IV objects to actions using them.

Data Acquisition Data Curation Data Publishing Data Processing Data Use

Data Acquisition

The data acquisition phase encompasses the actions defined for the observation/experimentation, storage, identification and backup of measurements/observations (raw data).

The following paragraphs explain the detailed diagram of how the IV actions can be combined to support data acquisition.

Note
This example is provided for illustrative purposes. The example shows one of many alternatives for performing data acquisition. Other IV actions and IV objects can be introduced at this stage. Additional actions and objects not described in the IV of the ENVRI RM can also be incorporated.

Specify investigation design: Before a measurement or observation can be started the design (or setup) must be defined, including the working hypothesis and scientific question, method of the selection of sites (stratified / random), necessary precision of the observation or measurement, boundary conditions, etc. For correctly using the resulting data, details about their processing, and the parameters defined have to be available (e.g. if a stratified selection of sites according to parameter A is done, the resulting value of parameter A can not be evaluated in the same way as other results).

Specify measurement or observation: After defining the overall design of measurements or observations, the measurement method, complying with the design, including devices which should be used, standards / protocols which should be followed, and other details have to be specified. The details of the process and the parameters used have to preserved to guarantee correct interpretation of the resulting data (e.g. when modelling a dependency of parameter B of a parallel measured wind velocity, the limit of detection of the used anemometer influences the range of values of possible assertions).

Perform measurement or observation: After the measurement or observation method is defined, the experiment can be performed, producing measurement result(s) which is a form of persistent data in a raw state.

Store data: The measurement result data is stored.This action can be very simple when using a measurement device, which periodically sends the data to the data management system, but this can also be a sophisticated harvesting process or e.g. in case of biodiversity observations a process done by humans. The storage process is the first step in the lifecycle of data that makes data accessible in digital form.

Data curation: Once data is stored, the next phase of the data lifecycle is data curation.

Data Acquisition
Notation

Data Curation

The data curation phase encompasses the actions that support the the long term preservation and use of research data. The main product of this set of actions is persistent data in a stable state (annotated data). The following paragraphs explain the detailed diagram of how the IV actions can be combined to support data curation.

Note
This example is provided for illustrative purposes. The example shows one of many alternatives for performing data curation. Other IV actions and IV objects can be introduced at this stage, for instance: Check quality, Register metadata, or Publish metadata. Actions and objects not described in the IV of the ENVRI RM can also be incorporated.

Data Acquisition: The first action is Data Acquisition, the phase of the data lifecycle that precedes data curation. This action produces three IV Objects: PersistentData, SpecificationOfMeasurementsOrObservations and SpecificationOfInvestigationDesign.

Carry out backup: As soon as data are available to the RI a backup can be made, independently of the state of the persisted data. This can be done locally or remotely, by the data owners or by dedicated data archiving centres.

Assign Unique Identifier: Data needs to be uniquely identified for correct retrieval and processing, the unique identifier can be local to the RI or global, to be used from outside the RI. As such it can be a simple numerical value assigned by the RI DBMS or a specific PID assigned following the standards of an external PID provider.

Add metadata: This action uses the specifications of investigation and measurements to facilitate the understanding of the associated persistent data object. In addition to this data the RI can add timestamps, and other identification data as metadata. Once the data is correctly stored and identified, and the corresponding metadata has been also created, persistent data can be linked to metadata.

Annotate data: Data is further enriched with additional metadata which can correspond to a specific ontology for the research field.

Annotate metadata: Metadata can also be further enriched with additional metadata which can correspond to a specific ontology for the research field.

Build conceptual model: The building of a local conceptual model mirrors the wider research community efforts to build a global conceptual model. In this set of activities concept are added to the local conceptual model of the RI. The conceptual model is made of the composition of concepts, which are used to help people know, understand, or simulate a subject the model represents. The pairing of data and metadata using semantic annotations creates a local concept (a new metadata object) and changes the state of the persistent data object to annotated.

Global conceptual models are ontologies, thesauri, dictionaries, or hierarchies built by a larger communities than a single RI, such as GEMET, DOLCE, SWEET. This action normally happens outside of the RI's main activities. Trough feedback mechanisms RIs participate in the creation of global conceptual models while developing their own models..

Data Publishing: Once data have been curated, the next phase of the data lifecycle is data publishing.

Data Curation
Notation

@@ Line 11: / Line 11: @@
 {| style="width: 95%;"
 |-
-| style="width: 60%; padding: 10px 10px 60px 10px"|
+| style="width: 60%; padding: 10px 10px 80px 10px"|
 === <span style="color: #BBCE00">Data Acquisition</span> ===
@@ Line 38: / Line 38: @@
 |-
 | style="background-color:#ffffff;"| [[File:IVEvolutionAcquisition03.png|300px]]
+<div style='text-align: right;'>'''Notation'''</div>
+|}
+|}
+{| style="width: 95%;"
+|-
+| style="width: 50%; padding: 10px 10px 10px 10px"|
+=== <span style="color: #BBCE00">Data Curation</span> ===
+The data curation phase encompasses the actions that support the the long term preservation and use of research data. The main product of this set of actions is persistent data in a stable state (annotated data). The following paragraphs explain the detailed diagram of how the IV actions can be combined to support data curation.
+<p style="background-color: #fcfcfc; border-radius: 5px; border: 1px solid #ccc;  padding: 10px 20px 10px 20px;">
+'''Note'''<br>
+This example is provided for illustrative purposes. The example shows one of many alternatives for performing data curation. Other IV actions and IV objects can be introduced at this stage, for instance: Check quality, Register metadata, or Publish metadata. Actions and objects not described in the IV of the ENVRI RM can also be incorporated.
+</p>
+'''Data Acquisition''': The first action is Data Acquisition, the phase of the data lifecycle that precedes data curation. This action produces three IV Objects: PersistentData, SpecificationOfMeasurementsOrObservations and SpecificationOfInvestigationDesign.
+'''Carry out backup''': As soon as data are available to the RI a backup can be made, independently of the state of the persisted data. This can be done locally or remotely, by the data owners or by dedicated data archiving centres.
+'''Assign Unique Identifier''': Data needs to be uniquely identified for correct retrieval and processing, the unique identifier can be local to the RI or global, to be used from outside the RI. As such it can be a simple numerical value assigned by the RI DBMS or a specific PID assigned following the standards of an external PID provider.
+'''Add metadata''': This action uses the specifications of investigation and measurements to facilitate the understanding of the associated persistent data object. In addition to this data the RI can add timestamps, and other identification data as metadata. Once the data is correctly stored and identified, and the corresponding metadata has been also created, persistent data can be linked to metadata.
+'''Annotate data''': Data is further enriched with additional metadata which can correspond to a specific ontology for the research field.
+'''Annotate metadata''': Metadata can also be further enriched with additional metadata which can correspond to a specific ontology for the research field.
+'''Build conceptual model''': The building of a '''local conceptual model''' mirrors the wider research community efforts to build a global conceptual model. In this set of activities concept are added to the local conceptual model of the RI. The conceptual model is made of the composition of concepts, which are used to help people know, understand, or simulate a subject the model represents.  The pairing of data and metadata using semantic annotations creates a local concept (a new metadata object) and changes the state of the persistent data object to annotated.
+'''Global conceptual models''' are ontologies, thesauri, dictionaries, or hierarchies built by a larger communities than a single RI, such as GEMET, DOLCE, SWEET. This action normally happens outside of the RI's main activities. Trough feedback mechanisms RIs participate in the creation of global conceptual models while developing their own models..
+'''Data Publishing''': Once data have been curated, the next phase of the data lifecycle is data publishing.
+| style="width: 50%; padding: 10px 10px 420px 25px"|
+{| class="wikitable"
+|-
+! style="padding: 10px"| <div style='text-align: left;'>'''Data Curation'''</div>
+|-
+| style="background-color:#ffffff;"| [[File:IVEvolutionCuration05.png|550px]]
 <div style='text-align: right;'>'''Notation'''</div>
 |}

Difference between revisions of "IV Lifecycle in Detail"

Revision as of 15:30, 27 March 2020

Data Acquisition

Data Curation

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Categories

Tools

Misc