Difference between revisions of "TC 2 EuroArgo Data Subscription Service"

From
Jump to: navigation, search
(External Links)
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
{{DISPLAYTITLE:TC_2 EuroArgo Data Subscription Service}}
 
== <span style="color: #BBCE00">Background</span> ==
 
== <span style="color: #BBCE00">Background</span> ==
  
Line 60: Line 61:
 
|-
 
|-
 
|Task 5.3/7.2 leader
 
|Task 5.3/7.2 leader
|Zhiming Zhao
+
|Zhiming Zhao<br>Paul Martin
Paul Martin
 
 
|University of Amsterdam (UvA)
 
|University of Amsterdam (UvA)
|[mailto:z.zhao@uva.nl z.zhao@uva.nl]
+
|[mailto:z.zhao@uva.nl <span style="color: #222222">z.zhao@uva.nl</span>]<br>[mailto:P.W.Martin@uva.nl <span style="color: #222222">P.W.Martin@uva.nl</span>]
[mailto:P.W.Martin@uva.nl P.W.Martin@uva.nl]
 
 
|}
 
|}
  
Line 82: Line 81:
 
'''<span style="color: #BBCE00">Scientific domain</span>'''
 
'''<span style="color: #BBCE00">Scientific domain</span>'''
  
Atmosphere, hydrosphere, ''geosphere''
+
Atmosphere, hydrosphere, ''geosphere''[[File:Information.jpg|16px]]
 +
 
  
 
'''<span style="color: #BBCE00">Community</span>'''
 
'''<span style="color: #BBCE00">Community</span>'''
  
 
Data service provision, data usage.
 
Data service provision, data usage.
 +
  
 
'''<span style="color: #BBCE00">Behavior</span>'''
 
'''<span style="color: #BBCE00">Behavior</span>'''
Line 95: Line 96:
  
 
The objective is to provide a regular data flow to scientists, from different Reseach Infrastructures.
 
The objective is to provide a regular data flow to scientists, from different Reseach Infrastructures.
 +
  
 
'''<span style="color: #BBCE00">Challenges</span>'''
 
'''<span style="color: #BBCE00">Challenges</span>'''
Line 100: Line 102:
 
* Aggregate and distribute billions of observations
 
* Aggregate and distribute billions of observations
 
* Link between RI and E-infrastructures
 
* Link between RI and E-infrastructures
 +
  
 
'''<span style="color: #BBCE00">Detailed scenario</span>'''
 
'''<span style="color: #BBCE00">Detailed scenario</span>'''
Line 154: Line 157:
 
* Step 5 : replicate data and VM in mirror sites (EU, US, AU, JP)
 
* Step 5 : replicate data and VM in mirror sites (EU, US, AU, JP)
 
* Next steps : promote the development of various services around an ENVRIPLUS cloud
 
* Next steps : promote the development of various services around an ENVRIPLUS cloud
 +
  
 
'''<span style="color: #BBCE00">Technical status and requirements</span>'''
 
'''<span style="color: #BBCE00">Technical status and requirements</span>'''
  
 
This use case involves EGI and EUdat
 
This use case involves EGI and EUdat
 +
  
 
'''<span style="color: #BBCE00">Implementation plan and timetable</span>'''
 
'''<span style="color: #BBCE00">Implementation plan and timetable</span>'''
Line 176: Line 181:
 
* EMSO RI
 
* EMSO RI
 
* ECOS-SOCAT RI
 
* ECOS-SOCAT RI
 +
  
 
'''<span style="color: #BBCE00">Expected output and evaluation of output</span>'''
 
'''<span style="color: #BBCE00">Expected output and evaluation of output</span>'''
Line 183: Line 189:
  
 
== <span style="color: #BBCE00">External Links</span> ==
 
== <span style="color: #BBCE00">External Links</span> ==
# TC_2 notebook: [https://envriplus.manageprojects.com/projects/wp9-service-validation-and-deployment-1/notebooks/630/pages/331 {+}]https://envriplus.manageprojects.com/projects/wp9-service-validation-and-deployment-1/notebooks/630/pages/331+
+
# TC_2 notebook: [https://envriplus.manageprojects.com/projects/wp9-service-validation-and-deployment-1/notebooks/630/pages/331 {+}][https://envriplus.manageprojects.com/projects/wp9-service-validation-and-deployment-1/notebooks/630/pages/331+ <span style="color: #222222">https://envriplus.manageprojects.com/projects/wp9-service-validation-and-deployment-1/notebooks/630/pages/331+</span>]
 +
 
  
[[Category:Use Case List]]
+
[[Category:Use Cases]]

Latest revision as of 14:20, 26 August 2020

Background[edit]

Short description[edit]

The objective is to provide a data subscription service to scientific users. The scientific user provides his data selection criteria. Regularly, a dataset of his selected data will be extracted from Research Infrastructures cloud and delivered in his personal cloud account.

Contact[edit]

Background Contact Person Organization Contact email
RI-ICT (Use Case proposer, Agile Group leader) Thierry Carval Ifremer, Euro-Argo Thierry.Carval@ifremer.fr
RI-ICT Robert Huber University of Bremen, EMSO rhuber@uni-bremen.de
RI-ICT Benjamin Pfeil University of Bergen, ICOS benjamin.pfeil@gfi.uib.no
e-Infrastructure Yin Chen EGI yin.chen@egi.eu
e-Infrastructure Leonardo Candela CNR leonardo.candela@isti.cnr.it
e-Infrastructure Gergely Sipos EGI gergely.sipos@egi.eu
e-Infrastructure Daan Broeder EUdat daan.broeder@meertens.knaw.nl
RI-ICT Jerome Detoc Ifremer, Euro-Argo Jerome.Detoc@ifremer.fr
RI-ICT Antoine Queric Ifremer, Euro-Argo Antoine.Queric@ifremer.fr
Task 5.3/7.2 leader Zhiming Zhao
Paul Martin
University of Amsterdam (UvA) z.zhao@uva.nl
P.W.Martin@uva.nl

Use case type[edit]

This use case is an implementation case
Euro-Argo, EMSO and ICOS Research Infrastructures will push a series of data and metadata on a so-called ENVRIPLUS cloud.
A data subscription service will be developed for scientific users.
Data provided for the cloud service:

  • All Argo observations, daily updated
  • Copernicus in situ observations
  • A selection of EMSO observatories data
  • ICOS-SOCAT carbon data observed from voluntary observation ships

Scientific domain and communities[edit]

Scientific domain

Atmosphere, hydrosphere, geosphereInformation.jpg


Community

Data service provision, data usage.


Behavior

Detailed description[edit]

Objective and Impact

The objective is to provide a regular data flow to scientists, from different Reseach Infrastructures.


Challenges

  • Provide data and metadata from complementary Research Infrastructure (Ocean, Atmosphere, space)
  • Aggregate and distribute billions of observations
  • Link between RI and E-infrastructures


Detailed scenario

The data subscription service to scientific users:

  • The user provides his criteria
    • time, spatial, parameter, data type
    • update period for delivery (daily, monthly, yearly, on the spot)
  • The relevant data are extracted on ENVRI cloud
  • Data may be converted/transformed on ENVRI grid
  • The user's cloud account is updated regularly with the new data provided above
  • An accounting of data delivery is performed (MDC ?)
    • A citation scheme is attached to the delivered data (DOI)
      • bibliographic surveys can track the use of these data in publications
      • reproducibility is possible
  • A users identification scheme is implemented (MarineID, OpenID, Shibboleth ?)

The cloud content:

  • Euro-Argo and Copernicus datasets
    • 4 billion ocean observations
    • 300 parameters
    • 15 000 observing platforms
    • from 1900 to today.
  • A selection of EMSO observatories data
  • ICOS-SOCAT carbon data observed from voluntary observation ships

This "cloud" of observations is pushed and continuously updated on ENVRIPLUS cloud (EGI, EUDAT …)
Copies are replicated in different places, close to users location (EU, US, Australia, Japan)

The cloud data model

  • Observation data model : a flat table of 4 billion records
  • ID DOI platformCode dataType x y z t parameter value dateUpdate
  • The observations table is hosted in a workplace such as
    • NoSQL Elasticsearch
    • The use of in-memory features would provide the best reactivity (instant answers)
  • The workplace is activated in a virtual server, replicated on the cloud

Data services to be developed around Euro-Argo cloud

  • Ocean observations API : scoop
  • Metadata vizualisation : map wms services
  • Data vizualisation : graphics
  • Data products such as mixed-layer depths maps

Agile and incremental implementation

  • Step 1 : Euro-Argo data file on the cloud (1to file daily updated)
  • Step 2 : VM for indexation of data file (Elasticsearch)
  • Step 3 : data file generation service (scoop Java API)
  • Step 4 : data subscription/distribution service to OwnCloud accounts
  • Step 5 : replicate data and VM in mirror sites (EU, US, AU, JP)
  • Next steps : promote the development of various services around an ENVRIPLUS cloud


Technical status and requirements

This use case involves EGI and EUdat


Implementation plan and timetable

Timetable

  • Step 1: 2016 Q1 Euro-Argo RI, EGI, EUDAT
  • Step 2: 2016 Q2 Euro-Argo RI, EGI, EUDAT
  • Step 3: 2016 Q3 Euro-Argo RI, EGI, EUDAT, EMSO RI, ECOS-SOCAT RI
  • Step 4: 2016 Q4 Euro-Argo RI, EGI, EUDAT, EMSO RI, ECOS-SOCAT RI
  • Step 5 : 2017 Euro-Argo RI, EGI, EUDAT, EMSO RI, ECOS-SOCAT RI

Allocation of resources

  • Euro-Argo RI: 10 man months, 20 000€ (2 months of subcontracting)
  • EGI
  • EUDAR
  • EMSO RI
  • ECOS-SOCAT RI


Expected output and evaluation of output

In 2017, the first data subscribers and the number of downloaded files will be counted.
The experience will be a success if regular users receive the data files they need.

External Links[edit]

  1. TC_2 notebook: {+}https://envriplus.manageprojects.com/projects/wp9-service-validation-and-deployment-1/notebooks/630/pages/331+