Abstract[edit]

D2.2 Methodology report for handling of data heterogeneity
Project	ENVRIplus
Deliverable nr	D2.2
Submission date	2017-05-02
Type	Report

The deliverable is related to the work developed in the task T2.2- Time-series heterogeneities: innovative user services (Task leader: INGV[EMSO], Participants: IFREMER[EURO-ARGO], UiT[ESONET-VI], UvA[LIFEWATCH], NERC[FixO3]) of Wp2.

The deliverable, after setting a common vocabulary and terminology, analyses the most recurrent sources of heterogeneity affecting the time-series in spite of the usual standardisation internal to Research Infrastructures and focus on some of them, namely data gaps and breaking points.

For the sake of a feasibility study on the application of methods for heterogeneities detection, time-series provided by Research Infrastructures have been classified in 'very-long time-series', lasting from one years to several years (typical of parameters related to global changes), and 'short time-series', lasting few seconds to some months (typical of parameter related to abrupt phenomena).

The methodologies used in the feasibility study on heterogeneities detection on time-series from Research Infrastructures have been borrowed by Geophysics. The computation of the Probability Density Function of the Power Spectral Density of the time-series is used for data gap detection and the computation of the ratio between the Short-Time Average and the Long-Time Average is shown as an example for breaking point (start time of heterogeneities) detection. A brief description of the methods is given together to basic references. The heterogeneity treatment issues are considered very dependent from the features of the corresponding parameters, site of measurement acquisition, modeling scale, and a trans-disciplinary approach to the various ENVRIplus time-series deserve a more deepen analysis.

The promising results obtained across disciplines and across domains support the proposal for the implementation of services to help scientists and data managers during the selection process of the most suitable data for their original elaborations. In shared virtual environments (i.e., cloud computing), the service can provide basic processing tools for time-series in different domains based on the proposed methodologies for heterogeneities detection. The service can be a very helpful option assisting data managers in the regular Quality Assessment/Quality Check procedures and support scientists in accepting/discarding /correcting data before the final data selection in view of original analytical elaborations.

⚠️ The full contents of this document have not yet been moved to the wiki. Please use the links to access the original document.

🙂 You can help by adding the contents to the wiki! See Help:Manual for more information on how to do this.

Document metadata

Document history
Original document	[1]
Title	D2.2 Methodology report for handling of data heterogeneity
Work package	WORK PACKAGE 2 – Metrology, quality and harmonization
Leading beneficiary	CNR
Authors	Laura Beranzoli (laura.beranzoli@ingv.it) (INGV)
	Mariagrazia De Caro (Mariagrazia.decaro@ingv.it) (INGV)
	Caterina Montuori (caterina.montuori@ingv.it) (INGV)
	Vito Vitale (v.vitale@isac.cnr.it) (CNR)
	Mauro Mazzola (m.mazzola@isac.cnr.it) (CNR)
	Boyan Petkov(b.petkov@isac.cnr.it) (CNR)
	Herve Petetin (Herve.Petetin@aero.obs-mip.fr) (OBSERVATOIRE MIDI-PYRÉNÉES)
	Justin Buck (juck@bodc.ac.uk) (NOCS)
	Catherine Lund Myhre (Cathrine.Lund.Myhre@nilu.no) (NILU)
Accepted by	Jean-Daniel Paris (WP 2 leader)
Deliverable type	Report
Dissemination level	Public
Deliverable due date	2017-04-28/M24
Actual Date of Submission	2017-05-02/M25
Abstract	#Abstract
Project internal reviewer	Ari Asmi (University of Helsinki)
Date	Version
15.04.2017	Draft for comments
25.04.2017	Corrected version
30.04.2017	Accepted by J.-D. Paris, V. Vitale, A. Asmi

Methodology report for handling of data heterogeneity

Abstract[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Categories

Tools

Misc