The flow is to extract the data, prepare them, put metadata, clean them, have different description and prediction algorithms and put them in a visible way. We are talking in the order of petabytes (PB) of data. Then loading, storing, processing and visualizing is not a trivial task.