1.4. Analysing raw cluster output
The rough workflow is as follows:
Each cluster job produces a lot of output files which contain raw data.
This raw data can then be aggregated into a
DataFrame
object.This object is then aggregated which reduces its size substantially. We usually refer to these as reduced data frames.
The drrc
package contains a module drrc.analysis
which has helper functions for these steps.
Refer to the main
function in [this example](/Analysis/Visualise_ValidTime.py).