Data, Data, Data
Large amounts of data
- 800-1000 input files; 200 different formats
- Gigabytes of input and output per analysis
- Hundreds of analyses “active”
Data management issues
- Received from many different agencies
- Regular and irregular updates
- Meandering formats
- Duplication between analyses:
- Computations
- Intermediate & output files