Change request #1339
Reduce memory usage to ctools
Status: | Closed | Start date: | 10/23/2014 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assigned To: | Knödlseder Jürgen | % Done: | 100% | |
Category: | - | |||
Target version: | 00-08-00 | |||
Duration: |
Description
ctobssim can take a large amount of memory if multiple runs are simulated at once. This is related to the fact that first all events are simulated and only after all simulations are done the results are saved. One can circumvent this by adding a save step directly after the simulation step in case that the execute()
method is called. The only problem is then that ctobssim may process the runs in parallel (due to the OpenMP) support, leading still to important memory requirements. But this could be controlled by setting the
export OMP_NUM_THREADS=2
environment variable to limit for example the maximum number of threads to 2.
Recurrence
No recurrence.
History
#1 Updated by Knödlseder Jürgen about 10 years ago
- Subject changed from Reduce ctobssim memory usage to Reduce memory usage to ctools
- Status changed from New to In Progress
- Assigned To set to Knödlseder Jürgen
- Target version set to 00-08-00
#2 Updated by Knödlseder Jürgen about 10 years ago
Here some information about memory usage.
The size of a GCTAEventAtom
object is 264 Bytes, which is about 2.3 times larger than the event written to disk. The difference mainly comes from the GCTAInstDir
member which takes 112 Bytes due to pre-computed information.
The IFAE20120510_50h
response predicts about 60000 events for 30 min, which corresponds to 16 MB in RAM and 7 MB on disk. The IFAE20120510_50h
response file takes 210 kB on disk. The binning of the response file is still relatively coarse, and it is certainly conceivable that the final response file takes a factor of 10 more on disk, i.e. 2 MB. As a rule of thumb, one can thus assume that a 30 min run will take 10 MB on disk and 20 MB in RAM. The following table summarizes the expected memory needs:
Duration | Disk | RAM |
30 min | 10 MB | 20 MB |
50 hr | 1 GB | 2 GB |
200 hr | 4 GB | 8 GB |
1 year | 28 GB | 56 GB |
30 years | 840 GB | 1.7 TB |
(1400 hr of observing time have been assumed per year).
As today laptops can accommodate a few GB of RAM, analyses of 50 hr can be done with the existing code in memory. Memory usage can be reduced by reducing the size of an GCTAEventAtom
object to the information that is needed for analysis. The minimal information is Right Ascension, Declination, energy and time, which would occupy 32 Bytes for double precision values. This corresponds to a factor of 8 reduction.
To go beyond the RAM memory limitations, data may be stored on disk when not used. This goes of course at the expense of computation speed.
#3 Updated by Knödlseder Jürgen about 10 years ago
- % Done changed from 0 to 20
ctobssim
now uses less memory when the execute()
method is called. In that case, events are written immediate to disk and are not kept in memory. They are however read back when needed.
#4 Updated by Knödlseder Jürgen about 10 years ago
- % Done changed from 20 to 50
I implemented a new logic in GCTAObservation
where the events are actually not loaded when reading information from an XML file. The read()
method now just stores the event file name, and the events()
method makes sure that the events are loaded if they are not yet online. To get rid of events once they have been loaded, the dispose_events()
method has been added.
With this logic, only minimal changes need to be done on the ctools
side.
#5 Updated by Knödlseder Jürgen about 10 years ago
- Description updated (diff)
#6 Updated by Knödlseder Jürgen about 10 years ago
- File test.py added
- Status changed from In Progress to Feedback
- % Done changed from 50 to 90
Attached a test script that I used to test a pipeline with reduced memory usage. Looks good from my side.
#7 Updated by Knödlseder Jürgen about 10 years ago
- Status changed from Feedback to Closed
- % Done changed from 90 to 100