Bug #1308

ctlike crashes when several background models are given

Added by Lu Chia-Chun over 9 years ago. Updated over 9 years ago.

Status:ClosedStart date:08/01/2014
Priority:UrgentDue date:
Assigned To:Knödlseder Jürgen% Done:

100%

Category:-
Target version:00-08-00
Duration:

Description

ctlike crashes when several backgrounds in model.xml are linked to their corresponding observations of the same id.

No error messages except 'Segmentation fault (core dumped)'


Recurrence

No recurrence.

History

#1 Updated by Knödlseder Jürgen over 9 years ago

This is probably something we never tested.

I think I understand what happens, but I cannot really verify unless you can provide me with some test data.

I think the problem is related to parallel processing. It should disappear when you switch off OpenMP support (./configure --disable-openmp).

GammaLib loads map cubes only when it really needs them. When you use OpenMP, GammaLib creates multiple copies of the model, and at that point, multiple threads are trying to access at the same time the same file, which leads to the cfitsio problem. The error code (108) is
READ_ERROR 108 error reading from FITS file
(see http://heasarc.gsfc.nasa.gov/docs/software/fitsio/quick/node26.html).

Can you trying switching off OpenMP support to see whether the problem disappears?

#2 Updated by Lu Chia-Chun over 9 years ago

  • Status changed from New to Feedback

Switch off openMp does resolve the problem!

#3 Updated by Knödlseder Jürgen over 9 years ago

  • Status changed from Feedback to In Progress
  • % Done changed from 0 to 10

Good. Now I have to think about how to prevent this problem.

It is in fact annoying that GammaLib creates multiple copies of the model just for OpenMP. This creates multiple copies of the model in memory, which can be quite exhaustive. I don’t fully understand why the problem occurs in fact, as each thread (which corresponds to one observation) should only use it’s specific model.

I still would like to have a test case to better understand the problem. We need a fix for that.

#4 Updated by Knödlseder Jürgen over 9 years ago

  • Status changed from In Progress to Feedback
  • Target version set to 00-08-00
  • % Done changed from 10 to 90

I added to GModelSpatialDiffuseCube::fetch_cube an OpenMP protection:

    if (!m_loaded && !m_filename.empty()) {
        #pragma omp critical
        {
            const_cast<GModelSpatialDiffuseCube*>(this)->load(m_filename);
        }
    }

Hope that this solves the issue.

#5 Updated by Knödlseder Jürgen over 9 years ago

Chia-Chun, did you have a chance to check whether this is fixed now?

#6 Updated by Lu Chia-Chun over 9 years ago

No. ctselect crashes without error messages after I updated. I am not sure what the problem is. I am still testing.

#7 Updated by Lu Chia-Chun over 9 years ago

Hi Juergen,

I believe now it works. Thank you for the effort to resolve this problem!

#8 Updated by Knödlseder Jürgen over 9 years ago

  • Status changed from Feedback to Closed
  • % Done changed from 90 to 100

Also available in: Atom PDF