Feature #1597

Please add option in XML observation spec to select HDU names

Added by Deil Christoph over 8 years ago. Updated over 8 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assigned To:Mayer Michael% Done:

100%

Category:-Estimated time:5.00 hours
Target version:-
Duration:

Description

As discussed on the ctools mailing list and as a follow-up to the Gammalib issue https://cta-redmine.irap.omp.eu/issues/1596 , here’s the corresponding feature request for ctools.

Please add an option to specify HDU names in the observation XML model specification, e.g. “extname” in addition to “file” here:
http://cta.irap.omp.eu/ctools-devel/user_manual/getting_started/response.html#specifying-individual-instrument-response-files

At the moment several names are hard-coded, like “POINT SPREAD FUNCTION”.

This poses extra requirements on how we have to structure and name HDUs in HESS in the FITS exporters that we’d like to get rid of:
http://gamma-astro-data-formats.readthedocs.org/en/latest/data_storage/hdu_index/index.html#disclaimer-about-hduname

Maybe this addition can still be made before the 1.0 release?


Recurrence

No recurrence.

History

#1 Updated by Knödlseder Jürgen over 8 years ago

  • Project changed from ctools to GammaLib

#2 Updated by Mayer Michael over 8 years ago

  • Tracker changed from Change request to Feature
  • Status changed from New to Feedback
  • Assigned To changed from Knödlseder Jürgen to Mayer Michael
  • Start date deleted (12/10/2015)
  • % Done changed from 0 to 100
  • Estimated time set to 5.00

I have implemented the changes to the CTA response interface. Generally, all load functions now also take the string parameter extname, which doesn’t have to be filled on default.
The derived response classes which are based on FITS tables (e.g. GCTAAeff2D) now can take the extension name on construction. The extension name can also be passed on saving.
Furthermore, I modified GCTAResponseIrf::read to look for a hdu attribute in the XML file. If present, the attribute is used and stored for saving later on. If the hdu attribute is not present it is not written to XML in the save method, and the default is used.

I have added unit test cases that read and write from and to XML, which all run smoothly.

Changes are available on 1597-Add-option-to-specify-hdu-for-cta-irfs

#3 Updated by Mayer Michael over 8 years ago

  • Status changed from Feedback to Pull request

#4 Updated by Deil Christoph over 8 years ago

Michael - sounds good to me. Thanks!

One minor question / comment: why call it “extname” in the code and “hdu” in the XML file ... make it consistent?

#5 Updated by Mayer Michael over 8 years ago

One minor question / comment: why call it “extname” in the code and “hdu” in the XML file ... make it consistent?

Don’t have a strong opinion “hdu” is shorter and more handy, “extname” makes it more consistent. Jürgen, any opinion?

#6 Updated by Knödlseder Jürgen over 8 years ago

Mayer Michael wrote:

One minor question / comment: why call it “extname” in the code and “hdu” in the XML file ... make it consistent?

Don’t have a strong opinion “hdu” is shorter and more handy, “extname” makes it more consistent. Jürgen, any opinion?

I agree to have it consistently named extname would be better.

#7 Updated by Mayer Michael over 8 years ago

Ok, I have made the change.
We might need the same behaviour for CTAEventList and GTI?

#8 Updated by Deil Christoph over 8 years ago

We might need the same behaviour for CTAEventList and GTI?

Yes, please!

For HESS, IMO it would be nice if I have the option to store the 100k HDUs for all HESS data in one FITS file that’s a few GB in size (depending on the cuts) and distribute that instead of 100k separate files for each HDU or 20k where the HDUs for one obs are bundled.

If you can make Gammalib / ctools so flexible the that HDU names can be specified everywhere / processing data from such a large multi-obs bundled FITS file, that would be great!

#9 Updated by Knödlseder Jürgen over 8 years ago

Mayer Michael wrote:

Ok, I have made the change.
We might need the same behaviour for CTAEventList and GTI?

I support to have this option always when a FITS file is loaded.

We should create a new feature on this and screen the code to see what needs to be modified (I’m current working on integrating your code, Michael). When working on the new feature, please also think about updating the Doxygen description of the interface (I’m about to add the missing extname parameters).

#10 Updated by Mayer Michael over 8 years ago

We should create a new feature on this and screen the code to see what needs to be modified

I have created #1598

When working on the new feature, please also think about updating the Doxygen description of the interface (I’m about to add the missing extname parameters).

Oh right, I forgot about this. Thanks.

#11 Updated by Knödlseder Jürgen over 8 years ago

I’m in the process to merge the change in, yet I was thinking about another possibility.

The cfitsio convention is to specify an extension name in squared brackets following

filename[extname]

(one can also given an extension number in that way). We could use this convention by adding a convenience function that pulls the filename[extname] input string apart into a filename and extname string, using that for opening the file (or alternatively an extension number). The advantage is to have simpler interfaces (now we have methods that don’t use extname, but they need it specified as a dummy argument of “”).

How does this sound (sorry Michael, I should have come up with this though before you start coding).

#12 Updated by Mayer Michael over 8 years ago

How does this sound (sorry Michael, I should have come up with this though before you start coding).

No worries :)
Sounds like a very good idea which just didn’t came to my mind. This would simplify things a lot I guess. Would you then already have this as filename in the xml file?
E.g.

<parameter name="PointSpreadFunction" file="psf.fits.gz[PSF]"/>

#13 Updated by Knödlseder Jürgen over 8 years ago

  • Status changed from Pull request to In Progress
  • % Done changed from 100 to 50

Yes. A complete list of the syntax supported by cfitsio is here: https://heasarc.gsfc.nasa.gov/docs/software/fitsio/c/c_user/node81.html (see also https://heasarc.gsfc.nasa.gov/docs/software/fitsio/filters.html).

I see that there is even an extension version (EXTVER) that I must admit I was not aware of.

My idea would be to implement a GFilename class that ultimately will replace the std::string types and that emulates somehow the C++ string class, but provides methods to extract extname, extver and extno from the filename. I created a feature for that (#1599).

I would not replace std::string by GFilename now (basically not to hold up release 1.0), but we can implement the class and use it for the moment internally.

#14 Updated by Mayer Michael over 8 years ago

Thanks for adding GFilename (#1599) - does it need to be GFileName though?.
Just to clarify what is necessary now:
  • Revert my changes and stick with the current interface in CTA response functions i.e. use load(std::string& filename).
  • In CTA response components load() functions, create a GFilename instance from the input string
  • Set the default_extname of the GFilename instance to the respective component (e.g. “EFFECTIVE AREA” in GCTAAeff2D)
  • Get the FITS table in extension GFilename::extname()
  • read the FITS table from the corresponding extension name.
  • Remove “extname” from XML interface as this can be handled via the file name parameter

Did I miss something?

#15 Updated by Mayer Michael over 8 years ago

Here is an example here how I would modify the code e.g. in GCTAAeff2D::load.

void GCTAAeff2D::load(const std::string& filename)
{
    // Initialise filename
    GFilename fname(filename);

    // Set default extension name
    fname.default_extname("EFFECTIVE AREA");

    // Open effective area FITS file
    GFits file(fname.filename());

    // Initialise FITS table
    GFitsTable& table;

    // Check for extno, otherwise use extname to initialise the table
    if (fname.has_extno()) {
        table = *file.table(fname.extno());
    }
    else {
        table = *file.table(fname.extname());
    }

    // Read effective area
    read(table);

    // Close effective area FITS file
    file.close();

    // Store filename
    m_filename = filename;

    // Return
    return;
}

Any input is very appreciated.

We could also think about putting some more logic into the function GFits::table, e.g. add GFits::table(GFilename&) or even a new constructor for the table: GFitsTable(GFilename&)?

#16 Updated by Knödlseder Jürgen over 8 years ago

Mayer Michael wrote:

Thanks for adding GFilename (#1599) - does it need to be GFileName though?.
Just to clarify what is necessary now:
  • Revert my changes and stick with the current interface in CTA response functions i.e. use load(std::string& filename).
  • In CTA response components load() functions, create a GFilename instance from the input string
  • Set the default_extname of the GFilename instance to the respective component (e.g. “EFFECTIVE AREA” in GCTAAeff2D)
  • Get the FITS table in extension GFilename::extname()
  • read the FITS table from the corresponding extension name.
  • Remove “extname” from XML interface as this can be handled via the file name parameter

Did I miss something?

I’m still fiddling around with the GFilename interface. I basically try to use the interface in GEbounds. Still wait a bit (I just removed default_extname).

#17 Updated by Knödlseder Jürgen over 8 years ago

Mayer Michael wrote:

Thanks for adding GFilename (#1599) - does it need to be GFileName though?.

I have not a very strong opinion about the name.

Sometimes I write filename as a single word, some times as two words (see https://en.wikipedia.org/wiki/Filename).

We should decide before using it if we keep it as is or we want to change. Any votes?

#18 Updated by Mayer Michael over 8 years ago

Any votes?

Didn’t know that it can be both one word or two. I also don’t have a strong opinion. I guess I would vote for keeping GFilename. I recall there was a similar discussion about GSkymap vs GSkyMap (which is why I brought up the point in the first place).
You make the call :)

#19 Updated by Knödlseder Jürgen over 8 years ago

Here’s how you can use now GFilename (I changed GEbounds as a test case):

void GEbounds::load(const std::string& filename)
{
    // Create file name
    GFilename fname(filename);

    // Allocate FITS file
    GFits file;

    // Open FITS file
    file.open(fname.filename());

    // Get energy boundary table
    const GFitsTable& table = *file.table(fname.extname("EBOUNDS"));

    // Read energy boundaries from table
    read(table);

    // Close FITS file
    file.close();

    // Return
    return;
}

Just construct a GFilename object from a string and then use the filename() and extname() method to get filename and extension name. Note that the argument of extname() specifies the default extension name. If the user specified an extension name this overwrites the default extension name. This way of dealing with the default extension name is more compact than the former default_extname method.

For saving you would do

void GEbounds::save(const std::string& filename,
                    const bool&        clobber,
                    const std::string& unit) const
{
    // Create file name
    GFilename fname(filename);

    // Allocate FITS file
    GFits file;

    // Write energy boundaries to FITS file
    write(file, fname.extname("EBOUNDS"), unit);

    // Save to file
    file.saveto(fname.filename(), clobber);

    // Return
    return;
}

Note that ultimately one should be able to replace the std::string argument by a GFilename argument in the methods directly, but there are issues in Python with automatic type conversion that I have not yet settled. So for release 1.0 I propose to use the above method. We then can switch to GFilename in a later release which should in principle be transparent to the user (due to automatic type casting between std::string and GFilename - once I get this also properly working in Python).

#20 Updated by Mayer Michael over 8 years ago

Thanks a lot for the input. I will checkout a fresh branch and make the changes to the CTA response methods.
One question though:
In the example I provided above, we would also take into account the possibility to specify an HDU number instead of a name:

// Check for extno, otherwise use extname to initialise the table
    if (fname.has_extno()) {
        table = *file.table(fname.extno());
    }
    else {
        table = *file.table(fname.extname());
    }

Do we want to support that in the CTA response classes, too? Note that currently GCTAPsfVector uses:
    // Get PSF table
    const GFitsTable& table = *file.table(1);

#21 Updated by Knödlseder Jürgen over 8 years ago

You are right that we may have the two cases, and I agree that ultimately we should support both. I’m wondering what we should do for release 1.0 and what for beyond. I would limit release 1.0 to what is strictly necessary (extno seems a bit more low level, and maybe no one needs it now), and do other things later with more time for thinking about the best way to do it.

With the actual GFilename class you would need to add a lot of if’s in many methods. This can be avoided by introducing a GFitsExtension class that handles extname and extno (and possibly also extver) in a unified way. GFilename would then fill a GFitsExtension object as needed, and you pass this to the FITS classes instead of having methods for extension number and extension name. But we should do this beyond release 1.0 (otherwise we never meet the date).

I recognized that we should also add default values to the extno and extver methods. If you want you can do that (in the way how it’s done for extname). Make sure you commit this separately from any other change (makes it easier to deal with stuff for release 1.0 and other stuff).

#22 Updated by Knödlseder Jürgen over 8 years ago

P.S. Will be offline now until tonight.

#23 Updated by Mayer Michael over 8 years ago

I added the default arguments to the functions extno() and extver on branch 1599-add-defaults-to-GFilename

#24 Updated by Mayer Michael over 8 years ago

  • Status changed from In Progress to Pull request
  • % Done changed from 50 to 100

The methods read(), write(), load(), save() from most of the CTA response components were adapted to use GFilename. All unit tests run smoothy. With this change we actually do not need any modification for GCTAResponseIrf::read() anymore, right?
One can just specify HDUs in the XML observation definition files in the following:

<parameter name="PointSpreadFunction" file="psf.fits.gz[PSF]"/>

The changes are available on branch 1597-use-GFilename-in-CTAresponse-components

#25 Updated by Knödlseder Jürgen over 8 years ago

Mayer Michael wrote:

I added the default arguments to the functions extno() and extver on branch 1599-add-defaults-to-GFilename

Merged into devel.

#26 Updated by Knödlseder Jürgen over 8 years ago

  • Status changed from Pull request to Closed

Mayer Michael wrote:

The methods read(), write(), load(), save() from most of the CTA response components were adapted to use GFilename. All unit tests run smoothy. With this change we actually do not need any modification for GCTAResponseIrf::read() anymore, right?
One can just specify HDUs in the XML observation definition files in the following:
[...]
The changes are available on branch 1597-use-GFilename-in-CTAresponse-components

Merged into devel.

#27 Updated by Mayer Michael over 8 years ago

I have realised that I forgot to store the filename in the load functions of the response components.
Therefore,I have updated the branch 1597-use-GFilename-in-CTAresponse-components to the latest version and included the changes.
Could you merge that in, too? Sorry for the inconvenience.

#28 Updated by Mayer Michael over 8 years ago

I also realised that the following functions need adaption, too:
  • GCTAResponseIrf::load_aeff()
  • GCTAResponseIrf::load_psf()
  • GCTAResponseIrf::load_edisp()
  • GCTAResponseIrf::load_background()

I have updated them on the same branch (including the file inst/cta/test/data/test_1dc.xml, now specifying extension names).

#29 Updated by Knödlseder Jürgen over 8 years ago

Mayer Michael wrote:

I have realised that I forgot to store the filename in the load functions of the response components.
Therefore,I have updated the branch 1597-use-GFilename-in-CTAresponse-components to the latest version and included the changes.
Could you merge that in, too? Sorry for the inconvenience.

Merged into devel.

#30 Updated by Knödlseder Jürgen over 8 years ago

Mayer Michael wrote:

I also realised that the following functions need adaption, too:
  • GCTAResponseIrf::load_aeff()
  • GCTAResponseIrf::load_psf()
  • GCTAResponseIrf::load_edisp()
  • GCTAResponseIrf::load_background()

I have updated them on the same branch (including the file inst/cta/test/data/test_1dc.xml, now specifying extension names).

I went over your proposed changes which required to add extension names to the inst/cta/test/data/test_1dc.xml file.

I made a slightly different change so that this is not needed. Here an example of how I modified the GCTAResponseIrf::load_aeff method. The main difference is that the default extension name is created on basis of the extensions that exist in the FITS file. Thus if a 1DC file is read without providing the extension, the default extension name will be set to SPECRESP in that case. Please let me know if this looks also fine for you, or if you have trouble with this new implementation.

    // If file is a FITS file ...
    if (gammalib::is_fits(filename)) {

        // Open FITS file
        GFits file(fname.filename());

        // Get the extension name. If an extension name has been specified
        // then use this name, otherwise use either the "EFFECTIVE AREA" 
        // or the "SPECRESP" extension.
        std::string extname = "";
        if (fname.has_extname()) {
            extname = fname.extname();
        }
        else {
            if (file.contains("EFFECTIVE AREA")) {
                extname = "EFFECTIVE AREA";
            }
            else if (file.contains("SPECRESP")) {
                extname = "SPECRESP";
            }
        }

        // Continue only if extension name is not empty
        if (!extname.empty()) {

            // Get FITS table
            const GFitsTable& table = *file.table(extname);

            // Read save energy thresholds if available
            if (table.has_card("LO_THRES")) {
                m_lo_save_thres = table.real("LO_THRES");
            }
            if (table.has_card("HI_THRES")) {
                m_hi_save_thres = table.real("HI_THRES");
            }

            // Check for specific table column
            if (table.contains("SPECRESP")) {

                // Close file
                file.close();

                // Allocate Aeff from file
                m_aeff = new GCTAAeffArf(filename);

            } // endif: load as GCTAAeffArf

            else {

                // Close file
                file.close();

                // Allocate Aeff from file
                m_aeff = new GCTAAeff2D(filename);

            } // endelse: load as GCTAAeff2D

        } // endif: extension name is not empty

        // Signal that no extension was found
        else {
            std::string msg = "FITS file \""+filename+"\" does not " 
                              "contain a valid effective area table. " 
                              "Please specify a valid effective area " 
                              "response file.";
            throw GException::file_error(G_LOAD_AEFF, msg);
        }

    } // endif: file was FITS file

#31 Updated by Mayer Michael over 8 years ago

Excellent, the code looks fine with me. Much better now as we do not have to specify extension names for the 1DC format. Thanks!

Also available in: Atom PDF