Should we change the GTI logic?

Added by Knödlseder Jürgen about 5 years ago

For the moment we assume that the event list is always consistent with the GTIs, hence that the event list (on disk) has been filtered according to the GTIs. This implies that if somebody changes the GTIs, he/she has to create a new copy of the event files. One may argue that this duplicates event files and requires disk space, on the other hand, only this assures the coherence of the data with the GTIs, and the disk space required is not so large.

If we want to avoid data duplication, we would need to carefully evaluate first the implications. This thread is for discussing that (and also the Use Cases).

At which step would we like to modify the events according to the GTIs (upon loading or upon event access)? The first simplifies things (I think), while the latter would be more flexible (but probably slows down things).

We also have to be aware that if we change GTIs we need to recompute ontime and livetime (and maybe something else?).

We also would need a mechanism that avoids making GTIs “larger” than the existing GTIs (hence a carefully checking would be needed that assures that GTIs can only be more restrictive than the previous ones).

So before doing any action here, I would suggest that we have a carefully discussion of this issue. Note that in any case, the events should always have a default GTI that corresponds to the initial selection.


Replies (23)

RE: Should we change the GTI logic? - Added by Deil Christoph about 5 years ago

Related: https://github.com/gammapy/gamma-astro-data-formats/issues/20

Completely agree this needs more thought to figure out the use cases / good semantics how things should work.

For HESS we’re currently not using GTIs, so I don’t have a concrete use case / proposal how it should work at this point.

RE: Should we change the GTI logic? - Added by Mayer Michael about 5 years ago

I agree this should be discussed.
The first question that comes to my mind is how a user wants to and can edit GTIs. Will there be a ctmktime tool eventually? Should this tool already select events by the GTIs or only compute the GTIs?

One option (which might be the simplest and most static one) could be to run ctselect after the GTIs have been changed. We would just need to adapt ctselect to always apply a time selection according to the given GTIs. Is that reasonable or too prone to errors.

If GTIs can (and should be) simply changed by the user, it make sense that filtering and computing ontime and livetime should be done on loading.
I am still unsure about the advantage to filter on runtime versus filtering on loading.

For HESS we’re currently not using GTIs, so I don’t have a concrete use case / proposal how it should work at this point.

Well we are using GTIs but they are at run start and run stop.

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

Mayer Michael wrote:

I agree this should be discussed.
The first question that comes to my mind is how a user wants to and can edit GTIs. Will there be a ctmktime tool eventually? Should this tool already select events by the GTIs or only compute the GTIs?

Yes, than plan is to have a tool that allows to redefine the GTIs based on any auxiliary information. The working name of the tool is ctfilter (we may of course change the name). See #1635.

One option (which might be the simplest and most static one) could be to run ctselect after the GTIs have been changed. We would just need to adapt ctselect to always apply a time selection according to the given GTIs. Is that reasonable or too prone to errors.

My idea was that ctfilter makes the time selection and writes out a new event file with adjusted GTIs. So the workflow is to run ctfilter and then ctselect (having two tools was inspired from the Fermi workflow and looked more flexible to me, but there could also be arguments for combining this into a single tool).

If GTIs can (and should be) simply changed by the user, it make sense that filtering and computing ontime and livetime should be done on loading.
I am still unsure about the advantage to filter on runtime versus filtering on loading.

The thing would be that you could set the GTI for example from within Python and rerun the analysis without changing anything on disk. Maybe this is what the user wants (to test quickly a bunch of GTIs) but I’m not sure about that. Maybe we really need to develop use cases for this.

RE: Should we change the GTI logic? - Added by Deil Christoph about 5 years ago

Are IRFs per-observation or per-GTI?

In HESS we currently have one GTI and one IRF per observation, but I think for VERITAS Nathan has split one observation in multiple chunks with their own IRFs.
I’m not sure how / if he’s using GTIs and how he does the GTI / IRF association.
That’s the first and most concrete use-case for GTIs and we probably want to support this for high-precision IRFs during long observations, no?

RE: Should we change the GTI logic? - Added by Mayer Michael about 5 years ago

Yes, than plan is to have a tool that allows to redefine the GTIs based on any auxiliary information. The working name of the tool is ctfilter (we may of course change the name). See #1635.

Sounds good. Just a matter of taste but to me ctfilter sounds very generic (what will be filtered by what). Therefore, e.g. also ctmkgti could be more meaningful.

My idea was that ctfilter makes the time selection and writes out a new event file with adjusted GTIs. So the workflow is to run ctfilter and then ctselect (having two tools was inspired from the Fermi workflow and looked more flexible to me, but there could also be arguments for combining this into a single tool).

Agreed, this looks like a reasonable workflow (however, in Fermi we first run gtselect and then gtmktime). Accordingly, if he user wants to change the GTIs, the complete chain has to be redone. That seems fine with me as GTIs are really fundamental for the analysis.

The thing would be that you could set the GTI for example from within Python and rerun the analysis without changing anything on disk. Maybe this is what the user wants (to test quickly a bunch of GTIs) but I’m not sure about that. Maybe we really need to develop use cases for this.

I agree this is also not quite clear to me. This might also be different for binned and unbinned analysis. For binned analysis, the intermediate products like exposure cube would have to be recomputed when the GTI changes. In case of an unbinned analysis it could be an option do sth like:

if (!m_gti.contains(event.time()) {
    continue;
}

But I don’t know how costly, i.e. time consuming such a call will be at the end.

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

Deil Christoph wrote:

Are IRFs per-observation or per-GTI?

In HESS we currently have one GTI and one IRF per observation, but I think for VERITAS Nathan has split one observation in multiple chunks with their own IRFs.
I’m not sure how / if he’s using GTIs and how he does the GTI / IRF association.
That’s the first and most concrete use-case for GTIs and we probably want to support this for high-precision IRFs during long observations, no?

IRFs are per observation, but you can easily split a single observation into sub-observations to have a more fine grained IRF handling. I think that’s what he is doing.

RE: Should we change the GTI logic? - Added by Deil Christoph about 5 years ago

So what’s the use case you have in mind where one needs GTIs at all?

- Assigning IRFs?
- Data quality selection?
- Time bins for lightcurves?

It’s only useful if you have an observation where the GTI is not simply [TSTART, TSTOP], right?

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

The plan in CTA is to keep as much events as possible, hence if something bad happens within a run (e.g. clouds) the period should simply be cut out (by using for example GTIs) instead of throwing away an entire run.

The observatory will certainly provide “standard” GTIs, but to test whether a result is solid I could image that a user would like to play with the GTI selection to see how sensitive the result is on a specific selection. Most users would probably not go into run-wise GTI modification, but would run the ctmkgti (or whatever we name it) tool using different filtering criteria, and rerun the analysis again.

Time bins for light curves should not be done using GTI (as GTI means Good Time Intervals, hence all intervals that are good for analysis). For user specified light curves bins we should invent something different (the format could however be very similar, but we should not use the official GTIs for that).

RE: Should we change the GTI logic? - Added by Deil Christoph about 5 years ago

Then it’s pretty simple: by default check / apply GTI filter on events in any tool that takes event lists as input (because it’s cheap). Maybe, if speed is a concern anywhere, add an option to not do this.

What about combining several observations into one EVENTS list?
Or even a global photon database?
Is this still being discussed / worth inventing a format for to prototype?

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

By the way: if we assume that CTA will provide standard GTIs, a user can only make a looser GTI cut if the event file contains all events, i.e. also those that are not falling within a GTI. Currently we do not support that logic.

The problem with

if (!m_gti.contains(event.time()) {
    continue;
}
is that eventually the contains() method can be quite time consuming (in particular if the number of GTIs is large; but maybe we can assume that this is rarely the case). Also note that the ontime and livetime needs to be computed correctly. This probably implies that we do not use for this the information provided in the FITS header. So far we have however no way to re-valuate the deadc fraction, which may change if we modify the GTIs. For this more fine-grained deadtime information would be needed.

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

Deil Christoph wrote:

What about combining several observations into one EVENTS list?
Or even a global photon database?
Is this still being discussed / worth inventing a format for to prototype?

Not sure what the status is about that, I have not heard about this since a long time.

RE: Should we change the GTI logic? - Added by Mayer Michael about 5 years ago

I understand that contains can become quite time consuming.

So the solution for now could be (until we have ctmkgti) to add a flag to ctselect to remove events when they are not contained in the GTIs (or even do it on default?). I am not sure if tools like ctlike should deal with this at all (maybe a simple check at the beginning and throw exception if there are still events outside GTIs?)

Time bins for light curves should not be done using GTI (as GTI means Good Time Intervals, hence all intervals that are good for analysis). For user specified light curves bins we should invent something different (the format could however be very similar, but we should not use the official GTIs for that).

I had a look at cslightcrv. The script can read FITS files that have “START” and “STOP” columns such as GTIs (the script uses the GTI class to handle the time bins internally). But the user can simply provide an arbitrary FITS (or even ascii) file to read the time bins.

RE: Should we change the GTI logic? - Added by Deil Christoph about 5 years ago

Concerning light curves: a spec of the time binning input format and also the output results format would be very welcome!
http://gamma-astro-data-formats.readthedocs.org/en/latest/results/light_curves/index.html
Writing up how exactly that should work might even be a good use case for figuring out how GTIs should work.

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

So far the tools do things consistently, but as soon as we use data from somewhere else (i.e. provided by HESS) we need to make sure that these data are consistent.

For now, we should probably add a verification step to GCTAEventList as all event loading goes through this class, and throw an exception if an event outside a GTI is found (one reason more to limit the handling of GTIs to GCTAEventList).

For cslightcrv: it fine to use some GTI stored in a user FITS file, I just would not use the GTI selection for events for that.

RE: Should we change the GTI logic? - Added by Mayer Michael about 5 years ago

For now, we should probably add a verification step to GCTAEventList as all event loading goes through this class, and throw an exception if an event outside a GTI is found (one reason more to limit the handling of GTIs to GCTAEventList).

Yes I thought about this too. However, if we load and dispose events during run time to save memory, would this become also quite time consuming? Maybe this check could be handled on the tool level, i.e. add GCTAEventList::apply_gti() that is called when loading the observations the first time only?

For cslightcrv: it fine to use some GTI stored in a user FITS file, I just would not use the GTI selection for events for that.

Yes, currently, the time selection for cslightcrv is done via time parameters in ctselect.

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

Deil Christoph wrote:

Concerning light curves: a spec of the time binning input format and also the output results format would be very welcome!
http://gamma-astro-data-formats.readthedocs.org/en/latest/results/light_curves/index.html
Writing up how exactly that should work might even be a good use case for figuring out how GTIs should work.

Are there any standard formats for light curves? (e.g. a VO spec?)

To my knowledge there is no VO standard, but there is http://wiki.ivoa.net/bin/view/IVOA/LightCurves. See also:

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

Mayer Michael wrote:

For now, we should probably add a verification step to GCTAEventList as all event loading goes through this class, and throw an exception if an event outside a GTI is found (one reason more to limit the handling of GTIs to GCTAEventList).

Yes I thought about this too. However, if we load and dispose events during run time to save memory, would this become also quite time consuming? Maybe this check could be handled on the tool level, i.e. add GCTAEventList::apply_gti() that is called when loading the observations the first time only?

We can of course do this job only on the first time. But: the dispose_events() is only used explicitly by ctobssim, ctbin and ctmodel so far after a on-time usage of the event data. The basic idea was to dispose the events when you are sure that they will not be used again by a tool, and thus save memory. So it would actually make no difference.

RE: Should we change the GTI logic? - Added by Mayer Michael about 5 years ago

We can of course do this job only on the first time. But: the dispose_events() is only used explicitly by ctobssim, ctbin and ctmodel so far after a on-time usage of the event data. The basic idea was to dispose the events when you are sure that they will not be used again by a tool, and thus save memory. So it would actually make no difference.

Ok sounds good - then let’s do it that way. I guess GCTAEventList::read_events() needs to be expanded then.

RE: Should we change the GTI logic? - Added by Deil Christoph about 5 years ago

Jürgen, thanks for the links.

I’ve made an issue here as a reminder that we want to write up a spec for this, linking back to this discussion:
https://github.com/gammapy/gamma-astro-data-formats/issues/22

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

Mayer Michael wrote:

Ok sounds good - then let’s do it that way. I guess GCTAEventList::read_events() needs to be expanded then.

I propose to continue the specific discussion about file off loading here: #1648.

RE: Should we change the GTI logic? - Added by Mayer Michael about 5 years ago

I propose to continue the specific discussion about file off loading here: #1648.

Agreed.

For the logic regarding the GTIs, we should make a separate issue to implement the flag in ctselect.

RE: Should we change the GTI logic? - Added by Knödlseder Jürgen about 5 years ago

Mayer Michael wrote:

For the logic regarding the GTIs, we should make a separate issue to implement the flag in ctselect.

Wondering whether we need a flag for this of whether ctselect just should do this job (and eventually log this into the log file). This would it make it easier to remove this once we have ctmkgti.

RE: Should we change the GTI logic? - Added by Mayer Michael about 5 years ago

Of course, we could also make ctselect to do this without asking. However, maybe a user wants to use ctselect to make other selections (e.g look at energy distributions outside GTIs, or select all events with certain sky directions regardless of GTIs). For this case ctselect wouldn’t be useful any longer.
I am not sure if these cases are needed at all though.

(1-23/23)