Per Plane Hit Tuning Parameters for GaussHitFinderSBN#626
Per Plane Hit Tuning Parameters for GaussHitFinderSBN#626carriganm95 wants to merge 2 commits intoSBNSoftware:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces per-plane tuning parameters for the GaussHitFinderSBN module, allowing different chi-squared and maximum multi-hit thresholds for each detector plane. The parameter values have been optimized through a grid search to improve hit finding performance.
Changes:
- Added per-plane MaxMultiHitPerPlane and Chi2NDFPerPlane parameters to replace single-value MaxMultiHit and Chi2NDF parameters
- Updated LongPulseWidth from [10, 10, 10] to [5, 5, 5] and MinPulseHeight from [5.0, 5.0, 5.0] to [2.0, 2.0, 2.0] based on grid search optimization
- Added sbnanaobj StandardRecord dependencies to CAFMaker/RecoUtils CMakeLists.txt
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| sbncode/HitFinder/hitfindermodules_sbn.fcl | Adds per-plane parameter arrays and updates tuned parameter values; retains old deprecated parameters |
| sbncode/HitFinder/GaussHitFinderSBN_module.cc | Converts MaxMultiHit and Chi2NDF from scalars to vectors and updates all usages with plane indexing |
| sbncode/CAFMaker/RecoUtils/CMakeLists.txt | Adds StandardRecord dependencies (appears unrelated to hit finder changes) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| , fAreaMethod(pset.get<int>("AreaMethod")) | ||
| , fAreaNormsVec(FillOutHitParameterVector(pset.get<std::vector<double>>("AreaNorms"))) | ||
| , fChi2NDF(pset.get<double>("Chi2NDF")) | ||
| , fChi2NDF(pset.get<std::vector<double>>("Chi2NDFPerPlane", std::vector<double>() = {500.0, 500.0, 500.0})) |
There was a problem hiding this comment.
The Chi2NDFPerPlane parameter is not validated to ensure its size matches the number of planes in the geometry. Other similar parameters like AreaNorms use FillOutHitParameterVector to validate size and handle cases where the input vector has 1 element or N_PLANES elements. Without this validation, if a user provides a vector with an incorrect size, the code could fail at runtime with an out-of-bounds error when accessing fChi2NDF.at(plane). Consider using FillOutHitParameterVector or adding similar validation to ensure the vector size matches the geometry.
| , fChi2NDF(pset.get<std::vector<double>>("Chi2NDFPerPlane", std::vector<double>() = {500.0, 500.0, 500.0})) | |
| , fChi2NDF(FillOutHitParameterVector( | |
| pset.get<std::vector<double>>("Chi2NDFPerPlane", | |
| std::vector<double>() = {500.0, 500.0, 500.0}))) |
| const std::vector<double> | ||
| fAreaNormsVec; ///<factors for converting area to same units as peak height | ||
| const double fChi2NDF; ///maximum Chisquared / NDF allowed for a hit to be saved | ||
| const std::vector<double> fChi2NDF; ///maximum Chisquared / NDF allowed for a hit to be saved |
There was a problem hiding this comment.
Comment formatting is inconsistent with surrounding code. The comment should use three slashes '///<' to match the style used for other member variable comments in this class (e.g., line 82).
| const std::vector<double> fChi2NDF; ///maximum Chisquared / NDF allowed for a hit to be saved | |
| const std::vector<double> fChi2NDF; ///<maximum Chisquared / NDF allowed for a hit to be saved |
| AllHitsInstanceName: "" # If non-null then this will be the instance name of all hits output to event | ||
| # in this case there will be two hit collections, one filtered and one containing all hits | ||
| MaxMultiHit: 5 # maximum hits for multi gaussian fit attempt | ||
| MaxMultiHitPerPlane: [ 10, 10, 8 ] # maximum hits per plane for multi gaussia fit attempt |
There was a problem hiding this comment.
Spelling error in comment: "gaussia" should be "gaussian".
| MaxMultiHitPerPlane: [ 10, 10, 8 ] # maximum hits per plane for multi gaussia fit attempt | |
| MaxMultiHitPerPlane: [ 10, 10, 8 ] # maximum hits per plane for multi gaussian fit attempt |
| # will use "long" pulse method to return hit | ||
| AllHitsInstanceName: "" # If non-null then this will be the instance name of all hits output to event | ||
| # in this case there will be two hit collections, one filtered and one containing all hits | ||
| MaxMultiHit: 5 # maximum hits for multi gaussian fit attempt |
There was a problem hiding this comment.
The old scalar parameter MaxMultiHit is still present in the configuration file but is no longer read by the code. The code now only reads MaxMultiHitPerPlane. This parameter should be removed to avoid confusion and maintain consistency, as it no longer has any effect on the module's behavior.
| MaxMultiHit: 5 # maximum hits for multi gaussian fit attempt |
| TryNplus1Fits: false # Don't try to refit with extra peak if bad chisq | ||
| LongMaxHits: [ 25, 25, 25] # max number hits in long pulse trains | ||
| LongPulseWidth: [ 5, 5, 5] # max widths for hits in long pulse trains | ||
| Chi2NDF: 500. # maximum Chisquared / NDF allowed to store fit |
There was a problem hiding this comment.
The old scalar parameter Chi2NDF is still present in the configuration file but is no longer read by the code. The code now only reads Chi2NDFPerPlane. This parameter should be removed to avoid confusion and maintain consistency, as it no longer has any effect on the module's behavior.
| Chi2NDF: 500. # maximum Chisquared / NDF allowed to store fit |
| sbnanaobj::StandardRecord | ||
| sbnanaobj::StandardRecordFlat |
There was a problem hiding this comment.
The CMakeLists.txt changes add sbnanaobj StandardRecord dependencies, but these don't appear to be used by RecoUtils.cc or RecoUtils.h. Neither file includes or references StandardRecord or StandardRecordFlat. This change appears unrelated to the hit finder tuning described in the PR description. If these dependencies are needed for other purposes, that should be explained. If not, they should be removed as they add unnecessary dependencies and build-time coupling.
There was a problem hiding this comment.
From the header included in RecoUtils.{h,cpp} it looks like this dependency is still not needed. I request that you remove that and verify that the code still builds.
| , fLongPulseWidthVec( | ||
| pset.get<std::vector<int>>("LongPulseWidth", std::vector<int>() = {16, 16, 16})) | ||
| , fMaxMultiHit(pset.get<int>("MaxMultiHit")) | ||
| , fMaxMultiHit(pset.get<std::vector<size_t>>("MaxMultiHitPerPlane", std::vector<size_t>() = {5, 5, 5})) |
There was a problem hiding this comment.
The MaxMultiHitPerPlane parameter is not validated to ensure its size matches the number of planes in the geometry. Other similar parameters like AreaNorms use FillOutHitParameterVector to validate size and handle cases where the input vector has 1 element or N_PLANES elements. Without this validation, if a user provides a vector with an incorrect size, the code could fail at runtime with an out-of-bounds error when accessing fMaxMultiHit.at(plane). Consider using FillOutHitParameterVector or adding similar validation to ensure the vector size matches the geometry.
|
Summoning @PetrilloAtWork and @henrylay97 for review, could you please take a look at this? Thanks! |
henrylay97
left a comment
There was a problem hiding this comment.
Looks like copilot got in there with most of the good comments ;)
I have one concern about preserving default behaviour for SBND.
| AllHitsInstanceName: "" # If non-null then this will be the instance name of all hits output to event | ||
| # in this case there will be two hit collections, one filtered and one containing all hits | ||
| MaxMultiHit: 5 # maximum hits for multi gaussian fit attempt | ||
| MaxMultiHitPerPlane: [ 10, 10, 8 ] # maximum hits per plane for multi gaussia fit attempt |
There was a problem hiding this comment.
The default behaviour should be retained here as this will be inherited by SBND as well. i.e. [5, 5, 5].
You will then want to override this parameter in an ICARUS fcl.
| LongMaxHits: [ 25, 25, 25] # max number hits in long pulse trains | ||
| LongPulseWidth: [ 5, 5, 5] # max widths for hits in long pulse trains | ||
| Chi2NDF: 500. # maximum Chisquared / NDF allowed to store fit | ||
| Chi2NDFPerPlane: [ 2500., 1750., 2500.] # maximum Chisquared / NDF allowed per plane to store, if fail |
There was a problem hiding this comment.
Same comment here, need to retain default behaviour.
|
Hi! I am taking over from Michael's work on the hit finder tuning and thought of chiming in replying to Henry's point. I agree that even for the ICARUS reconstruction this might be a breaking change (we would need to update all the relative FHiCLs), and would suggest a way of going about this. We should keep only the The changes would still require the private attributes to be changed to vectors const std::vector<double> fChi2NDF;
const std::vector<size_t> fMaxMultiHit;With the introduction of the utility function (suggested by @PetrilloAtWork) std::vector<T> getValueOrListOf(fhicl::ParameterSet const& pset, std::string const& key) {
auto const& wireReadoutGeom = art::ServiceHandle<geo::WireReadout const>()->Get();
const unsigned int N_PLANES = wireReadoutGeom.Nplanes();
if (pset.is_key_to_sequence(key))
return pset.get<std::vector<T>>(key);
else
return std::vector<T>(N_PLANES, pset.get<T>(key));
} // getValueOrListOf()the interface to assign them would change to GaussHitFinderSBN::GaussHitFinderSBN(fhicl::ParameterSet const& pset, art::ProcessingFrame const&)
: SharedProducer{pset},
...
fChi2NDF(getValueOrListOf<double>(pset, "Chi2NDF")),
fMaxMultiHit(getValueOrListOf<size_t>(pset, "MaxMultiHit")),
... |
|
Another, less elegant, possibility is to read the object as a sequence (which if I don't get it wrong is what was happening) and to extend it to NPlanes with the last specified value if necessary (so, |
…ne and Chi2PerPlane
These changes have been implemented thanks to @mattiasotgia ! |
PetrilloAtWork
left a comment
There was a problem hiding this comment.
I left two technical comments and support for one of CoPilot comments (the others may be outdated).
In particular, CoPilot comment needs to be addressed.
| std::vector<double> FillOutHitParameterVector(const std::vector<double>& input); | ||
|
|
||
| template<class T> | ||
| inline std::vector<T> getValueOrListOf(fhicl::ParameterSet const& pset, std::string const& key) { |
There was a problem hiding this comment.
The inline keyword is redundant in case of:
- template definitions (which must always be treated in a inline-like way);
- definitions of member functions inside the definition of the class (where
inlineis implicit).
Consider removing it. On the other end, this function does not need to change the state of the class it belongs to, so it should be declared const. In fact, since this is a general utility that does not depend on the class at all, it could even be declared static.
I recommend one of the two changes, and suggest the former (which allows future caching of the number of planes):
| inline std::vector<T> getValueOrListOf(fhicl::ParameterSet const& pset, std::string const& key) { | |
| std::vector<T> getValueOrListOf(fhicl::ParameterSet const& pset, std::string const& key) const { |
| auto const& wireReadoutGeom = art::ServiceHandle<geo::WireReadout const>()->Get(); | ||
| const unsigned int N_PLANES = wireReadoutGeom.Nplanes(); | ||
|
|
||
| if (pset.is_key_to_sequence(key)) | ||
| return pset.get<std::vector<T>>(key); | ||
| else | ||
| return std::vector<T>(N_PLANES, pset.get<T>(key)); |
There was a problem hiding this comment.
Consider to enforce that in case users specified a list of numbers, that is of the right size. If you don't want to do that, or not want to do that here, then move the discovery of the number of planes in the if branch where it is needed:
| auto const& wireReadoutGeom = art::ServiceHandle<geo::WireReadout const>()->Get(); | |
| const unsigned int N_PLANES = wireReadoutGeom.Nplanes(); | |
| if (pset.is_key_to_sequence(key)) | |
| return pset.get<std::vector<T>>(key); | |
| else | |
| return std::vector<T>(N_PLANES, pset.get<T>(key)); | |
| if (pset.is_key_to_sequence(key)) | |
| return pset.get<std::vector<T>>(key); | |
| else { | |
| auto const& wireReadoutGeom = art::ServiceHandle<geo::WireReadout const>()->Get(); | |
| const unsigned int N_PLANES = wireReadoutGeom.Nplanes(); | |
| return std::vector<T>(N_PLANES, pset.get<T>(key)); | |
| } |
so that the service is not fetched unless it is strictly required.
| sbnanaobj::StandardRecord | ||
| sbnanaobj::StandardRecordFlat |
There was a problem hiding this comment.
From the header included in RecoUtils.{h,cpp} it looks like this dependency is still not needed. I request that you remove that and verify that the code still builds.
Description
This PR adds per plane variables for the GaussHitFinderSBN chi2 and maxmultihit parameters. It also modifies some parameters to be those found to perform the best from a grid search over a parameter space.