BIGCAT Change Log
Known Bugs
- Ingest will crash when trying to write out integrations that timed-out due to not getting some of the subbands required.
- Some zoom configurations will cause some correlators to crash shortly after being given a frequency configuration.
- Caobs will try to continue observations even if not all correlators start, which is unintended behaviour.
- Caobs does not handle malformed schedules very gracefully, and has been known to crash when told to load them. We advise all observers to carefully ensure that their schedule file is valid. A common issue is when a UTC schedule is specified but not all scans are labelled with UTC times. For most normal observations, UTC schedules are not what you want anyway, and you should choose relative timing in the scheduler.
- SSPD has a severe memory leak when asked to produce and display integrated data, and so should not be used to do so.
- It is easily possible to get the system into an uncontrollable state by starting more than one caobs_server.
Starting a schedule that has both 16cm and 4cm/15mm/7mm scans, and starting on a 16cm scan, will cause the correlators that aren't required for processing 16cm subbands to crash. Then, when switching to a scan requiring more than 15 subbands then no additional correlators will be started, and this will cause ingest crashes (see below).When automatically stopping the correlator and starting new ones for a new scan, when that new scan is at 16cm, caobs will not necessarily wait for the "correct" correlators to be ready, and thus may not actually start new correlators, even though it thinks correlators will start. This will result in no data being produced, and ingest to crash.Safety stows do not work.caobs does not check the correlator has properly shutdown before restarting if a start or track command isssued too quickly after a corr stopOccasionally a correlator process does not shut down cleanly and gets stuck. This means no data from that GPU in for future cans and eventually ingest crashes. ^C the process are re-run start_correlatorIngest does not support more than a single frequency configuration in a datasetIngest regularly fails to close scans properly in the output (A)SDM. This results in datasets that cannot be converted to MeasurementSet. Thankfully, the fix for this is easy, using the fixasdm.py script on panthera-ingest.Caobs fails to control the correlator if the first scan that launches the correlator is stopped before the correlator is properly instructed to begin processing data. The only fix for this is to shut everything down and restart it.Caobs cannot properly tell when the antenna is on-source after instructing the antenna to change receivers. The fix for this is to stop the scan and restart it, at which point caobs will recognise that the correct receiver is on axis (if it is).Caobs can do pointing scans and determine the az/el offsets from them, but cannot use the updated model for scans that ask for the updated model. This is less a bug than a feature that has yet to be implemented.The 16cm delay solution is not very good, and the mm delay solution has not yet been tested.Runcorr.py will exit if a Config message is sent without the subbands that it is controlling. This is not what we want most of the time.
2026-01-21
Changes to caobs
- Fix the safety stow bug, which was the result of the stow starting but then being told immediately to stop when the observation stops. The observation now stops, but the antennas continue to stow.
- Fix the issue with starting correlators for 16cm scans, so that caobs will now wait for the correct correlators to be ready before trying to start.
- Stop trying to use the same correlator configuration for all scans, to prevent starting more correlators than required for 16cm scans.
- Reduce the wait time for starting scans now that the correlator starts up much more quickly, and that the packet loss issue seems to be solved.
Changes to correlator
- Greatly improve the start-up and shut-down times using inter-process signalling between runcorr and the correlator processes, and by parallelising memory cleanup at the end.
Known Bugs
- Ingest will crash when trying to write out integrations that timed-out due to not getting some of the subbands required.
- Some zoom configurations will cause some correlators to crash shortly after being given a frequency configuration.
- Caobs will try to continue observations even if not all correlators start, which is unintended behaviour.
- Caobs does not handle malformed schedules very gracefully, and has been known to crash when told to load them. We advise all observers to carefully ensure that their schedule file is valid. A common issue is when a UTC schedule is specified but not all scans are labelled with UTC times. For most normal observations, UTC schedules are not what you want anyway, and you should choose relative timing in the scheduler.
- SSPD has a severe memory leak when asked to produce and display integrated data, and so should not be used to do so.
Software versions used
- tag: production_2026-01-21
- mc (github): 238fc2911f94cc51e37da51fe1f4c3c69d51b9de
- correlator (bitbucket): 141041421b4d616089fb80a35b3242cc835adbc7
- ingest (github): f2431615f1e6512388b3b2fde4a6c39fe0d9e107
2025-12-19
Changes to caobs
- BIGCAT RF is now installed, and caobs can now control the new hardware and use four 2 GHz IFs.
- Enable automatic correlator configuration changes when a scan requires it to.
- Enable updating of MoniCA observation points, although currently only the array configuration is being updated.
- Detect whether ingest has been restarted, and close any open datasets when that happens, and emit a warning message to the user if ingest starts or stops.
- Update the pointing update algorithm for use with BIGCAT RF.
Changes to correlator
- Fix packet loss problem when starting the correlator processes.
- Fix problem with runcorr not regaining control when the correlator is told to stop.
Changes to ingest
- BIGCAT RF is now installed, and ingest can now handle 60 subbands.
- Fix bug which prevented more than one frequency configuration being stored in a single dataset.
- Fix bug where end times were not being written for some of the XML tables.
- Subbands now always appear in the (A)SDM file in increasing frequency order, regardless of the actual sideband.
- Change the Tsys scaling to use only a single value per subband, and it now also properly discards erroneous GTP/SDO values.
- Increase the parallelisation of all the integrator workers, to allow for the extra work ingest has to do with the larger data rates.
- Allow for integrations to get serviced slowly if required.
- Remove zoom bands from the data published to nvis.
Other changes
- SSPD now only connects to the subbands selected by the user, which should make it more reliable and need fewer resources.
Known Bugs
- Starting a schedule that has both 16cm and 4cm/15mm/7mm scans, and starting on a 16cm scan, will cause the correlators that aren't required for processing 16cm subbands to crash. Then, when switching to a scan requiring more than 15 subbands then no additional correlators will be started, and this will cause ingest crashes (see below).
- Ingest will crash when trying to write out integrations that timed-out due to not getting some of the subbands required.
- When automatically stopping the correlator and starting new ones for a new scan, when that new scan is at 16cm, caobs will not necessarily wait for the "correct" correlators to be ready, and thus may not actually start new correlators, even though it thinks correlators will start. This will result in no data being produced, and ingest to crash.
- Safety stows do not work.
- Caobs does not handle malformed schedules very gracefully, and has been known to crash when told to load them. We advise all observers to carefully ensure that their schedule file is valid. A common issue is when a UTC schedule is specified but not all scans are labelled with UTC times. For most normal observations, UTC schedules are not what you want anyway, and you should choose relative timing in the scheduler.
- SSPD has a severe memory leak when asked to produce and display integrated data, and so should not be used to do so.
Software versions used
- tag: production_2025-12-19
- mc (github): 33445ff0311e57f66fdb0bfd78483cc94195c865
- correlator (bitbucket): abec88075af64afe540d5f1969591c93be3b06d2
- ingest (github): eaf6b5c3db5cb418d7a14059ff4e731f8827f992
2025-10-30
Changes to caobs
- Made another attempt at fixing a bug that may occasionally cause the correlator to be told to run, but not have it produce data, by limiting the Pause command to only be issued when the correlator is actually running.
- Fix an issue where the mm LO would not be set after the first initial setting.
Changes to correlator
- Fixed runcorr.py to clear old information from previous configure messages when receiving new messages. This bug I think was responsible for runcorr crashes when switching to 16cm, where fewer subbands are specified; without clearing the old information, runcorr was launching correlators that were not included in the Config message, and thus would immediately exit with an error.
Known Bugs
- Caobs does not handle malformed schedules very gracefully, and has been known to crash when told to load them. We advise all observers to carefully ensure that their schedule file is valid. A common issue is when a UTC schedule is specified but not all scans are labelled with UTC times. For most normal observations, UTC schedules are not what you want anyway, and you should choose relative timing in the scheduler.
- Issuing a "corr stop" in caobs before the correlator has started (ie. shortly after the first start/track command is given) will always fail, and make it difficult/impossible to control the correlator without restarting everything. To avoid this problem, we recommend observers carefully start their schedule so they don't need to stop and restart immediately, and if they do make a mistake, to either restart without "corr stop" or to wait for a minute before doing that.
- The 7mm delay solution has not yet been tested.
- SSPD has a severe memory leak when asked to produce and display integrated data, and so should not be used to do so.
Software versions used
- tag: production_2025-10-30
- mc: cdebdcd4103db89776e2824ad42789b4271d5a09
- correlator: 798ee3252bd28fd0988346262141854bae76253d
- ingest: 9b7db7b3a50e496ce8859158c1dc1bd82f5089bc
2025-10-28
Changes to caobs
- Fixed a bug that might have been causing the correlator to run but not produce data, which is a failure mode that we have seen. Caobs now no longer issues the same scan number more than once per correlator run.
- Enable Update pointing scans to work properly, along with the Offset pointing type. The pointing corrections made are now broadcast to clients and shown in the caobs command-line client.
Changes to correlator
- Move to the correlator version that supports VLBI tied-array mode. No changes to non-VLBI operations should be noticed.
Other changes
- Corrected 15mm delay model.
- Fixed phase corrections for caobs, so now all frequency combinations within the 16cm, 4cm and 15mm bands should look alright, with no phase discontinuities.
Known Bugs
- Caobs does not handle malformed schedules very gracefully, and has been known to crash when told to load them. We advise all observers to carefully ensure that their schedule file is valid. A common issue is when a UTC schedule is specified but not all scans are labelled with UTC times. For most normal observations, UTC schedules are not what you want anyway, and you should choose relative timing in the scheduler.
- Issuing a "corr stop" in caobs before the correlator has started (ie. shortly after the first start/track command is given) will always fail, and make it difficult/impossible to control the correlator without restarting everything. To avoid this problem, we recommend observers carefully start their schedule so they don't need to stop and restart immediately, and if they do make a mistake, to either restart without "corr stop" or to wait for a minute before doing that.
- The 7mm delay solution has not yet been tested.
- SSPD has a severe memory leak when asked to produce and display integrated data, and so should not be used to do so.
Software versions used
- tag: production_2025-10-28
- mc: 09b925f0be505ade6fd0f6dc48e9e73a04b2778a
- correlator: 5fa82d1f9182a8ca598a697fb8796e9a288a3953
- ingest: 9b7db7b3a50e496ce8859158c1dc1bd82f5089bc
2025-10-23 Revision 2
After observations using the production_2025-10-23 software showed that some of the bugs that we thought were fixed were not, we made some further alterations in the green time starting at 07:30 UTC.
Changes to caobs
- Fixed a bug that was causing scan dataset labelling issues when sending scans in advance, which is normal behaviour. This doesn't seem to have caused any issues with the output data, but was causing error messages from ingest.
Changes to ingest
- Ingest now seems to properly close scans when it should. Previous testing was done only by stopping and starting scans, not running the schedule with multiple scans in the same execution block, but now both modes of observing seem to work as intended.
Changes to correlator
- Fix runcorr.py to launch only those GPU streams that are requested by caobs. While runcorr.py continues to output messages saying that a GPU stream it supports was not requested, this no longer causes runcorr.py to exit.
Known Bugs
- Caobs does not handle malformed schedules very gracefully, and has been known to crash when told to load them. We advise all observers to carefully ensure that their schedule file is valid. A common issue is when a UTC schedule is specified but not all scans are labelled with UTC times. For most normal observations, UTC schedules are not what you want anyway, and you should choose relative timing in the scheduler.
- Issuing a "corr stop" in caobs before the correlator has started (ie. shortly after the first start/track command is given) will always fail, and make it difficult/impossible to control the correlator without restarting everything. To avoid this problem, we recommend observers carefully start their schedule so they don't need to stop and restart immediately, and if they do make a mistake, to either restart without "corr stop" or to wait for a minute before doing that.
- Caobs can do pointing scans and determine the az/el offsets from them, but cannot use the updated model for scans that ask for the updated model. This is less a bug than a feature that has yet to be implemented.
- The mm delay solution has not yet been tested.
- SSPD has a severe memory leak when asked to produce and display integrated data, and so should not be used to do so.
Software versions used
- tag: production_2025-10-23_rev2
- mc: 51700b07ec8883a66a2eb55689d76bb8e2af724e
- correlator: 30267297dde3beedf70b7e35bac38bbc402968a5 (caveat: only the runcorr.py changes from the previous commit have been incorporated, the old correlator binaries are still being used)
- ingest: 9b7db7b3a50e496ce8859158c1dc1bd82f5089bc
2025-10-23
Changes to caobs
- Fixed the issue that was causing on-source flagging problems when changing bands.
- Improved handling of start/stop/starts before the correlator is properly configured by the first start. This should mean the system is much less likely to lose control of the correlator processes.
- Fixed the caobs_server launcher on skull, to make it less likely to launch more than one server at a time.
Changes to ingest
- Fixed issues that were preventing correct writing of more than one frequency configuration to a single dataset; this should (and has been tested to) work now.
- Ingest now is much better at closing scans in the dataset.
- Changed the output of ingest in the terminal to be less unnecessarily verbose.
Other changes
- Corrected 16cm delay model.
Known Bugs
- Issuing a "corr stop" in caobs before the correlator has started (ie. shortly after the first start/track command is given) will always fail, and make it difficult/impossible to control the correlator without restarting everything. To avoid this problem, we recommend observers carefully start their schedule so they don't need to stop and restart immediately, and if they do make a mistake, to either restart without "corr stop" or to wait for a minute before doing that.
- Ingest sometimes fails to close scans properly in the output (A)SDM, although the situation has improved in this software release. This results in datasets that cannot be converted to MeasurementSet. Thankfully, the fix for this is easy, using the fixasdm.py script on panthera-ingest.
- Caobs can do pointing scans and determine the az/el offsets from them, but cannot use the updated model for scans that ask for the updated model. This is less a bug than a feature that has yet to be implemented.
- The mm delay solution has not yet been tested.
- SSPD has a severe memory leak when asked to produce and display integrated data, and so should not be used to do so.
- Runcorr.py will exit if a Config message is sent without the subbands that it is controlling. This is not what we want most of the time.
Software versions used
- tag: production_2025-10-23
- mc: 9f6bdf7ddcbdb1f72ef3c9c78befce6bdd9a3605
- correlator: 2ac59b18dbaf5cbd534f84e21b0add36a7be03d2
- ingest: 9b61c29539696d96ddb9f57a5a71e684a9be8b67
2025-10-21
BIGCAT operations begin.
Original: Jamie Stevens (23-Oct-2025)
Modified: Jamie Stevens (23-Oct-2025) Modified: Chris Phillips (23-Dec-2025)
