This loads a font easier to read for people with dyslexia.
This renders the document in high contrast mode.
This renders the document as white on black
This can help those with trouble processing rapid screen movements.

Accessing BIGCAT data before it is in CASDA

BIGCAT Data

BIGCAT data is stored in a format called (A)SDM. This is actually a directory structure, with XML files describing the contents of binary blobs, mixed with MIME headers. These (A)SDM datasets will find their way into CASDA, but some people may want access to the data before that happens.

The other reason to access data before it is deposited into CASDA is to get the converted MeasurementSet that we currently generate from the (A)SDM. Currently, only a customised version of CASA is able to load the BIGCAT (A)SDM files into MS format, and so the observatory does this conversion as a service to our users. Eventually we will push our changes to the NRAO CASA so that this step will no longer be required at the observatory.

Let's look inside an example dataset, to make it clear what is contained in it. This is from the dataset 2025-10-21_0329_C3761.


	  ls 2025-10-21_0329_C3761
	  casa-20251021-065949.log  casa-20251023-041228.log  casa-20251023-041620.log
	  raw  raw.fixed  raw.fixed.ms  raw.fixed.ms.flagversions  raw.ms
      
The log files are the output from CASA when we converted the (A)SDM into MS. The raw directory is the (A)SDM dataset, and the raw.ms is the MS after the conversion process. For many datasets, that is all that will be present, but for some early BIGCAT datasets ingest had a bug that meant that the scans in the raw directory weren't closed properly. To fix this issue, we made a tool that could repair the dataset, and this tool creates an (A)SDM dataset called raw.fixed, and any MS converted from this is called raw.fixed.ms. If this is present in your dataset, you should use this for your data reduction. If it is isn't present, your data was not affected by the bug.

To access data from BIGCAT before it has made its way into CASDA, we recommend the following methods.

IMPORTANT NOTE: Only download data that is from your project; do not download other people's data.

Getting the whole dataset

To get the entire dataset, including (A)SDM and MS directories, you can use the following very simple command, remembering to replace the usr123 with your ATNF Linux username, and the 2025-10-21_1330_C9999 with the actual name of your dataset.

	rsync --progress -a usr123@venice.atnf.csiro.au:/DATA/BIGCAT_1/2025-10-21_1330_C9999 .
      
This command should be executed in the directory on your machine where you want to download the data to. You will also need to have SSH access to venice, which is described here.

Getting just the MeasurementSet

You can use rsync to only download the MS instead of the (A)SDM if you want, using the command:

	  rsync --progress -a --include="*/" --include="*.ms/*" --include="*.ms/*/*" --exclude="*" \
	  usr123@venice.atnf.csiro.au:/DATA/BIGCAT_1/2025-10-21_1330_C9999 .
      

You may wish to do this because the MS is essentially an entire duplication of the data contained in the (A)SDM, effectively doubling the download size. By only downloading the MS, especially on a slower or metered connection, you will still receive your data, but with much lower overhead.


Original: Jamie Stevens (31-Oct-2025)
Modified: Jamie Stevens (31-Oct-2025)