Programs to generate XEphem 3.4 xe2 catalogs



Elwood Downey's XEphem 3.4 is a popular UNIX/X-based graphical astronomical ephemeris program. Previous versions of XEphem used some publicly-distributed star catalogs in a compact binary format with filenames ending in ".xe". Starting with XEphem 3.4, the catalogs use a different format with filenames ending in ".xe2", and are distributed only on the XEphem 3.4 CD-ROM.

Since there are public sources for several major star catalogs and the xe2 file format is described in the XEphem source code, it's fairly easy to process that data into xe2 files usable by XEphem. I've written a couple of programs to do that. You might also find this information of use if you want to import other sources of catalog data into XEphem xe2 files, or make your own customized catalogs.

Data sources

If you want to run these programs yourself to produce customized versions of the XEphem catalogs, you'll need to start with these catalog data files. Links are to the FTP directories, since in the case of the Tycho-2 catalog you'll need to get more than one file. Each FTP directory also includes a ReadMe file describing the catalog file format. Mirrors regional to Europe at Centre de Données astronomiques de Strasbourg and North America at NASA's Goddard Space Flight Center Astronomical Data Center are listed below.

North America:
North America:
PPM North
North America:
PPM South
PPM Supplement
North America:

You will need a lot of disk space to work with these files; the uncompressed version of the Tycho-2 catalog tyc2.dat is over 520 megabytes in size, for example.

To get the full Tycho-2 catalog data file tyc2.dat, you'll need to uncompress and concatenate the split files tyc2.dat.00.gz through tyc2.dat.19.gz with a UNIX command like:

zcat tyc2.dat.??.gz >tyc2.dat

To get the full PPM catalog data file ppm.dat, you'll need to concatenate the three data files ppm1, ppm2, ppm3 or ppmnorth.dat.gz, ppmsouth.dat.gz, ppm3.gz with UNIX commands like:

cat ppm1 ppm2 ppm3 >ppm.dat

zcat ppmnorth.dat.gz ppmsouth.dat.gz ppm3.gz >ppm.dat

You can uncompress the file hip_main.dat.gz to get hip_main.dat with a UNIX command like:

gunzip hip_main.dat.gz

Building and running the processing programs

This assumes some familiarity on your part with how to build C programs in a UNIX environment. If you're not that familiar with how to compile and run C programs, please try to get help locally before pestering me.

Download makexe2.tar.gz and use zcat makexe2.tar.gz | tar xvf - to extract the source code (all files will be placed in a subdirectory named makexe2 relative to wherever you extract it), or download all the individual files here.

You may need to edit the Makefile to customize it for your system's C compiler and programming environment, although the provided Makefile should work on a lot of different systems.

Run make in the makexe2 directory to build the programs. If you get errors, this probably means you need to tweak the Makefile. If you can't fix the errors by changing the Makefile, then maybe I made some nonportable assumption in my code. If you figure out how to fix it, feel free to drop me a note.

The two programs that will be built are named tyc2xe2 and ppmxe2.

tyc2xe2 will process the Hipparcos catalog hip_main.dat and the Tycho-2 catalog tyc2.dat and produce the two files hip.xe2 and tycho2.xe2.

ppmxe2 will process the PPM catalog ppm.dat and produce the file ppm1.xe2.

Each program requires an amount of memory comparable to the size of the catalog it generates. tyc2xe2 will require at least 44 megabytes of memory, and ppmxe2 will require at least 9 megabytes of memory. This is because they need to sort the output data, and the expedient way to do this was simply to hold it all in memory and use the C qsort() function to do the sorting.

On my Alphastation 255/233, tyc2xe2 takes about 5 minutes to run, and ppmxe2 takes less than a minute to run. Your run times will likely be different, depending on the sort of system you run them on, but clearly even on a slower system the run time will probably be quite tolerable.

How the catalog data is processed

tyc2xe2 uses both the Hipparcos and Tycho-2 catalog data because the Hipparcos catalog includes the standard visual magnitude and spectral class data for the stars it lists but does not use the J2000 coordinate system for right ascension and declination values, while the Tycho-2 catalog includes coordinates in the J2000 system expected by XEphem, but does not include spectral class information or standard magnitude information. Stars that have valid Hipparcos magnitude and spectral class data will be shown with the designation "HIP" in XEphem; otherwise the "Tnnnn-nnnnn-n" designation will be shown. Stars from the Tycho-2 catalog that do not correspond to Hipparcos stars have their magnitude calculated according to the formula in the Tycho-2 catalog's ReadMe file if both the "BT" and "VT" magnitudes are available; otherwise whichever of the "BT" or "VT" magnitudes is available is used. Stars without J2000 position data in the Tycho-2 catalog are ignored. If the star is listed as some type of multiple in the Tycho-2 catalog, it will be shown as a "double star" in XEphem.

As the PPM catalog data is very regular (all stars listed have position data in J2000) its processing is much more straightforward. Stars from the PPM catalog will be shown with their "PPM" designation in XEphem, and will be shown as a "double star" if flagged as a multiple.

Overall, my approach was to avoid doing complicated processing on the catalog data (such as precession or proper motion corrections for positions not given in J2000) while otherwise trying to make the data as complete as possible. However, I'm not a professional astronomer, so I'd welcome any reports of errors in the processed catalog data or suggestions for how to improve the quality of the catalogs.

Pre-made catalog data files

Of course, you may not be interested in building the catalogs themselves, so here are pre-made versions. Please note that these files are large, so be patient when downloading them. Also, please do not bookmark the file locations directly -- currently I'm serving this from my personal system, but can move the catalog data to a higher-capacity server should it prove popular enough to affect my use of this system.

Notes on the source code

You may notice that the source code files include a short copyright notice with what I think are fairly generous terms for their use and redistribution. If you think these terms would somehow unduly restrict your use of this code, I might be persuaded to grant you different terms. The basic idea is that I'm not planning to make any money off this stuff and you probably shouldn't either, and you're perfectly free to modify and redistribute the code as long as I get credit for the portions I wrote.

packxe2.c is just a routine to create an xe2 record from a more C-friendly representation of the same data in a struct xe2data as defined in packxe2.h. It does some basic range-checking on most of the input data and returns 0 if anything falls out of range. For all but Tycho-2 stars, you just need to fill the num1 structure member with the catalog number of the star. For a Tycho-2 star, you put the triple of catalog numbers in the respective three num1, num2, num3 members. packxe2() will figure out which members to use based on how you set namecode.

readcat.c has some utility functions for extracting certain types of data from catalog lines, with some basic data validation. They're designed to use the field starting and ending locations exactly as shown in the catalog ReadMe files, which are 1-based (the first character in a line has index 1) rather than 0-based the way C strings are normally indexed. I suppose if I get more ambitious I could write an automatic translator that produces the appropriate C source to parse entire catalog records using those functions. I found it easiest to have these functions return special values as defined in readcat.h rather than use a separate flag to return information about whether a valid value was found in the field.

get_char(line, index, &return) just pulls out the character at a given index in the line; '\0' is put in the returned character if the index is out of range.

get_string(line, start, end, string) tries to extract a string in the field between start and end. Leading and trailing spaces are removed. Make sure string is at least as big as start-end+1. It may give you an empty string if the field contained only spaces.

get_integer(line, start, end, &integer) tries to parse a (base-10) integer in the field using the C strtol() function. If there are no digits in the field, BAD_INT is placed in &integer.

Finally, get_fp(line, start, end, &fp) tries to parse a floating-point number in the field using the C strtod() function. If a valid floating-point number is not found in the field, BAD_DOUBLE is placed in &fp.

Of course, tyc2xe2.c and ppmxe2.c show plenty of examples of how to use these functions.

Steve VanDevender