06. User Uploaded Tables#
User Uploaded Tables¶
Contact authors: Yumi Choi
Last verified to run: 2024-08-23
LSST Science Piplines version: Weekly 2024_32
Container Size: small
Targeted learning level: Beginner
Description: Use the TAP upload functionality for user-supplied tables and join them with DP0.3 catalogs.
Skills: Use the TAP service to upload a table and join it to an LSST table with ADQL.
LSST Data Products: TAP tables dp03_catalogs_10yr.SSObject, dp03_catalogs_10yr.MPCORB, dp03_catalogs_10yr.DiaSource
Packages: lsst.rsp.get_tap_service
Credit: Developed by Yumi Choi. This tutorial is based on a Portal tutorial by Christina Williams for using user-supplied tables in queries for DP0.3 and a Jupyter Notebook by Melissan Graham and Jake Kurlander for accessing Gaia data and matching with DP0.3 data for solar system objects.
Get Support: Find DP0-related documentation and resources at dp0.lsst.io. Questions are welcome as new topics in the Support - Data Preview 0 Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
This notebook illustrates the process of uploading user-provided tables via the TAP service and integrating them into queries for DP0.3. It focuses on two types of user-provided tables: (1) tables generated outside the RSP, and (2) tables created within the RSP but retrieved from an external database using PyVO.
import os
import getpass
import matplotlib.pyplot as plt
import numpy as np
import pyvo
from astropy.table import Table
import pandas as pd
from lsst.rsp import get_tap_service
1.2. Define parameters¶
Set a few style parameters for the plots.
plt.style.use('tableau-colorblind10')
params = {'axes.labelsize': 12,
'font.size': 12,
'legend.fontsize': 10}
plt.rcParams.update(params)
For the LSST filter set ugrizy, always use symbols to represent the filters in addition to color.
plot_filter_colors_white_background = {
'u': '#0c71ff',
'g': '#49be61',
'r': '#c61c00',
'i': '#ffc200',
'z': '#f341a2',
'y': '#5d0000'
}
plot_symbols = {
'u': 'o',
'g': '^',
'r': 'v',
'i': 's',
'z': '*',
'y': 'p'
}
Define the path to the home directory, where outputs generated by this tutorial will be saved.
my_home_dir = '/home/' + getpass.getuser() + '/'
print(my_home_dir)
/home/galaxyumi331/
Start the TAP service and assert that it exists.
rsp_tap = get_tap_service("ssotap")
assert rsp_tap is not None
2. Spatial and temporal cross-match to diaSources¶
The scientific scenario used for this demonstration is to answer the question of whether LSST detected a moving object which, based on its known orbit, was expected to appear at given coordinates on nine different nights.
The list of expected coordinates and dates is provided in a file with nine rows (i.e., nine expected locations and times), with the format expected for user-uploaded tables: column names in the first row with no # symbol at the start of the row. The list contains columns of identifier index, right ascension, declination, and modified julian date (id, ra, dec, and mjd).
The ADQL query below uploads this list to the TAP service and cross-matches to
the table of detected sources that results from difference image analysis: the diaSource
table
(learn more about DP0.3 catalogs).
fnm1 = 'data/dp03_06_user_table_1.cat'
Option: View the contents of the file with more
, or return the word count (lines, words, characters) with wc
.
# os.system('more ' + fnm1)
# os.system('wc ' + fnm1)
Read the file as user table 1, ut1
.
ut1 = Table.read(fnm1, format='ascii.basic')
Define a query to cross match the coordinates
and dates in the file to detections in the DP0.3 diaSource
catalog.
This query is applying a spatial threshold of 10 arcseconds (0.00278 degrees), and a temporal threshold of half a day.
Notice that the columns from the user-uploaded table are being renamed in the query, e.g., ut1.ra AS ut1_ra
.
If this renaming is not done, the TAP service will rename the columns from the user table as ra2
and dec2
to distinguish them from the ra
and dec
columns from the DiaSource
table.
query = """
SELECT dias.ra, dias.dec, dias.midPointMjdTai, dias.ssObjectId,
ut1.ra AS ut1_ra, ut1.dec AS ut1_dec, ut1.mjd AS ut1_mjd, ut1.id AS ut1_id
FROM dp03_catalogs_10yr.DiaSource AS dias, TAP_UPLOAD.ut1 AS ut1
WHERE CONTAINS(POINT('ICRS', dias.ra, dias.dec),
CIRCLE('ICRS', ut1.ra, ut1.dec, 0.00278))=1
AND ABS(dias.midPointMjdTai - ut1.mjd) < 0.5
ORDER BY dias.ssObjectId
"""
Create the job by submitting the query and then run it asynchronously.
job = rsp_tap.submit_job(query, uploads={"ut1": ut1})
job.run()
<pyvo.dal.tap.AsyncTAPJob at 0x78525200e4d0>
Check that the job is completed.
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED
Retrieve the results and display them.
results = job.fetch_result().to_table()
results
ra | dec | midPointMjdTai | ssObjectId | ut1_ra | ut1_dec | ut1_mjd | ut1_id |
---|---|---|---|---|---|---|---|
deg | deg | d | |||||
float64 | float64 | float64 | int64 | float64 | float64 | float64 | int64 |
314.9407129 | -31.5520653 | 62800.04496 | 2956589648411852100 | 314.9407129 | -31.5520653 | 62800.04496 | 0 |
314.939805 | -31.5544472 | 62800.02065 | 2956589648411852100 | 314.9407129 | -31.5520653 | 62800.04496 | 0 |
338.0151274 | -14.4770618 | 61741.04101 | 2956589648411852100 | 338.0151274 | -14.4770618 | 61741.04101 | 1 |
338.014364 | -14.4779895 | 61741.02776 | 2956589648411852100 | 338.0151274 | -14.4770618 | 61741.04101 | 1 |
22.8351124 | 13.0273351 | 63450.38788 | 2956589648411852100 | 22.8351124 | 13.0273351 | 63450.38788 | 2 |
352.2623818 | -13.4892628 | 61632.24295 | 2956589648411852100 | 352.2623818 | -13.4892628 | 61632.24295 | 3 |
349.8507502 | -14.3276051 | 61644.29927 | 2956589648411852100 | 349.8507502 | -14.3276051 | 61644.29927 | 4 |
349.8505526 | -14.3276598 | 61644.30017 | 2956589648411852100 | 349.8507502 | -14.3276051 | 61644.29927 | 4 |
338.0151274 | -14.4770618 | 61741.04101 | 2956589648411852100 | 338.014364 | -14.4779895 | 61741.02776 | 5 |
338.014364 | -14.4779895 | 61741.02776 | 2956589648411852100 | 338.014364 | -14.4779895 | 61741.02776 | 5 |
317.6806847 | -35.8352905 | 61291.02415 | 2956589648411852100 | 317.6805893 | -35.8352626 | 61291.0246 | 6 |
317.6805893 | -35.8352626 | 61291.0246 | 2956589648411852100 | 317.6805893 | -35.8352626 | 61291.0246 | 6 |
345.2016394 | -15.5920336 | 61665.05291 | 2956589648411852100 | 345.2016394 | -15.5920336 | 61665.05291 | 7 |
324.1758204 | -35.9475937 | 61267.33416 | 2956589648411852100 | 324.1765195 | -35.9475313 | 61267.33191 | 8 |
324.1765195 | -35.9475313 | 61267.33191 | 2956589648411852100 | 324.1765195 | -35.9475313 | 61267.33191 | 8 |
In the table above, notice that the ssObjectId
is all the same.
This is because the file was created to contain the detections of a single moving object across multiple nights.
Notice also that whereas the user-uploaded table had 9 rows, there are 15 matches to the diaSource
table.
Six of the rows had two matches.
This means that six times, this moving object was detected twice within 10 arcseconds in the same night.
2.1. Plot the spatial and temporal offsets¶
For the purposes of demonstrating some kind of plot out of the results of this cross-matching, calculate the spatial and temporal
offsets (s_off
and t_off
) between the detected diaSource
s and the user uploaded expectations,
and then plot the spatial vs. the temporal offsets.
cos_dec = np.cos(np.deg2rad(results['dec']))
delta_ra = cos_dec * np.abs(results['ra'] - results['ut1_ra'])
delta_dec = np.abs(results['dec'] - results['ut1_dec'])
s_off = 3600.0 * np.sqrt(delta_ra**2 + delta_dec**2)
t_off = 24.0 * 60.0 * np.abs(results['midPointMjdTai'] - results['ut1_mjd'])
del cos_dec, delta_ra, delta_dec
Moving objects do not exhibit a constant rate of motion on the sky in terms of, e.g., arcsec per minute,
but a correlation between spatial and temporal offset is expected.
Keep in mind that nine of these points will be at zero (no offset; perfect match).
But for the six additional, newly-discovered detections in
the diaSource
catalog which were not in the user-uploaded table,
there should be larger spatial offsets for larger temporal offsets.
fig = plt.figure(figsize=(6, 4))
plt.plot(t_off, s_off, 'o', ms=10, alpha=0.3, mew=0)
plt.xlabel('time difference (minutes)')
plt.ylabel('sky distance (arcseconds)')
plt.title('distance between same-night observations')
plt.show()
Figure 1: Above, the sky distance in arcseconds is plotted versus the time difference in minutes for the 15 detections cross-matched with the 9 anticipated coordinates and times in the user-uploaded table.
Clean up.
del t_off, s_off
del fnm1, ut1, query, job, results
3. Object identifier cross-match to diaSources¶
This section demonstrates how to upload a user-supplied table and join it with a DP0.3 table.
In this scenario, a list of identifiers (ssObjectId
) for moving objects has been assembled
by the user and stored in a file (one column, two rows of data).
As in Section 2, define the file name, optionally print the contents or the word count, and then read in the file.
fnm2 = 'data/dp03_06_user_table_2.cat'
# os.system('more ' + fnm2)
# os.system('wc ' + fnm2)
ut2 = Table.read(fnm2, format='ascii.basic')
Define a query to use an inner join on the user-uploaded table and the diaSource
table
to return only records (objects) that have matching values in both tables.
Inner join: An inner join is the default type of join, and is what is done if simply JOIN
is used in a query.
Other types of joins include outer joins (e.g., left, right, full) which optionally return
all records in one or the other table, with matches where available.
query = """
SELECT ut2.ssObjectId_user, dias.ssObjectId, dias.ra, dias.dec
FROM TAP_UPLOAD.ut2 as ut2
INNER JOIN dp03_catalogs_10yr.diaSource as dias
ON ut2.ssObjectId_user = dias.ssObjectId
"""
Run the query asynchronously and then retrieve the results.
job = rsp_tap.submit_job(query, uploads={"ut2": ut2})
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED
results = job.fetch_result().to_table()
results
ssObjectId_user | ssObjectId | ra | dec |
---|---|---|---|
deg | deg | ||
int64 | int64 | float64 | float64 |
5977535780727431144 | 5977535780727431144 | 297.6175308 | -0.4666219 |
5977535780727431144 | 5977535780727431144 | 267.9602991 | 13.7540882 |
5977535780727431144 | 5977535780727431144 | 242.0993711 | 6.867205 |
5977535780727431144 | 5977535780727431144 | 251.1593754 | 0.5298268 |
5977535780727431144 | 5977535780727431144 | 243.2244299 | 1.8095232 |
5977535780727431144 | 5977535780727431144 | 287.9080887 | 8.1408245 |
5977535780727431144 | 5977535780727431144 | 290.2771912 | 8.3605998 |
5977535780727431144 | 5977535780727431144 | 252.1509606 | 10.6167461 |
5977535780727431144 | 5977535780727431144 | 287.8090077 | -16.8921954 |
... | ... | ... | ... |
4350915375550808373 | 4350915375550808373 | 86.806202 | -28.6962433 |
4350915375550808373 | 4350915375550808373 | 61.7813131 | -15.7272584 |
4350915375550808373 | 4350915375550808373 | 75.2083479 | -31.4801478 |
4350915375550808373 | 4350915375550808373 | 71.6289181 | -30.4687589 |
4350915375550808373 | 4350915375550808373 | 3.1459663 | -44.9514265 |
4350915375550808373 | 4350915375550808373 | 82.8768975 | -27.5150918 |
4350915375550808373 | 4350915375550808373 | 80.2784977 | -28.6247472 |
4350915375550808373 | 4350915375550808373 | 84.344066 | -29.4088081 |
4350915375550808373 | 4350915375550808373 | 66.8995306 | -31.6402378 |
Print the unique ssObjectId
and how many detections each had.
uniqueIds, counts = np.unique(results['ssObjectId'], return_counts=True)
for uniqueId, count in zip(uniqueIds, counts):
print("ssObjectId:", uniqueId, "Number of detections:", count)
ssObjectId: 4350915375550808373 Number of detections: 322 ssObjectId: 5977535780727431144 Number of detections: 350
3.1. Plot the sky distribution of LSST detections¶
Create a plot showing sky distribution of the detections of these two unique objects over 10 years.
fig = plt.figure(figsize=(6, 4))
for uniqueId in uniqueIds:
tx = results['ssObjectId'] == uniqueId
plt.plot(results['ra'][tx], results['dec'][tx],
'o', ms=5, alpha=0.3, mew=0, label=str(uniqueId))
plt.xlabel('Right Ascension [deg]')
plt.ylabel('Declination [deg]')
plt.title('LSST detections of two moving objects')
plt.legend(loc='lower left')
plt.show()
Figure 2: Above, the coordinates of the LSST difference-image detections from the
diaSource
catalog for the two moving objects listed in the user-uploaded table.
Clean up.
del fnm2, ut2, query, job, results, uniqueIds, counts
4. Gaia data for DP0.3 asteroids¶
This section demonstrates how to do an external TAP query to retrieve a Gaia data table using PyVO
from the RSP,
and then upload the table and use it in a joint query with the LSST DP0.3 catalogs.
Gaia data is also accessible via the Gaia Archive (see the Help tab for more information about Gaia catalogs and data access).
4.1. Retrieve Gaia data for main-belt asteroids¶
Get an instance of the Gaia TAP service using PyVO
and assert that it exists.
gaia_tap_url = 'https://gea.esac.esa.int/tap-server/tap'
gaia_tap = pyvo.dal.TAPService(gaia_tap_url)
assert gaia_tap is not None
assert gaia_tap.baseurl == gaia_tap_url
Query the Gaia database for main-belt asteroids (MBAs) following the population definition used by the
JPL Horizons small body database query tool:
2.0 < a
< 3.25 au and q
> 1.666 au.
To expedite the query, only return objects with more than 200 observations (i.e., well-observed objects),
and limit the number of objects returned to 1000 with SELECT TOP 1000
.
gaia_query = """
SELECT TOP 1000 denomination, inclination,
eccentricity, semi_major_axis
FROM gaiadr3.sso_orbits
WHERE num_observations > 200
AND semi_major_axis > 2.0
AND semi_major_axis < 3.25
AND semi_major_axis*(1-eccentricity) > 1.666
"""
Run the query and retrieve the results.
gaia_job = gaia_tap.submit_job(gaia_query)
gaia_job.run()
gaia_job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', gaia_job.phase)
Job phase is COMPLETED
Retrieve the results and convert them to an astropy
table with to_table
, and then
to a pandas
table with to_pandas
, in order to enable some column reformatting.
gaia_results = gaia_job.fetch_result().to_table().to_pandas()
gaia_results
denomination | inclination | eccentricity | semi_major_axis | |
---|---|---|---|---|
0 | 2000_so179 | 0.058380 | 0.120836 | 2.701355 |
1 | 1999_gu8 | 0.113790 | 0.208611 | 2.203320 |
2 | 2000_ge143 | 0.185379 | 0.074487 | 3.091090 |
3 | 5082_t-3 | 0.097229 | 0.149435 | 2.313023 |
4 | 2000_su299 | 0.227229 | 0.075321 | 3.169976 |
... | ... | ... | ... | ... |
995 | 1999_jx102 | 0.107432 | 0.114787 | 3.137027 |
996 | 1993_fv82 | 0.049898 | 0.202413 | 2.189258 |
997 | 1999_jv24 | 0.267638 | 0.046316 | 2.570573 |
998 | andrewchiang | 0.147260 | 0.037843 | 3.062983 |
999 | 2001_qs256 | 0.079710 | 0.130844 | 2.671761 |
1000 rows × 4 columns
4.1.1. Plot the orbital parameters¶
To explore the retrieved data for 1000 MBAs, make plots showing eccentricity (e
) and inclination (i
) against semi-major axis (a
).
fig = plt.figure(figsize=(8, 3))
plt.subplot(121)
plt.scatter(gaia_results['semi_major_axis'], gaia_results['eccentricity'],
s=3, alpha=0.3)
plt.xlabel('a [au]')
plt.ylabel('e')
plt.subplot(122)
plt.scatter(gaia_results['semi_major_axis'], np.sin(gaia_results['inclination']),
s=3, alpha=0.3)
plt.xlabel('a [au]')
plt.ylabel(r'sin($i$)')
plt.tight_layout()
plt.show()
Figure 3: Above, scatter plots of the orbital parameters for 1000 MBAs from Gaia. At left, eccentricity (e) versus the semi-major axis (a). At right, inclination versus semi-major axis.
4.1.2. Reformat the MPC designation column¶
The MPCORB
table in DP0.3 follows the standard format for MPC designation, which is the year, then a space,
and then an identifier that is several characters, capital letters and numbers.
However, the denomination
in some Gaia tables (like sso_orbit
, queried above) uses underscores instead of
spaces and/or lower-case letters instead of capital letters.
To ensure accurate cross-matching (table joining) between the Gaia data and the DP0.3 data in Section 4.2.2,
the denomination
column must first be converted to the standard MPC designation format (mpc_desig1
).
A known issue
with the mpcDesignation
column in the MPCORB
table can lead to incorrect cross-matches for objects with mpcDesignation
values longer than 8 characters. To avoid this problem, the fullDesignation
column from the MPCORB
table will be used in Section 4.2.2, despite its own issue with an unnecessary prefix of "2011".
Additionally, the denomination
column needs to be converted to use spaces instead of underscores, while retaining
the lowercase letters (mpc_desig2
), to correctly identify a moving object by name in the Gaia sso_observation
table in Section 4.2.3.
gaia_results['mpc_desig1'] = '2011 '+ gaia_results['denomination'].str.replace('_', ' ').str.upper()
gaia_results['mpc_desig2'] = gaia_results['denomination'].str.replace('_', ' ')
gaia_results
denomination | inclination | eccentricity | semi_major_axis | mpc_desig1 | mpc_desig2 | |
---|---|---|---|---|---|---|
0 | 2000_so179 | 0.058380 | 0.120836 | 2.701355 | 2011 2000 SO179 | 2000 so179 |
1 | 1999_gu8 | 0.113790 | 0.208611 | 2.203320 | 2011 1999 GU8 | 1999 gu8 |
2 | 2000_ge143 | 0.185379 | 0.074487 | 3.091090 | 2011 2000 GE143 | 2000 ge143 |
3 | 5082_t-3 | 0.097229 | 0.149435 | 2.313023 | 2011 5082 T-3 | 5082 t-3 |
4 | 2000_su299 | 0.227229 | 0.075321 | 3.169976 | 2011 2000 SU299 | 2000 su299 |
... | ... | ... | ... | ... | ... | ... |
995 | 1999_jx102 | 0.107432 | 0.114787 | 3.137027 | 2011 1999 JX102 | 1999 jx102 |
996 | 1993_fv82 | 0.049898 | 0.202413 | 2.189258 | 2011 1993 FV82 | 1993 fv82 |
997 | 1999_jv24 | 0.267638 | 0.046316 | 2.570573 | 2011 1999 JV24 | 1999 jv24 |
998 | andrewchiang | 0.147260 | 0.037843 | 3.062983 | 2011 ANDREWCHIANG | andrewchiang |
999 | 2001_qs256 | 0.079710 | 0.130844 | 2.671761 | 2011 2001 QS256 | 2001 qs256 |
1000 rows × 6 columns
Save the result as a CSV (comma separated values) file in the home directory to explore an option of using pandas
table joins in Section 5.
gaia_results.to_csv(my_home_dir+'gaia_mba.cat', index=False)
Clean up. Only gaia_results
is needed for the next section.
del gaia_tap_url, gaia_tap
del gaia_query, gaia_job
4.2. Gaia object cross-match to MPCORB and diaSources¶
The scientific scenario used for this demonstration is that for the 1000 Gaia moving objects, the question is whether LSST detected them and, if so, what their $g$-band difference-image magnitudes are for each detection.
4.2.1. Reformat the table¶
As gaia_results
is a pandas
table, but the TAP service does not accept uploads in a pandas DataFrame
format,
convert it back to an astropy
table using the imported Table
method.
Follow the convention established in the sections above and name the table ut3
.
ut3 = Table.from_pandas(gaia_results)
ut3.dtype
dtype([('denomination', '<U15'), ('inclination', '<f8'), ('eccentricity', '<f8'), ('semi_major_axis', '<f8'), ('mpc_desig1', '<U20'), ('mpc_desig2', '<U15')])
Notice that the denomination
and mpc_desig
columns are of data type (dtype
) Unicode (U
).
Uploading a table which includes Unicode columns will not work.
Convert the Unicode columns (dtype.kind='U'
) to bytestring (dtype.kind='S'
) using the
convert_unicde_to_bytestring
.
ut3.convert_unicode_to_bytestring()
Clean up.
del gaia_results
4.2.2. Execute the query¶
Prepare a query to join three tables: the user-specified Gaia table data, the DP0.3 MPCORB
table, and the DP0.3 diaSource
table.
This query will return the magnitudes for all LSST detections in $g$-band difference images from the DP0.3 diaSource
table
for all objects in the user-uploaded Gaia table data (where a match exists).
In the query, the first join between the Gaia user-uploaded table and the DP0.3 MPCORB
table obtains the internal LSST ssObjectId
for a given moving object by joining on MPC designation.
The second join between the MPCORB
table and the diaSource
table obtains all $g$-band detections associated
with the ssObjectId
.
This triple join is necessary because the diaSource
table does not contain the MPC designation.
query = """
SELECT mpc.fullDesignation, mpc.ssObjectId,
ut3.denomination, ut3.mpc_desig2,
dias.midPointMjdTai, dias.mag, dias.band
FROM dp03_catalogs_10yr.MPCORB AS mpc
INNER JOIN TAP_UPLOAD.ut3 as ut3
ON ut3.mpc_desig1 = mpc.fullDesignation
INNER JOIN dp03_catalogs_10yr.DiaSource as dias
ON dias.ssObjectId = mpc.ssObjectId
Where dias.band = 'g'
"""
Execute the query, then retrieve the results.
job = rsp_tap.submit_job(query, uploads={"ut3": ut3})
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED
results = job.fetch_result().to_table()
results
fullDesignation | ssObjectId | denomination | mpc_desig2 | midPointMjdTai | mag | band |
---|---|---|---|---|---|---|
d | ||||||
str26 | int64 | str15 | str15 | float64 | float32 | str1 |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 63562.16077 | 18.813 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 63096.99067 | 21.002 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 61470.26138 | 19.519 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 62919.34252 | 20.99 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 61993.09195 | 19.406 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 63512.28129 | 18.624 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 63559.12272 | 18.723 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 62447.30256 | 20.195 | g |
2011 1020 T-3 | -942361610017462599 | 1020_t-3 | 1020 t-3 | 61467.2197 | 19.568 | g |
... | ... | ... | ... | ... | ... | ... |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 62716.11787 | 20.357 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 63115.09364 | 19.096 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 60852.31446 | 19.685 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 60912.03036 | 19.94 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 61797.19675 | 18.413 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 63088.26039 | 19.082 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 62217.27509 | 19.772 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 60874.27013 | 19.289 | g |
2011 6640 P-L | -8247848826207698658 | 6640_p-l | 6640 p-l | 60857.35035 | 19.573 | g |
Print the number of objects (out of the 1000 from Gaia) that were detected by LSST, and also the minimum, maximum, and mean number of LSST detections over the ten-year survey.
uniqueIds, counts = np.unique(results['ssObjectId'], return_counts=True)
print('Number of LSST-detected objects: ', len(uniqueIds))
print('Minimum, maximum, and mean number of detections per object: ',
np.min(counts), np.max(counts), int(np.mean(counts)))
Number of LSST-detected objects: 769 Minimum, maximum, and mean number of detections per object: 4 215 34
Clean up.
del ut3, query, job
del uniqueIds, counts
4.2.3. Plot the Gaia+LSST light curve for one MBA¶
Select one random MBA and save its data entry as a small table called random_mba
.
uniqueObj = np.random.choice(results['ssObjectId'], 1)[0]
print('ssObjectId: ', uniqueObj)
random_mba = results[results['ssObjectId'] == uniqueObj]
print('MPC designation: ', random_mba['mpc_desig2'][0])
ssObjectId: 4411859400751185100 MPC designation: 2000 uh39
Retrieve Gaia photometry for the selected MBA.
Occasionally, the following error message may occur:
DALServiceError: 401 Client Error: 401 for url: https://gea.esac.esa.int/tap-server/tap/sync
.
This usually represents an intermittent service error.
To avoid it, rerun the following cell.
gaia_tap_url = 'https://gea.esac.esa.int/tap-server/tap'
gaia_tap = pyvo.dal.TAPService(gaia_tap_url)
assert gaia_tap is not None
assert gaia_tap.baseurl == gaia_tap_url
Define and execute a query to retrieve the dates (epoch_utc
) and $g$-band magnitudes (g_mag
)
of Gaia observations for the randomly selected MBA by only returning rows
where the denomination
is equal to mpc_desig2
, which is the MPC
Designation reformatted to the convention of the Gaia sso_observation
table
(see Section 4.1.2).
gaia_query = """
SELECT epoch_utc, g_mag
FROM gaiadr3.sso_observation
WHERE denomination = '{}'
""".format(random_mba['mpc_desig2'][0])
gaia_job = gaia_tap.submit_job(gaia_query)
gaia_job.run()
gaia_job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', gaia_job.phase)
Job phase is COMPLETED
Retrieve the results.
gaia_results = gaia_job.fetch_result().to_table()
Convert Universal Time Coordinated (UTC) observation date (epoch_utc
) to a Modified Julian Date (MJD) observation date.
From the Gaia documentation, epoch_utc
is the Gaiacentric epoch UTC, while the LSST records observing time in MJD. The conversion is MJD = UTC + 55197.5 (day).
gaia_results['epoch_mjd'] = gaia_results['epoch_utc'] + 55197.5
Plot the $g$-band light curve for the selected asteroid.
fig = plt.figure(figsize=(6, 4))
plt.plot(gaia_results['epoch_mjd'], gaia_results['g_mag'],
's', color='dodgerblue', label='Gaia G')
plt.plot(random_mba['midPointMjdTai'], random_mba['mag'],
plot_symbols['g'],
color=plot_filter_colors_white_background['g'], label='LSST g')
ymin = min(gaia_results['g_mag'].min(), random_mba['mag'].min())
ymax = max(gaia_results['g_mag'].max(), random_mba['mag'].max())
plt.ylim(ymax+0.2, ymin-0.2)
plt.legend(loc='upper left')
plt.xlabel('MJD [days]')
plt.ylabel('magnitude')
plt.title('MPC designation:' + random_mba['mpc_desig2'][0])
plt.show()
Figure 4: Above, the Gaia G-band detections in cyan squares and the simulated DP0.3 LSST $g$-band magnitudes in green circles as a function of time. The variation in apparent magnitude over time is primarily due to the object's changing distance from Earth.
Clean up.
del results, uniqueObj, random_mba
del gaia_tap_url, gaia_tap, gaia_query, gaia_job, gaia_results
5. Combine a user-created table and the LSST query result table using Pandas
¶
The scientific scenario used for this demonstration is for a user-created pandas
table
of moving objects (the saved 1,000 Gaia MBA data is recycled as an example), whether LSST detected
them during a specific night, and if so how the LSST data look like.
The LSST query result will be also saved as a pandas
table, and the join with the user-created
table will be done using the merge
function provided by pandas
.
5.1. Read the Gaia MBA table as a pandas DataFrame
¶
Follow the convention established in the sections above and name the table ut4.
ut4 = pd.read_csv(my_home_dir+'gaia_mba.cat')
ut4
denomination | inclination | eccentricity | semi_major_axis | mpc_desig1 | mpc_desig2 | |
---|---|---|---|---|---|---|
0 | 2000_so179 | 0.058380 | 0.120836 | 2.701355 | 2011 2000 SO179 | 2000 so179 |
1 | 1999_gu8 | 0.113790 | 0.208611 | 2.203320 | 2011 1999 GU8 | 1999 gu8 |
2 | 2000_ge143 | 0.185379 | 0.074487 | 3.091090 | 2011 2000 GE143 | 2000 ge143 |
3 | 5082_t-3 | 0.097229 | 0.149435 | 2.313023 | 2011 5082 T-3 | 5082 t-3 |
4 | 2000_su299 | 0.227229 | 0.075321 | 3.169976 | 2011 2000 SU299 | 2000 su299 |
... | ... | ... | ... | ... | ... | ... |
995 | 1999_jx102 | 0.107432 | 0.114787 | 3.137027 | 2011 1999 JX102 | 1999 jx102 |
996 | 1993_fv82 | 0.049898 | 0.202413 | 2.189258 | 2011 1993 FV82 | 1993 fv82 |
997 | 1999_jv24 | 0.267638 | 0.046316 | 2.570573 | 2011 1999 JV24 | 1999 jv24 |
998 | andrewchiang | 0.147260 | 0.037843 | 3.062983 | 2011 ANDREWCHIANG | andrewchiang |
999 | 2001_qs256 | 0.079710 | 0.130844 | 2.671761 | 2011 2001 QS256 | 2001 qs256 |
1000 rows × 6 columns
5.2 Retrieve the DP0.3 data and save the result as a pandas
table¶
As opposed to Section 4.2.2, the query in this section joins only two tables:
the DP0.3 MPCORB
table and the DP0.3 diaSource
table.
This query will return all LSST detections from the DP0.3 diaSource
table
for MBAs (2.0 < a
< 3.25 au and q
> 1.666 au) observed at 62001 < midPointMjdTai
< 62002.
query = """
SELECT mpc.fullDesignation, mpc.ssObjectId, dias.ssObjectId,
mpc.e, mpc.incl, mpc.q/(1-mpc.e) as a,
dias.midPointMjdTai, dias.mag, dias.band
FROM dp03_catalogs_10yr.MPCORB AS mpc
INNER JOIN dp03_catalogs_10yr.DiaSource as dias
ON dias.ssObjectId = mpc.ssObjectId
Where midPointMjdTai > 62001 AND midPointMjdTai < 62002
AND mpc.q/(1-mpc.e) > 2.0 AND mpc.q/(1-mpc.e) < 3.25
AND mpc.q > 1.666
"""
job = rsp_tap.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED
results = job.fetch_result().to_table().to_pandas()
results
fullDesignation | ssObjectId | ssObjectId2 | e | incl | a | midPointMjdTai | mag | band | |
---|---|---|---|---|---|---|---|---|---|
0 | 2011 2007 CO32 | 6102997768245 | 6102997768245 | 0.129147 | 3.82716 | 2.460951 | 62001.26994 | 23.445000 | g |
1 | 2011 2014 DY22 | 47995361189908 | 47995361189908 | 0.288534 | 30.28869 | 2.577251 | 62001.12358 | 22.701000 | r |
2 | 2011 2008 RW112 | 51566589826717 | 51566589826717 | 0.075504 | 8.89582 | 2.936360 | 62001.30411 | 21.393999 | r |
3 | 2011 2008 RW112 | 51566589826717 | 51566589826717 | 0.075504 | 8.89582 | 2.936360 | 62001.27631 | 21.379999 | r |
4 | 2011 2008 RW112 | 51566589826717 | 51566589826717 | 0.075504 | 8.89582 | 2.936360 | 62001.25500 | 22.037001 | g |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
447674 | 2011 S100jZN6a | -325612888168026 | -325612888168026 | 0.068819 | 9.43365 | 3.161782 | 62001.22832 | 22.962000 | r |
447675 | 2011 S1005kB6a | -174501892927906 | -174501892927906 | 0.077079 | 21.92935 | 3.210731 | 62001.13702 | 20.893999 | r |
447676 | 2011 S1005kB6a | -174501892927906 | -174501892927906 | 0.077079 | 21.92935 | 3.210731 | 62001.14200 | 20.889000 | r |
447677 | 2011 S1007L1va | -72352755829421 | -72352755829421 | 0.112628 | 14.93522 | 2.666496 | 62001.19220 | 22.156000 | r |
447678 | 2011 2008 EU130 | -25365820730656 | -25365820730656 | 0.236274 | 4.01956 | 3.159747 | 62001.30949 | 21.840000 | r |
447679 rows × 9 columns
uniqObjId = np.unique(results['ssObjectId'])
print(f'{len(uniqObjId)} unique MBAs have LSST detections between MJD=62001 and 62002.')
241107 unique MBAs have LSST detections between MJD=62001 and 62002.
5.3 Join two pandas DataFrames
using the merge
function¶
Pandas
can combine two DataFrames with different column names for a common key.
The common key is the designation for objects, with the corresponding column name
being mpc_desig1
in the ut4
table and fullDesignation
in the results
table.
When using the 'inner' method, the merge operation returns only the rows that have
matching values in both DataFrames. This type of join is known as an "inner join".
merged_df = pd.merge(ut4, results, left_on='mpc_desig1', right_on='fullDesignation', how='inner')
print(f'{len(merged_df)} matches between the ut4 ({len(ut4)} MBAs) and '
f'the results ({len(results)} diaSources) tables.')
merged_df
97 matches between the ut4 (1000 MBAs) and the results (447679 diaSources) tables.
denomination | inclination | eccentricity | semi_major_axis | mpc_desig1 | mpc_desig2 | fullDesignation | ssObjectId | ssObjectId2 | e | incl | a | midPointMjdTai | mag | band | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2000_ge143 | 0.185379 | 0.074487 | 3.091090 | 2011 2000 GE143 | 2000 ge143 | 2011 2000 GE143 | 5411272580347895600 | 5411272580347895600 | 0.073031 | 10.62879 | 3.081456 | 62001.19271 | 18.775000 | r |
1 | 1994_eo6 | 0.088736 | 0.098679 | 2.361715 | 2011 1994 EO6 | 1994 eo6 | 2011 1994 EO6 | -2828025230589562555 | -2828025230589562555 | 0.099343 | 5.08267 | 2.358358 | 62001.10954 | 17.509001 | r |
2 | 1999_cb43 | 0.034245 | 0.133156 | 3.101095 | 2011 1999 CB43 | 1999 cb43 | 2011 1999 CB43 | -6876285656538331515 | -6876285656538331515 | 0.131949 | 1.96211 | 3.092506 | 62001.10354 | 16.827000 | r |
3 | 1991_aj | 0.180869 | 0.172780 | 3.198215 | 2011 1991 AJ | 1991 aj | 2011 1991 AJ | -2147153103981108893 | -2147153103981108893 | 0.171861 | 10.35835 | 3.195750 | 62001.10219 | 18.228001 | r |
4 | 1999_jj54 | 0.096823 | 0.106759 | 3.140851 | 2011 1999 JJ54 | 1999 jj54 | 2011 1999 JJ54 | 4962058787561465018 | 4962058787561465018 | 0.105927 | 5.54762 | 3.137985 | 62001.24566 | 18.966999 | g |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
92 | 3424_t-3 | 0.040064 | 0.033362 | 2.631548 | 2011 3424 T-3 | 3424 t-3 | 2011 3424 T-3 | 7545917565920585779 | 7545917565920585779 | 0.031006 | 2.29617 | 2.628013 | 62001.28768 | 18.836000 | r |
93 | 1999_ve170 | 0.036370 | 0.153232 | 2.390796 | 2011 1999 VE170 | 1999 ve170 | 2011 1999 VE170 | 1159259540580112333 | 1159259540580112333 | 0.153633 | 2.08492 | 2.387461 | 62001.25726 | 21.114000 | g |
94 | 1999_ve170 | 0.036370 | 0.153232 | 2.390796 | 2011 1999 VE170 | 1999 ve170 | 2011 1999 VE170 | 1159259540580112333 | 1159259540580112333 | 0.153633 | 2.08492 | 2.387461 | 62001.27857 | 20.475000 | r |
95 | 2001_ag4 | 0.059589 | 0.272405 | 2.564281 | 2011 2001 AG4 | 2001 ag4 | 2011 2001 AG4 | -698031098315107607 | -698031098315107607 | 0.271387 | 3.41216 | 2.563919 | 62001.22109 | 16.940001 | r |
96 | 2001_ag4 | 0.059589 | 0.272405 | 2.564281 | 2011 2001 AG4 | 2001 ag4 | 2011 2001 AG4 | -698031098315107607 | -698031098315107607 | 0.271387 | 3.41216 | 2.563919 | 62001.24521 | 17.587999 | g |
97 rows × 15 columns
uniqObj, cnts = np.unique(merged_df['denomination'], return_counts=True)
most_detected = np.where(cnts == cnts.max())
print(f'{len(uniqObj)} Gaia MBAs out of {len(ut4)} are detected with Rubin '
f'between MJD=62001 and 62002.')
print(f'{len(most_detected[0])} of them have the maximum number '
f'of the LSST detections of {cnts.max()}.')
48 Gaia MBAs out of 1000 are detected with Rubin between MJD=62001 and 62002. 6 of them have the maximum number of the LSST detections of 4.
5.3.1 Compare the orbital parameters from Gaia vs. the LSST for the matched MBAs¶
Plot the orbital parameters such as eccentricity (e
), inclination (i
), and semi-major axis (a
)
from the Gaia and the LSST $g$-band for matched MBAs.
fig, axs = plt.subplots(1, 3, figsize=(16, 3))
fig.suptitle('For matched MBAs between the Gaia and the LSST g-band')
axs[0].scatter(merged_df['eccentricity'], merged_df['e'])
axs[0].set_xlabel('e from Gaia')
axs[0].set_ylabel('e from Rubin')
axs[1].scatter(np.rad2deg(merged_df['inclination']), merged_df['incl'])
axs[1].set_xlabel('i [deg] from Gaia')
axs[1].set_ylabel('i [deg] from Rubin')
axs[2].scatter(merged_df['semi_major_axis'], merged_df['a'])
axs[2].set_xlabel('a [au] from Gaia')
axs[2].set_ylabel('a [au] from Rubin')
plt.show()
Clean up.
del ut4, results, uniqObjId, merged_df, uniqObj, cnts, most_detected
6. Excercises for the learner¶
In Section 4.1, the Gaia database was queried for main-belt asteroids (MBAs) following the population definition used by the JPL Horizons small body database query tool. At that link, click on "Limit by Orbit Class" and then hover the cursor over any "information" symbol to see any definition. Use the definition of TransNeptunian Object (TNO) to rewrite the query in Section 4.1 and create a table of the nine Gaia objects with >200 observations and the semi-major axis of a TNO. Clear the kernel and re-execute the notebook in full, using Gaia TNOs instead of MBAs; there should be 4 detected by LSST.
(Advanced) Generate a table of DP0.3 objects of interest and save it as a file (e.g., the MPC designations of TNOs identified in step 1). Using Sections 3 and 4 as a guide, use the user-upload functionality and do a tripe-table join with the DP0.3
SSObject
catalog (match on MPC designation) andSSSource
catalog (match onssObjectId
) to obtain and plot, e.g., heliocentric distances.