Working with the LLAMMa data

From CoolWiki
Jump to navigationJump to search

This page is an updated version of the pages I've made for most of my teams. This page was developed and updated specifically for the 2016 LLAMMA-Ceph C team visit.

Please note: THIS IS NOT meant to be used without applying your brain! This is NOT a cookbook! This is presented as a linear progression because of the nature of this page, but we have already done some things "out of order", and moreover, chances are excellent that you will go back and redo different pieces of this at different stages of your work.

FOR REFERENCE: LLAMMa Bigger Picture and Goals

FOR CONTEXT: I know we have a wide range of ages and backgrounds here. There are things tagged "BONUS" in here - this means "if you get to this point and need something to do while everyone else catches up, work on this." You can also do this later, at home, when you are reviewing what we did this summer. You need not do it here and now, or even necessarily at all. But it will give you a deeper understanding of what is going on.

Useful Positions

(just for reference)

We are studying a region that is 10-15 arcmin in radius, centered on 23:05:51 +62:30:55. (in galactic coordinates, that's 111.08d, +2.098d.)

Why? (a) Because we are looking for YSOs. (b) Because we have time series data (light curves) for objects in this region, and we want to know which are the young ones so that we can better interpret the light curves.

Relevant links for reference: LLAMMa Bigger Picture and Goals

Obtaining the imaging data

DONE (but be sure you have the files you need!)

Why? Need to figure out what data are in this region from which we might obtain photometry (=quantitative measures of brightness of objects) to use to look for IR excess sources.

We found imaging data for this region from CXO, SDSS, IPHAS, 2MASS, Spitzer (cryo and post-cryo), WISE, Herschel, and SCUBA. We can get our hands on imaging from all of those, plus POSS (DSS). We have already used FinderChart (easy access to DSS, SDSS, 2MASS, WISE) and other IRSA tools. We have already used Skyview at Goddard. We have already used ds9.

Big goal: Learn how to get images so that you can do this in the future without me.

Process: Either re-pull FITS images for yourself for our region in DSS, 2MASS, and WISE, or get them from the Box drive. You'll need at least these images for the next section. BONUS: Spitzer, Herschel, IPHAS.

Relevant links for reference:

Investigating the big mosaics


Why? There is astrophysics in understanding what is bright/faint in each band. Spatial resolution is going to play a role for us downstream.

It is "real astronomy" to spend a lot of time staring at the mosaics and understanding what you are looking at. Don't dismiss this step as not "real astronomy" just because you are not making quantitative measurements. This is time well-spent, and you should plan on investing some time doing this section. Some aspects of this were already discussed in the context of the Resolution worksheet.

Big goal: Understand what is part of the sky and what is an artifact (e.g., not part of the sky). Recognize how the images differ among the various bands, and why. (NB: this has come up during more than one telecon, which is why this task is here!) Understand (remind yourself) which survey has the lowest (worst) spatial resolution, and which has the best.

Relevant links for reference:

Process: Load images of our region over a range of wavelengths into a viewer of your choice, ds9 or IRSA tools, or a mixture of them. Ideally, you should pick at least one optical band (POSS, SDSS, or IPHAS), one NIR (2MASS), one MIR (IRAC, or WISE1/2/3), and one FIR band (MIPS, WISE4, PACS, or SPIRE). Compare the images. Answer the questions below.

Hints and tips: FinderChart is fine to use (and may be easiest) but you may run into problems getting "big enough" tiles to answer the questions. You may not have enough memory to load everything into ds9, in which case you'll want to try the generic IRSA Viewer and the images I put on the Box drive. Align the images by position to compare the same portions of the sky. You may find it helpful to make 3-color images to more directly compare images in exactly the same region. Zoom in/out. Play with color stretches to bring out detail in the images.

Questions for you:

  1. MOST IMPORTANT of these questions: Compare the mosaics across the bands. What changes? What stays the same? Why? (This is a DEEP question! See also next questions.)
  2. How does the number of stars differ across the bands? Which band has the most stars? The fewest? (BONUS question: why?) The most nebulosity? The least? (BONUS: why?) Are there more stars in the regions of nebulosity, or less? Why? (BONUS: Where do we need to worry the most about reddening?)
  3. What are some instrumental effects you can identify?
  4. Notice the pixel scale. Which survey has the lowest resolution (biggest pixels)? (BONUS: is that the same as the native pixels for the survey? You will need to Google, or go back to your LLAMMa Resolution Worksheet notes.)
  5. Make a three-color image. Do the stars match up? Does the nebulosity?
  6. BONUS: How big are any of the features in the image (nebulosity, galaxy, space between objects)? (What do I mean by big?) in pixels, arcseconds, parsecs, and/or light years? (Hint: you need to know how far away the thing is -- check the proposal for the number. If it helps, there are 3.26 light years in a parsec.)

Obtaining the catalog data and bandmerging across catalogs

DONE, because I did this for you, except possibly the "SCUBA 16" will need more work.

Why? We need photometry (=a quantitative measure of brightness) of our sources. Others have already done photometry for us so we don't have to do it ourselves. We need to make the matches across catalogs -- no one else has ever done this before, by which I mean identified which sources in this region are seen in each one of these surveys, and tied the measurements together. (Think about that for a bit -- no one else has ever done this before...)

We found data for this region from CXO, SDSS, IPHAS, 2MASS, Spitzer (cryo and post-cryo), WISE, Herschel, and SCUBA. We have already used FinderChart and other IRSA tools to retrieve 2MASS and WISE (and Spitzer) catalogs. I've used additional archives to pull additional data tables from various places.

Big goal: Learn how to get catalogs so that you can do this in the future without me. Understand what bandmerging is and why we need to do it.

Relevant links for reference:

The process of merging the bands across catalogs is called "bandmerging." I did this for you because it would be a GIGANTIC pain in Excel (especially for 30,000 sources), or (worse) by hand. I've heard TopCat can do it easily, but I've never used that.

Process (What I did):

  1. Make a "master catalog"
    1. Download catalogs from these sources over our region.
    2. Using a computer, load in catalogA. Then, for each of the sources in catalogB, metaphorically sit on each source in catalogB and look for a match in catalogA. If I find a match, associate those sources. If I do not find a match, sometimes I added the entire source to the catalog, and sometimes I just dropped it. (There are a LOT of sources here, many we do not care about, so having each and every source in here is less important than it might be.)
    3. Now have merged catalogA+catalogB. Do same for each source in catalogC, such that I merge in catalogC with catalogA+B. Repeat for catalogD, etc.
  2. Identify "interesting sources" in the master catalog
    1. Use Rob Gutermuth's approach to identify YSO candidates with IR excesses. Tag those as interesting.
    2. Find the objects with an X-ray detection and star-like SEDs. Tag those as interesting.
    3. Find the objects that are variable in the YSOVAR data. Tag those as interesting.
    4. Find the (very few) objects that are tagged by the IPHAS people as Halpha-bright. Tag those as interesting.
    5. Find the 16 SCUBA sources and tag those as interesting.

Note: There are still unresolved issues with source matching across bands for the "SCUBA 16". When we get the PACS data, some of these will be solved. I believe that there are some SPIRE sources that are still unreported in the source list we got from the SPIRE team, and I bet similar issues will be present in the PACS data. There are several sources that appear to be missing counterparts; these all need to be checked again in the context of the SED checking below.

After this process, I have about 30,000 objects in a master catalog, about 300 of which we care about. Remember that we are interested in the UNION of all the "interesting objects", that is things that are tagged as YSO candidates from IR, X-ray, Halpha, SCUBA, and/or variability.

Previously identified sources

DONE because you worked on it this Spring

Why? Others have gone before us, and it pays to learn from them rather than reinvent the wheel.

Big goal: Understand what has already been studied and what hasn't in the region we care about.

Relevant links for reference: How can I find out what scientists already know about a particular astronomy topic or object? and I'm ready to go on to the "Advanced" Literature Searching section and IRSA Viewer

Process (what we did before):

  1. Search ADS, SIMBAD.
  2. Identify literature of relevance. (in this case, really just SCUBA paper and the Halpha paper)
  3. Read literature.
  4. Extract from it the data we care about.
  5. Assess how good the positions are. Can we blindly match these sources to the ensemble catalog (which has positions better than an arcsec)?
  6. If not, use FinderChart to investigate each source by hand to 'correct' its position to be one that can be merged blindly with the rest of the catalog (which is based on 2MASS and Spitzer positions, which should be good to within an arcsecond).
  7. Then, merge in the literature observations (SCUBA sources, Halpha sources).
  8. Tag the interesting objects as interesting in the database so we can find them again. Keep careful track of the multiple short-wavelength sources that could be tied to just one SCUBA source.

Data Tables (part 1) and Color-Color and Color-Magnitude Diagrams (part 1)

DO THIS! skipped for now. consolidate with later tasks?

Why? Rob found sources with YSO-like colors by making a bunch of CMDs and CCDs and selecting objects from these diagrams. It behooves us to get a sense of what he did. (And, we will need to make more CMDs/CCDs downstream.) The lab this morning was a metaphor. Let's at least try a few similar plots.

Big goal: Learn how to manipulate data tables using IRSA Viewer for now (because it's easier for a quick plot and because it handles 30,000 sources more elegantly than Excel does). Make some color-color and color-mag plots to compare to plots in the literature. Do they look similar? We will use WISE rather than Spitzer data for this because WISE data comes in units of mags and because the Spitzer data we are using is not transparently visible as a single catalog to IRSA tools.

Relevant links for reference:

Process: Go get a WISE catalog for our region from IRSA, not me. Look at the data tables and Xavier's paper (available on the Box drive) to identify what you should plot. Make some plots that he made and see how our region compares to the regions he used in his most recent paper.

Advice and Hints: Remember that a dust-free star should have zero infrared color for basically any combination. (At least, it is 0 as long as the color is (shorter wavelength) minus (longer wavelength) !) You may find that W3-W4 is notably NOT zero for rather a lot of objects, because the only objects seen at W3 or W4 at the distances we are talking about here are the ones notably bright at W4, so they all are brighter than plain stars at W4. This is going to be a different morphology than a, IRAC color-color diagram (I1-I2 vs I3-I4 plot), where a much larger number of sources are seen, they are closer on average, and a large fraction of those are plain stars. This gave me heart failure during the 2012 summer visit until I realized this. YSO candidates are bright and red, generally. There are other CMDs you can try. After we include some optical data, there will be even more CMDs we can try.

Specific questions/tasks for you: In his paper, Xavier was not using our region. His plots WILL look different than ours. But can you find points from Ceph-C that are in the same region as the YSOs in Xavier's plots?

  1. His fig 4,left has w1-w2 on the y axis and w2-w3 on the x axis. Does it look like ours. Why? (Hint: what region is plotted?)
  2. His fig 4,right has w1 on the y axis and w1-w3 on the x axis. Make this plot in IRSA Viewer. Do we have objects in the same place as the YSOs?
  3. BONUS: Keep going with more plots from Xavier's paper. Do our versions of his plots look like his? Why or why not?
  4. BONUS #2: Make a plot of SDSS data -- put i-z (psfMag_i-psfMag_z) on the x axis and r on the y-axis (-psfMag_r to get bright objects at the top). The SDSS pipeline has 5 bands of data to play with, and so it makes a guess as to whether it thinks each source is a galaxy or star (or something else). Toggle back to the table view, and filter it down to be just stars. What does the plot look like? Toggle back to the table view and filter it instead to be just galaxies. How is the plot different, and why?

How I would do this (there is more than one way...):

  1. Use FinderChart.
  2. Search on 23:05:51 +62:30:55, 30 arcminutes, ask for the catalogs "within the image boundary"
  3. When the catalogs load, there are icons in the upper right of the browser window, one of which denotes "table view" and the other of which denotes "plot view". This is how you toggle between plots and a table view of the catalog you have in the foreground. Click on the plot icon.
  4. Click on the gears icon to change what is plotted. Make Xavier's Figure 4, left. Change the x-axis to w2mpro-w3mpro and the y-axis to w1mpro-w2mpro. You can change the limits of the plot under "more options" to match Xavier's plot more closely.
  5. Look at the plot and think about how it is different than Xavier's plot, and why.
  6. Repeat for other plots in the paper. To get bright objects on the top, plot, say, "-w1mpro."

Image Inspection

STARTED, BUT NEED TO FINISH, probably concurrently with the SED inspection below.

Why? OK, we've picked candidate YSOs based on a variety of techniques. For each of the sources in which we are interested, are they really point sources in the images? (AKA, Do you believe what the computer is telling you?) Example: Do you believe the computer if it says that there is a detection there, especially at 22 um?

Relevant links for reference: FinderChart

Process (what we got started on, June 2016):

  1. Assemble list of sources in which we are interested from work above.
  2. Feed list to FinderChart and load POSS, 2MASS, and WISE images. Watch the size of the images you retrieve because it matters for context and automatic stretching that FinderChart does.
  3. Inspect each image. Is it really there at all bands? Is it a point source? Remember the reason that the source is on the list in the first place. (This is encoded in the stuff I gave you - see list next section.) I expect a source selected from the IR properties (IR excess or YSOVAR variability) to have some WISE data, because they should be (*should* be) bright enough for WISE data. Stars that are Halpha-selected may in fact NOT be detected in WISE. Resolution matters.
  4. Sources may well not be there in POSS, or many of the SDSS bands, but that won't affect whether or not we think it is a good YSO candidate, because (a) we aren't using photometry from POSS; (b) POSS is relatively shallow (the SDSS and IPHAS data go deeper -- reach fainter magnitudes -- than POSS); (c) we've selected these (mostly) based on IR properties, not optical properties.
  5. For each source that may be questionable, check on it in the other images we have. Note that this is harder because the Spitzer, Chandra, IPHAS, Herschel, and SCUBA data are not in FinderChart. Of all of these more-difficult-to-check images, the Spitzer data is the most important. You can force the SHA to help you find these sources in the SEIP images. For the 16 SCUBA sources, you already saw that it's very important to use all the long-wavelength data we have to trace the same sources across bands (and much less important to use the optical).
  6. For each source, check and see if we all agree. Reconcile differences. Note that all of this may be best done in concert with SED assessment in a few steps.

Data Tables (part 2): Moving into Excel


Why? For each of the sources we care about, we need to make SEDs so that we can decide if these sources have IR excesses. We also need to make CMDs/CCDs too. Getting data tables into Excel is the first step in that process.

Relevant links for reference: YouTube video on what tbl files are, how to access them, and specifically how to import tbl files into xls. (10min)

Process: Get the bandmerged catalog for the ~300 objects of interest into Excel, with all columns divided appropriately. I've made for you a *csv file, so this should be trivial. Do the same for the complete (~30,000 object) catalog, if your Excel can handle it.

Hints and Advice: Note that many data tables come with many, many, many lines (like more than 100) at the top explaining what the contents of the file are. These are useful for keeping with the file (like a FITS header is useful to keep with the image), but when reading it into Excel, you may wish to delete all but a note to yourself about what the file is, and the headers of the data columns themselves. Personally, I recommend generally keeping the original file and naming subsequent files similar names. For example, iphas.original.txt, iphas.xlsx, etc.

It's useful to keep track of why the sources are in the list. Values for the "whyhere" column are combinations of letter codes:

  • I = selected based on IR colors (Rob's approach; I=*I*RAC)
  • X = selected based on X-ray detection+ star-like SED (X=*X*-ray)
  • V = selected based on it being variable over the YSOVAR campaign (V=*V*ariable)
  • H = selected based on it being bright in Halpha (only two of these; H = *H*alpha)
  • L = selected because it matches a SCUBA source (L = "*L*ong wavelength" because I kept getting "S" confused with Spitzer rather than SCUBA)
  • L? = selected because it MAY match a SCUBA source (source confusion issues we discussed before)
  • C = selected because it seems to have changed in brightness significantly between the cryo and YSOVAR data (C= *c*ryo to ysovar variable)

Many have just one code, but many have multiple codes. IVC means the source is in the list because it has IRAC colors consistent with a YSO (I), it's variable over the YSOVAR data (V), and variable between that and the cryo data (C).

  • CAUTION 1: There are multiple files from me with everything in the region (long) and files with just the things in which we are interested (short). Some have many columns (wide), and some have just a few (narrow). Look at the filename and contents, and ask questions until you are sure you are using the right file for whatever task you're working on.
  • CAUTION 2, AND THIS ONE'S A BIGGIE: These catalog files generally have a mixture of detections and limits, measurements and errors, flux densities and magnitudes. You will need to be careful in importing this into Excel. The data are NOT all Vega mags; the SDSS measurements are AB mags. Some of the measurements come in flux densities, not magnitudes.

BONUS: Try making some color-color or color-magnitude diagrams of just the interesting sources. Example. Make a new column for W1-W4 and program Excel to do the math for you. Plot W1 vs. W1-W4. Make sure the axes go in the correct direction such that brighter objects are at the top. Compare this to the plot of everything in the field that you made a few steps above using the full WISE catalog. What is different, and why?

Making SEDs


Why? For each of the sources we care about, we need to make SEDs so that we can decide if these sources have IR excesses. Let's do this!

BRACE YOURSELF: lots of math and programming spreadsheets (You may have already have developed some of these skills via the shortlist of stuff I sent before. If not, this is the time to learn!) here... you WILL do this more than once to get the units right!

Relevant links for reference:

Process: Program a spreadsheet to convert between mags and flux densities. Make at least one SED yourself. Even if you run out of time so that you don't actually make your own SEDs right now, make sure you understand how to get the fluxes from the magnitudes (and then make your own SEDs later). This is not easy to do right the first time, so you will get the wrong answer the first few times you try.

We will ultimately need to make SEDs for everything, but for purposes of this example, let's work with these three objects from our shortlist: 230443.74+623406.8, 230444.46+623233.4, 230446.76+622907.8. Start with just one. You will ultimately plot log (lambda*F(lambda)) vs log (lambda) -- see the Units page. It will take time to get the units right, but once you do it right the first time, all the rest come along for free (if you're working in a spreadsheet). Spend some time looking at these SEDs. Look at their similarities and differences. Make sure to keep careful track of those things that are limits rather than detections. (Build skills for next step.)

AT MINIMUM, the goal here is to get at least an optical+2MASS+Spitzer SED just for the three sources I'm asking about here. Therefore, you may wish to start from the most pared-down version of the Excel spreadsheets I've provided. You can always do more if you are feeling ambitious (e.g., doing all optical through 850 um, or doing more SEDs than just these 3).

Another try at explaining:

  • What do you have? for the minimum 3: SDSS ugriz (in AB mags), 2MASS JHK (in Vega mags), Spitzer I1I2I3I4 (in Vega mags). More complete list: r'i'Ha, JHKs, WISE, Spitzer data, all in Vega mags. ugriz in AB mags. Herschel and SCUBA in flux densities, but slightly different units -- Herschel is in mJy and SCUBA is in Jy.
  • What do you need to get? everything into Jy, which are units of Fnu -- look up how to convert between mags and flux density (Units page and Central wavelengths and zero points). Then convert your Fnu in Jy into Fnu in cgs units, ergs/s/cm2/Hz, so multiply by 10^-23. Then convert your Fnu into Flambda in cgs units, so multiply by c/lambda^2, with c=2.99d10 cm/s and lambda in cm (not microns!). Then get lambda*Flambda by multiplying by lambda in cm. Plot log (lambda*Flambda) vs. log (lambda).
  • Once you make your first SED correctly, the rest are easy. But that first one is hard!
  • Ultimately (see next section), you need to look through each of the SEDs and decide which look like you expect, which need photometry to be checked, and which seem unlikely to be legitimate YSOs. This is a judgement call, and your judgement will improve with time as you gain some experience. (This is also the next step.)

You can do all of this in one massive spreadsheet such that you do the calculations for all ~300 SEDs at once. This is the power of Excel. Or, you can make one at a time. (You will probably need to plot one at a time anyway, because stupid Excel.) You can start from the Excel you yourself created in the prior task, or from mine. Your call.

Questions for you:

  1. What do the IR excesses look like in your plots for these three example sources? Do they look like you expected? Like objects in Monday's ppt or elsewhere?
  2. BONUS, If you made more SEDs: For comparison, find these objects and look at their SEDs: 230637.76+622857.3 and 230637.57+622726.5. Do you expect them to have a large excess based on [3.4]-[22]? What do they look like? Why are we considering these objects?

Assessing SEDs


Why? Now that you've made at least one SED, you know how to do this. We will wave our magic wand and assume you can do it, given enough time, for all 300 sources of interest. With some help from me, then, to jump that barrier in the limited time we have here at Caltech, we need to next look at the SEDs from all 300 sources of interest with a critical eye. They will not all be clean and neat. We will need to fold in information learned from the image assessments.

Big goal: Understand what an SED is and why it matters. Understand what to expect in a YSO SED and how to discard objects for having questionable SEDs (or put them on the list for checking source matching, photometry, etc).

Relevant links for reference:

Process Overview: Examine the SEDs for all of our candidate objects. Use them to further evaluate the quality of the YSO candidates from the YSO candidate list. Combine with notes from the image assessment (or redo image assessment on the fly) to decide if each is a good candidate. Identify the bad ones, and discuss with the others why/whether to drop them off the list of YSO candidates. Look at their similarities and differences. Make sure to recognize those points that are limits rather than detections, and assess the SED appropriately. After you get through the SED assessment, we will reconvene and compare all our notes. Then, we will have a set of objects in which we are interested, and we should have (will have) notes on each of the objects, obtained from the image check and the SED check. We need to next collate all of these such that we can tag objects as "unlikely to be real YSOs", or "YSO that does not seem to have a disk", or "still surviving as new YSO candidates" or "maybe this is ok" or some other status.

More mechanics of process: Get the file with all the SEDs in it from the Box drive. We will do the first 12 (the first page) as a group. Then you should work as a 2 or 3-person team and go through each of the objects you're assigned and make notes about what you see. There is a blank Excel file in the Box drive for you to use, or maybe we should use a Google doc to collect responses. (Ideally, would have at least 5 teams of 2 or 3, and have at least 2 teams do each source. For 300 sources, that's at absolute minimum ~60 per team to have just one team per object.)

Questions to think about for each source: Does it look like a YSO SED? Does it look like the data weren't tied to the correct source or there were spatial resolution problems? Are there disagreements (e.g., WISE and Spitzer don't agree)? Look at the images at the same time if at all possible... Are there data missing that should be there (e.g., there are no 2MASS data in the SED but you can see a source at that location in the 2MASS images)? Discuss with your partner about whether or not you believe each source, and why. Do you believe each point in the SED? If not, why not? Don't forget to compare your notes on SEDs with notes on images (e.g., if you decide it is "iffy" in images AND "iffy" in SEDs, chances are excellent that it is not a good candidate). Keep good notes on this! (If we are working in a communal document, be sure not to overwrite each other!


  1. Limits in SEDs. Sometimes the source is too bright or too faint for these catalogs. In either case, the catalogs will often report, in essence, "I don't know how bright this thing really is, but I can tell that it must be brighter or fainter than this." That is what a limit means. The limits can be important in the SEDs -- because we are combining catalogs with different sensitivities, there are objects that are undetected. Limits can also help us determine if the source is correctly matched across bands -- detected at a particular brightness but also having a limit at a nearby band that is much below that detection suggests a source mismatch. I may have inverted the direction of the limits in places -- please tell me if you find these.
  2. Accreting young stars, or stars that are rotating quickly (often because they are young) are bright in Halpha and ultraviolet (u band). If Halpha or the u-band points are much above the rest of the SED, that's OK, and in fact a GOOD THING, because that is more evidence that the star is young. The Halpha point is red and the u band point is blue in my SEDs to help you remember this.
  3. Remember that in the context of IR excesses and SED classes, 2 to 25 microns is the most important. There may be wackiness in the optical, and it may matter (especially if there is a mismatch between sources) but wackiness in the optical is less of a critical issue than if, say, IRAC 3.6 microns doesn't match with WISE 3.4 microns, because we care more about the IR side of the SED. Longer wavelengths than 25 um help us understand the nature of the object, but the formal SED classification is based on 2-25 um (only).
  4. Black triangles = IPHAS ri; red triangle = Halpha (which means that bright is ok). Black + signs = SDSS griz; blue + sign = SDSS u (which means that bright is ok). Arrows anywhere are limits in the direction indicated. Black diamonds = 2MASS. Blue circles = IRAC; green circles = mean (average) IRAC from YSOVAR light curves. Black stars = WISE. Black box (near the IRAC points) = MIPS. longer wavelength SEDs continue: asterisks out here mean SPIRE, and boxes/arrows out here mean SCUBA.

CMDs and CCDs, part 2


Why? We have weeded the objects in which we are interested to get rid of the things that really have problems. We now have a shortlist of things that we are starting to believe may be YSOs. We have a lot of ancillary data, and we can do more checking using these data to see how confident we are in these objects, so that we can refine our list of true YSO candidates vs. not.

We will reassess on the fly, but most likely, we will have everyone make one plot (like we did for the SEDs) and then we will wave our magic wand and assume you can make these plots, given enough time, for our entire catalog, and highlight the sources of interest. With some help from me, again, I will make several CMDs and CCDs and we will talk about them as a group.

Process: By this point you should have a list of things that have survived tests. You can make a wide variety of CMDs and CCDs with these objects highlighted. We will then go through each of them and decide what to believe.

Make a WISE or Spitzer color-mag diagram such as one of the ones you made earlier, but this time do it in Excel. Overplot the survivors of the image and SED tests on this diagram. Where do they fall? Are they where you expect them to be? You may wish to plot the data for *everything* and then just overplot the interesting ones; I find this easier to identify whether or not the objects of interest are where they should be. Or you may wish to just plot the interesting ones, and compare to plots of other regions in the literature; it's up to you. Tag any of the survivors as less likely if they aren't where they should be.

Repeat for several other color-color and color-mag diagrams. Don't forget about reddening. Are the sources where you expect them to be? I've made a few, but I'm sure you can think of more. Remember that the convention is shorter wavelength band - longer wavelength band for colors.

We have optical data for a lot of these objects. You can make several different possible optical CMDs and see if the "literature YSOs (that may or may not have a disk)" and "still surviving as new YSO candidates" fall in the 'right place' in these diagrams. For context, you could take all the optical data we have for all the objects in the region and plot r vs r-i from the IPHAS data. Overplot the survivors on this diagram. Are they where they should be (above the ZAMS)? Do this again, but for r-Halpha vs. r-i. Are the survivors where they should be (substantially above the unreddened main sequence locus)?

Analyzing SEDs


Why? There are empirically defined groupings of SED shapes -- Class 0s and Class Is are the most embedded (presumably youngest); Class IIIs are the least embedded (presumably oldest). How do our new YSOs compare to this? Have we mostly found Class IIs?

Process: Add a new column in Excel to calculate the slope between 2 and 25 microns (using just 2 points) in the log (lambda*F(lambda)) vs log (lambda) parameter space. This task only makes sense for those objects with both K band and either MIPS-1 or WISE-4 detections. (For advanced folks: fit the slope to all available points between 2 and 25 microns. How does this change the classifications, if at all?)

  • if the slope > 0.3 then the class = I
  • if the slope < 0.3 and the slope > -0.3 then the class = 'flat'
  • if the slope < -0.3 and the slope > -1.6 then class = II
  • if the slope < -1.6 then class = III

These classifications come from Wilking et al. (2001, ApJ, 551, 357); yes, they are the real definitions (read more about the classes here)!

  1. How many class I, flat, II and III objects do we have? What are any implications for apparent ages?
  2. Where are the objects with infrared excesses located on the images? Are all the Class Is in similar sorts of locations, but different from the Class IIIs?
  3. Look at where the "16 SCUBA sources" fall with respect to, well, everything else on these plots. Are they similar to or different from the YSO candidates selected via other mechanisms? Do the positions have anything to tell us about some of the source confusion issues?

For very advanced folks: suite of online models from D'Alessio et al. and suite of online models from Robitaille et al.. Compare these to the SEDs we have observed.

Going back to check the literature

DO THIS (if we can).

Why? We did a pretty thorough literature check a few months ago, but (a) it's not infallible, and (b) astronomers have kept on publishing since then. For each of the objects we are asserting are new YSOs, we should go back and check the literature to be sure, e.g., that someone else didn't just publish a spectrum that says its a carbon star (meaning, not a young star).

Process: Go back into SIMBAD and search for each of our sources. Has anyone has done anything on them before? Are we the first ever on our planet to care about this source? Keep careful notes!

Putting this in context a little: Science


Why? We've been doing a lot of nitty gritty work with the data. But now it's time to back up and look at the big picture again.

Goal: put our work in context with the literature.


  1. Compare the refined YSO yield now with what I had done before blindly, just taking the computer to be correct. What are the ratios of, say, Class I to the total number of YSOs?
  2. Look at the various ratios in the Rebull et al. (2014) paper that tries to put all the clusters in context with each other. How does our new assessment of YSOs in Ceph C compare with the old ratios, and how does it compare to the other YSOVAR clusters?

Writing it up!


Why? Now that we have completed a lot of work, we need to tell other people what we did, and what we found out about the Universe that no one else knows yet.

Goal: We need to write an AAS abstract and then the poster.

We need to include:

  1. How the data were taken.
  2. How the data were reduced.
  3. What the IR properties are of the previously identified YSOs here, in context with other observations from the literature.
  4. What the IR properties are of the new sources we have found, including objects you think are new YSOs (or objects you think are not), and why you think that.