The DIRAC Institute in the Department of Astronomy at the University of Washington is seeking applicants with a strong research record in the development of statistical techniques or algorithms for analyzing large astrophysical data sets for two postdoctoral positions.
AstroML: The first position is to help in the development of the second edition of astroML (http://astroml.org) a popular Python-based machine learning package for astrophysics. New components we are incorporating within astroML include methodologies from deep learning and hierarchical bayesian statistics. Special emphasis will be placed on building a broader community and making astroML a sustainable open-source project. The successful candidate will lead these activities, including the application of the new codes to dataset available to UW researchers.
Time Series Data: The second position is to develop new approaches for analyzing astronomical time series data using modern computational frameworks. The goal of this framework will be to enable science with the ZTF and LSST data sets. Promising applicants should possess an interest in time domain science and experience or interest in the use of databases and large scale compute platforms such as Spark, Dask, or similar. Good Python skills, and experience with machine learning libraries, image processing of astronomical images, or astronomical databases are desirable.
The DIRAC Institute is a newly formed center for data intensive astrophysics at the University of Washington. The Institute consists of six faculty and senior fellows, and over 20 postdoctoral researchers and research scientists. It has active research programs in Cosmology, Solar System science, Milky-Way structure, the Variable and Transient universe, andAstronomical Software.
The University of Washington is a partner in the Zwicky Transient Facility (ZTF) project, a new time-domain survey which will begin operations in early 2018. The UW is a founding partner of the LSST project, and leads the construction of its time domain and solar system processing pipelines. Other research activities at UW/DIRAC include topics in extragalactic science, as well as the understanding the structure, formation, and evolution of the Milky Way using large surveys (SDSS, WISE, PanSTARRS PS1, and others).
A Ph.D. degree in astronomy, physics, computer science, or a related subject is required. The initial appointment is for two years, renewable up to three years, and offers competitive salary and benefits. The appointments are available immediately and are expected to start no later than September 2018.
Applicants should submit a curriculum vitae, description of research interests (with links to Github if relevant) and arrange for three letters of reference to be submitted to Nikolina Horvat at email@example.com with subject line “DIRAC postdoc application (your name)”. Applications will be accepted until the positions are filled, to assure full consideration, please send your application by Dec 31st 2017
For detailed information about the benefits available through the University of Washington, including dental, medical and disability insurance, retirement, and childcare centers, see the University of Washington benefits page: https://www.washington.edu/admin/hr/benefits/.
The DIRAC Institute is a community of people with diverse interests and areas of expertise, engaged in the understanding of our universe through the analysis of large and complex data sets. We are an open, ethical, highly engaged and collaborative community based on trust, transparency and mutual respect. We believe in providing a welcoming and inclusive environment, in the importance of quality of life, in embracing diversity, in making a difference and having fun.
Monday, February 10th, 2020 @ 12:30pm
PAA, Room A214
Supermassive black holes – with masses of millions to billions of times that of the Sun – reside in the nuclei of galaxies. While black holes are not directly visible, surrounding material becomes extremely luminous before being accreted, creating telltale signatures of black hole activity. In turn, the amount of activity tells us about black hole growth, and about energy injection back into the host galaxies. This so-called black hole feedback is thought to play a role in regulating the rate at which galaxies form new stars, thereby affecting directly their evolution across cosmic time. After a brief overview, I will highlight new findings from a multi-scale analysis of gas ionization and dynamics thanks to 3D spectroscopy with the VLT/MUSE instrument. I will then present observational constraints on the fueling of black holes, and on the extent to which they can change the fate of galaxies from statistical analyses of large datasets derived from SDSS. The latter are paving the way to yet larger experiments such as the Dark Energy Spectroscopic Instrument (DESI), which will yield over 35 million spectra of galaxies and quasars. I will conclude by briefly showcasing how the Astro Data Lab (datalab.noao.edu) and other science platforms play a role in the analysis of large datasets to further our knowledge on supermassive black holes, galaxies, and beyond.
Stephanie joined the NSF’s OIR Lab ASTRO Data Lab team as a staff scientist, coming from a staff scientist position at CEA Saclay in France. She received her PhD in astronomy from the University of Arizona in 2011 under the supervision of the NSF’s OIR Lab’s Mark Dickinson. Her research interests are focused on the evolution of galaxies and supermassive black holes across cosmic time. She brings to the Data Lab team a wealth of experience and ideas in developing and applying new methods for turning large survey data sets into scientific knowledge.
The astroML project was started in 2012 to accompany the book Statistics, Data Mining, and Machine Learning in Astronomy, by Željko Ivezić, Andrew Connolly, Jacob Vanderplas, and Alex Gray.
The astroML Python package is publicly available and designed as a repository of statistical routines and machine learning tools for astrophysics. It builds on the scientific Python ecosystem, on well known libraries such as Numpy, Scipy, Scikit-learn, and Astropy; extending the functionality available in these general-purpose libraries.
astroML is designed to be a resource for both researchers and students of astronomy and Python. It is envisioned to be a community resource, with the development and submission of new algorithms, data sets, and examples provided by GitHub’s collaborative coding interface. In addition to being used for astronomical research, several university courses build on astroML, for example at the University of Washington, University of Cambridge, and Drexel University to list a few.
astroML strives to bring the astronomical community closer to the ideals of Reproducible Research, in which research papers are accompanied by well-written code to reproduce, check, and extend the results. With this in mind we share the source code used to generate the figures in both editions of the textbook in a separate GitHub repository.
Updates and news about astroML project can be found here.
Searching for faint Solar System objects Kuiper Belt Objects (KBOs) are a population of Solar System objects that exist beyond the orbit of Neptune. Finding these objects is important because understanding their true distribution teaches us about the formation history of the Solar System and especially about the evolution of the orbits of the gas giants. However, due to the large distances from the Earth and Sun KBOs are very faint and hard to find. Some existing techniques for finding moving objects rely on the objects being bright enough for observation in a single image, but here at DIRAC we are working on a type of technique based upon “shift-and-stack” algorithms.
Find more information in the following papers [Gladman & Kavelaars 1997, Kuiper Belt searches from the Palomar 5-m telescope]; [Allen et al. 2001, The Edge of the Solar System]; [Bernstein et al. 2004, The Size Distribution of Trans-Neptunian Bodies].
“Shift-and-stack” techniques are able to find objects that are fainter than those that can be found in a single image by shifting multiple images of the same part of the sky along the path of a potential orbit and adding up the light from any moving objects following that orbit. Static objects blur out while potential moving objects pop out as a point source as in the figure below from our paper [Whidden et al. 2019, Fast Algorithms for Slow Moving Asteroids: Constraints on the Distribution of Kuiper Belt Objects].
Digital Tracking This “shift-and-stack” method originally worked by shifting along a limited number of trajectories on the order of a few dozen and then looking by eye for point sources in the resulting stacks. More recently, astronomers developed “digital tracking” where we use computers to search many more possible trajectories and find point sources in large stacks of data. As we move to larger stacks of images and longer baselines in our search we are able to find fainter and slower moving objects but this also creates challenges as our search parameter space becomes much larger as we go further in time. This is because we want to find the slowest objects without missing faster ones and must search a much larger group of possible orbits. To help solve this problem we developed our technique Kernel Based Moving Object Detection (KBMOD).
KBMOD To tackle the problem of searching on the order of 10<sup>12</sup> possible orbits we have turned to using Graphics Processing Units (GPUs). GPUs are much better suited to highly parallel applications than traditional CPUs and since we are performing the same operation of adding up the flux values along trillions of trajectories repeatedly the GPU is perfect for our algorithm. In fact, KBMOD is capable of searching on the order of 10<sup>10</sup> trajectories in a stack of 10-15 4k-by-4k images in a minute using a consumer grade GPU. The software is up and running and in our first application of KBMOD to the 2015 HITS dataset [Förster et al. 2016, The High Cadence Transient Survey (HITS). I. Survey Design and Supernova Shock Breakout Constraints] we discovered 39 new KBOs that were reported to the Minor Planet Center as well as recovering 6 previously reported objects. Further development of KBMOD is ongoing and we are applying it to new and different datasets.
To follow along with our progress stay tuned to our GitHub Repository.
The increased number of asteroid discoveries over the past few decades as well as the expectation that the number of known asteroids will increase by five times when the Large Synoptic Survey Telescope (LSST; https://www.lsst.org/) is expected to come online in 2022 bring an increased number of potentially hazardous asteroid (PHA) discoveries. PHAs are near-Earth objects – either asteroids or comets – for which the closest points between their orbits and the Earth’s orbit are less than 0.05 astronomical units (19.5 times the distance between the Earth and Moon) apart and with diameters of approximately 460 ft (140 m) or greater. An object this large is big enough to cause devastation to a populated region in the case of a land impact or a major tsunami for an impact into the ocean.
Not all PHAs are likely to impact the Earth in the foreseeable future, but their orbits put them in close enough proximity to the Earth that they need to be monitored in case their impact probability changes. One way the chances of an impact can change is if a PHA comes sufficiently close to a planet for it to gravitationally tug on the asteroid, changing its orbit enough that its future trajectory puts it on an impact course with the Earth. Fortunately, however, the threat due to asteroid impacts is a natural disaster we have the ability to avoid.
One of the projects supported by the DiRAC Institute is the Asteroid Decision Analysis and Mapping (ADAM) platform being developed by the Asteroid Institute, a program of B612. B612 is a non-profit organization dedicated to protecting the Earth from asteroid impacts as well as advising and advancing decision-making on planetary defense issues on a world-wide scale (https://b612foundation.org).
The ADAM platform is being developed to answer questions such as “How long after discovery does it takes for typical Earth-impacting asteroids to be labeled as impact threats?” and “How far in advance do we need to deflect such asteroids to avoid a collision?”. To answer these questions and others, ADAM is being built in Google Cloud, which allows the required computations to be run on a large-scale platform that provides ample data for analysis. One of the goals of ADAM is to make these computations accessible to the greater scientific community, not only in scale and accuracy, but in ease of use. ADAM is being developed as open-source software that upon completion of initial development and testing will be available to the scientific community to both use and contribute to its computational abilities.
One capability of ADAM is to compute large-scale asteroid orbit propagations that predict the orbital characteristics and locations of a large set of asteroids at times in the future given their current orbital characteristics. The animation shown here is of an orbit propagation of a synthetic Earth-impacting asteroid that shows the orbital motion of the four terrestrial planets (Mercury, Venus, Earth, and Mars) as well as the asteroid (labeled 129_2011_04_DeltaV) over a period of roughly 8 months. The animation ends when the asteroid impacts the Earth. This animation was produced using the tools upon which ADAM’s orbit propagation is based; visualization of orbit propagations will eventually be a benefit of computing propagations with ADAM.
Along with computing large-scale orbit propagations, ADAM can calculate the deflection impulse, or nudge, needed to avoid an asteroid impact. Such a nudge could be imparted to an asteroid using a spacecraft called a kinetic impactor, which would rendezvous with an Earth-impacting asteroid before it were to collide with the Earth to gently push the asteroid an amount large enough to avoid the collision. The goal of such a maneuver would be to avoid the asteroid and the Earth being at the point where their orbits intersect at the same time, thus avoiding a collision. This is similar to either stepping on the brakes or the gas in your car to avoid a traffic collision.
In a study1 recently submitted to the journal Icarus and expected to be published in early 2020 by members of the Asteroid Institute and the DiRAC Institute lead by Dr. Sarah Greenstreet using ADAM to determine the distribution of nudges needed by a large sample of synthetic Earth-impacting asteroids to avoid collision with Earth, researchers found that required nudges range from a few hundredths of an inch per second to a few inches per second, depending on the time before impact available to impart the nudge. In terms of the amount of energy this would impart to the asteroid, for a 450-ft-diameter asteroid made of typical rocky material, a nudge of roughly half an inch per second is the equivalent of the energy required to power a 60 W light bulb for one hour.
The researchers found that the required deflection impulse, or nudge, typically changes roughly as the inverse of the time before impact that the deflection impulse can be applied. This means that a nudge applied to an asteroid 20 years before impact needs to be approximately half the size of a nudge applied 10 years before impact to miss the Earth by the same distance.
Another finding of the study described above is that a small fraction of the synthetic Earth-impacting asteroid population studied require either 10 times more or less velocity change (nudge) than the median value. This means some asteroids are either much harder or easier to deflect than the typical Earth-impacting asteroid. These types of impact scenarios are important to study in addition to the typical cases to best understand the full breadth of the threat due to asteroid impacts.
An additional capability of ADAM currently being developed is determining the evolution of the probability of impacts for a large sample of synthetic Earth-impacting asteroids. For a given asteroid, as further observations are made of the asteroid after discovery, the orbit of the asteroid evolves. Each new observation adds additional data that can be used to compute an orbit for the asteroid, with each new calculation producing a different orbit until the evolution stabilizes and further observations do very little to change the orbit. As the determined orbit evolves, so does the probability of a future Earth-impact for the computed orbit. Like the evolution of the asteroid’s orbit, the impact probability changes with additional observations. Studying the impact probability evolution of a large sample of synthetic Earth-impacting asteroids can provide a better understanding of how long it can take to say with confidence that an impact is expected to occur for a wide range of Earth-impact scenarios.
Altogether, the capabilities of ADAM are helping us to better understand the threat due to asteroid impacts and what we can do to avoid them. This information can further future discussions and decisions regarding impact hazard mitigation on a global scale, as is the mission of the Asteroid Institute.
1Greenstreet, S., Lu, E., Loucks, M., Carrico, J., Kichkaylo, T., & Jurić, M., “Required deflection impulses as a function of time before impact for Earth-impacting asteroids”, 2019, Icarus, reviewed.
Dr. Sarah Greenstreet is a joint postdoctoral fellow with the Asteroid Institute, a program of B612, and the DiRAC Institute at the University of Washington. Her research interests include the study of orbital dynamics and impacts of small bodies in the Solar System.
Our Solar System is the current frontier of human and robotic exploration. Part of planetary science and Solar System astronomy is tasked with informing decisions regarding where and what to explore next. The DIRAC Institute has been helping answer these questions by developing state-of-art algorithms that enable the discovery of asteroids and comets — small bodies in our Solar System that very well may be the next points of interest on the ever growing map of our Solar System.
Discovering small bodies is not an easy task. Unlike most astrophysical objects, small bodies move on appreciable time scales; asteroids and comets can move at a wide variety of different speeds and our motion, the motion of the observer, also complicates the problem. As bigger telescopes are being built, the number of observations that need to be processed also increases dramatically. The problem is akin to throwing a bucket full of sand on a table and taking a picture of the table, and then having a friend shake the table, taking a new picture. Your task is to figure out which grain moved where using just those two images. If you imagine having unlimited computing resources, the way you would approach this problem is by letting a telescope observe the sky, and every time a new detection occurs, you test all other unidentified detections for a possible linkage with this one detection. These linkages are known as orbits and define how an object moves in space. To discover a moving object is to know, with a high degree of certainty, its orbit. When a telescope generates millions of new detections in a single evening it is simply not possible to test every combination of un-linked detections for an orbit.
Astronomers figured out a way so solve this problem: the “tracklet”. A “tracklet” is a combination of two or more detections, with the time between detections typically no more than 30 minutes apart. A tracklet, which is essentially just a motion vector, constrains the position and speed of a potential moving object. In a 30-minute time span, an asteroid or comet can only have moved a certain distance, and so, by limiting the time between two exposures on the sky, a survey telescope limits the number of combinations of detections it would need to test for an orbit down the road. Typically, to discover a moving object a telescope needs to observe three tracklets (three pairs of at least two detections) over a two-week window.
Tracklets are problematic. By requiring tracklets for moving object discovery, it requires telescopes to operate in a very specific way. Telescopes must come back to the same area of the sky to take at least a second exposure within 30 minutes. Effectively, to discover moving objects by building tracklets you are limiting a telescope to, at best, observe only half the amount of sky it could observe in a single night. Datasets from past missions or surveys, that did not have this specific cadence of operation are also unsuitable to do retrospective searches of moving objects since they don’t allow for the building of tracklets.
Tracklet-less Heliocentric Orbit Recovery. Aside from the awesome acronym, THOR aims to solve these problems by removing the need for “tracklets” to be observed. The algorithm makes use of certain aspects of the motion of small bodies in the Solar System. THOR assumes a series of test orbits, when assuming a test orbit you know exactly where in space that potential object will be at any point in time in the past or future. Which means you can look for that potential object in datasets from any survey regardless of the time between detections (ie, no tracklets needed!). Naively, if there are 800,000 objects you would need to test 800,000 orbits to discover them all. However, small bodies in the Solar System tend to have orbits that are similar. THOR utilizes this fact, and so as opposed to needing one orbit to discover one object, a single orbit can be used to discover hundreds or even thousands of objects. The power of the THOR framework is that all you need to discover more moving objects is another well-selected test orbit.
What has THOR achieved thus far? We ran THOR on two weeks worth of detections from the Zwicky Transient Facility (a survey in operation from the Palomar Observatory in California). ZTF’s internal tracklet-based moving object algorithm in that two week period was able to recover about 14,000 previously known moving objects. THOR running on the exact same dataset recovered a little over 21,000 objects (97% of the objects with at least five detections), a factor of 1.5 improvement. We are now working to deploy a newer, better, and faster version of THOR on all of the detections coming from ZTF.
THOR is a completely open-source project. Find it here on Github: https://github.com/moeyensj/thor
The Large Synoptic Survey Telescope (LSST) is an upcoming sky survey that aims to conduct a 10 year long survey from which we hope to answer questions about dark matter, dark energy, hazardous asteroids and the formation and structure of the Milky Way. To find these answers LSST images the entire night sky every three nights. It is estimated that in the 10 years of operations LSST will deliver 500 petabytes (PB) of data – largest, to date, released astronomical dataset.
Science catalogs, on which most of the science will be performed, are produced by image reduction pipelines that are a part of LSST’s code base called Science Pipelines. While LSSTs Science Pipelines adopt a set of image processing algorithms and metrics that cover as many science goals as possible, and while the LSST will set aside 10% of their compute power to be shared by the collaboration members, enabling processing of the underlying pixel data by scientists remains a very challenging problem. The largest obstacle to wide-spread data processing is the sheer data volume that will be produced by LSST, which requires large compute infrastructure. If pixel data re/processing were accessible to more astronomers it would undoubtedly repeatability, reproducibility and would, in general, increase the type and quantity of science that can be done with the data.
The tech industry, which has in a lot of cases significantly surpassed LSST’s data volumes, has adopted cloud based solutions because of their ability to scale up and down dependent on the size and complexity of the data. LSST Data Management (DM) commissioned an Amazon Web Services (AWS) Proof of Concept (PoC) group to determine whether a cloud deployment of the LSST codebase is feasible (to measure its performance and determine the cost of cloud-native options).
The first results of this work were presented at the Petabytes to Science conference in Boston where Dino Bektesevic and colleagues from LSST and Amazon demonstrated how the LSST Science Pipelines can be run on Amazon’s cloud, scaling up to thousands of compute cores. The preliminary tests indicate that the cloud definitely has the potential for significant scaling while still remaining affordable.
Read more here.