Square Kilometer Array (SKA) Use Case

The SKA is an unprecedented, international, engineering endeavor to create the largest radio telescope in the world. Completion of this project requires the use of state-of-the-art technologies to facilitate the massive amount of data that will be captured [1]. Once this data is captured, it will require advanced high-performance computing centers to make sense of the data and gain valuable insight. While there are many innovative ideas involved with the SKA, this use case will only examine the technologies and processes involved with the solutions directly related to the SKA’s big data needs.

What is a radio telescope?

Before understanding the data needs of the SKA, it is important to understand what a radio telescope is. Many people are familiar with a regular telescope that uses a series of lenses to amplify light waves from distant places to create an image. A radio telescope is similar in the fact that it collects weak electromagnetic radiation from far distances, and then amplifies it so that it can be analyzed. Another application could be to send radio waves towards a direction and then record the reflection off celestial bodies. In any case, the signal’s that astronomers are interested in are extremely weak. Many earthly sources of electro-magnetic radiation are many times greater in strength. There are multiple ways to combat this noise from earth-based radiation, and some of it could be done using hardware, or software, but there are also other ways to combat this that the SKA is utilizing. Modern radio telescopes accept a wide range of radio frequencies, and then computationally split the frequencies into up to many thousands of channels. To further complicate things, while increasing the efficacy of the radio telescopes, generally more than one telescope is used. This allows multiple positions on the ground to receive the same radio signal, but at slightly different times and slightly different phases of the waveform. This variation allows for more complex analysis of the radio signal. Obviously, this adds another step in the computational work, but having a large array of radio telescopes is imperative to accomplish most modern astronomical research goals [2].

Science Goals

The vast size of the SKA project allows the exploration of a variety of burning questions that not only intrigue astrophysicists, but nearly everyone on the planet. One overreaching design goal of the SKA is to have a design flexible enough that it can be used as a “discovery machine” for the “exploration of the unknown”. With that said, there are five broad research goals of the SKA [3].

Galaxy Evolution and Dark Energy

As a central goal of the SKA, this is quite a broad question that requires a great deal of study to fully understand. With the data gathered, researchers how to understand fundamental questions about how galaxies change over the course of their lifetimes. One problem with studying this, is that most galaxies nearest to us are so far along in their evolution that it is hard to know what happens in the early years of the galaxy. We can overcome this challenge with SKA, due to its “sensitivity and resolution”. The SKA will be able to focus on younger galaxies that are much earlier in their evolution to study what our galaxy was like shortly after the big bang. To gain an understanding of the creation and evolution of galaxies, a study of dark energy must be done. While this mysterious energy has made headlines in the past decade, it is still the subject of a lot of speculation. As gravity is a main driving factor in the evolution of cosmic objects, understanding dark energy is needed to gain a full picture of what is happening in galactical evolution. Currently our fundamental physical theories, derived by Einstein, suggest that universal expansion should be slowing, but it is not. This is where dark energy plays a part in the formation of our universe [4].

Was Einstein’s theory of relativity, correct?

It is a tall order to question the most influential physicist in history. Technology is catching up with our theoretical understanding of physics so that we can test fundamental theories that we have held true for many years. The SKA hopes to use its incredible sensitivity to investigate gravitational waves from extremely powerful sources of gravity such as black holes. While Einstein’s theories are very likely to be mostly true, they might not be fully complete and that is what SKA hopes to find out [1].

What are the sources of large magnetic fields in space?

We know that our earth creates a magnetic field that is imperative for life to exist. For the most part we understand that this is due to the composition and actions of the core of the planet. When it comes to the origin of magnetic fields in space, we are not completely sure what creates all the fields. The study of these magnetic fields will allow further study of the evolution of galaxies and our universe [5].

What are the origins of our universe?

This is a burning question that we have some theories about, but still have a great deal of exploration to do on the topic. The prevailing theory relies on the big bang, but the SKA hopes to further study the eras shortly after the big bang to gain insight into the origins of our universe. The SKA hopes to do this by once again using its sensitivity to give the most accurate measurements of the initial light sources in our universe [6]. As long this question remains unsolved, humans will always want to understand where we all came from.

As living beings, are we alone in the universe?

Using Drake’s equation, and new exoplanet information, scientists are extremely optimistic that life exists somewhere in our universe. In some estimates, what has happened on our planet, could have happened about “10 billion other times over in cosmic history!” [7]. One way that SKA can look for extraterrestrial life is by searching for radio signals sent out by advanced civilizations such as ours. Another way that SKA could look for extraterrestrial life is by looking for signs of the building blocks of life. One of these building blocks are amino acids, which can be identified by the SKA.

Current Progress

The SKA telescopes reside in two separate locations. One location is in Western Australia and will be focused on low frequencies. The second location is in South Africa and will have two arrays, one for mid frequencies, and one for mid to high frequency [8].

South Africa

Design and preparations for the final SKA implementation are still on-going. Currently there are two arrays named KAT7 and MeerKAT that are installed and functioning and will be the precursor to the SKA arrays in South Africa.

Australia

This site also has a precursor to SKA already operating named ASKAP. It is currently located in the same location that the SKA’s major components will eventually occupy, so this will give insights into the performance of this location for radio telescopes. Also, in Australia, as recent as in the past year, prototype antennas are being setup in smaller arrays to capture data and run tests before the design is used in the final array [10].

Big Data Challenges and Solutions

The SKA presents many big data challenges, from preprocessing to long-term storage of data. The estimated output of all the telescopes is around 700 PB per year [12].

Raw Data and Preprocessing

The data comes in the form of an analog radio signals that are collected over a vast geographical area. At some point, to do analytics on the data, the data needs to be converted from analog to digital. While this is usually done via hardware, and is not on computational machines, this is still a data processing step that must be done at scale. There is also some preprocessing of the data, that must happen constantly as data is collected. While this could be done once reaching the supercomputer, it is a repetitive task that could be done using FPGAs. The benefit of using a FGPA is that it can parallel process in many more threads and do repetitive algorithms faster and with less power as normal CPUs [12].

Storage and Access

As mentioned previously, the estimated data output of the telescope at peak is 700 PB. The initiative also hopes to save all data for the lifetime of the project which is around 50 years. This ends up being in the realm of needing to eventually store 35 EB of data. For more immediate storage, the SKA team plans to use a buffer system. The way this works is by having a large array of fast read and write storage devices such as SSDs and NVMe (a specialized SSD). This buffer will immediately take in the data as it is coming in at rates that require write speeds that are not as prevalent with traditional spinning disks. After being written to this buffer, they will slowly move the data onto more affordable solutions, that have slower read/write speeds. While the team could use SSDs for the entire storage, the cost would be enormous. It is much more cost effective to have most of the data stored on hard disk. When it comes to long-term storage of data, even cheaper sources of data such as tape drives could be utilized. After a certain time from data collection, the data will be opened up to the public, this means that the data will likely not end up in a cold storage system [12].

Processing of data

Currently, the processing of data will be done at a large network of sites that will be made up of a variety of technologies. Mostly, no new high-performance computing centers will be created. Existing infrastructures, including public clouds will be used for the processing of data. Along with using FPGAs for pre-processing and possibly more processing afterwards, the SKA team plans to use GPU accelerators to allow for efficient processing. Each team of researchers will have various goals that they will want from the data. This means that they will have a variety of processing needs, which will be carried out in SKA Regional Centers (SRCs). This might mean machine learning programs to get insights from the data, all the way to other mathematical operations to make the data ready for study. In any case, it is the expectation that this additional data is preserved as well, leading to even more data needing to be managed [12].

Other Challenges

While this data is not the most sensitive data on the planet, it is important that security is considered. The SKA team is planning on creating a sort of firewall between users and the actual HPC centers by using an AAAI (authorization, access, authentication, and identification) system. Security of proprietary data will be a concern that will have to be addressed. As there is a large team working on the project, as well as many external actors, security becomes extremely complex, especially the more access points there are to the data [12]. A project this large and versatile requires the use of many software tools. These software tools generally need some level or automatic communication if they are used together in a project. With a large number of tools, there becomes a complex IT infrastructure that needs to be managed, and constantly monitored. It is possible for one tool to receive a critical update, and then cause issues with integration of other software systems.

References

[1] “Square Kilometre Array - ICRAR”, ICRAR, 2020. [Online]. Available: https://www.icrar.org/our-research/ska/. [Accessed: 23- Sep- 2020].
[2] “What are Radio Telescopes? - National Radio Astronomy Observatory”, National Radio Astronomy Observatory, 2020. [Online]. Available: https://public.nrao.edu/telescopes/radio-telescopes/. [Accessed: 23- Sep- 2020].
[3] “SKA Science - Public Website”, SQUARE KILOMETRE ARRAY, 2020. [Online]. Available: https://www.skatelescope.org/science/. [Accessed: 24- Sep- 2020].
[4] “Galaxy Evolution, Cosmology and Dark Energy - Public Website”, SQUARE KILOMETRE ARRAY, 2020. [Online]. Available: https://www.skatelescope.org/galaxyevolution/. [Accessed: 24- Sep- 2020].
[5] “Cosmic Magnetism - Public Website”, SQUARE KILOMETRE ARRAY, 2020. [Online]. Available: https://www.skatelescope.org/magnetism/. [Accessed: 24- Sep- 2020].
[6] “Probing the Cosmic Dawn - Public Website”, SQUARE KILOMETRE ARRAY, 2020. [Online]. Available: https://www.skatelescope.org/cosmicdawn/. [Accessed: 24- Sep- 2020].
[7] L. Sierra, “Are we alone in the universe? Revisiting the Drake equation”, Exoplanet Exploration: Planets Beyond our Solar System, 2020. [Online]. Available: https://exoplanets.nasa.gov/news/1350/are-we-alone-in-the-universe-revisiting-the-drake-equation/. [Accessed: 24- Sep- 2020].
[8] “Design - ICRAR”, ICRAR, 2020. [Online]. Available: https://www.icrar.org/our-research/ska/design/. [Accessed: 24- Sep- 2020].
[9] “Africa - Public Website”, SQUARE KILOMETRE ARRAY, 2020. [Online]. Available: https://www.skatelescope.org/africa/. [Accessed: 24- Sep- 2020].
[10] Square Kilometre Array, Building a giant telescope in the outback - part 2. 2020.
[11] “Australia - Public Website”, SQUARE KILOMETRE ARRAY, 2020. [Online]. Available: https://www.skatelescope.org/australia/. [Accessed: 24- Sep- 2020].
[12] Filled in Use Case Survey for SKA