Use Cases for a Radiation Software
Hello! Thank you for deploying this JupyterHub instance!
Context
I work in the EM2C Laboratory of CentraleSupélec, Paris-Saclay, where we do Combustion, Plasma & Energy Science. As I told @nicolas.thiery on the phone, we developed an open-source infrared code called RADIS which uses a new method to compute high resolution molecular spectra (with thousands or millions of lines), orders of magnitude faster than comparable codes in the literature. We use the code (on our laptops) for plasma physics research, but there is a large potential in other communities, in particular in high temperature cases like Combustion (comparison with approximate models used in fluid codes) and astrophysics (ExoPlanet detection).
We'd like to use the JupyterHub platform to test these new potential use cases at high temperature.
About the code:
Temperature is an important parameter: at low temperatures (< 700 K), databases contain thousands of lines (~ few Mb) and can be downloaded at runtime (see the documentation examples). For high temperature cases, databases contain millions of lines and weight ~a few (< 2000 K) or dozens of GB (< 4000 K). Specs (1 CPU):
Temperature | Low temperature | High temperature | Very High temperature |
---|---|---|---|
Calculation time (RADIS) | < 2 s | ~ 3 s | < 10 s |
Calculation time (others) | << 1s | ~ 10-30 s | ~ minutes or hrs |
Disk memory (all) | ~ few MB | 3 - 10 GB | 10 - 100 GB |
RAM for best performances (all) | < 1 Gb | 10 - 16 Gb | as much as possible :) [we split the spectra] |
Today the code starts to be used by other groups. We have co-development from other institutions (Ohio State University), and with the open-source community (through GitHub), and two Google Summer of Code projects pending. We're trying to spread the use of the code and see the real potential. One of the limits for users not familiar with the Python environment is to have to install an Anaconda distribution, in particular for one-shot uses. For Student projects, we are sometimes limited with students hardware (RAM), and because they need to manually download & link the databases.
Use cases
Typical use cases are:
-
"Spectrum discovery": user wants to calculate the high-resolution spectrum of a molecule at a low temperature (single session). logins on the JupyterNotebook. RADIS is pre-installed. Calculates his low temperature spectrum where databases are downloaded automatically. Alternative today: there are a couple in-the-browser software like SpectraPlot in the US, and hitran.iao.ru in Russia. I've also deployed Binder instances for this use case already.
-
"Combustion Researcher": User wants to calculate the high-resolution spectrum of a molecule at high temperature (single session). User logins on the JupyterNotebook. High temperatures databases are already available on shared disk. Alternative today: None (competitor codes cannot compute spectra fast enough for in-the-browser applications, and Binder does not support hosting a line database). Today, these users usually find someone else to calculate their spectra (which is fine, but there is definitly a need).
-
"Fitting": user wants to fit experimental spectra (= has experimental data, wants to upload it and compare with synthetic spectra generated by RADIS). Fast alternative today: None (even at low temperatures). Online environments like SpectraPlot do not let users upload files. 1-hr long session.
-
"Student Project": same as 3., but with recurrent use over the year.
-
"Collaborative Projects": there could be an interest to have collaborative, multi-user use (in particuliar for Education, or Student research projects): with a Collabrative JupyterHub (@nicolas.thiery told me it may be possible within 6 months - 1 year. Maybe related to #7): we're currently trying that on CoCal instances with Tokens given by @nicolas.thiery to see if there is an interest.
-
"Public": a final use would be to have a no-login access for cases 1., 2. and 3. (one-time session, no persistent storage), to test the interest from researchers outside of Paris Saclay, in particuliar for cases 2. & 3. where they have no real alternatives today.
Question
I'd be very interested in getting to know which of the use cases above seem possible:
-
- and 3. are for sure already possible (I tried on my own session).
-
- may be possible too.
-
- may require some data limitations (for research projects we're talking about 2-3 students max).
-
- we'll try on CoCalc.
- For 6, I have no idea, as the JupyterHub instance is mostly dedicated to internal Paris-Saclay use (but could also be the most interesting use case here). In general, I'd be very happy to learn more about the future of the JupyterHub instance and of all similar projects in Paris Saclay.
Erwan
PS (not related):
- we also have another radiation code called Specair which is in FORTRAN with dependencies such as HDF5. I built a Docker image that comes pre-built with all the dependencies so we can simply run our CI test cases on our own Gitlab instance. At some point I wanted to host this Docker image somewhere (I tried an Amazon EC3 instance) and have it accessible through a JupyterNotebook frontend (to avoid installing, similarly to RADIS here), but I have no idea how to make this. I would be quite interesting in learning (but that's a separate issue).